|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[2020-11-03 18:14 UTC] cmb@php.net
-Status: Open
+Status: Suspended
[2020-11-03 18:14 UTC] cmb@php.net
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Mon Nov 03 11:00:02 2025 UTC |
Description: ------------ mbstring is missing `explode` equivalent. • `explode` itself can't be used for the purpose, because it fails for encodings that either have no autosynchronization property (like utf16) or identical byte sequences are otherwise not uniquely mapped to characters. • `mb_split` requires regex as its argument - hence it is unsuitable if the delimiter is not known during writing the code. • `mb_strpos` + `mb_substr` tandem is very slow. Test script: --------------- function printHexString($string) { foreach (str_split($string) as $character) { echo dechex(ord($character)), ' '; } echo "\n"; } mb_internal_encoding('utf-8'); // should match the actual encoding $input = mb_convert_encoding('Ω☡Ω☡Ω☡☦abc', 'utf-16'); // Ω is U+2126, ☡ is U+2621 $delimiter = mb_convert_encoding('☦', 'utf-16'); // ☦ is U+2626 printHexString($input); echo "\n"; $exploded = explode($delimiter, $input); foreach ($exploded as $element) { printHexString($element); } Expected result: ---------------- (assuming `mb_explode` instead `explode`) 21 26 26 21 21 26 26 21 21 26 26 21 26 26 0 61 0 62 0 63 21 26 26 21 21 26 26 21 21 26 26 21 0 61 0 62 0 63 Actual result: -------------- 21 26 26 21 21 26 26 21 21 26 26 21 26 26 0 61 0 62 0 63 21 21 21 21 21 21 0 61 0 62 0 63