|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2005-02-02 07:21 UTC] gullevek at gullevek dot org
Description: ------------ If you want to get a string length of a string with japanese kanji, then with 4.3.10 the first kanji counts 8 characters instead of 2. Any other double byte character afterwards is counted as 2 bytes. The problem is, mb_strlen should return only 1 and not 2. If I could with strlen there should be 2. I get the wrong return with all ways. With no default charset set, with default charset set, with giving charsets on the mb_strlen function, getting it via the mb_detect_encoding. It always returns the wrong length. This was not in versions before 4.3.10. PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Nov 05 11:00:02 2025 UTC |
okay, it is not 100% a bug perhaps. problem is, if you have iso-2022-jp encoded data, and you don't have default set, php doesn't read it correctly (because iso-2022-jp is encoded very differently). see example below. enter two characters, one 1 bit (eg a) and one two bit (eg あ). then you will see, in the output with no iso set, the length is wrong. But I don't know why 4.3.10 behaves different to 4.3.9 ... <?php import_request_variables("p"); if ($send) { echo "S: $string<br>"; echo "D: ".mb_detect_encoding($string,"iso-2022-jp")."<br>"; echo strlen($string)." -- without iso: ".mb_strlen($string)." -- with iso".mb_strlen($string,"iso-2022-jp")."<br>"; } ?> <html><head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-2022-JP"> </head> <body> <form method="post" name="foo" enctype="multipart/form-data"> <input type="text" name="string" size="50" value="<? echo $string; ?>"><br> <input type="submit" name="send" value="Send"> </form></body></html>