|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2006-06-13 11:53 UTC] jdespatis at yahoo dot fr
Description:
------------
preg_split("/\W/u", $utf8_string) cuts the words !
Reproduce code:
---------------
print_r(preg_split("/(\W)/u", "этот", -1, PREG_SPLIT_DELIM_CAPTURE));
(watch out, i've put an utf8 string (you need to translate the html code into utf8), it's a russian string, (when you see the characters, you can see etot, with e being an epsilon inverted)
For now, i succeed in making my code work by using:
\P{L} instead of \W
Expected result:
----------------
Array
(
[0] => этот
)
Actual result:
--------------
Array
(
[0] =>
[1] => э
[2] =>
[3] => т
[4] =>
[5] => о
[6] =>
[7] => т
[8] =>
)
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sun Oct 26 22:00:01 2025 UTC |
sorry, my last comment is incorrect. in utf mode you should use the property escapes (\p{..}), instead of non utf8-aware escapes, like \W.