|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2018-09-29 20:36 UTC] tobias at tromm dot no-ip dot org
Description: ------------ Hi. The function rtrim is not working correctly if user uses ASCII table to cheat the system. Please check the full example I am given with some fix [probably there is a better way to fix it, but, it is working]. Test script: --------------- https://www.papinho.com/PHP_bug.zip Expected result: ---------------- Result String: "David" This is the result with my fix script. Actual result: -------------- Given String: " David " Given String with the use of php trim function: " David " PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Oct 29 13:00:01 2025 UTC |
<?php $apelido = " David "; echo "Given String: \"".$apelido."\"<br><br>"; echo "Given String with the use of php trim function: \"".trim($apelido)."\"<br><br>"; echo "Now we will check the ASCII table to check wich character are being used:<br><br>"; for ($cont=0; $cont <strlen($apelido); $cont++){ echo "ORD: ".ord($apelido[$cont])."<br>"; } $fazer = 1; do { $fazer = 0; //Check the beginning of the string if ($apelido[0] == chr(32) OR $apelido[0] == chr(194) OR $apelido[0] == chr(160)) { $fazer = 1; $apelido = ltrim($apelido, chr(32)); $apelido = ltrim($apelido, chr(194)); $apelido = ltrim($apelido, chr(160)); } //Check the end of the string if ($apelido[strlen($apelido)-1] == chr(32) OR $apelido[strlen($apelido)-1] == chr(194) OR $apelido[strlen($apelido)-1] == chr(160)) { $fazer = 1; $apelido = rtrim($apelido, chr(32)); $apelido = rtrim($apelido, chr(194)); $apelido = rtrim($apelido, chr(160)); } } while ($fazer == 1); echo "<br>Result String: \"".$apelido."\"<br><br>"; ?>You probably need the longer explanation. Welcome to the world of character encodings. Nearly all of PHP's normal functions work on bytes. You are thinking about characters. Since PHP doesn't internally manage Unicode characters, the only reasonable way PHP can currently convert between the two is look at the byte ranges that all ("all") character encodings can agree upon: the 0-126 range. That means functions like trim will only deal with the bytes \n\r\t\v and space, and \0 for the fun of it, and they will not cover anything above \x7F. Those bytes blocking trim from reducing the entire string to "David" are above \x7F. The exact interpretation of what characters those bytes are depends on the character encoding. In Latin1, \xA0 (\240) is a non-breaking space, but in UTF-8 it is not a character at all - instead it is part of a 2-4 byte sequence that represents a character (and the sequence for a non-breaking space is \xC2\xA0 or \302\240). There is no mb_trim function but you can use pcre_replace with \s and the /u option.For what it's worth, preg_replace('/(^[\t\n\r\000\v\pZ]+)|([\t\n\r\000\v\pZ]+$)/u', '', $apelido); Removes leading/trailing instances of everything that trim() removes and also everything that Unicode considers a "whitespace" character (assumes UTF-8 encoding).Unfortunately some user still using a way to insert a database value with space in the end of the string. If someone have any idea where I will be grateful. I tryed also preg_replace('/(^[\t\n\r\000\v\pZ]+)|([\t\n\r\000\v\pZ]+$)/u', '', $apelido); On my tests everything goes ok, but the user is doing something to hack it someway with the following ord sequence in the end: ORD: 32 ORD: 32 ORD: 32 ORD: 32 ORD: 32 ORD: 32 ORD: 32 ORD: 32