php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66069 UTF8 problem with substr
Submitted: 2013-11-08 23:11 UTC Modified: 2013-11-09 00:04 UTC
From: p dot danesh at afagh dot info Assigned:
Status: Not a bug Package: *General Issues
PHP Version: 5.5.5 OS: LINUX
Private report: No CVE-ID: None
 [2013-11-08 23:11 UTC] p dot danesh at afagh dot info
Description:
------------
---
From manual page: http://www.php.net/function.substr
---
Hello,
in UTF8 string , when we use substr , one character "�" seen at end of output.
we dont have problem in mb_substr.


Test script:
---------------
$str="نمونه متن ساده جهت برش توسط تابع ساب اس تی آر در پی اچ پی";
echo substr($str ,0 ,28);

Expected result:
----------------
نمونه متن ساده �


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-11-09 00:04 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2013-11-09 00:04 UTC] requinix@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

substr() works with bytes, not logical characters. If you use it with a string in a multibyte encoding then you might accidentally cut it off in the middle of a byte sequence. Which is what happened for you.

Don't use substr() for a UTF-8 string.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Nov 07 03:00:01 2025 UTC