php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66069 UTF8 problem with substr
Submitted: 2013-11-08 23:11 UTC Modified: 2013-11-09 00:04 UTC
From: p dot danesh at afagh dot info Assigned:
Status: Not a bug Package: *General Issues
PHP Version: 5.5.5 OS: LINUX
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: p dot danesh at afagh dot info
New email:
PHP Version: OS:

 

 [2013-11-08 23:11 UTC] p dot danesh at afagh dot info
Description:
------------
---
From manual page: http://www.php.net/function.substr
---
Hello,
in UTF8 string , when we use substr , one character "�" seen at end of output.
we dont have problem in mb_substr.


Test script:
---------------
$str="نمونه متن ساده جهت برش توسط تابع ساب اس تی آر در پی اچ پی";
echo substr($str ,0 ,28);

Expected result:
----------------
نمونه متن ساده �


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-11-09 00:04 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2013-11-09 00:04 UTC] requinix@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

substr() works with bytes, not logical characters. If you use it with a string in a multibyte encoding then you might accidentally cut it off in the middle of a byte sequence. Which is what happened for you.

Don't use substr() for a UTF-8 string.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Nov 07 05:00:01 2025 UTC