php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #76827 Introduce a new function struncate or some such
Submitted: 2018-08-31 08:27 UTC Modified: 2018-08-31 18:03 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: zjz at zjz dot name Assigned:
Status: Assigned Package: Performance problem
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: zjz at zjz dot name
New email:
PHP Version: OS:

 

 [2018-08-31 08:27 UTC] zjz at zjz dot name
Description:
------------
Please consider introducing a new function |struncate| or some such to resolve the performance problem of truncate a very long string to a slightly shorter one.

For example, suppose the user wants to truncate a 50000 byte string to a 49999 byte string(i.e. to delete the last character), and **the untruncated one is no longer in use**.

Right now the user has to write code as such:

$str = substr($str, 0, -1);

However, this is silly, it is rather bad in performance when the string copying is not necessary as the untruncated one is no longer in use as mentioned above.

So I'd like to suggest adding a new function |struncate|, which receives two parameters, one is a reference of a string to be truncated, the other is the truncated length:

function struncate(&$string, $length)


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-08-31 08:30 UTC] zjz at zjz dot name
Please consider introducing a new function |struncate| or some such to resolve the performance problem of truncating a very long string to a slightly shorter one.

For example, suppose the user wants to truncate a 50000 byte string to a 49999 byte string(i.e. to delete the last character), and **the untruncated one is no longer in use**.

Right now the user has to write code as such:

$str = substr($str, 0, -1);

However, this is silly, it is rather bad in performance when the string copying is not necessary as the untruncated one is no longer in use as mentioned above.

So I'd like to suggest adding a new function |struncate|, which receives two parameters, one is a reference of a string to be truncated, the other is the truncated length:

function struncate(&$string, $length)
 [2018-08-31 08:31 UTC] spam2 at rhsoft dot net
$str = substr($str, 0, -1);
why do you think that does copy the whole string?

function struncate(&$string, $length)
and why do you think that would gain anything?

for me it looks you don't understand the copy-on-write nature of the zendengine and that references in PHP in most cases don't do what you think and hence are often bad
 [2018-08-31 08:42 UTC] zjz at zjz dot name
Thanks for the reply.

So if substr doesn't do copying, then what if I use a different variable(say $str1) to receive the truncated string from $str.

$str1 = substr($str, 0, -1);

Doesn't it do copying?

Or did you mean that PHP is smart in that if the variable that receives the returned value is as the same as the one passed as the parameter, it can avoid copying?
 [2018-08-31 08:46 UTC] spam2 at rhsoft dot net
why the hell should it copy the param and why do you think a reference is helpful?

http://www.phpinternalsbook.com/zvals/memory_management.html
http://schlueters.de/blog/archives/125-Do-not-use-PHP-references.html
 [2018-08-31 09:04 UTC] zjz at zjz dot name
I think I understand the copy-on-write nature, I am just not sure it can avoid copying **if the substr can optimise out** the copying when the variable that receives the returned value is as the same one as the passed parameter.

I haven't right now looked into the C code, but from what I guess, substr doesn't know before hand if the variable that receives the returned value is as the same one as the passed parameter, so the process I guess is like the following:

Step 1. Copying the string by the truncated length. (in the process of calling substr)
Step 2. Do the assignment(in the process of $str = substr()), now the original string is dereferenced and its reference count is zero, so it's freed in this step.

So I want to repeat what I said in my previous comment: is PHP smart in that if the variable that receives the returned value is as the same as the one passed as the parameter, it can avoid copying?
 [2018-08-31 09:08 UTC] spam2 at rhsoft dot net
you miss the point:

a new function with a reference would gain *nothing* which couldn't be done in substr() too without clutter the function list
 [2018-08-31 09:56 UTC] zjz at zjz dot name
$str = substr($str, 0 , -1); first makes a new string by the truncated length, and then frees the original string, which seems to me uncessary.

As to why I suggest this new function with a reference parameter, **I didn't AT ALL try to mean that referencing itself is more effient**, but I just meant in the new function, **something magic** can be done in C code in the new function, so that the original string can be **directly** truncated, without making a new truncated string in another place at first, and then freeing the original one, which is uncessary.
 [2018-08-31 10:34 UTC] zjz at zjz dot name
Well. Let me try to look into the C code and then gives some code example to explain what I said.
 [2018-08-31 10:39 UTC] zjz at zjz dot name
The general idea I suggested is directly modifying the **len** value of the str:

TBC, I am not familar with PHP source code, so this code I write here can't be the exactly formal code, but simply demostrates the general idea I meant:

PHP_FUNCTION(struncate)
{
	zend_string* str;
	zend_long    len;
	int argc = ZEND_NUM_ARGS();

	ZEND_PARSE_PARAMETERS_START(...)
		Z_PARAM_STR(str)
		Z_PARAM_LONG(len)
	ZEND_PARSE_PARAMETERS_END();

	str->len = len;
}
 [2018-08-31 10:47 UTC] zjz at zjz dot name
I think it's far more effient than substr when the new string is still very long, after being truncated, isn't it?
 [2018-08-31 11:09 UTC] zjz at zjz dot name
Strictly speaking, it doesn't have to be a reference to be passed to struncate at all, since str->len = len; or some similar code can do the job anyway, regardless of whether the parameter is a reference or not.

But I'd like to suggest the parameter in the new function to be a reference, otherwise it can't reflect the fact that the parameter value will be changed(in other words, it's length would be changed), and would make the user confused.
 [2018-08-31 18:03 UTC] zjz at zjz dot name
-Status: Open +Status: Assigned
 [2018-08-31 18:03 UTC] zjz at zjz dot name
Let me try to offer a patch on Github.
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Sat Dec 14 15:01:23 2019 UTC