php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #35427 str_word_count() handles '-' incorrectly
Submitted: 2005-11-27 19:12 UTC Modified: 2005-11-29 17:14 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: tomas_matousek at hotmail dot com Assigned: iliaa (profile)
Status: Closed Package: Strings related
PHP Version: 5.1.0 OS: *
Private report: No CVE-ID: None
 [2005-11-27 19:12 UTC] tomas_matousek at hotmail dot com
Description:
------------
Characters specified in str_word_count() should be treated equally to letters, right?
This works for apostrophe but doesn't for hyphen.

Reproduce code:
---------------
var_dump(str_word_count("foo'0 bar-0var", 2, "0"));


Expected result:
----------------
array(3) {
  [0]=>
  string(5) "foo'0"
  [6]=>
  string(3) "bar0var"
}


Actual result:
--------------
array(3) {
  [0]=>
  string(5) "foo'0"
  [6]=>
  string(3) "bar"
  [10]=>
  string(4) "0var"
}


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-11-27 19:28 UTC] tony2001@php.net
"bar-0var" doesn't look like a valid *WORD* to me.
Or is it?
 [2005-11-27 20:00 UTC] tomas_matousek at hotmail dot com
By passing "0" as the third parameter, one declares '0' character legal word character which should be equivalent to any other letter, e.g. 'x'. "bar-xbar" is considered to be a word so "bar-0bar" should be word as well.
 [2005-11-28 21:27 UTC] tomas_matousek at hotmail dot com
No, I needn't. str_word_count("bar-var") returns 1, so '-' is considered as a part of the word if it is followed by 'word' character.

See the source code. The bug is clear there.
 [2005-11-29 09:41 UTC] tomas_matousek at hotmail dot com
File string.c, line 4744:

while (isalpha(*p) || *p == '\'' || (*p == '-' && isalpha(*(p+1))) || (char_list && ch[(unsigned char)*p])) 

should be:

while (isalpha(*p) || *p == '\'' || (*p == '-' && (isalpha(*(p+1) || (char_list && ch[(unsigned char)*p])))) || (char_list && ch[(unsigned char)*p]))
 [2005-11-29 09:45 UTC] tomas_matousek at hotmail dot com
One more correction:

while (isalpha(*p) || *p == '\'' || (*p == '-' && (isalpha(*(p+1)) || char_list && ch[(unsigned char)*(p+1)]))
|| (char_list && ch[(unsigned char)*p]))
 [2005-11-29 17:14 UTC] iliaa@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 22 10:01:30 2025 UTC