php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #35427 str_word_count() handles '-' incorrectly
Submitted: 2005-11-27 19:12 UTC Modified: 2005-11-29 17:14 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: tomas_matousek at hotmail dot com Assigned: iliaa (profile)
Status: Closed Package: Strings related
PHP Version: 5.1.0 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: tomas_matousek at hotmail dot com
New email:
PHP Version: OS:

 

 [2005-11-27 19:12 UTC] tomas_matousek at hotmail dot com
Description:
------------
Characters specified in str_word_count() should be treated equally to letters, right?
This works for apostrophe but doesn't for hyphen.

Reproduce code:
---------------
var_dump(str_word_count("foo'0 bar-0var", 2, "0"));


Expected result:
----------------
array(3) {
  [0]=>
  string(5) "foo'0"
  [6]=>
  string(3) "bar0var"
}


Actual result:
--------------
array(3) {
  [0]=>
  string(5) "foo'0"
  [6]=>
  string(3) "bar"
  [10]=>
  string(4) "0var"
}


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-11-27 19:28 UTC] tony2001@php.net
"bar-0var" doesn't look like a valid *WORD* to me.
Or is it?
 [2005-11-27 20:00 UTC] tomas_matousek at hotmail dot com
By passing "0" as the third parameter, one declares '0' character legal word character which should be equivalent to any other letter, e.g. 'x'. "bar-xbar" is considered to be a word so "bar-0bar" should be word as well.
 [2005-11-28 21:27 UTC] tomas_matousek at hotmail dot com
No, I needn't. str_word_count("bar-var") returns 1, so '-' is considered as a part of the word if it is followed by 'word' character.

See the source code. The bug is clear there.
 [2005-11-29 09:41 UTC] tomas_matousek at hotmail dot com
File string.c, line 4744:

while (isalpha(*p) || *p == '\'' || (*p == '-' && isalpha(*(p+1))) || (char_list && ch[(unsigned char)*p])) 

should be:

while (isalpha(*p) || *p == '\'' || (*p == '-' && (isalpha(*(p+1) || (char_list && ch[(unsigned char)*p])))) || (char_list && ch[(unsigned char)*p]))
 [2005-11-29 09:45 UTC] tomas_matousek at hotmail dot com
One more correction:

while (isalpha(*p) || *p == '\'' || (*p == '-' && (isalpha(*(p+1)) || char_list && ch[(unsigned char)*(p+1)]))
|| (char_list && ch[(unsigned char)*p]))
 [2005-11-29 17:14 UTC] iliaa@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Oct 31 23:01:28 2024 UTC