|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #35427 str_word_count() handles '-' incorrectly
Submitted: 2005-11-27 19:12 UTC Modified: 2005-11-29 17:14 UTC
Avg. Score:3.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: tomas_matousek at hotmail dot com Assigned: iliaa (profile)
Status: Closed Package: Strings related
PHP Version: 5.1.0 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Bug Type:
From: tomas_matousek at hotmail dot com
New email:
PHP Version: OS:


 [2005-11-27 19:12 UTC] tomas_matousek at hotmail dot com
Characters specified in str_word_count() should be treated equally to letters, right?
This works for apostrophe but doesn't for hyphen.

Reproduce code:
var_dump(str_word_count("foo'0 bar-0var", 2, "0"));

Expected result:
array(3) {
  string(5) "foo'0"
  string(3) "bar0var"

Actual result:
array(3) {
  string(5) "foo'0"
  string(3) "bar"
  string(4) "0var"


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2005-11-27 19:28 UTC]
"bar-0var" doesn't look like a valid *WORD* to me.
Or is it?
 [2005-11-27 20:00 UTC] tomas_matousek at hotmail dot com
By passing "0" as the third parameter, one declares '0' character legal word character which should be equivalent to any other letter, e.g. 'x'. "bar-xbar" is considered to be a word so "bar-0bar" should be word as well.
 [2005-11-28 21:27 UTC] tomas_matousek at hotmail dot com
No, I needn't. str_word_count("bar-var") returns 1, so '-' is considered as a part of the word if it is followed by 'word' character.

See the source code. The bug is clear there.
 [2005-11-29 09:41 UTC] tomas_matousek at hotmail dot com
File string.c, line 4744:

while (isalpha(*p) || *p == '\'' || (*p == '-' && isalpha(*(p+1))) || (char_list && ch[(unsigned char)*p])) 

should be:

while (isalpha(*p) || *p == '\'' || (*p == '-' && (isalpha(*(p+1) || (char_list && ch[(unsigned char)*p])))) || (char_list && ch[(unsigned char)*p]))
 [2005-11-29 09:45 UTC] tomas_matousek at hotmail dot com
One more correction:

while (isalpha(*p) || *p == '\'' || (*p == '-' && (isalpha(*(p+1)) || char_list && ch[(unsigned char)*(p+1)]))
|| (char_list && ch[(unsigned char)*p]))
 [2005-11-29 17:14 UTC]
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
Thank you for the report, and for helping us make PHP better.

PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Sun May 31 00:01:25 2020 UTC