php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #59018 Please support word boundaries
Submitted: 2010-01-03 18:47 UTC Modified: 2012-07-25 08:30 UTC
From: chx1975 at gmail dot com Assigned: cataphract (profile)
Status: Closed Package: intl (PECL)
PHP Version: Trunk SVN-2010-01-03 (dev) OS: irrelevant
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: chx1975 at gmail dot com
New email:
PHP Version: OS:

 

 [2010-01-03 18:47 UTC] chx1975 at gmail dot com
Description:
------------
http://www.unicode.org/reports/tr29/#Default_Word_Boundaries 
defines this and AFAIK ICU supports it. Please add. 


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-07-25 08:20 UTC] cataphract@php.net
Already supported in ext/intl (trunk) and PECL/intl (3.0.0a2).
 [2012-07-25 08:20 UTC] cataphract@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: cataphract
 [2012-07-25 08:24 UTC] chx@php.net
Mind if I ask how? I do not see a "break into words" function.
 [2012-07-25 08:30 UTC] cataphract@php.net
See e.g. https://github.com/cataphract/PECL-
intl/blob/master/tests/breakiter_getPartsIterator_basic.phpt

You can look at the rule status (::getRuleStatus()) to determine whether the 
current token is actually a word. The predefined rules have certain ranges for 
these statuses: https://github.com/cataphract/PECL-
intl/blob/master/breakiterator/breakiterator_class.cpp#L366
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 14 16:01:26 2024 UTC