php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #59018 Please support word boundaries
Submitted: 2010-01-03 18:47 UTC Modified: 2012-07-25 08:30 UTC
From: chx1975 at gmail dot com Assigned: cataphract (profile)
Status: Closed Package: intl (PECL)
PHP Version: Trunk SVN-2010-01-03 (dev) OS: irrelevant
Private report: No CVE-ID: None
 [2010-01-03 18:47 UTC] chx1975 at gmail dot com
Description:
------------
http://www.unicode.org/reports/tr29/#Default_Word_Boundaries 
defines this and AFAIK ICU supports it. Please add. 


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-07-25 08:20 UTC] cataphract@php.net
Already supported in ext/intl (trunk) and PECL/intl (3.0.0a2).
 [2012-07-25 08:20 UTC] cataphract@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: cataphract
 [2012-07-25 08:24 UTC] chx@php.net
Mind if I ask how? I do not see a "break into words" function.
 [2012-07-25 08:30 UTC] cataphract@php.net
See e.g. https://github.com/cataphract/PECL-
intl/blob/master/tests/breakiter_getPartsIterator_basic.phpt

You can look at the rule status (::getRuleStatus()) to determine whether the 
current token is actually a word. The predefined rules have certain ranges for 
these statuses: https://github.com/cataphract/PECL-
intl/blob/master/breakiterator/breakiterator_class.cpp#L366
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Thu Nov 26 04:01:23 2020 UTC