PHP Bugs  
php.net | support | documentation | report a bug | advanced search | search howto | statistics | login

go to bug id or search bugs for  

Bug #39415 C ompilation failure on preg_match_all()
Submitted:7 Nov 2006 4:45pm UTC Modified: 13 Nov 2006 10:09am UTC
From:jordi at telematictraining dot com Assigned to:
Status:Bogus Category:PCRE related
Version:5.2.0 OS:Debian GNU/Linux Stable
View/Vote Developer Edit Submission

Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know! Just going to say 'Me too!'? Don't clutter the database with that please !
Do NOT use the comment system to ask questions related to bug reports, nor add "this is how I solved it" or "this might help you" comments to bugs that are not open. They will be removed without warning.
Your email address:
CAPTCHA: Type If you are unable to read this image, click the help link to the right of the input box into this box... (help)
If this image is hard to read, reload the page.

[7 Nov 2006 4:45pm UTC] jordi at telematictraining dot com
Description:
------------
Hi there,

What we did? Update php 5.1.6 to 5.2.0.
What we wanted to happen? We expected the 5.2.0 version to behave/work
as the previous ones (5.1.2, 5.1.4 and 5.1.6).
What actualy happened? It didn't.

We have a php based app and we've been working since php version 5.1.2.
But with this new version (5.2.0) it seems to be a problem with the
preg_match_all() function.

Being the function call preg_match_all(string $pattern, string $subject,
array $coincidences), this are the values of the variables:

$pattern =
"/((field):(codigo_doa|titulo_obra|titulo_alternativo|num_serie|ejemplar
_serie|ej
emplares_obra|nombre_tecnica|materiales_soporte|medidas_diametro|lista_a
rtistas|m
edidas_peso|medidas_resolucion|epoca_corriente|medidas_minutaje|color|so
nido_cana
les|numero_normalizado|valor|fecha_creacion|medidas_longitud|es_firmado|
exactitud
_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|unidades_medidas|unidades
_peso|tip
o_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nombre_fo
rmato|son
ido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicacion))|(
(barcode)
:((\_[ABC]){0,1}(\((\d+)\)){0,1}(\{(codigo_doa|titulo_obra|titulo_altern
ativo|num
_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|
medidas_d
iametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_corriente|m
edidas_mi
nutaje|color|sonido_canales|numero_normalizado|valor|fecha_creacion|medi
das_longi
tud|es_firmado|exactitud_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|u
nidades_m
edidas|unidades_peso|tipo_formato|nombre_formato_imagen|arquitectura|hay
_sonido|sonido_nombre_forma
to|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicaci
on)\}){1,
45}))/";

$subject = "<table width="100%"> <tr> <td class="field_label">Tipo de
objeto / Type of object: </td> <td class="field">field:tipologia_objeto
(field:tipo_obra)</td> </tr> <tr> <td class="field_label">Objeto de arte
registrado en AICOA / Work of art registered in AICOA: </td> <td
class="field">field:codigo_doa</td> </tr> <tr> <td
class="field_label">Título de la obra (Título alternativo) / Title
(Alternative title): </td> <td class="field">field:titulo_obra
(field:titulo_alternativo)</td> </tr> <tr> <td class="field_label">Autor
/ Author: </td> <td class="field">field:lista_artistas</td> </tr> <tr>
<td class="field_label">Fecha realización / Date or period: </td> <td
class="field">field:fecha_creacion (field:exactitud_fecha)</td> </tr>
<tr> <td class="field_label">Escuela, corriente estilística / School,
art movement: </td> <td class="field">field:epoca_corriente</td> </tr>
<tr> <td class="field_label">Datos de la serie / Serial Number: </td>
<td class="field">field:ejemplar_serie / field:ejemplares_obra -- serie:
field:num_serie</td> </tr> <tr> <td class="field_label">Características
del formato / Format characteristics: </td> <td
class="field">field:tipo_tecnica , resolución:
field:medidas_resolucion</td> </tr> <tr> <td class="field_label">Técnica
/ Technique:</td> <td class="field">field:nombre_tecnica</td> </tr> <tr>
<td class="field_label">Materiales-Soporte / Material-Support:</td> <td
class="field">field:materiales_soporte</td> </tr> <tr> <td
class="field_label">Medidas / dimensions:</td> <td class="field">
field:medidas_longitud field:unidades_medidas // Ø
field:medidas_diametro field:unidades_medidas // field:medidas_peso
field:unidades_peso </td> </tr> <tr> <td class="field_label">Firmado /
Signed</td> <td class="field">field:es_firmado</td> </tr> </table>"

The function breaks down giving this message:

Warning: preg_match_all() [function.preg-match-all]: Compilation failed:
repeated subpattern is too long at offset 1153 in
/home/.../dcombs_controller.php on line 723

Warning: preg_match_all() [function.preg-match-all]: Compilation failed:
repeated subpattern is too long at offset 1153 in
/home/.../dcombs_controller.php on line 723

We don't know if this is a bug, but we haven't seen any change related
to this on the 5.2.0 changelog. Just in case this could be a variable
size limitation (of 1024?), the $pattern is 1158 characters long and the
$subject is 1738 charachers.

Reproduce code:
---------------
See description.

Configure line: ./configure --prefix=/usr/local
--with-config-file-path=/usr/local/etc --with-apxs2=/usr/bin/apxs2
--with-mod_charset --with-openssl --with-kerberos --with-zlib
--enable-bcmath --with-bz2 --enable-calendar --with-curl
--with-curlwrappers --with-gd --with-ttf --enable-gd-native-ttf
--with-gettext --with-mcrypt --with-mysql --with-mysqli --with-snmp
--enable-wddx-with-xmlrpc --with-xsl --enable-sysvmsg --enable-sysvsem
--enable-sysvshm --with-freetype-dir --with-xml --with-libxml
--with-expat-dir --with-xmlrpc --enable-soap --enable-mbstring
--enable-mbstr-enc-trans --with-pgsql --with-tidy

Other configure options are removing --with-tidy.

Expected result:
----------------
The the preg_match_all() to work as it did on all 5.x php versions until
the date (except for 5.2.0).

Actual result:
--------------
Warning: preg_match_all() [function.preg-match-all]: Compilation failed:
repeated subpattern is too long at offset 1153 in
/home/.../dcombs_controller.php on line 723

Warning: preg_match_all() [function.preg-match-all]: Compilation failed:
repeated subpattern is too long at offset 1153 in
/home/.../dcombs_controller.php on line 723
[7 Nov 2006 4:47pm UTC] jordi at telematictraining dot com
The content of $pattern was line-altered (line breaks where they
shouldn't), so here's the correct one:

$pattern="/((field):(codigo_doa|titulo_obra|titulo_alternativo|num_serie
|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|medida
s_diametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_corrient
e|medidas_minutaje|color|sonido_canales|numero_normalizado|valor|fecha_c
reacion|medidas_longitud|es_firmado|exactitud_fecha|tipo_obra|tipologia_
objeto|tipo_tecnica|unidades_medidas|unidades_peso|tipo_formato|nombre_f
ormato_imagen|arquitectura|hay_sonido|sonido_nombre_formato|sonido_muest
reo|sonido_amplitud|sonido_idioma_original|pais_publicacion))|((barcode)
:((\_[ABC]){0,1}(\((\d+)\)){0,1}(\{(codigo_doa|titulo_obra|titulo_altern
ativo|num_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales
_soporte|medidas_diametro|lista_artistas|medidas_peso|medidas_resolucion
|epoca_corriente|medidas_minutaje|color|sonido_canales|numero_normalizad
o|valor|fecha_creacion|medidas_longitud|es_firmado|exactitud_fecha|tipo_
obra|tipologia_objeto|tipo_tecnica|unidades_medidas|unidades_pe
so|tipo_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nom
bre_formato|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_
publicacion)\}){1,45}))/"
[8 Nov 2006 9:52am UTC] jordi at telematictraining dot com
Sorry for not posting the script, here you are:

#!/usr/local/bin/php

<?php

/* Bug on the PHP 5.2.0 preg_Match_all() function? */

$pattern =
"/((field):(codigo_doa|titulo_obra|titulo_alternativo|num_serie|ejemplar
_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|medidas_diametr
o|lista_artistas|medidas_peso|medidas_resolucion|epoca_corriente|medidas
_minutaje|color|sonido_canales|numero_normalizado|valor|fecha_creacion|m
edidas_longitud|es_firmado|exactitud_fecha|tipo_obra|tipologia_objeto|ti
po_tecnica|unidades_medidas|unidades_peso|tipo_formato|nombre_formato_im
agen|arquitectura|hay_sonido|sonido_nombre_formato|sonido_muestreo|sonid
o_amplitud|sonido_idioma_original|pais_publicacion))|((barcode):((\_[ABC
]){0,1}(\((\d+)\)){0,1}(\{(codigo_doa|titulo_obra|titulo_alternativo|num
_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|
medidas_diametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_co
rriente|medidas_minutaje|color|sonido_canales|numero_normalizado|valor|f
echa_creacion|medidas_longitud|es_firmado|exactitud_fecha|tipo_obra|tipo
logia_objeto|tipo_tecnica|unidades_medidas|unidades_pe
so|tipo_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nom
bre_formato|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_
publicacion)\}){1,45}))/";

$subject = "<table width=\"100%\"> <tr> <td class=\"field_label\">Tipo
de objeto / Type of object: </td> <td
class=\"field\">field:tipologia_objeto (field:tipo_obra)</td> </tr> <tr>
<td class=\"field_label\">Objeto de arte registrado en AICOA / Work of
art registered in AICOA: </td> <td class=\"field\">field:codigo_doa</td>
</tr> <tr> <td class=\"field_label\">Título de la obra (Título
alternativo) / Title (Alternative title): </td> <td
class=\"field\">field:titulo_obra (field:titulo_alternativo)</td> </tr>
<tr> <td class=\"field_label\">Autor / Author: </td> <td
class=\"field\">field:lista_artistas</td> </tr> <tr> <td
class=\"field_label\">Fecha realización / Date or period: </td> <td
class=\"field\">field:fecha_creacion (field:exactitud_fecha)</td> </tr>
<tr> <td class=\"field_label\">Escuela, corriente estilística / School,
art movement: </td> <td class=\"field\">field:epoca_corriente</td> </tr>
<tr> <td class=\"field_label\">Datos de la serie / Serial Number: </td>
<td class=\"field\">field:ejemplar_serie / field:ejemplares_obra --
serie: field:num_serie</td> </tr> <tr> <td
class=\"field_label\">Características del formato / Format
characteristics: </td> <td class=\"field\">field:tipo_tecnica ,
resolución: field:medidas_resolucion</td> </tr> <tr> <td
class=\"field_label\">Técnica / Technique:</td> <td
class=\"field\">field:nombre_tecnica</td> </tr> <tr> <td
class=\"field_label\">Materiales-Soporte / Material-Support:</td> <td
class=\"field\">field:materiales_soporte</td> </tr> <tr> <td
class=\"field_label\">Medidas / dimensions:</td> <td class=\"field\">
field:medidas_longitud field:unidades_medidas // Ø
field:medidas_diametro field:unidades_medidas // field:medidas_peso
field:unidades_peso </td> </tr> <tr> <td class=\"field_label\">Firmado /
Signed</td> <td class=\"field\">field:es_firmado</td> </tr> </table>";

if(preg_match_all($pattern, $subject, $coincidences))
   print_r($coincidences);

?>
[8 Nov 2006 1:34pm UTC] tony2001@php.net
This is a limitation of PCRE library, not PHP.
[8 Nov 2006 2:56pm UTC] jordi at telematictraining dot com
This failure wasn't showing on php versions 5.1.2, 5.1.4 and 5.1.6 and
we cannot remember any PCRE library update. The only thing we updated
was php to version 5.2.0.

Besides this, once the error showed up on php 5.2.0 version, we came
back to version 5.1.6 and this failure didn't show (and it's not showing
right now, even if the PCRE lib was the one affected -a thing that does
not seem to be). We're also pretty sure this error does not show either
on 5.1.2 and 5.1.4 versions.

That's why we think this problem may be a php matter rather than a PCRE
lib one.
[9 Nov 2006 8:54am UTC] tony2001@php.net
02 Nov 2006, PHP 5.2.0
- Updated PCRE to version 6.7. (Ilia)
[13 Nov 2006 10:09am UTC] jordi at telematictraining dot com
Ok, thanks for pointing that out.

Just FYI on anyone that could be interested in this issue, we've sent an
e-mail to the PCRE developer to notify him this situation, although we
haven't got any response yet.

RSS feed | show source 

PHP Copyright © 2001-2009 The PHP Group
All rights reserved.
Last updated: Sat Nov 21 10:30:49 2009 UTC