php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #39415 C ompilation failure on preg_match_all()
Submitted: 2006-11-07 16:45 UTC Modified: 2010-12-20 12:20 UTC
From: jordi at telematictraining dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.2.0 OS: Debian GNU/Linux Stable
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: jordi at telematictraining dot com
New email:
PHP Version: OS:

 

 [2006-11-07 16:45 UTC] jordi at telematictraining dot com
Description:
------------
Hi there,

What we did? Update php 5.1.6 to 5.2.0.
What we wanted to happen? We expected the 5.2.0 version to behave/work as the previous ones (5.1.2, 5.1.4 and 5.1.6).
What actualy happened? It didn't.

We have a php based app and we've been working since php version 5.1.2. But with this new version (5.2.0) it seems to be a problem with the preg_match_all() function.

Being the function call preg_match_all(string $pattern, string $subject, array $coincidences), this are the values of the variables:

$pattern = "/((field):(codigo_doa|titulo_obra|titulo_alternativo|num_serie|ejemplar_serie|ej
emplares_obra|nombre_tecnica|materiales_soporte|medidas_diametro|lista_artistas|m
edidas_peso|medidas_resolucion|epoca_corriente|medidas_minutaje|color|sonido_cana
les|numero_normalizado|valor|fecha_creacion|medidas_longitud|es_firmado|exactitud
_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|unidades_medidas|unidades_peso|tip
o_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nombre_formato|son
ido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicacion))|((barcode)
:((\_[ABC]){0,1}(\((\d+)\)){0,1}(\{(codigo_doa|titulo_obra|titulo_alternativo|num
_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|medidas_d
iametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_corriente|medidas_mi
nutaje|color|sonido_canales|numero_normalizado|valor|fecha_creacion|medidas_longi
tud|es_firmado|exactitud_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|unidades_m
edidas|unidades_peso|tipo_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nombre_forma
to|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicacion)\}){1,
45}))/";

$subject = "<table width="100%"> <tr> <td class="field_label">Tipo de objeto / Type of object: </td> <td class="field">field:tipologia_objeto (field:tipo_obra)</td> </tr> <tr> <td class="field_label">Objeto de arte registrado en AICOA / Work of art registered in AICOA: </td> <td class="field">field:codigo_doa</td> </tr> <tr> <td class="field_label">T?tulo de la obra (T?tulo alternativo) / Title (Alternative title): </td> <td class="field">field:titulo_obra (field:titulo_alternativo)</td> </tr> <tr> <td class="field_label">Autor / Author: </td> <td class="field">field:lista_artistas</td> </tr> <tr> <td class="field_label">Fecha realizaci?n / Date or period: </td> <td class="field">field:fecha_creacion (field:exactitud_fecha)</td> </tr> <tr> <td class="field_label">Escuela, corriente estil?stica / School, art movement: </td> <td class="field">field:epoca_corriente</td> </tr> <tr> <td class="field_label">Datos de la serie / Serial Number: </td> <td class="field">field:ejemplar_serie / field:ejemplares_obra -- serie: field:num_serie</td> </tr> <tr> <td class="field_label">Caracter?sticas del formato / Format characteristics: </td> <td class="field">field:tipo_tecnica , resoluci?n: field:medidas_resolucion</td> </tr> <tr> <td class="field_label">T?cnica / Technique:</td> <td class="field">field:nombre_tecnica</td> </tr> <tr> <td class="field_label">Materiales-Soporte / Material-Support:</td> <td class="field">field:materiales_soporte</td> </tr> <tr> <td class="field_label">Medidas / dimensions:</td> <td class="field"> field:medidas_longitud field:unidades_medidas // ? field:medidas_diametro field:unidades_medidas // field:medidas_peso field:unidades_peso </td> </tr> <tr> <td class="field_label">Firmado / Signed</td> <td class="field">field:es_firmado</td> </tr> </table>"

The function breaks down giving this message:

Warning: preg_match_all() [function.preg-match-all]: Compilation failed: repeated subpattern is too long at offset 1153 in /home/.../dcombs_controller.php on line 723

Warning: preg_match_all() [function.preg-match-all]: Compilation failed: repeated subpattern is too long at offset 1153 in /home/.../dcombs_controller.php on line 723

We don't know if this is a bug, but we haven't seen any change related to this on the 5.2.0 changelog. Just in case this could be a variable size limitation (of 1024?), the $pattern is 1158 characters long and the $subject is 1738 charachers.


Reproduce code:
---------------
See description.

Configure line: ./configure --prefix=/usr/local --with-config-file-path=/usr/local/etc --with-apxs2=/usr/bin/apxs2 --with-mod_charset --with-openssl --with-kerberos --with-zlib --enable-bcmath --with-bz2 --enable-calendar --with-curl --with-curlwrappers --with-gd --with-ttf --enable-gd-native-ttf --with-gettext --with-mcrypt --with-mysql --with-mysqli --with-snmp --enable-wddx-with-xmlrpc --with-xsl --enable-sysvmsg --enable-sysvsem --enable-sysvshm --with-freetype-dir --with-xml --with-libxml --with-expat-dir --with-xmlrpc --enable-soap --enable-mbstring --enable-mbstr-enc-trans --with-pgsql --with-tidy

Other configure options are removing --with-tidy.


Expected result:
----------------
The the preg_match_all() to work as it did on all 5.x php versions until the date (except for 5.2.0).

Actual result:
--------------
Warning: preg_match_all() [function.preg-match-all]: Compilation failed: repeated subpattern is too long at offset 1153 in /home/.../dcombs_controller.php on line 723

Warning: preg_match_all() [function.preg-match-all]: Compilation failed: repeated subpattern is too long at offset 1153 in /home/.../dcombs_controller.php on line 723

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-11-07 16:47 UTC] jordi at telematictraining dot com
The content of $pattern was line-altered (line breaks where they shouldn't), so here's the correct one:

$pattern="/((field):(codigo_doa|titulo_obra|titulo_alternativo|num_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|medidas_diametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_corriente|medidas_minutaje|color|sonido_canales|numero_normalizado|valor|fecha_creacion|medidas_longitud|es_firmado|exactitud_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|unidades_medidas|unidades_peso|tipo_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nombre_formato|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicacion))|((barcode):((\_[ABC]){0,1}(\((\d+)\)){0,1}(\{(codigo_doa|titulo_obra|titulo_alternativo|num_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|medidas_diametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_corriente|medidas_minutaje|color|sonido_canales|numero_normalizado|valor|fecha_creacion|medidas_longitud|es_firmado|exactitud_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|unidades_medidas|unidades_pe so|tipo_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nombre_formato|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicacion)\}){1,45}))/"
 [2006-11-08 09:52 UTC] jordi at telematictraining dot com
Sorry for not posting the script, here you are:

#!/usr/local/bin/php

<?php

/* Bug on the PHP 5.2.0 preg_Match_all() function? */

$pattern = "/((field):(codigo_doa|titulo_obra|titulo_alternativo|num_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|medidas_diametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_corriente|medidas_minutaje|color|sonido_canales|numero_normalizado|valor|fecha_creacion|medidas_longitud|es_firmado|exactitud_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|unidades_medidas|unidades_peso|tipo_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nombre_formato|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicacion))|((barcode):((\_[ABC]){0,1}(\((\d+)\)){0,1}(\{(codigo_doa|titulo_obra|titulo_alternativo|num_serie|ejemplar_serie|ejemplares_obra|nombre_tecnica|materiales_soporte|medidas_diametro|lista_artistas|medidas_peso|medidas_resolucion|epoca_corriente|medidas_minutaje|color|sonido_canales|numero_normalizado|valor|fecha_creacion|medidas_longitud|es_firmado|exactitud_fecha|tipo_obra|tipologia_objeto|tipo_tecnica|unidades_medidas|unidades_pe so|tipo_formato|nombre_formato_imagen|arquitectura|hay_sonido|sonido_nombre_formato|sonido_muestreo|sonido_amplitud|sonido_idioma_original|pais_publicacion)\}){1,45}))/";

$subject = "<table width=\"100%\"> <tr> <td class=\"field_label\">Tipo de objeto / Type of object: </td> <td class=\"field\">field:tipologia_objeto (field:tipo_obra)</td> </tr> <tr> <td class=\"field_label\">Objeto de arte registrado en AICOA / Work of art registered in AICOA: </td> <td class=\"field\">field:codigo_doa</td> </tr> <tr> <td class=\"field_label\">T?tulo de la obra (T?tulo alternativo) / Title (Alternative title): </td> <td class=\"field\">field:titulo_obra (field:titulo_alternativo)</td> </tr> <tr> <td class=\"field_label\">Autor / Author: </td> <td class=\"field\">field:lista_artistas</td> </tr> <tr> <td class=\"field_label\">Fecha realizaci?n / Date or period: </td> <td class=\"field\">field:fecha_creacion (field:exactitud_fecha)</td> </tr> <tr> <td class=\"field_label\">Escuela, corriente estil?stica / School, art movement: </td> <td class=\"field\">field:epoca_corriente</td> </tr> <tr> <td class=\"field_label\">Datos de la serie / Serial Number: </td> <td class=\"field\">field:ejemplar_serie / field:ejemplares_obra -- serie: field:num_serie</td> </tr> <tr> <td class=\"field_label\">Caracter?sticas del formato / Format characteristics: </td> <td class=\"field\">field:tipo_tecnica , resoluci?n: field:medidas_resolucion</td> </tr> <tr> <td class=\"field_label\">T?cnica / Technique:</td> <td class=\"field\">field:nombre_tecnica</td> </tr> <tr> <td class=\"field_label\">Materiales-Soporte / Material-Support:</td> <td class=\"field\">field:materiales_soporte</td> </tr> <tr> <td class=\"field_label\">Medidas / dimensions:</td> <td class=\"field\"> field:medidas_longitud field:unidades_medidas // ? field:medidas_diametro field:unidades_medidas // field:medidas_peso field:unidades_peso </td> </tr> <tr> <td class=\"field_label\">Firmado / Signed</td> <td class=\"field\">field:es_firmado</td> </tr> </table>";

if(preg_match_all($pattern, $subject, $coincidences))
   print_r($coincidences);

?>
 [2006-11-08 13:34 UTC] tony2001@php.net
This is a limitation of PCRE library, not PHP.
 [2006-11-08 14:56 UTC] jordi at telematictraining dot com
This failure wasn't showing on php versions 5.1.2, 5.1.4 and 5.1.6 and we cannot remember any PCRE library update. The only thing we updated was php to version 5.2.0.

Besides this, once the error showed up on php 5.2.0 version, we came back to version 5.1.6 and this failure didn't show (and it's not showing right now, even if the PCRE lib was the one affected -a thing that does not seem to be). We're also pretty sure this error does not show either on 5.1.2 and 5.1.4 versions.

That's why we think this problem may be a php matter rather than a PCRE lib one.
 [2006-11-09 08:54 UTC] tony2001@php.net
02 Nov 2006, PHP 5.2.0
- Updated PCRE to version 6.7. (Ilia)
 [2006-11-13 10:09 UTC] jordi at telematictraining dot com
Ok, thanks for pointing that out.

Just FYI on anyone that could be interested in this issue, we've sent an e-mail to the PCRE developer to notify him this situation, although we haven't got any response yet.
 [2010-12-20 12:20 UTC] jani@php.net
-Package: Tidy +Package: PCRE related
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 01:01:28 2024 UTC