php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #12860 problem with strip_tags()
Submitted: 2001-08-20 09:45 UTC Modified: 2001-08-22 19:15 UTC
From: costrova at prdel dot cz Assigned:
Status: Closed Package: Strings related
PHP Version: 4.0.6 OS: Linux and Win
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: costrova at prdel dot cz
New email:
PHP Version: OS:

 

 [2001-08-20 09:45 UTC] costrova at prdel dot cz
When I have text exported from MS Word to HTML where is "<?xml:namespace..." (see $string) and I want strip tags from it, I get text before it only.

example:
<?

$string = <<<EOD
<BODY><P class=MsoNormal><B><U>I am hungry<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /></U></B></P> <P class=MsoNormal>I am really hungry<o:p></o:p></SPAN></BODY>  
EOD;

var_dump(strip_tags($string));

?>

output is:

string(11) "I am hungry"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-08-20 11:02 UTC] swm@php.net
This markup is not XHTML 1.0 compliant, from my readying.
That is, I'm pretty sure <?xml ...> must be the first line
of the document.

This, however, is not the reason why strip_tags is failing.
It is failing because it recognises '<?' (from <?xml ) as
the beginning of PHP code - not xml. This presents problems
with applying strip_tags to xhtml. The reason why this
probably hasn't been picked up is that correct (??) XML
declarations are of the form <?xml ... ?> - which should
not contain anything to be out put anyway.

If you/anyone can show that this is valid we can work
around it (by demoting the PHP strip_tags state to an HTML state).
 [2001-08-21 05:50 UTC] costrova at prdel dot cz
It is done by MS Word and I can?t affect it.
Although it isn't compliant, I think, that is a bug. Fnc strip_tags() has strip all tags anywhere inside the string. 
I can't check, if all tags in the string are valid.
If you want to show if this is valid, I would have to call to Redmond :-)
 [2001-08-21 22:02 UTC] swm@php.net
Latest CVS now checks if <? is followed by 'xml'. If so,
it treats it just like HTML.

Check latest CVS to confirm that this works (www.php.net/downloads.php)

Gavin
 [2001-08-22 19:15 UTC] cynic@php.net
erm, why does strip_tags() allow PHP tags in the first place? this doesn't look right.. (am I missing something?)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Nov 27 02:01:38 2024 UTC