php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #12860 problem with strip_tags()
Submitted: 2001-08-20 09:45 UTC Modified: 2001-08-22 19:15 UTC
From: costrova at prdel dot cz Assigned:
Status: Closed Package: Strings related
PHP Version: 4.0.6 OS: Linux and Win
Private report: No CVE-ID: None
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
39 + 34 = ?
Subscribe to this entry?

 
 [2001-08-20 09:45 UTC] costrova at prdel dot cz
When I have text exported from MS Word to HTML where is "<?xml:namespace..." (see $string) and I want strip tags from it, I get text before it only.

example:
<?

$string = <<<EOD
<BODY><P class=MsoNormal><B><U>I am hungry<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /></U></B></P> <P class=MsoNormal>I am really hungry<o:p></o:p></SPAN></BODY>  
EOD;

var_dump(strip_tags($string));

?>

output is:

string(11) "I am hungry"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-08-20 11:02 UTC] swm@php.net
This markup is not XHTML 1.0 compliant, from my readying.
That is, I'm pretty sure <?xml ...> must be the first line
of the document.

This, however, is not the reason why strip_tags is failing.
It is failing because it recognises '<?' (from <?xml ) as
the beginning of PHP code - not xml. This presents problems
with applying strip_tags to xhtml. The reason why this
probably hasn't been picked up is that correct (??) XML
declarations are of the form <?xml ... ?> - which should
not contain anything to be out put anyway.

If you/anyone can show that this is valid we can work
around it (by demoting the PHP strip_tags state to an HTML state).
 [2001-08-21 05:50 UTC] costrova at prdel dot cz
It is done by MS Word and I can?t affect it.
Although it isn't compliant, I think, that is a bug. Fnc strip_tags() has strip all tags anywhere inside the string. 
I can't check, if all tags in the string are valid.
If you want to show if this is valid, I would have to call to Redmond :-)
 [2001-08-21 22:02 UTC] swm@php.net
Latest CVS now checks if <? is followed by 'xml'. If so,
it treats it just like HTML.

Check latest CVS to confirm that this works (www.php.net/downloads.php)

Gavin
 [2001-08-22 19:15 UTC] cynic@php.net
erm, why does strip_tags() allow PHP tags in the first place? this doesn't look right.. (am I missing something?)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Oct 04 21:01:27 2024 UTC