|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[1999-11-14 03:47 UTC] joey at cvs dot php dot net
[2000-05-30 19:18 UTC] rasmus at cvs dot php dot net
|
|||||||||||||||||||||||||||
Copyright © 2001-2026 The PHP GroupAll rights reserved. |
Last updated: Mon Jun 15 20:00:02 2026 UTC |
Demo script: <?php Header("Content-type: text/plain"); $data = 'HREF="blah.blah">test</A> inside <A HREF="brackets.com">brackets</A>. What\'s it gonna do?'; $data = strip_tags($data); echo "$data\n"; ?> Output: HREF="blah.blah"test inside brackets. What's it gonna do? Config: ./configure --prefix=/www --with-apache=../apache_1.3.3 --with-mysql --with-imap --with-zlib --with-config-file-path --enable-debug=yes --enable-track-vars=yes --enable-magic-quotes=yes --enable-memory-limit=yes php.ini not relevant. When doing "one line at a time" stripping, the state engine simply removes any extraneous > signs. When I wrote a function similar to this to handle individual lines of html (no multi-line processing), the function set a boolean if and when it sees an < sign. If it sees a > before it ever sees a <, the function logic "assumed" that everything leading up to the > was html and removed it. Worked like a champ. Something else, although this is purely asthetic. After a >, and the state engine goes back to zero, it should plunk a "space" into the spot vacated by all the removed html if the next character is not a whitespace character or a less-than sign (<). Otherwise this little test program: <?php Header("Content-type: text/plain"); $data = '<TABLE BORDER=0><TR><TD>Hi there</TD></TR><TD>Ooops</TD></TR></TABLE>'; $data = strip_tags($data); echo "$data\n"; ?> Results in this: Hi thereOoops Something like this should fix that (I think).. case '>': if (state == 1) { if( *(p+1)!='<' ) { if(*(p+1)!=' ')&&(*(p+1)!=' ') { *(rp++) = ' '; } } lc = '>'; state = 0; } else if (state == 2) { if (!br && lc != '\"' && *(p-1)=='?') { state = 0; } } break;