|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #69203 FILTER_FLAG_STRIP_HIGH doesn't strip ASCII 127
Submitted: 2015-03-09 11:12 UTC Modified: 2015-04-21 11:11 UTC
From: Assigned: whatthejeff (profile)
Status: Closed Package: Filter related
PHP Version: 5.5Git-2015-03-09 (Git) OS:
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
New email:
PHP Version: OS:


 [2015-03-09 11:12 UTC]
FILTER_FLAG_STRIP_HIGH doesn't strip ASCII 127. This is inconsistent with FILTER_FLAG_ENCODE_HIGH which encodes ASCII 127 as expected.

Test script:
var_dump(filter_var("\x7f", FILTER_UNSAFE_RAW, FILTER_FLAG_STRIP_HIGH));

var_dump(filter_var("\x7f", FILTER_UNSAFE_RAW, FILTER_FLAG_ENCODE_HIGH));

Expected result:
string(0) ""
string(0) ""
string(0) ""
string(0) ""
string(6) ""
string(6) ""
string(3) "%7F"
string(6) ""

Actual result:
string(1) ""
string(1) ""
string(3) "%7F"
string(1) ""
string(6) ""
string(6) ""
string(3) "%7F"
string(6) ""


Add a Patch

Pull Requests

Pull requests:

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2015-03-14 08:20 UTC]
Automatic comment on behalf of
Log: Fix #69203: FILTER_FLAG_STRIP_HIGH doesn't strip ASCII 127
 [2015-03-14 08:20 UTC]
-Status: Open +Status: Closed
 [2015-04-20 11:26 UTC] dominic at mailinator dot com
Doesn't FILTER_FLAG_ENCODE_HIGH contain the error here? 127 is lower ASCII.
 [2015-04-20 11:52 UTC]
-Assigned To: +Assigned To: whatthejeff
 [2015-04-20 12:03 UTC]
ASCII only encodes 128 characters. I think LOW/HIGH was meant to reference the non-printable characters.
 [2015-04-20 12:25 UTC] dominic at mailinator dot com
That's a little confusing. Lots of people mean Extended ASCII when they refer to ASCII. Extended ASCII encodes 256 characters, with the additional 128 characters known as 'High ASCII'.

Given that the PHP documentation - - says that these filters encode/strip characters higher than 127 (which is now no longer true?), I suspect the filter creators and the documenters were referring to High ASCII.
 [2015-04-20 14:04 UTC] dominic at mailinator dot com
It looks like FILTER_FLAG_ENCODE_HIGH's behaviour changed in commit:;a=commit;h=7d7248390cb85f61150304bbdd3eace0a2023a86

with message:

Filter fixes:
Fixed possible double encoding problem with sanitizing filters
Make use of space-strict strip_tags() function

So it seems like an unintentional side-effect. Until then, both STRIP and ENCODE were for >127.
 [2015-04-21 08:23 UTC]
> So it seems like an unintentional side-effect. Until then, both STRIP and ENCODE were for >127.

I think there was only one PHP release that included this extension before that commit. Hard to say if it was intentional or not, but I guess we could ask Derick or Ilia if they remember.

My assumption is that these flags were intended to optionally strip/encode characters that don't fall within the range of printable ASCII characters (20-7e). It's true that the original code didn't strip/encode the DEL control character (7f), but I'm not sure if that was intentional or an oversight.
 [2015-04-21 08:56 UTC]
IIRC, they should strip / encode for for >= 128 only.
 [2015-04-21 10:44 UTC]
BTW, that's what the docs say too:
 [2015-04-21 11:11 UTC]
I think the intended behavior was a little ambiguous since the docs for FILTER_FLAG_ENCODE_HIGH have been out of sync with the implementation/tests since 5.2.1 (it seems).
 [2015-04-21 11:16 UTC]
I guess we should revert this and fix FILTER_FLAG_ENCODE_HIGH if they're not behaving as intended.
 [2015-04-21 11:42 UTC] dominic at mailinator dot com
That would get my vote, possibly with new filters FILTER_FLAG_[ENCODE|STRIP]_CONTROL_CODES.
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Jul 18 07:01:28 2024 UTC