php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #43402 FILTER_VALIDATE_EMAIL is not RFC2822 compliant
Submitted: 2007-11-25 22:22 UTC Modified: 2009-12-11 16:18 UTC
Votes:17
Avg. Score:4.4 ± 0.8
Reproduced:14 of 16 (87.5%)
Same Version:6 (42.9%)
Same OS:10 (71.4%)
From: nobody at example dot org Assigned:
Status: No Feedback Package: Feature/Change Request
PHP Version: 5.2.5 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: nobody at example dot org
New email:
PHP Version: OS:

 

 [2007-11-25 22:22 UTC] nobody at example dot org
Description:
------------
The regex used in php_filter_validate_email does not permit all valid atom chars from RFC2822 (eg: ASCII 61, 63). 

Reproduce code:
---------------
<?php

$valid="!#$%&'*+-/=.?^_`{|}~@[1.0.0.127]";

echo filter_var($valid, FILTER_VALIDATE_EMAIL)? 'Valid': 'Invalid', "\n";


Expected result:
----------------
Valid

Actual result:
--------------
Invalid

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-11-25 23:46 UTC] nobody at example dot org
--TEST--
RFC2822 conformance for local atoms
--SKIPIF--
<?php if (!extension_loaded("filter")) die("skip"); ?>
--FILE--
<?php
	$var = "!#$%&'*+-/=.?^_`{|}~@[1.0.0.127]";
	var_dump(filter_var($var, FILTER_VALIDATE_EMAIL));
?>
--EXPECT--	
bool(true)


# Apologies for bug spam
 [2007-11-26 11:34 UTC] nobody at example dot org
I may be missing something about the unit tests, following regex update to php_filter_validate_email() will not pass my test case (after doing rm ext/filter/tests/*.o ext/filter/tests/*.lo, clearing .out .log .exp .diff from tests and doing make; make test).

const char regexp[] = "/^((\\\"[^\\\"\\f\\n\\r\\t\\b]+\\\")|([\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}\\=\\?]+(\\.[\\w\\!\\#\\$\\%\\&\\'\\*\\+\\-\\~\\/\\^\\`\\|\\{\\}\\=\\?]+)*))@((\\[(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))\\])|(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))|((([A-Za-z0-9\\-])+\\.)+[A-Za-z\\-]+))$/D";

Yet the equivalent regex works as expected in both PHP and my patched install.

<?php
error_reporting(E_ALL|E_STRICT);

function validate_email($_)
{
  /* Original from PEAR QuickForm Email.php rev: 1.4 */
  $r = '/^((\"[^\"\f\n\r\t\v\b]+\")|([\w\!\#\$\%\&\'\*\+\-\~\/\^\`\|\{\}\=\?]+(\.[\w\!\#\$\%\&\'\*\+\-\~\/\^\`\|\{\}\=\?]+)*))@((\[(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))\])|(((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9]))\.((25[0-5])|(2[0-4][0-9])|([0-1]?[0-9]?[0-9])))|((([A-Za-z0-9\-])+\.)+[A-Za-z\-]+))$/D';

  return (bool) preg_match($r, $_);

}


$test = array('nobody@example.org'=>true,
              '.fails@example.org'=>false,
              "!#$%&'*+-/=.?^_`{|}~@[1.0.0.127]"=>true,
              );

$failed = 0;
$fail = array();

foreach ($test as $k => $v){
  if (!(validate_email($k) === $v)){
    $failed++;
    $fail[].= $k;
  }
}

if ($failed > 0){
  echo "Failed $failed of ",count($test), " tests using PHP func\n";
  print_r($fail);
}
$failed = 0;
$fail = array();

foreach ($test as $k => $v){
  if (!((bool)filter_var($k, FILTER_VALIDATE_EMAIL) == (bool)$v)){
    $failed++;
    $fail[].= $k;
  }
}

if ($failed > 0){
  echo "Failed $failed of ",count($test), " tests using filter func\n";
  print_r($fail);
}
 [2007-11-26 14:23 UTC] nobody at example dot org
Updated test, php_filter_validate_email() returns string on success. Surely bool would be a more appropriate return value for a logic filter? 

Updated regex above fixes the specific issue I was having, I'm uncertain about other edge cases (user\@site@example.org)?

--TEST--
Bug 43402, RFC2822 allows chars (?, =) in dot-atoms
--SKIPIF--
<?php if (!extension_loaded("filter")) die("skip"); ?>
--FILE--
<?php

$var="!#$%&'*+-/=.?^_`{|}~@[1.0.0.127]";
var_dump((bool)filter_var($var, FILTER_VALIDATE_EMAIL));
?>
--EXPECT--      
bool(true)
 [2008-09-16 19:37 UTC] drewish at katherinehouse dot com
The current code also bounces valid email addresses like "foo@localhost". I haven't been able to test out the suggested regex.
 [2008-09-16 20:00 UTC] matt at mattfarina dot com
Please correct me if I'm wrong but isn't localhost an alias and RFC 2822 requires a fully qualified domain name or IP address. That would be the issue with foo@localhost.
 [2008-09-17 12:41 UTC] matt at mattfarina dot com
RFC 2822 allows for email addresses like user@localhost or user@example. But, RFC 2821 (SMTP Standard) does not allow for those. See sections 4.1.2 and 4.13 for more detail.

The question with email addresses is should we support RFC 2822 or 2821? For routing FILTER_VALIDATE_EMAIL currently follows RFC 2821.
 [2008-09-22 16:01 UTC] nobody at example dot org
I see no reason support for hostnames can't be added. 

 filter_var ($addr, FILTER_VALIDATE_EMAIL, FILTER_PERMIT_NON_FQDNS);

That's fine on a LAN and the additional flag stops web miscreants doing what would, if this were the default behaviour, otherwise be inevitable.

Back on topic, FILTER_VALIDATE_EMAIL validates nothing. It fails to ensure an address is syntactically valid. 

<?php

function _ ($_, $inv = false) {
  $bool = (filter_var ($_, FILTER_VALIDATE_EMAIL) === $_);
  echo (($inv)? !$bool: $bool)? 'OK  ': 'ERR ', "$_\n";
}


// RFC2821
// 4.1.2
// Should pass
_ ('escaped\"quote@example.org');

// 4.5.3.1
// should both fail
_ ('this-local-part-is-over-64-chars-in-length-'
  .'and-therefore-not-valid@na.tld', true);
_ ('test@'.str_repeat('d', 256).'.com', true);

// RFC2822 ('=' and '?' still fail as of PHP 5.3.0alpha3-dev) 
_ ("!#$%&'*+-/=.?^_`{|}~@[1.0.0.127]");
 [2008-10-01 10:16 UTC] alexanderpas at yahoo dot co dot uk
RFC5322 is out, which obsoletes RFC2822
http://tools.ietf.org/html/rfc5322
 [2009-04-06 15:59 UTC] damien at mc-kenna dot com
An extremely detailed analysis of the various RFC requirements and 
errata has been compiled, along with CPAL 1.0-licensed code:
http://www.dominicsayers.com/isemail/
If PHP is going to bother having any support for email validation it 
needs to be authoritative rather than "well this should work for most 
uses".
 [2009-04-20 14:29 UTC] dominic dot sayers at gmail dot com
The 228 test cases I collated might help determine which approach to 
follow in resolving this. They can also be found at 
http://www.dominicsayers.com/isemail

Both the test cases and the validation code are now GPL licensed at 
Damien's request.
 [2009-04-20 14:43 UTC] pajoye@php.net
hi,

We can't include GPL code. Is it possible to provide them under BSD?
 [2009-04-28 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2009-10-06 07:27 UTC] hickseydr at optusnet dot com dot au
pajoye: this is just a heads up that Dominic Sayer's is_email code at 
http://code.google.com/p/isemail/source/browse/trunk/is_email.php is 
now licensed under a BSD license.
 [2009-11-10 21:55 UTC] adminekb at mail dot ru
And what about this bug? Can anyone fix it?
 [2009-12-11 16:18 UTC] hm2k@php.net
Please review this article with regards to this issue -> http://www.hm2k.com/posts/what-is-a-valid-email-address
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Nov 26 08:01:30 2024 UTC