php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #45675 The documentation for ord() is inaccurate
Submitted: 2008-08-01 14:39 UTC Modified: 2008-08-04 13:01 UTC
From: mtesta at money-media dot com Assigned:
Status: Not a bug Package: Documentation problem
PHP Version: Irrelevant OS: Windows XP
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: mtesta at money-media dot com
New email:
PHP Version: OS:

 

 [2008-08-01 14:39 UTC] mtesta at money-media dot com
Description:
------------
The documentation for ord() is inaccurate. From my tests ord() returns the ANSI (Windows-1252) value of character. Bellow is a simple test to confirm this. "?" is not in the ASCII character set but the value of "?" in the ANSI character set is 154.

Reproduce code:
---------------
ord('?');

Actual result:
--------------
154

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-08-04 12:42 UTC] rquadling@php.net
At the windows command line ...

php -r "echo ord('š');"

I get 154.

If the file is encoded Windows 1252 I get 154.

Latin small letter s with caron. 

If the file is encoded UTF-8, I get 197 as ord() is working on binary strings and š has a binary value of 0xC5 0xA1 an 0xC5 = 197. (This translates to U+0161 according to Charmap).

In PHP 5, ord() currently works on binary strings. So, depending upon the encoding used, the binary string may start whatever is appropriate.

There is a user note on ord() with regards to getting a value of a unicode "character".

This returns a value of 353 which, as if by magic, has a value of 0x0161.

 [2008-08-04 12:47 UTC] mtesta at money-media dot com
That's exactly my point. The PHP documentation (http://us.php.net/manual/en/function.ord.php) says: "Returns the ASCII value of the first character of string". This is not true, as we have both shown, and should be updated appropriately.
 [2008-08-04 12:54 UTC] rquadling@php.net
The only thing that is potentially missing is that you will get a different response depending upon the encoding of the source file.

If you use UTF-8, the first byte (which in PHP5 and lower is synonymous with character) has a value of 197.

If you are using Windows-1252, then the character has a value of 154.

Both are correct as they are in effect different strings.

If you are needing UTF-8 string processing, then the functions mentioned in the user notes as well as other multi-byte string libraries are available.


 [2008-08-04 13:01 UTC] mtesta at money-media dot com
I'm not talking about the user notes or about the accuracy of the function. The function works absolutely fine. In fact, I'm glad it works the way it does. I'm just saying that the documentation is miss leading. The ASCII character set only goes from 0 to 127. When I read the documentation and it tells me that ord returns the ASCII value that tells me to only expect a return value between 0 and 127 (or 32 to 126 for printable characters).

I have no problems with the way the ord function works. Only the documentation does not reflect what the function does.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat May 18 10:01:32 2024 UTC