PHP :: Doc Bug #30801 :: imagettftext 'text' parameter wrongly describes Unicode character entities

Doc Bug #30801	imagettftext 'text' parameter wrongly describes Unicode character entities
Submitted:	2004-11-15 20:44 UTC	Modified:	2004-12-29 12:20 UTC
From:	andy at andyh dot co dot uk	Assigned:
Status:	Closed	Package:	Documentation problem
PHP Version:	5.0.2	OS:	n/a
Private report:	No	CVE-ID:	None

View Developer Edit

Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.

Password:

Status:
Package:
Bug Type:
Summary:
From:	andy at andyh dot co dot uk
New email:
PHP Version:		OS:

New Comment:

[2004-11-15 20:44 UTC] andy at andyh dot co dot uk

Description:
------------
The description of the 'text' parameter in imagettftext wrongly describes decimal numeric character entities as "UTF-8 character sequences", whereas the function actually accepts BOTH of UTF-8 character sequences, and decimal numeric character entities.

http://uk2.php.net/imagettftext
"
text
The text string.

May include any UTF-8 character sequences (of the form: &#123;) to access characters in a font beyond the first 255
"

&#123; is not a UTF-8 character sequence, it is a decimal numeric character reference as per section 5.3.1 of HTML 4.0.1.

http://www.w3.org/TR/html4/charset.html#h-5.3.1

This has nothing to do with the UTF-8 encoding of Unicode characters; the reference is entirely ASCII, and the character it refers to is the code point in the Unicode character set - not the corresponding UTF-8 encoding of that character.

Also the example value "123" given is not past the first 255 characters in the font.

imagettftext passes the string to GD, which expects a UTF-8 encoded string according to its documentation. So, any character above 127 will be a malformed UTF-8 string.

See: http://www.boutell.com/gd/manual2.0.33.html#gdImageStringFT
"
The null-terminated string argument is considered to be encoded via the UTF_8 standard; also, HTML entities are supported, including decimal, hexadecimal, and named entities (2.0.26).
"

Given characters in the 128-255 range, which are not valid UTF-8 single byte characters, it appears to fall back to using them as a single-byte character, but this is not documented.

Suggest the description be changed to something like:

"
text
The text string, encoded in UTF-8.

May include decimal numeric character references (of the form: &#8364;) to access characters in a font beyond position 127.
"

(The value 8364 is the Euro symbol)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2004-11-16 10:33 UTC] vrana@php.net

This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.

[2004-12-19 18:41 UTC] andy at andyh dot co dot uk

The updated documentation has the following problems:

(1) Does not mention UTF-8 at all, despite this one of the major points, that it can accept UTF-8 encoded Unicode strings directly.

(2) The example character reference following the words "of the form:" appears as three garbled characters "€"; was the leading '&' not escaped as '&amp;' so that it appears as a literal & on the page?

Thanks.

[2004-12-29 12:20 UTC] vrana@php.net

This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.

	php.net \| support \| documentation \| report a bug \| advanced search \| search howto \| statistics \| random bug \| login
go to bug id or search bugs for


Copyright © 2001-2025 The PHP Group All rights reserved.	Last updated: Tue Jul 08 22:01:31 2025 UTC