|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78043 UTF-8 BOM is carried from an included file into a Header Content
Submitted: 2019-05-20 17:23 UTC Modified: 2019-05-20 17:27 UTC
From: barryd dot it at gmail dot com Assigned:
Status: Duplicate Package: *Unicode Issues
PHP Version: 7.2.18 OS: Windows
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: barryd dot it at gmail dot com
New email:
PHP Version: OS:


 [2019-05-20 17:23 UTC] barryd dot it at gmail dot com
I included a file into my script, and soon found that since that file was a UTF-8 encoded file which means that at the beginning of the file, is a UTF-8 BOM code sitting just before the ...<?php ?> tags, and that the information which is sent with Content-Disposition: attachment;, and no matter what my Content-Type: application/rtf; charset=iso-8859-1//TRANSLIT says, the downloaded content turned out to be UTF-8 instead of ANSI. The string itself was entirely ASCII. What needs to be changed is that BOM code needs to be filtered out of the content, thus ensuring that the content-type can be honored. In order to recreate this scenario, simply change the include.php files' encoding to UTF-8, and use the code mentioned below. And once you save the file sent from the web server to you system, you to will find that the content is UTF-8.

Test script:
$rtf = 'Testing';
header('Content-Type: application/rtf; charset=iso-8859-1//TRANSLIT');
header('Content-Disposition: attachment; filename="' . '10-Jane-Doe' . '.rtf"');
header('Content-Length: ' . strlen($rtf));
header('Expires: Fri, 01 Jan 2010 05:00:00 GMT');
header('Last-Modified: ' . gmdate( 'D, d M Y H:i:s' ) . ' GMT'); 
header('Cache-Control: private, must-revalidate');
header('Pragma: no-cache');
echo $rtf;

Expected result:
I would expect the web server to honor the Content-Type of the string and the web browser honor the Encoding of file being downloaded, and not have php push content  not part of the string into the output buffer.


Actual result:
Testing is being downloaded into the rtf file, thus causing the file to be UTF-8, and then causing the file not to rendered properly with LibreOffice.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2019-05-20 17:26 UTC] spam2 at rhsoft dot net
the BOM is *before* <?php and so sent to the client
after that you can't no longer send headers
don't create files with a BOM - it's that easy
 [2019-05-20 17:27 UTC]
-Status: Open +Status: Duplicate
 [2019-05-20 17:27 UTC]
Note that the UTF-8 spec does not recommend using a BOM in the first place.

Duplicate because the issue of PHP recognizing file encodings, especially BOMs, has been raised many times before. And simply discarding it is not proper.
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Sat Nov 27 12:03:14 2021 UTC