|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2020-10-28 09:22 UTC] jozyah-etienne at eerees dot com
Description: ------------ PHP's default session handler (on all versions so far) do lock files which is good for preventing race conditions, but it's write operation is not atomic. It directly writes data to the destination file, which in case of failure (power outage, disk failure, crash, connection loss to a networked storage, etc) session file will get corrupted. The correct way to write session data to files is writing data to a temporary file (such as /path/to/sessions/sess_id.tmp) and only on success, rename it to needed destination (/path/to/sessions/sess_id). Then if rename was successful, the write operation was successful too, otherwise the old data remains intact. Also temporary file should be in the same destination directory, otherwise if temp file and destination file are in separate devices or mount points, data loss may happen too. In modern operating systems (POSIX OSes and NTFS on Windows 10 1607 and later, and maybe other operating systems), rename is an atomic operation and should be used to minimize failures and mitigate data loss. Garbage collector can remove temporary files in case of failure, or those files can get updated in next write operation. Also there is no need to acquire a lock on temporary files, because whole session operation is already locked. More info in this Stack Overflow's thread: https://stackoverflow.com/questions/64565698/are-php-session-writes-atomic Expected result: ---------------- Session's write operation should be atomic. Actual result: -------------- Session's write operation is not atomic. PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Mon Oct 27 04:00:02 2025 UTC |
> session files are *temporary* data and if you really insist in atomic writes please use a sql-based session handler but don't make my setups slower for cases which never happen at all They are *temporary*, not *cached*. Temporary data are important, cached data are not. You need those temporary data to serve your user, otherwise bad things can happen, from a simple user logout to massive data losses. Also, even cached data are important if a data loss during write could break your production and you are not aware of it. > looks like you never where targetr auf a serious DDOS! > in that case you are graceful for every overhead you can save I just executed the following code: #$i = 0; #$start = microtime(true); #do #{ # rename('./temp.'.$i, './temp.'.($i+1)); #} while($i++ < 100000); #$total = microtime(true) - $start; #var_dump($total); It took less than 1 second to rename a file for 100,000 times in my 7 years old, 2 core, super slow laptop. That's a 0.00001 seconds overhead per request. You know why? because the needed inodes (or whatever needed in the process) are already found and cached by operating system & PHP during write operation. If you really need to optimize your site for 0.00001 seconds, you are long in a wrong way, because definitely PHP is not the right tool for your use case. Also in that case, your hardware infrastructure is also not meeting your needs and spending a couple of $bucks to upgrading your hardware is much better than putting yourself in danger of data loss. > how often do you have poweroutage without UPS, dying disks without RAID and crashes at all on production servers? > > the one case every few decades where *probably* a ssession file my get corrupt don't matter that much You are wrong, it's not about how good your infrastructure is, it's about how reliable your tool is and how much you can trust it. It's all about how important your data is and how often you (as a programmer) know that your data is really important and also you know that the tool you are using is not reliable so you know that you shouldn't trust the tool and you know that you need to handle the scenario all by yourself so in case of data loss, you don't need to bang your head against the wall to find the source of problem. It's all about trusting the tool you are using that it can do it's job, and do it well, and do it in a reliable fashion. And to answer your specific question, yes, data gets lost all the time, even in professional-grade servers in profession-grade data centers, that's why databases are ACID and we use transactions to avoid such marginal, once in a life time disaster cases, otherwise they would be way faster than they are. And to answer your question with other questions: * How often do you have SQL injection or code injection? So why bother and use precious CPU clocks and RAM and Disk space to convert passwords to hashes using slow algorithms? * How often do you hit with a timing attack on a non-real-time, threaded server and operating system? So why bother yourself and use hash_equals() instead of "==="? * How often do you hit with a serious DDoS attack that a 0.00001 seconds long operation becomes so important to you that you prefer to ask for troubles instead of doing jobs in secure, reliable way? I'm saying, most programmers are using standard session handler and they are not aware that data loss and data corruption can happen. This is an important bug that needs to be fixed. If a 0.00001 seconds operation is so important in DDoS attacks, at least a option like session_write_close_atomic() should be added so the programmer can chooses between reliability and 0.00001 seconds optimization.