php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66616 R/W split fails: QOS with mysqlnd_ms_get_last_gtid with Built IN MYSQL 6 GTID
Submitted: 2014-01-31 10:32 UTC Modified: 2014-02-13 10:42 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: a dot devalbray at gmail dot com Assigned:
Status: Wont fix Package: mysqlnd_ms (PECL)
PHP Version: 5.4.24 OS: Centos 6
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2014-01-31 10:32 UTC] a dot devalbray at gmail dot com
Description:
------------
From http://fr2.php.net/manual/de/function.mysqlnd-ms-get-last-gtid.php

I try to use read your writes service level consitency.
I successfully set up replication with Mysql 6 with gtid_mode enabled.
i use built in gtid to check for last gtid with mysqlnd_ms.

To check latest executed gtid, and verify consitency, mysqlnd_ms queries are (in conf file):
 "global_transaction_id_injection":{
            "fetch_last_gtid" : "SELECT @@GLOBAL.GTID_EXECUTED AS trx_id FROM DUAL",
            "check_for_gtid" : "SELECT GTID_SUBSET('#GTID', @@GLOBAL.GTID_EXECUTED) AS trx_id FROM DUAL",
            "report_error":true
                                    }

At first, load balancing is done just right. mysqlnd_ms_get_last_gtid works just fine.
However, as time goes on... It fails.

This is why. When you execute Show slave status, it indicates:
 Retrieved_Gtid_Set: c23bc1c6-6739-11e3-a3f9-002590829ea0:1-1218510
            Executed_Gtid_Set: c23bc1c6-6739-11e3-a3f9-002590829ea0:1-557510:557512-799729:799731-1156432:1156436-1159353:1159355-1159359:1159361-1159362:1159364-1160126:1160128-1160130:1160132:1160134-1218510

=> Replication still works. But mysqlnd_ms_get_last_gtid do not check for Executed_Gtid properly due to the number between the "1-xx:xx-1218510


check_for_gtid query is not done properly I think.

Test script:
---------------
{
    "myapp": {
        "master": {
            "master_0": {
                "host": "localhost",
                "port": "3306"
            }
        },
        "slave": {
            "slave_0": {
                "host": "xx.xxx.xxx.xxx",
                "port": "3306"}


                 },
        "global_transaction_id_injection":{
            "fetch_last_gtid" : "SELECT @@GLOBAL.GTID_EXECUTED AS trx_id FROM DUAL",
            "check_for_gtid" : "SELECT GTID_SUBSET('#GTID', @@GLOBAL.GTID_EXECUTED) AS trx_id FROM DUAL",
            "report_error":true
        },
        "server_charset": "utf8",
        "failover": {"strategy": "master"}
        }
}



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-02-12 15:13 UTC] uw@php.net
On the first read, I fail to understand the issue.

GTIDs are server generated. All a client, including mysqlnd_ms, can do is read the set of GTIDs and compare. For comparison MySQL provides a SQL function. If either the set of GTIDs reported by the server or the SQL function is faulty, this is not a client matter.

Are you saying the comparison logic used by mysqlnd_ms is wrong?
 [2014-02-12 15:43 UTC] uw@php.net
Without knowing the GTID sets of both the master and the slave, I can only speculate that the issue is around the gaps in the slave's GTID set. It could be the slave has not executed all transactions run on the master (for whatever reason). A possible scenario to provoke such a situation is the use of replicate-do-db or any other filter.

Armed with nothing but the full set of GTIDs executed on the master to identify the last transaction run on the master, the clients test of the slaves status is very coarse. When the client runs a trx on the master, the masters GTID set/trx counter increases. This may be a transaction, say id=1, may not be replicated to the slave because of a filter. Then, the next transaction run on the master, id=2, is one what shall be replicated to the slave. At this point, if you want read-your-writes and try to use GTIDs for searching a matching slave, you are stuck. The master never reports the id of the trx in question. Instead, it says "I've run id=1...2". Then you go to the slaves and ask the wrong question which is "did you run id=1..2". Due to the filter none of the slaves has id=1...2. The slave trx histories would show gaps. One slave would show, for example, id=1...2,4,7...9. Another slave may report id=3,5..6,10. 

As you slaves trx history shows gaps I could envision that as a reason.

I should probably add a big, big warning to the docs. I know the idea of using GTIDs for such "read your write" games have bubbled up to the MySQL replication folks but if and when they will ever... until then, I should probably mark this ms feature "instable" (with instable meaning server does not properly support it).
 [2014-02-12 15:43 UTC] uw@php.net
-Status: Open +Status: Feedback
 [2014-02-12 16:47 UTC] a dot devalbray at gmail dot com
-Status: Feedback +Status: Open
 [2014-02-12 16:47 UTC] a dot devalbray at gmail dot com
Hi,

Thank you for your answer.

Master GTID executed was: c23bc1c6-6739-11e3-a3f9-002590829ea0:1-1218510
Slave Executed_Gtid_Set: c23bc1c6-6739-11e3-a3f9-002590829ea0:1-557510:557512-799729:799731-1156432:1156436-1159353:1159355-1159359:1159361-1159362:1159364-1160126:1160128-1160130:1160132:1160134-1218510

Which means that when the client tries to check if slave has executed master's gtid properly...the answer can only be NO! Even if, as you can see GTID 1218510 has been properly executed on both server.

You are right to say that we are using replicate-do-db And binlog-do-db.

Your answer is quite clever and i think you are right.

Check_for_gtid SELECT should take into account that their might be gaps between executed gtids. it should only check for the last gtid's number executed (in the example, 1218510), And return true.

This ms feature seems indeed unstable.
Thank you for your reply,
 [2014-02-13 10:42 UTC] uw@php.net
-Status: Open +Status: Wont fix
 [2014-02-13 10:42 UTC] uw@php.net
Trouble is, a client cannot do any better currently. In order to do better the minimum requirement on the server is to give the GTID of the last transaction, and only that GTID. At the time of writing, there's no way to obtain this very information. 

There are additional cases to consider to provide the full range of possible consistencies and the current approach in the ms code may not cover them. However, what's there in the code is the best possible.  

As said, I'll update the docs accordingly.

It's a can't fix, no idea what status to use. Setting to won't fix...
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Apr 24 10:01:31 2024 UTC