MySQL XA transaction processing has a series of pitfalls and issues that make it NOT crash safe nor fault tolerant. In this talk the author would like to share his insights, findings and analysis about such issues and MySQL transaction processing in general, and how he and his team solved all these issues and made MySQL XA transaction processing really crash safe and fault tolerant, which is crucial to use MySQL as storage nodes of a distributed DBMS such as Kunlun.
In an era of data explosion, more and more database users will need distributed databases to scale out ever increasing data management needs. From the author's past decade of database kernel development experience in Oracle and Tencent, he knows that although MySQL is an excellent DBMS in itself, its storage capacity has a limit of a couple of terabytes. Beyond that, users would need table sharding to scale to multiple MySQL storage clusters to get good performance. And using MySQL as part of a distributed database will be more and more pervasive in the future, thus it's crucial that the MySQL XA transaction processing issues be well understood and solved.
In the author’s former work in Tencent TDSQL development team, he took the initiative to evolve TDSQL from a simple table sharding solution to a full-blown distributed DBMS which handles distributed transaction and query processing, among others, and TDSQL since then has been widely used inside Tencent and in Tencent’s public cloud service. In this work they solved a series of MySQL-5.7 XA transaction processing issues and pitfalls, so that TDSQL is made crash safe and fault tolerant. And they reported such XA crash-safety bugs to MySQL official team and contributed the initial patches too.
Since Aug 2019, the author left Tencent and started to develop Kunlun distributed DBMS, which uses MySQL 8.0 as storage nodes. Based on his previous experiences and knowledge of using MySQL-5.7 in TDSQL, he and his team implemented a series of features and functionality to make MySQL 8.0 XA transaction processing crash safe and fault tolerant. Note that community MySQL XA modules lack such guarantees.
In this session the author would like to share the insights and findings of some of the XA transaction processing issues&pitfalls, as well as the analysis and solution.