如何排查与分析MySQL死锁问题 - 文章 - 开发者社区

问题描述

在真实的生产环境中，死锁问题是非常常见的，我们应该如何排查此类问题并进行规避？

问题分析

表结构与测试数据：

CREATE TABLE `dead_lock` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col1` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `kcol1` (`col1`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8

mysql> select * from dead_lock;
+----+------+
| id | col1 |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
+----+------+
4 rows in set (0.03 sec)

事务的执行顺序如下：

步骤	transaction_1	transaction_2
1		begin;
2		delete from dead_lock where col1=2;
3	begin；
4	delete from dead_lock where col1=2; 出现等待
5		insert into dead_lock(col1) values(2);
6	ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
7	thread_id=3301099	thread_id= 3300738

使用 show engine innodb status 打出最近一次死锁信息：

------------------------
LATEST DETECTED DEADLOCK
------------------------
2022-02-10 16:31:21 0x7fe759f2f700
*** (1) TRANSACTION:
TRANSACTION 22150690, ACTIVE 34 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 3301099, OS thread handle 140632233637632, query id 169406906 52.82.38.248 52.82.38.248 rudonx updating
delete from dead_lock where col1=2
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 303 page no 4 n bits 72 index kcol1 of table `rudonx`.`dead_lock` trx id 22150690 lock_mode X locks rec but not gap waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 4; hex 80000002; asc     ;;
 1: len 4; hex 80000002; asc     ;;

*** (2) TRANSACTION:
TRANSACTION 22150533, ACTIVE 113 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 3300738, OS thread handle 140631623268096, query id 169409104 52.82.38.248 52.82.38.248 rudonx update
insert into dead_lock(col1) values(2)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 303 page no 4 n bits 72 index kcol1 of table `rudonx`.`dead_lock` trx id 22150533 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 4; hex 80000002; asc     ;;
 1: len 4; hex 80000002; asc     ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 303 page no 4 n bits 72 index kcol1 of table `rudonx`.`dead_lock` trx id 22150533 lock mode S waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 4; hex 80000002; asc     ;;
 1: len 4; hex 80000002; asc     ;;

*** WE ROLL BACK TRANSACTION (1)**

从上面的输出中，我们可以得出如下信息：

transaction 2 ( thread_id =3300738 ) 持有的锁为 lock_mode X locks rec but not gap。
transaction 1 (thread_id = 3301099 ) 请求的锁为 lock_mode X locks rec。
同时transaction 2 ( thread_id =3300738 )请求的锁为 lock mode S waiting Record lock。
出现死锁之后，MySQL 选择回滚掉transaction 1，因为回滚这个事务的代价是最小的。

解决方案

死锁产生会有如下条件：

两个或者两个以上的事务。
每个事务已经持有锁，并希望申请新的锁。
事务与事务由于持有锁和申请锁导致形成循环等待。

从这些条件入手，我们会得出一些解决死锁问题的办法。这里有一些建议，供您参考：

事务要及时提交，避免长事务，复杂事务。
调整优化业务逻辑，优化事务加锁的顺序。
事务尽量要小，事务时间变长则导致事务中锁的持有时间变长，影响整体的数据库吞吐量。

同时，还有如下最佳实践供您参考：

可以使用比较低的隔离级别，RC 隔离级别在锁冲突方面优于RR，避免RR模式带来的gap锁竞争。
为表添加合适的索引，避免全表扫描。
开启 innodb_print_all_deadlocks，将死锁日志写入到MySQL error log中，以便后续的分析。
开启 innodb_status_output_locks 获详细的事务和死锁信息。

如果您有其他问题，欢迎您联系火山引擎技术支持服务