问题描述
在真实的生产环境中,死锁问题是非常常见的,我们应该如何排查此类问题并进行规避?
问题分析
表结构与测试数据:
CREATE TABLE `dead_lock` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`col1` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `kcol1` (`col1`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8
mysql> select * from dead_lock;
+----+------+
| id | col1 |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
+----+------+
4 rows in set (0.03 sec)
事务的执行顺序如下:
步骤 | transaction_1 | transaction_2 |
---|---|---|
1 | begin; | |
2 | delete from dead_lock where col1=2; | |
3 | begin; | |
4 | delete from dead_lock where col1=2; 出现等待 | |
5 | insert into dead_lock(col1) values(2); | |
6 | ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction | |
7 | thread_id=3301099 | thread_id= 3300738 |
使用 show engine innodb status 打出最近一次死锁信息:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2022-02-10 16:31:21 0x7fe759f2f700
*** (1) TRANSACTION:
TRANSACTION 22150690, ACTIVE 34 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 3301099, OS thread handle 140632233637632, query id 169406906 52.82.38.248 52.82.38.248 rudonx updating
delete from dead_lock where col1=2
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 303 page no 4 n bits 72 index kcol1 of table `rudonx`.`dead_lock` trx id 22150690 lock_mode X locks rec but not gap waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 4; hex 80000002; asc ;;
1: len 4; hex 80000002; asc ;;
*** (2) TRANSACTION:
TRANSACTION 22150533, ACTIVE 113 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 3300738, OS thread handle 140631623268096, query id 169409104 52.82.38.248 52.82.38.248 rudonx update
insert into dead_lock(col1) values(2)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 303 page no 4 n bits 72 index kcol1 of table `rudonx`.`dead_lock` trx id 22150533 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 4; hex 80000002; asc ;;
1: len 4; hex 80000002; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 303 page no 4 n bits 72 index kcol1 of table `rudonx`.`dead_lock` trx id 22150533 lock mode S waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 4; hex 80000002; asc ;;
1: len 4; hex 80000002; asc ;;
*** WE ROLL BACK TRANSACTION (1)**
从上面的输出中,我们可以得出如下信息:
- transaction 2 ( thread_id =3300738 ) 持有的锁为 lock_mode X locks rec but not gap。
- transaction 1 (thread_id = 3301099 ) 请求的锁为 lock_mode X locks rec。
- 同时transaction 2 ( thread_id =3300738 )请求的锁为 lock mode S waiting Record lock。
- 出现死锁之后,MySQL 选择回滚掉transaction 1,因为回滚这个事务的代价是最小的。
解决方案
死锁产生会有如下条件:
- 两个或者两个以上的事务。
- 每个事务已经持有锁,并希望申请新的锁。
- 事务与事务由于持有锁和申请锁导致形成循环等待。
从这些条件入手,我们会得出一些解决死锁问题的办法。这里有一些建议,供您参考:
- 事务要及时提交,避免长事务,复杂事务。
- 调整优化业务逻辑,优化事务加锁的顺序。
- 事务尽量要小,事务时间变长则导致事务中锁的持有时间变长,影响整体的数据库吞吐量。
同时,还有如下最佳实践供您参考:
- 可以使用比较低的隔离级别,RC 隔离级别在锁冲突方面优于RR,避免RR模式带来的gap锁竞争。
- 为表添加合适的索引,避免全表扫描。
- 开启 innodb_print_all_deadlocks,将死锁日志写入到MySQL error log中,以便后续的分析。
- 开启 innodb_status_output_locks 获详细的事务和死锁信息。
如果您有其他问题,欢迎您联系火山引擎技术支持服务