RDS for MySQL 如何查看持有表级锁的线程?

数据库关系型数据库技术服务知识库
问题描述

经常遇到 MySQL 因为表级锁导致的 SQL 阻塞问题,那么如何找到 MySQL 中被表级锁占用的线程和具体SQL内容?

问题分析

如果需要分析查看持有表级锁的线程及执行 SQL,需要开启 performance_schema 库并开启表级锁对应的instruments(wait/lock/metadata/sql/mdl),全局配置项 global_instrumentation 控制着 setup_consumers 的记录,所以也需要开启。

问题复现

Session 1:

mysql> begin;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into test values(2,2,2,'22',2);
Query OK, 1 row affected (0.00 sec)

mysql> select * from test;
+---------+------+-------+------+----------+
| auto_id | code | times | name | test_int |
+---------+------+-------+------+----------+
|       1 |    1 |     1 | 1    |     NULL |
|       2 |    2 |     2 | 22   |        2 |
|       6 |    6 |     6 | 66   |     NULL |
+---------+------+-------+------+----------+
3 rows in set (0.00 sec)

session 2:

mysql> alter table test drop column test_int;

session 3:

1. 查看当前 User 为 lxb 线程信息

mysql> SELECT * FROM information_schema.processlist where User='lxb';
+---------+------+---------------------+--------------------+---------+------+---------------------------------+----------------------------------------------------------------------+
| ID      | USER | HOST                | DB                 | COMMAND | TIME | STATE                           | INFO                                                                 |
+---------+------+---------------------+--------------------+---------+------+---------------------------------+----------------------------------------------------------------------+
| 4005295 | lxb  | 192.168.0.128:52830 | lxb                | Query   |  437 | Waiting for table metadata lock | /*rdsproxy:lxb@192.168.0.128*/ alter table test drop column test_int |
| 4003183 | lxb  | 192.168.0.128:52870 | information_schema | Query   |    0 | executing                       | SELECT * FROM information_schema.processlist where User='lxb'        |
| 4002776 | lxb  | 192.168.0.128:52132 | lxb                | Sleep   |  498 |                                 | NULL                                                                 |
+---------+------+---------------------+--------------------+---------+------+---------------------------------+----------------------------------------------------------------------+
3 rows in set (0.00 sec)

可以看到alter table test drop column test_int 的状态是 Waiting for table metadata lock

2. 查看 test 表的 MDL 锁信息

mysql> select * from performance_schema.metadata_locks where OBJECT_TYPE='TABLE' and OBJECT_NAME='test'\G
*************************** 1. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: lxb
          OBJECT_NAME: test
OBJECT_INSTANCE_BEGIN: 140015122787616
            LOCK_TYPE: SHARED_WRITE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE: 
      OWNER_THREAD_ID: 4002796
       OWNER_EVENT_ID: 28
*************************** 2. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: lxb
          OBJECT_NAME: test
OBJECT_INSTANCE_BEGIN: 140015061993184
            LOCK_TYPE: SHARED_UPGRADABLE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE: 
      OWNER_THREAD_ID: 4005315
       OWNER_EVENT_ID: 7
*************************** 3. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: lxb
          OBJECT_NAME: test
OBJECT_INSTANCE_BEGIN: 140014953141584
            LOCK_TYPE: EXCLUSIVE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: PENDING
               SOURCE: 
      OWNER_THREAD_ID: 4005315
       OWNER_EVENT_ID: 7
3 rows in set (0.00 sec)

可以看到 OWNER_THREAD_ID为 4005315 的线程被 OWNER_THREAD_ID 为 4002796 的线程阻塞 处于 Pending 状态。

3.如何知道内部ID 4002796 对应的 process id是多少呢?

我们可以通过 performance_schema.threads 表查询

select processlist_id from performance_schema.threads where thread_id=4002796;
+----------------+
| processlist_id |
+----------------+
|  4002776 |
+----------------+
1 row in set (0.00 sec)

4. 排查具体的 SQL 语句

通过上述查询到的相关数据,4002776 的 processlist_id 对 sbtest1 表显式加了表级读锁,而且长时间处于 sleep 状态,但我们并不知道该线程正在执行什么 SQL 语句,我们可以通过 performance_schema.events_statements_current 表查询,如下:

# 查看当前事务正在执行的 SQL 
mysql> select * from performance_schema.events_statements_current where thread_id=4002796\G;
*************************** 1. row ***************************
              THREAD_ID: 4002796
               EVENT_ID: 27
           END_EVENT_ID: 27
             EVENT_NAME: statement/sql/insert
                 SOURCE: 
            TIMER_START: 1696627893207823000
              TIMER_END: 1696627893378879000
             TIMER_WAIT: 171056000
              LOCK_TIME: 57000000
               SQL_TEXT: insert into test values(2,2,2,'22',2)
                 DIGEST: 8791a6928992bb852a2a054cb32a5437
            DIGEST_TEXT: INSERT INTO `test` VALUES (...) 
         CURRENT_SCHEMA: lxb
            OBJECT_TYPE: NULL
          OBJECT_SCHEMA: NULL
            OBJECT_NAME: NULL
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: 00000
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 1
              ROWS_SENT: 0
          ROWS_EXAMINED: 0
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
 SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 0
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 0
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: NULL
     NESTING_EVENT_TYPE: NULL
    NESTING_EVENT_LEVEL: 0
解决方案

在确认了阻塞线程和具体的 SQL 后, 我们需要和业务确认是否可以手动杀掉任务,或者将其提交

mysql> kill   4002776

杀掉进程之后观察 session 1 中的数据
mysql> select * from test;
+---------+------+-------+------+
| auto_id | code | times | name |
+---------+------+-------+------+
|       1 |    1 |     1 | 1    |
|       6 |    6 |     6 | 66   |
+---------+------+-------+------+
2 rows in set (0.00 sec)

发现 auto_id 为2 的数据被回滚, drop 语句正常执行,test_int 字段被删除。

业务一定要确认清楚,才可以 kill 掉线程,线上请谨慎操作。

如果您有其他问题,欢迎您联系火山引擎技术支持服务

0
0
0
0
评论
未登录
看完啦,登录分享一下感受吧~
暂无评论