如何排查MySQL中MDL锁等待

数据库关系型数据库技术服务知识库
问题描述

使用 show full processlist 查看当前数据库正在运行的 SQL 语句,我们可能会发现 State 列为 Waiting for table metadata lock,我们该如何排查此类问题?

问题复现
  1. 在会话 1 中显式开启一个事务,执行一个 update 操作,不进行提交
mysql> begin;
Query OK, 0 rows affected (0.00 sec)


mysql> update t1 set name='liwangz' where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
```
`

2. 在会话 2 中对表执行 DDL 操作:我们会发现操作被阻塞

````undefined
mysql> alter table t1 add index i_c(id);
```
`

3. 使用 `show full processlist` 命令进行排查,会发现 State 字段出现 **Waiting for table metadata lock**

![图片](https://lf3-volc-editor.volccdn.com/obj/volcfe/sop-public/upload_9b15372045d69fcfe68a398c5dfb0577.png)
# 解决方案
### 1. 前提条件

* 排查 MDL 锁依赖于 performance_schema,请提前确认 performance_schema 已经开启
* MDL 锁记录对应的 instruments 为**wait/lock/metadata/sql/mdl**,需要提前开启

### 2. 查看 metadata_lock 表获取锁信息
````undefined
mysql> select * from performance_schema.metadata_locks where OWNER_THREAD_ID!=sys.ps_thread_id(connection_id())\G;
*************************** 1. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: rudonx
          OBJECT_NAME: t1
OBJECT_INSTANCE_BEGIN: 139681262016240
            LOCK_TYPE: SHARED_WRITE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE: sql_parse.cc:5993
      OWNER_THREAD_ID: 48
       OWNER_EVENT_ID: 282
*************************** 2. row ***************************
          OBJECT_TYPE: GLOBAL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 139681127989264
            LOCK_TYPE: INTENTION_EXCLUSIVE
        LOCK_DURATION: STATEMENT
          LOCK_STATUS: GRANTED
               SOURCE: sql_base.cc:5497
      OWNER_THREAD_ID: 54
       OWNER_EVENT_ID: 1064
*************************** 3. row ***************************
          OBJECT_TYPE: SCHEMA
        OBJECT_SCHEMA: rudonx
          OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 139681137621328
            LOCK_TYPE: INTENTION_EXCLUSIVE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE: sql_base.cc:5482
      OWNER_THREAD_ID: 54
       OWNER_EVENT_ID: 1064
*************************** 4. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: rudonx
          OBJECT_NAME: t1
OBJECT_INSTANCE_BEGIN: 139681127961472
            LOCK_TYPE: SHARED_UPGRADABLE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
               SOURCE: sql_parse.cc:5993
      OWNER_THREAD_ID: 54
       OWNER_EVENT_ID: 1066
*************************** 5. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: rudonx
          OBJECT_NAME: t1
OBJECT_INSTANCE_BEGIN: 139681137618432
            LOCK_TYPE: EXCLUSIVE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: PENDING
               SOURCE: mdl.cc:3891
      OWNER_THREAD_ID: 54
       OWNER_EVENT_ID: 1137
5 rows in set (0.00 sec)
```
`
输出中共有 5 行数据,其中我们可以发现第一行为 t1 表的SHARED_WRITE 锁,编号为 48 的线程所持有(GRANTED)
后续 4 行中体现了 t1 表的 SHARED_UPGRADABLE、EXCLUSIVE 锁,其中 SHARED_UPGRADABLE 处于 GRANTED 状态,EXCLUSIVE 处于 PENDING 状态,编号为 54 的线程在等待 MDL 锁。
### 3. 查询 events_statements_current 获取查询语句
从 performance_schema.events_statements_current 表来查询某个线程最后一次执行的语句:
````undefined
mysql> select * from performance_schema.events_statements_current where thread_id=48\G;
*************************** 1. row ***************************
              THREAD_ID: 48
               EVENT_ID: 403
           END_EVENT_ID: 457
             EVENT_NAME: statement/sql/update
                 SOURCE: socket_connection.cc:101
            TIMER_START: 8819577691629633000
              TIMER_END: 8819577691947950000
             TIMER_WAIT: 318317000
              LOCK_TIME: 160000000
               SQL_TEXT: update t1 set name='liwangz' where id=1
                 DIGEST: 3c26d58f8e27a87167a1237254ff60c5
            DIGEST_TEXT: UPDATE `t1` SET NAME = ? WHERE `id` = ?
         CURRENT_SCHEMA: rudonx
            OBJECT_TYPE: NULL
          OBJECT_SCHEMA: NULL
            OBJECT_NAME: NULL
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: 00000
           MESSAGE_TEXT: Rows matched: 1  Changed: 0  Warnings: 0
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 0
          ROWS_EXAMINED: 1
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
 SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 0
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 0
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: NULL
     NESTING_EVENT_TYPE: NULL
    NESTING_EVENT_LEVEL: 0
1 row in set (0.00 sec)

ERROR:
No query specified

```
`
通过上述 SQL 回显的信息,我们可以看到相关 SQL 文本的信息。如果我们单纯去查询 `show full processlist`,不会显示 SQL 文本,会给定位问题带来一定的困扰。同样的去查询 `information_schema.innodb_trx\G`,我们只能看到有一个未提交的事务,也不会有文本信息。
````undefined
mysql> select * from information_schema.innodb_trx\G;
*************************** 1. row ***************************
                    trx_id: 36257
                 trx_state: RUNNING
               trx_started: 2021-07-06 14:35:27
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 3
       trx_mysql_thread_id: 22
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 1
          trx_lock_structs: 2
     trx_lock_memory_bytes: 1136
           trx_rows_locked: 1
         trx_rows_modified: 1
   trx_concurrency_tickets: 0
       trx_isolation_level: READ COMMITTED
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)

ERROR:
No query specified
```
`
### 4. kill 掉相关线程
通过 `performance_schema.events_statements_current` 表中的 THREAD_ID 字段值来获取 processlist ID,相关语句如下:
````undefined
mysql> select THREAD_ID,PROCESSLIST_ID from performance_schema.threads where THREAD_ID=48;
+-----------+----------------+
| THREAD_ID | PROCESSLIST_ID |
+-----------+----------------+
|   48      |      22        |
+-----------+----------------+
1 row in set (0.03 sec)
```
`
当和业务确认之后,可以执行如下操作来 kill 掉相关连接,将锁释放。
````undefined
mysql> kill 22;
Query OK, 0 rows affected (0.06 sec)
```
`
上述所有 SQL 语句的排查目的在于发现到底是什么语句持有 MDL 锁而引起阻塞,如果想快速恢复业务,可以使用如下方式:
````undefined
SELECT
   concat( 'kill ', p1.id, ';' ) 
FROM
   information_schema.PROCESSLIST p1,
   (
   SELECT
      id,
      time 
   FROM
      information_schema.PROCESSLIST 
   WHERE
      time = ( SELECT max( time ) FROM information_schema.PROCESSLIST WHERE state = 'Waiting for table metadata lock'
      )) p2 
WHERE
   p1.time >= p2.time 
   AND p1.command IN ( 'Sleep', 'Query' ) 
   AND p1.id NOT IN ( connection_id(), p2.id );
```
`
# 参考文档
[1] [https://dev.mysql.com/doc/refman/5.7/en/metadata-locking.html](https://dev.mysql.com/doc/refman/5.7/en/metadata-locking.html)
**如果您有其他问题,欢迎您联系火山引擎**[技术支持服务](https://console.volcengine.com/ticket/createTicketV2/)

0
0
0
0
评论
未登录
看完啦,登录分享一下感受吧~
暂无评论