Li_阴宅

这个屌丝很懒，什么也没留下！

热门标签

MySQL性能优化[实践篇]-使用B树索引_mysql 没写using btree

作者：Li_阴宅 | 2024-08-16 05:37:50

踩

mysql 没写using btree

系列文章：

准备测试数据

create table tb_user(
  id int auto_increment primary key,
  name varchar(10),
  birth date
);
1
2
3
4
5

使用存储过程创建测试数据，不过在这之前我们先创建两个工具函数。

mysql里面默认以;作为一条完整语句的终结符，为了不与函数和存储过程中的;冲突，我们要提前使用delimiter命令将命令结束符修改成其他符号。

比如用delimiter $$可以将结束符修改为$$。

在定义完函数后用$$将缓冲区中的命令发送到服务器处理，然后有必要的话建议把终结符设置回默认的;

delimiter $$
# 生成随机字符串
create function rand_str(n int) returns varchar(10)
begin
  declare CHARS char(52) default 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
  declare result varchar(255) default '';
  declare i int default 1;
  while i < n do
    set result = concat(result, substr(CHARS,floor(1+RAND()*52), 1));
    set i = i + 1;
  end while;
  return result;
end
$$

# 生成随机数(i <= R < j)
create function rand_num(i int, j int) returns int
begin
  return floor(i + rand() * (j - i));
end
$$

delimiter ;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

测试数据存储过程：

delimiter $$
create procedure insert_tb_user(c int)
begin
  declare start_date int default TO_DAYS(STR_TO_DATE('1970-01-1','%Y-%m-%e'));
  declare end_date int default TO_DAYS(CURDATE());
  declare i int default 0;
  set autocommit = 0;
  repeat
    set i = i + 1;
    insert into tb_user(name,birth) values(rand_str(10),FROM_DAYS(rand_num(start_date,end_date)));
  until i = c end repeat;
  set autocommit = 1;
end
$$
delimiter ;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

调用存储过程生成一百万条随机数据（可能需要两三分钟）：

call insert_tb_user(1000000);
1

未创建索引

为了能更细致的看到每个查询的耗时情况，我们把profiling打开。

mysql> set profiling=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
1
2

使用主键`id`查询

mysql> select * from tb_user where id=100000;
+--------+-----------+------------+
| id     | name      | birth      |
+--------+-----------+------------+
| 100000 | UZlihlHEY | 1975-10-09 |
+--------+-----------+------------+
1 row in set (0.00 sec)

mysql> show profiles;
+----------+------------+---------------------------------------+
| Query_ID | Duration   | Query                                 |
+----------+------------+---------------------------------------+
|        1 | 0.00048275 | select * from tb_user where id=100000 |
+----------+------------+---------------------------------------+
1 row in set, 1 warning (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

使用id查询耗时不到0.5毫秒(不同机器肯定会有所区别)。

MySQL默认创建的主键索引就是B树索引，所以主键查询速度会很快。

# tb_user目前只有一个主键索引
mysql> show index from tb_user\G
*************************** 1. row ***************************
        Table: tb_user
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: id     # id列
    Collation: A
  Cardinality: 977600
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE  # 主键索引也是B树索引
      Comment:
Index_comment:
1 row in set (0.00 sec)

mysql> explain select * from tb_user where id=100000\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tb_user
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY  # 用到了主键索引
      key_len: 4
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

关于主键还有很多内容，会在后续文章中介绍。

使用`name`字段查询

mysql> select * from tb_user where name='UZlihlHEY';
+--------+-----------+------------+
| id     | name      | birth      |
+--------+-----------+------------+
| 100000 | UZlihlHEY | 1975-10-09 |
+--------+-----------+------------+
1 row in set (0.42 sec)

mysql> show profiles;
+----------+------------+-----------------------------------------------+
| Query_ID | Duration   | Query                                         |
+----------+------------+-----------------------------------------------+
|        1 | 0.00048275 | select * from tb_user where id=100000         |
|        2 | 0.00061350 | show index from tb_user                       |
|        3 | 0.00052125 | explain select * from tb_user where id=100000 |
|        4 | 0.42458000 | select * from tb_user where name='UZlihlHEY'  |
+----------+------------+-----------------------------------------------+
4 rows in set, 1 warning (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

使用id查询耗时达到0.42秒，与主键查询相比速度慢了近千倍。

我们可以用show profile比较两个查询，看看name字段查询慢在哪里。

mysql> show profile for query 1;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000106 |
| checking permissions | 0.000008 |
| Opening tables       | 0.000025 |
| init                 | 0.000031 |
| System lock          | 0.000016 |
| optimizing           | 0.000013 |
| statistics           | 0.000076 |
| preparing            | 0.000013 |
| executing            | 0.000002 |
| Sending data         | 0.000016 |
| end                  | 0.000004 |
| query end            | 0.000007 |
| closing tables       | 0.000010 |
| freeing items        | 0.000118 |
| cleaning up          | 0.000040 |
+----------------------+----------+
15 rows in set, 1 warning (0.00 sec)

mysql> show profile for query 4;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000073 |
| checking permissions | 0.000008 |
| Opening tables       | 0.000018 |
| init                 | 0.000030 |
| System lock          | 0.000011 |
| optimizing           | 0.000010 |
| statistics           | 0.000018 |
| preparing            | 0.000013 |
| executing            | 0.000002 |
| Sending data         | 0.424281 |  # 这个阶段最耗时
| end                  | 0.000013 |
| query end            | 0.000009 |
| closing tables       | 0.000009 |
| freeing items        | 0.000071 |
| cleaning up          | 0.000016 |
+----------------------+----------+
15 rows in set, 1 warning (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

不要被Sending data的字面意思给误导了，它并不是指网络传输的速度慢，executing状态的时间也并不是整个SQL语句执行的时间。

MySQL官方手册对这两种状态的解释如下：

executing
The thread has begun executing a statement.
线程已经开始执行语句

Sending data
The thread is reading and processing rows for a SELECT statement, and sending data to the client. 
Because operations occurring during this state tend to perform large amounts of disk access (reads), 
it is often the longest-running state over the lifetime of a given query.
线程正在读取并处理select语句选择的行数据，然后将数据发送给客户端。
因为这个状态期间的操作偏重执行大量的磁盘访问(读取磁盘)，
它通常是整个查询生命周期中运行时间最长的状态。
1
2
3
4
5
6
7
8
9
10
11

事实证明磁盘IO才是罪魁祸首。

对`name`字段创建索引

# 不加“using btree”子句，mysql也会默认创建B树索引
mysql> create index idx_tb_user_name on tb_user(name);
Query OK, 0 rows affected (5.32 sec)
Records: 0  Duplicates: 0  Warnings: 0

# 现在有两条索引了
mysql> show index from tb_user \G
*************************** 1. row ***************************
        Table: tb_user
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: id
    Collation: A
  Cardinality: 977600
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
*************************** 2. row ***************************
        Table: tb_user
   Non_unique: 1
     Key_name: idx_tb_user_name
 Seq_in_index: 1
  Column_name: name      # name列创建的索引
    Collation: A
  Cardinality: 987287
     Sub_part: NULL
       Packed: NULL
         Null: YES
   Index_type: BTREE     # B树索引
      Comment:
Index_comment:
2 rows in set (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

创建索引的详细语法可以参考官方文档create index和 alter table

创建索引后查询

mysql> select * from tb_user where name='UZlihlHEY';
+--------+-----------+------+
| id     | name      | age  |
+--------+-----------+------+
| 100000 | DnGrPZAGD |    2 |
+--------+-----------+------+
1 row in set (0.00 sec)

# 创建索引后耗时从原来的0.5秒缩短到现在的0.67毫秒
mysql> show profiles;
+----------+------------+------------------------------------------------+
| Query_ID | Duration   | Query                                          |
+----------+------------+------------------------------------------------+
|        1 | 0.00048275 | select * from tb_user where id=100000          |
|        2 | 0.00061350 | show index from tb_user                        |
|        3 | 0.00052125 | explain select * from tb_user where id=100000  |
|        4 | 0.42458000 | select * from tb_user where name='UZlihlHEY'   |
|        5 | 5.31390800 | create index idx_tb_user_name on tb_user(name) |
|        6 | 0.00033175 | show index from tb_user                        |
|        7 | 0.00046250 | select * from tb_user where name='UZlihlHEY'   |
+----------+------------+------------------------------------------------+
7 rows in set, 1 warning (0.00 sec)

mysql> explain select * from tb_user where name='UZlihlHEY'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tb_user
   partitions: NULL
         type: ref
possible_keys: idx_tb_user_name
          key: idx_tb_user_name     # 使用了刚刚创建的idx_tb_user_name索引
      key_len: 33
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

即使创建了索引，使用name字段查询仍然没有主键查询的速度快，这是因为name字段创建的索引是二级索引，InnoDB的二级索引叶子节点并不保存实际的数据行或数据行的引用，而是保存了主键id。二级索引走了一遍得到主键id，还要拿着主键id在主键索引再走一遍，才能查到数据行。关于这部分内容会在后续的文章中详细介绍。

正确使用索引

索引虽然能加快我们的查询速度，但是如果姿势不对还是不能达到预期的速度。

1. 避免索引列在表达式中出现

# 索引列在表达式中
mysql> select * from tb_user where id+30000=130000;
+--------+-----------+------+
| id     | name      | age  |
+--------+-----------+------+
| 100000 | DnGrPZAGD |    2 |
+--------+-----------+------+
1 row in set (0.38 sec)

# SQL解析器无法优化这样的等式
mysql> explain select * from tb_user where id+30000=130000\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tb_user
   partitions: NULL
         type: ALL   # 全表扫描
possible_keys: NULL
          key: NULL  # 没有用到主键索引
      key_len: NULL
          ref: NULL
         rows: 998430
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

# 索引列单独在等式的一端，仍可以使用索引
mysql> select * from tb_user where id=30000+70000;
+--------+-----------+------+
| id     | name      | age  |
+--------+-----------+------+
| 100000 | DnGrPZAGD |    2 |
+--------+-----------+------+
1 row in set (0.00 sec)

mysql> explain select * from tb_user where id=30000+70000\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tb_user
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

2. 避免索引列作为函数参数

# 如果birth列创建了索引，但查询条件中这一列作为函数参数，birth这个索引就不会被用到
select count(*) from tb_user where YEAR(current_date) - YEAR(birth) >= 18;
1
2

3. 前缀索引

进行字符串模糊匹配的时候，经常会用like操作符，通配符开头的字符串匹配无法使用索引，

mysql> select * from tb_user where name like '%UZlih%';
+--------+-----------+------------+
| id     | name      | birth      |
+--------+-----------+------------+
|  33090 | ZjUZlihkF | 1988-08-03 |
|  37993 | iUZlihlKQ | 1970-08-07 |
...// 省略
| 903589 | mgTauZLIh | 2004-12-12 |
| 999434 | IUzLiHNSa | 1994-03-06 |
+--------+-----------+------------+
35 rows in set (0.51 sec)  # 耗时0.50994975秒

mysql> explain select * from tb_user where name like '%UZlih%'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tb_user
   partitions: NULL
         type: ALL    # 全表扫描
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 998244
     filtered: 11.11
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

mysql> select * from tb_user where name like 'UZlih%';
+--------+-----------+------------+
| id     | name      | birth      |
+--------+-----------+------------+
| 309324 | uZLIhKgBm | 2001-03-18 |
| 100000 | UZlihlHEY | 1975-10-09 |
| 891682 | uZLIhLlUQ | 1988-06-24 |
| 568787 | uzlIHmnFl | 1982-07-30 |
| 621614 | uZLIhMnfM | 1987-08-30 |
|  41360 | uZLIhMpmq | 2011-08-25 |
|  42195 | UzLiHNSYo | 1993-04-18 |
| 600167 | uZLIhNtFU | 2002-05-30 |
+--------+-----------+------------+
8 rows in set (0.00 sec)   # 耗时0.00046500秒

mysql> explain select * from tb_user where name like 'UZlih%'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tb_user
   partitions: NULL
         type: range             # 范围查询
possible_keys: idx_tb_user_name
          key: idx_tb_user_name  # 用到了name索引
      key_len: 33
          ref: NULL
         rows: 8
     filtered: 100.00
        Extra: Using index condition  # 使用索引进行条件查询
1 row in set, 1 warning (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

参考：

《高性能MySQL》

https://dev.mysql.com/doc/refman/5.7/en/index-btree-hash.html

https://dev.mysql.com/doc/refman/5.7/en/index-condition-pushdown-optimization.html

声明：本文内容由网友自发贡献，转载请注明出处：【wpsshop】