mysql order by 索引问题

作者: dreamfly 分类: mysql 发布时间: 2020-05-11 10:38

MySQL 是一款非常优秀的数据库软件，它有着出色的性能，但是如果使用不好的话，也可能会导致查询效率缓慢，笔者以工作中遇到的一个案例为由，讲解如何优化和创建索引。

案例

两个表如下：

CREATE TABLE `cmf_portal_category_post` (
    `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
    `post_id` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '文章id',
    `category_id` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '分类id',
    `list_order` FLOAT(12) NOT NULL DEFAULT '10000' COMMENT '排序',
    `status` TINYINT(3) UNSIGNED NOT NULL DEFAULT '1' COMMENT '状态,1:发布;0:不发布',
    PRIMARY KEY (`id`) USING BTREE,
    INDEX `term_taxonomy_id` (`category_id`) USING BTREE,
    INDEX `index_category_id` (`category_id`) USING BTREE,
    INDEX `index_post_id` (`post_id`) USING BTREE
)
COMMENT='portal应用 分类文章对应表'
COLLATE='utf8_general_ci'
ENGINE=InnoDB
AUTO_INCREMENT=1
;

CREATE TABLE `cmf_portal_post` (
    `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
    `parent_id` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '父级id',
    `post_type` TINYINT(3) UNSIGNED NOT NULL DEFAULT '1' COMMENT '类型,1:文章;2:页面',
    `post_format` TINYINT(3) UNSIGNED NOT NULL DEFAULT '1' COMMENT '内容格式;1:html;2:md',
    `user_id` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '发表者用户id',
    `post_status` TINYINT(3) UNSIGNED NOT NULL DEFAULT '1' COMMENT '状态;1:已发布;0:未发布;',
    `comment_status` TINYINT(3) UNSIGNED NOT NULL DEFAULT '1' COMMENT '评论状态;1:允许;0:不允许',
    `is_top` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0' COMMENT '是否置顶;1:置顶;0:不置顶',
    `recommended` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0' COMMENT '是否推荐;1:推荐;0:不推荐',
    `post_hits` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '查看数',
    `post_favorites` INT(10) UNSIGNED NOT NULL DEFAULT '0' COMMENT '收藏数',
    `post_like` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '点赞数',
    `comment_count` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '评论数',
    `create_time` INT(10) UNSIGNED NOT NULL DEFAULT '0' COMMENT '创建时间',
    `update_time` INT(10) UNSIGNED NOT NULL DEFAULT '0' COMMENT '更新时间',
    `published_time` INT(10) UNSIGNED NOT NULL DEFAULT '0' COMMENT '发布时间',
    `delete_time` INT(10) UNSIGNED NOT NULL DEFAULT '0' COMMENT '删除时间',
    `post_title` VARCHAR(100) NOT NULL DEFAULT '' COMMENT 'post标题' COLLATE 'utf8_unicode_ci',
    `post_keywords` VARCHAR(150) NOT NULL DEFAULT '' COMMENT 'seo keywords' COLLATE 'utf8_unicode_ci',
    `post_excerpt` VARCHAR(500) NOT NULL DEFAULT '' COMMENT 'post摘要' COLLATE 'utf8_unicode_ci',
    `post_source` VARCHAR(150) NOT NULL DEFAULT '' COMMENT '转载文章的来源' COLLATE 'utf8_unicode_ci',
    `thumbnail` VARCHAR(100) NOT NULL DEFAULT '' COMMENT '缩略图' COLLATE 'utf8_unicode_ci',
    `post_content` TEXT(65535) NULL DEFAULT NULL COMMENT '文章内容' COLLATE 'utf8_unicode_ci',
    `post_content_filtered` TEXT(65535) NULL DEFAULT NULL COMMENT '处理过的文章内容' COLLATE 'utf8_unicode_ci',
    `more` TEXT(65535) NULL DEFAULT NULL COMMENT '扩展属性,如缩略图;格式为json' COLLATE 'utf8_unicode_ci',
    PRIMARY KEY (`id`) USING BTREE,
    INDEX `type_status_date` (`post_type`, `post_status`, `create_time`, `id`) USING BTREE,
    INDEX `parent_id` (`parent_id`) USING BTREE,
    INDEX `user_id` (`user_id`) USING BTREE,
    INDEX `create_time` (`create_time`) USING BTREE,
    INDEX `index_create_time` (`create_time`) USING BTREE,
    INDEX `index_update_time` (`update_time`) USING BTREE,
    INDEX `index_published_time` (`published_time`) USING BTREE,
    INDEX `index_post_status` (`post_status`) USING BTREE,
    INDEX `index_post_type` (`post_type`) USING BTREE,
    INDEX `index_delete_time` (`delete_time`) USING BTREE
)
COMMENT='portal应用 文章表'
COLLATE='utf8_unicode_ci'
ENGINE=InnoDB
ROW_FORMAT=COMPACT
AUTO_INCREMENT=1
;

从上面的表结构可以看出，cmf_portal_post表没有保存category_id，因此查询条件中含有category_id字段的时候，就需要关联cmf_portal_category_post表进行关联查询，下面我们给出一个查询语句。

explain SELECT post.* FROM `cmf_portal_post` `post` left JOIN `cmf_portal_category_post` `category_post` ON `post`.`id`=`category_post`.`post_id` WHERE  `post`.`post_status` = 1 AND `post`.`post_type` = 1 AND `post`.`delete_time` = 0 AND `category_post`.`category_id` IN ('18','30','31','32','34') AND ( `post`.`is_top` = 0 )  ORDER BY `post`.`published_time` DESC LIMIT 3;

查询结果如下：

cmf_portal_post

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	post	index_merge	PRIMARY,type_status_date,index_post_status,index_post_type,index_delete_time,id_published_time	index_post_status,index_post_type,index_delete_time	1,1,4	\N	2993	Using intersect(index_post_status,index_post_type,index_delete_time); Using where; Using filesort
1	SIMPLE	category_post	ref	term_taxonomy_id,index_category_id,index_post_id	index_post_id	8	furniture_80shih.post.id	1	Using where

我们从结果中可以看出，查询post的时候，使用了Using filesort,也就是使用了文件排序，我们知道文件排序是操作硬盘的，因此效率非常低，笔者在只有3万多条数据的时候，查询执行时间竟然用了0.7秒，这是非常高的时间消耗。为什么我们明明在published_time创建了索引，但是没有起作用呢？

这是因为order by的原理决定的。

因为我们查询的结果字段不仅仅包含published_time字段，还包含其它字段，因此单独的索引查询之后，我们还需要返回回表查询其它字段。

更据查询字段，我们可以建立联合索引

create INDEX `index_news` (`post_type`, `post_status`, `published_time`, `delete_time`, `post_title`, `post_excerpt`(255))

创建这样的索引之后，我们再执行sql语句，就会发现结果如下：

cmf_portal_post

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	post	range	PRIMARY,type_status_date,index_post_status,index_post_type,index_delete_time,index_news	index_news	2	\N	11975	Using index condition; Using where
1	SIMPLE	category_post	ref	term_taxonomy_id,index_category_id,index_post_id	index_post_id	8	furniture_80shih.post.id	1	Using where

从结果中可以看到经过这样的修改，using filesort已经不见了。

索引类型

mysql 有多种索引类型，下面我们简单介绍一下。

聚集索引，顾名思义，索引是聚集的，就是在物理结构上是连续的，类似于我们使用汉语字典通过拼音查询。mysql中的主键就是聚集索引，当然它也是唯一的。
非聚集索引，物理结果上是不连续的，类似于我们使用汉语字典通过偏旁部首进行查询。
联合索引，就是多个字段索引，如上面那个案例，当我们返回的字段需要一次返回多个的时候，我们可以根据查询字段创建联合索引。联合索引需要注意的就是where查询时候的条件顺序。比如我们建立了一个name,age,sex的联合索引，那么下面的语句有些是不会使用索引的。

select name,age,sex from test where name = '扎根';-- 会使用索引
select name,age,sex from test where name ='tom' and age = 16; --会使用索引
select name,age,sex from test where age = 16;--不会使用索引
select name,age,sec from test where age = 15 and sex = 1; -- 不会使用索引

从上面的规律我们可以看出，要想使用联合索引，必须从最左侧开始匹配，我们把这个规则叫做最左前缀原则。

全文索引，也叫做倒排索引，也就是我们搜索引擎常用的存储。它存储的是单词在数据库中的位置，和根据内容找单词不同的是，它是根据单词找内容，是相反的，因此叫做倒排。

索引本质

有序的文件查询就会非常快，而乱序的文件查询就会慢，而索引本质就是对文件进行整理排序，当文件有序了，我们查询的时候效率自然就高了。因此排序算法为什么很多公司面试都会问，因为它们不仅基础，而且太重要了。

如果觉得我的文章对您有用，请随意打赏。您的支持将鼓励我继续创作！

评论已关闭!