赞
踩
一般使用 Elasticsearch
的时候,会使用 Query DSL
来查询数据,从 Elasticsearch6.3
版本以后,Elasticsearch
已经支持SQL
查询了。
Elasticsearch SQL
是一个 X-Pack
组件,它允许针对 Elasticsearch
实时执行类似SQL
的 查询。无论使用REST
接口,命令行还是JDBC
,任何客户端都可以使用SQL
对 Elasticsearch
中的数据进行原生搜索和聚合数据。可以将 Elasticsearch SQL
看作是一种翻译器,它可以将 SQL
翻译成 Query DSL
。
原生集成 Elasticsearch SQL
是为 Elasticsearch
从头开始构建的。每个查询都根据底层存储有效地针对相关节点执行。
没有外部零件 无需额外的硬件、进程、运行时或库来查询 Elasticsearch
;Elasticsearch SQL
通过在Elasticsearch
内部运行消除了额外的移动部件。
轻巧高效 Elasticsearch SQL
并未抽象化其搜索功能,相反的它拥抱并接受了SQL
来 实现全文搜索,以简洁的方式实时运行全文搜索
- PUT my-sql-index/_bulk?refresh
- {"index":{"_id": "JAVA"}}
- {"name": "JAVA", "author": "zhangsan", "release_date": "2022-08-10","page_count": 561}
- {"index":{"_id": "BIGDATA"}}
- {"name": "BIGDATA", "author": "lisi", "release_date": "2022-08-11", "page_count": 482}
- {"index":{"_id": "SCALA"}}
- {"name": "SCALA", "author": "wangwu", "release_date": "2022-08-12", "page_count": 604}
- # SQL
- # 这里的表就是索引
- # 可以通过 format参数控制返回结果的格式,默认为 json格式
- # txt:表示文本格式,看起来更直观点.
- # csv:使用逗号隔开的数据
- # json:JSON格式数据
- # tsv: 使用 tab键隔开数据
- # yaml:属性配置格式
- POST _sql?format=txt
- {
- "query": """
- SELECT * FROM "my-sql-index"
- """
- }
- # 条件查询
- POST _sql?format=txt
- {
- "query": """
- SELECT * FROM "my-sql-index" where page_count > 500
- """
- }
当我们需要使用Query DSL
时,也可以先使用SQL
来查询,然后通过Translate API
转换即可,查询的结果为DSL
方式的结果
- # 转换 SQL为 DSL进行操作
- POST _sql/translate
- {
- "query": """
- SELECT * FROM "my-sql-index" where page_count > 500
- """
- }
我们如果在优化SQL
语句之后还不满足查询需求,可以拿SQL
和 DSL
混用,ES
会先根据SQL
进行查询,然后根据DSL
语句对SQL
的执行结果进行二次查询
- # SQL和 DSL混合使用
- # 由于索引中含有横线,所以作为表名时需要采用双引号,且外层需要三个引号包含
- POST _sql?format=txt
- {
- "query": """SELECT * FROM "my-sql-index" """,
- "filter" : {
- "range": {
- "page_count": {
- "gte": 400,
- "lte": 600
- }
- }
- },
- "fetch_size": 2
- }
- GET _sql?format=txt
- {
- "query": """
- show tables
- """
- }
- GET _sql?format=txt
- {
- "query": """
- show tables like 'my-sql-index'
- """
- }
- GET _sql?format=txt
- {
- "query": """
- show tables like 'my-%'
- """
- }
- # 先创建一个索引
- put myindex
- {
- "mappings":{
- "properties":{
- "sku_id":{
- "type":"long"
- },
- "sku_name":{
- "type":"text"
- },
- "sku_url":{
- "type":"keyword"
- }
- }
- }
- }

- GET _sql?format=txt
- {
- "query": """
- describe myindex
- """
- }
ES
中使用SQL
查询的语法与在数据库中使用基本一致- # 条件过滤
- POST _sql?format=txt
- {
- "query": """ SELECT * FROM "my-sql-index" where name = 'JAVA' """
- }
- # 按照日期进行分组
- GET _sql?format=txt
- {
- "query": """
- SELECT release_date FROM "my-sql-index" group by release_date
- """
- }
- # 对分组后的数据进行过滤
- GET _sql?format=txt
- {
- "query": """
- SELECT sum(page_count), release_date as datacnt FROM "my-sql-index" group by release_date having sum(page_count) > 1000
- """
- }
- # 对页面数量进行排序(降序)
- GET _sql?format=txt
- {
- "query": """
- select * from "my-sql-index" order by page_count desc
- """
- }
- # 限定查询数量
- GET _sql?format=txt
- {
- "query": """
- select * from "my-sql-index" limit 3
- """
- }
游标(cursor)是系统为用户开设的一个数据缓冲区,存储sql语句的执行结果,每个游标区都有一个名字,用户可以用 sql 语句逐一从游标中获取记录,并赋给主变量,交由主语言进一步处理。就本质而言,游标实际上是一种能从包括多条数据记录的结果集中每次提取一条或多条记录的机制
- # 查询数据
- # 因为查询结果较多,但是获取的数据较少,所以为了提高效果,会将数据存储到临时缓冲区中
- # 此处数据展示格式为 json
- POST _sql?format=json
- {
- "query": """ SELECT * FROM "my-sql-index" order by page_count desc """,
- "fetch_size": 2
- }
返回结果中的cursor
就是缓冲区的标识,这就意味着可以从缓冲区中直接获取后续数据,操作上有点类似于迭代器,可多次执行。
- # 此处游标cursor值就是上图中的结果
- POST /_sql?format=json
- {
- "cursor": "v5HqA0RGTACEkd9OwjAUxnvmQgwx8RF8BVG44IKLDew0YRCkUF1MljI6NigtrOVPeCIfwPfTbUDEK7+Lnu80PV+T34EAQYIsQIW+c92WDuVXUI1TLqahVpmprtiMh5HaSIMqYZxm2gAgsIWSs+N7+IIrZFm5KY4y4eNkUFHBrrCNSVSGLFuyJUfWZaZ1k3HBmebhlBkOd9pkaWTKJlQrkyrJRGjSJQ8lk0pb8AnbxvPa2T35k7eFiR6x7lKxCbxxHS/EfEh7a9pJ1NhrdklnqiORKCKSve9FDhaiPhnt7vsjHI/mbi2Yu+3goUEGFMdDGjlntZs+DQ4v+76HiX94JUwK/E5XPb/mpl0SkMHAcf/7y3FaLVQ9crUhPhEA2/C9yZHEJYnf9oLIGS3Ef8lcF0gKEPl4vqofAAAA//8DAA=="
- }
如果执行后,无任何结果返回,说明数据已经读取完毕
此时再次执行,会返回错误信息
如果关闭缓冲区,执行下面指令即可
- POST _sql/close
- {
- "cursor": "v5HqA0RGTACEkUtuwjAQhj1phCpUqUfoFUoLCxYsEmjSSgREMbjNJjLBJgFjQ2we4kQ9QE/Ui7RJAJWu+i9G/4zmIX0DIYIEWYAKfee6LR3KS1DlKRPTSKvMVFd0xqJYbaRBlYinmTYACGyh5OzYD59whSwrN0UoK18ng4rtYFfoxiQqQ5Yt6ZIh63KndZMxwahm0ZQaBnfaZGlsyiRSK5MqSUVk0iWLJJVKW/AB28bz2tk9BZO3hYkfPd0lYhP647q3EPMh6a1JJ1Fjv9nFnamORaKwSPaBHzueEPXJaHffH3l8NHdr4dxthw8NPCAeH5LYOavdDEh4eNn3fQ8Hh1dMpfDeyaoX1Ny0i0M8GDjuf7ccp9VC1SNXG/iJANiG7U2OhJckftMLIme0wP+SuS6QFCDy8fxVPwAAAP//AwA="
- }
- GET _sql?format=txt
- {
- "query": """
- SELECT
- MIN(page_count) min,
- MAX(page_count) max,
- AVG(page_count) avg,
- SUM(page_count) sum,
- COUNT(*) count,
- COUNT(DISTINCT name) dictinct_count
- FROM "my-sql-index"
- """
- }
- -- Equality
- SELECT * FROM "my-sql-index" WHERE name = 'JAVA'
-
- -- Null Safe Equality
- SELECT 'elastic' <=> null AS "equals"
- SELECT null <=> null AS "equals"
-
- -- Inequality
- SELECT * FROM "my-sql-index" WHERE name <> 'JAVA'
- SELECT * FROM "my-sql-index" WHERE name != 'JAVA'
-
- -- Comparison
- SELECT * FROM "my-sql-index" WHERE page_count > 500
- SELECT * FROM "my-sql-index" WHERE page_count >= 500
- SELECT * FROM "my-sql-index" WHERE page_count < 500
- SELECT * FROM "my-sql-index" WHERE page_count <= 500
-
- -- BETWEEN
- SELECT * FROM "my-sql-index" WHERE page_count between 100 and 500
-
- -- Is Null / Is Not Null
- SELECT * FROM "my-sql-index" WHERE name is not null
- SELECT * FROM "my-sql-index" WHERE name is null
-
- -- IN
- SELECT * FROM "my-sql-index" WHERE name in ('JAVA', 'SCALA')

- -- AND
- SELECT * FROM "my-sql-index" WHERE name = 'JAVA' AND page_count > 100
-
- -- OR
- SELECT * FROM "my-sql-index" WHERE name = 'JAVA' OR name = 'SCALA'
-
- -- NOT
- SELECT * FROM "my-sql-index" WHERE NOT name = 'JAVA'
- # 加减乘除
- select 1 + 1 as x
- select 1 - 1 as x
- select - 1 as x
- select 6 * 6 as x
- select 30 / 5 as x
- select 30 % 7 as x
SELECT '123'::long AS long
复制代码
- -- LIKE 通配符
- SELECT * FROM "my-sql-index" WHERE name like 'JAVA%'
- SELECT * FROM "my-sql-index" WHERE name like 'JAVA_'
-
- -- 如果需要匹配通配符本身,使用转义字符
- SELECT * FROM "my-sql-index" WHERE name like 'JAVA/%' ESCAPE '/'
-
- -- RLIKE 不要误会,这里的 R表示的不是方向,而是正则表示式 Regex
- SELECT * FROM "my-sql-index" WHERE name like 'JAV*A'
- SELECT * FROM "my-sql-index" WHERE name rlike 'JAV*A'
-
- -- 尽管 LIKE在 Elasticsearch SQL 中搜索或过滤时是一个有效的选项,但全文搜索 MATCH和 QUERY 速度更快、功能更强大,并且是首选替代方案。
- -- FIRST / FIRST_VALUE : FIRST(第一个字段,排序字段)
- SELECT first(name, release_date) FROM "my-sql-index"
- SELECT first_value(substring(name,2,1)) FROM "my-sql-index"
-
- -- LAST / LAST_VALUE : LAST (第一个字段,排序字段)
- SELECT last(name, release_date) FROM "my-sql-index"
- SELECT last_value(substring(name,2,1)) FROM "my-sql-index"
-
- -- KURTOSIS 量化字段的峰值分布
- SELECT KURTOSIS(page_count) FROM "my-sql-index"
-
- -- MAD
- SELECT MAD(page_count) FROM "my-sql-index"
- -- HISTOGRAM : 直方矩阵
- SELECT HISTOGRAM(page_count, 100) as c,count(*) FROM "my-sql-index" group by c
- -- ABS:求数字的绝对值
- select ABS(page_count) from "myindex" limit 5
-
- -- CBRT:求数字的立方根,返回 double
- select page_count v,CBRT(page_count) cbrt from "myindex" limit 5
-
- -- CEIL:返回大于或者等于指定表达式最小整数(double)
- select page_count v,CEIL(page_count) from "myindex" limit 5
-
- -- CEILING:等同于 CEIL
- select page_count v,CEILING(page_count) from "myindex" limit 5
-
- -- E:返回自然常数 e(2.718281828459045)
- select page_count,E(page_count) from "myindex" limit 5
-
- -- ROUND:四舍五入精确到个位
- select ROUND(-3.14)
-
- -- FLOOR:向下取整
- select FLOOR(3.14)
-
- -- LOG:计算以 2为底的自然对数
- select LOG(4)
-
- -- LOG10:计算以 10为底的自然对数
- select LOG10(100)
-
- -- SQRT:求一个非负实数的平方根
- select SQRT(9)
-
- -- EXP:此函数返回 e(自然对数的底)的 X次方的值
- select EXP(3)

Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。