赞
踩
平时使用Elasticsearch的时候,会在Kibana中使用Query DSL来查询数据.每次要用到Query DSL时都基本忘光了,需要重新在回顾一遍,最近发现Elasticsearch已经支持SQL查询了(6.3版本以后),整理了下一些用法.
Elasticsearch SQL是一个X-Pack组件,它允许针对Elasticsearch实时执行类似SQL的查询.无论使用REST接口,命令行还是JDBC,任何客户端都可以使用SQL对Elasticsearch中的数据进行原生搜索和聚合数据.可以将Elasticsearch SQL看作是一种翻译器,它可以将SQL翻译成Query DSL.
Elasticsearch SQL具有如下特性:
先安装好Elasticsearch和Kibana,这里安装的是7.17.0版本
安装完成后在Kibana中 http://127.0.0.1:5601/app/dev_tools#/console{target=“_blank”}
导入测试数据,数据地址: https://github.com/macrozheng/mall-learning/blob/master/document/json/accounts.json{target=“_blank”}
直接在Kibana的Dev Tools中运行如下命令即可:
POST /account/_bulk {"index":{"_id":"1"}} {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"} {"index":{"_id":"6"}} {"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"} {"index":{"_id":"13"}} {"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA"} {"index":{"_id":"18"}} {"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","employer":"Boink","email":"daleadams@boink.com","city":"Orick","state":"MD"} {"index":{"_id":"20"}} {"account_number":20,"balance":16418,"firstname":"Elinor","lastname":"Ratliff","age":36,"gender":"M","address":"282 Kings Place","employer":"Scentric","email":"elinorratliff@scentric.com","city":"Ribera","state":"WA"} {"index":{"_id":"25"}} {"account_number":25,"balance":40540,"firstname":"Virginia","lastname":"Ayala","age":39,"gender":"F","address":"171 Putnam Avenue","employer":"Filodyne","email":"virginiaayala@filodyne.com","city":"Nicholson","state":"PA"} {"index":{"_id":"32"}} {"account_number":32,"balance":48086,"firstname":"Dillard","lastname":"Mcpherson","age":34,"gender":"F","address":"702 Quentin Street","employer":"Quailcom","email":"dillardmcpherson@quailcom.com","city":"Veguita","state":"IN"} {"index":{"_id":"37"}} {"account_number":37,"balance":18612,"firstname":"Mcgee","lastname":"Mooney","age":39,"gender":"M","address":"826 Fillmore Place","employer":"Reversus","email":"mcgeemooney@reversus.com","city":"Tooleville","state":"OK"} {"index":{"_id":"44"}} {"account_number":44,"balance":34487,"firstname":"Aurelia","lastname":"Harding","age":37,"gender":"M","address":"502 Baycliff Terrace","employer":"Orbalix","email":"aureliaharding@orbalix.com","city":"Yardville","state":"DE"} {"index":{"_id":"49"}} {"account_number":49,"balance":29104,"firstname":"Fulton","lastname":"Holt","age":23,"gender":"F","address":"451 Humboldt Street","employer":"Anocha","email":"fultonholt@anocha.com","city":"Sunriver","state":"RI"}
我们使用SQL来查询下前10条记录,可以通过format参数控制返回结果的格式,txt表示文本格式,看起来更直观点,默认为json格式.
在Kibana的Dev Tools中输入如下命令:
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance FROM account LIMIT 10"
}
查询结果显示如下.
account_number | address | age | balance
---------------+--------------------+---------------+---------------
1 |880 Holmes Lane |32 |39225
6 |671 Bristol Street |36 |5686
13 |789 Madison Street |28 |32838
18 |467 Hutchinson Court|33 |4180
20 |282 Kings Place |36 |16418
25 |171 Putnam Avenue |39 |40540
32 |702 Quentin Street |34 |48086
37 |826 Fillmore Place |39 |18612
44 |502 Baycliff Terrace|37 |34487
49 |451 Humboldt Street |23 |29104
如上实例,使用 _sql
指明使用SQL模块,在 query
字段中指定要执行的SQL语句.使用 format
指定返回数据的格式,数据格式可选项有以下几个,它们都是见名识意的:
format | Accept Http header | 说明 |
---|---|---|
csv | text/csv | 逗号分隔 |
json | application/json | Json 格式 |
tsv | text/tab-separated-values | tab 分隔 |
txt | text/plain | 文本格式 |
yaml | application/yaml | yaml |
cbor | application/cbor | 简洁的二进制对象表示格式 |
smile | application/smile | 类似于 cbor 的另一种二进制格式 |
当我们需要使用Query DSL时,也可以先使用SQL来查询,然后通过Translate API转换即可.
例如我们翻译以下查询语句:
POST /_sql/translate
{
"query": "SELECT account_number,address,age,balance FROM account WHERE age>32 LIMIT 10"
}
最终获取到Query DSL结果如下.
{ "size" : 10, "query" : { "range" : { "age" : { "from" : 32, "to" : null, "include_lower" : false, "include_upper" : false, "boost" : 1.0 } } }, "_source" : false, "fields" : [ { "field" : "account_number" }, { "field" : "address" }, { "field" : "age" }, { "field" : "balance" } ], "sort" : [ { "_doc" : { "order" : "asc" } } ] }
然后可以用Query DSL 语法来查询:
GET /account/_search { "size": 10, "query": { "range": { "age": { "from": 32, "to": null, "include_lower": false, "include_upper": false, "boost": 1 } } }, "_source": false, "fields": [ { "field": "account_number" }, { "field": "address" }, { "field": "age" }, { "field": "balance" } ], "sort": [ { "_doc": { "order": "asc" } } ] }
我们还可以将SQL和Query DSL混合使用,比如使用Query DSL来设置过滤条件.
例如查询 age在30-35
之间的记录,可以使用如下查询语句:
POST /_sql?format=txt
{
"query": "SELECT account_number,address,age,balance FROM account",
"filter": {
"range": {
"age": {
"gte": 30,
"lte": 35
}
}
},
"fetch_size": 10
}
虽然 SQL 和 Elasticsearch 对于数据的组织方式(以及不同的语义)有不同的术语,但本质上它们的用途是相同的.下面是它们的映射关系表:
SQL | Elasticsearch | 说明 |
---|---|---|
column | field | 在 Elasticsearch 字段时,SQL 将这样的条目调用为 column.注意,在 Elasticsearch,一个字段可以包含同一类型的多个值(本质上是一个列表) ,而在 SQL 中,一个列可以只包含一个表示类型的值.Elasticsearch SQL 将尽最大努力保留 SQL 语义,并根据查询的不同,拒绝那些返回多个值的字段. |
row | document | 列和字段本身不存在; 它们是行或文档的一部分.两者的语义略有不同: 行row往往是严格的(并且有更多的强制执行),而文档往往更灵活或更松散(同时仍然具有结构). |
table | index | 在 SQL 还是 Elasticsearch 中查询针对的目标 |
schema | implicit | 在关系型数据库中,schema 主要是表的名称空间,通常用作安全边界.Elasticsearch没有为它提供一个等价的概念. |
虽然这些概念之间的映射在语义上有些不同,但它们间更多的是有共同点,而不是不同点.
ES SQL 的词法结构很大程度上类似于 ANSI SQL 本身.ES SQL 当前一次只能接受一个命令,这里的命令是由输入流结尾结束的 token 序列.这些 token 可以是关键字,标识符(带引号或者不带引号),文本(或者常量),特殊字符符号(通常是分隔符).
关键词这个其实跟我们写 SQL 语句那种关键字的定义是一样的,例如 SELECT,FROM 等都是关键字,需要注意的是,关键字不区分大小写.
SELECT * FROM my_table
如上示例,共有 4 个 token:SELECT, * ,FROM ,my_table
,其中 SELECT,* ,FROM
是关键词,表示在 SQL 具有固定含义的词.而 my_table
是一个标识符,其表示了 SQL 中实体,如表,列等
标识符有两种类型:带引号的和不带引号的,示例如下:
SELECT ip_address FROM "hosts-*"
如上示例,查询中有两个标识符分别为不带引号的 ip_address
和带引号的 hosts-*
(通配符模式).
因为 ip_address
不与任何关键字冲突,所以可以不带引号.而 hosts-*
与 -
(减号操作)和 *
冲突,所以要加引号.
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小桥流水78/article/detail/919313
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。