当前位置:   article > 正文

ElasticSearch+SpringBoot聚合、分组、排序、分页、10000条查询限制及遇到的问题_elasticsearch 分组排序

elasticsearch 分组排序

一、SQL转换

Select

From

Where and or not

Group by

Having

Order by

1、单条件查询

Select from book where name = “小明”
  • 1

精确匹配

QueryBuilders.matchQuery("name",”小明”);
  • 1

(1)模糊查询

1.常用的字符串查询
 Select  from book where name like “%小%”

QueryBuilders.queryStringQuery("fieldValue").field("fieldName");//左右模糊
  • 1
  • 2
  • 3
2.前缀查询 如果字段没分词,就匹配整个字段前缀
QueryBuilders.prefixQuery("fieldName","fieldValue");
  • 1

(2)多字段模糊查询

String[] fieldName = new String[2];

fieldName[0] = "明";

fieldName[1] = "年";

QueryBuilders.moreLikeThisQuery(fieldName);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

(3)wildcard query

通配符查询,支持* 任意字符串;?任意一个字符

QueryBuilders.wildcardQuery("fieldName","ctr*");//前面是fieldname,后面是带匹配字符的字符串

QueryBuilders.wildcardQuery("fieldName","c?r?");
  • 1
  • 2
  • 3

2、多条件查询

1、and、or、not

组合查询BoolQueryBuilder

  must(QueryBuilders)  :AND

  mustNot(QueryBuilders) :NOT

  should:        :OR
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

例子:

Select * from book where author = “年轻” or count =1

builder.must(QueryBuilders.queryStringQuery("年轻").field("author"));

 builder.should(QueryBuilders.matchQuery("count",1));
  • 1
  • 2
  • 3
  • 4
  • 5

2、排序查询

1、排序查询

Select * from book order by count asc

FieldSortBuilder sort = SortBuilders.fieldSort("count").order(SortOrder.ASC);
  • 1
  • 2
  • 3

3、一个字段匹配多个值

builder.must(QueryBuilders.termsQuery("desc", "年","书"));
  • 1

4、一个值匹配多个字段

 builder.must(QueryBuilders.multiMatchQuery("小", "author", "desc"));
  • 1

5、范围查询


闭区间查询

QueryBuilder queryBuilder0 = QueryBuilders.rangeQuery("fieldName").from("fieldValue1").to("fieldValue2");
  • 1

开区间查询

QueryBuilder queryBuilder1 = QueryBuilders.rangeQuery("fieldName").from("fieldValue1").to("fieldValue2").includeUpper(false).includeLower(false);//默认是true,也就是包含
  • 1

大于

QueryBuilder queryBuilder2 = QueryBuilders.rangeQuery("fieldName").gt("fieldValue");
  • 1

大于等于

QueryBuilder queryBuilder3 = QueryBuilders.rangeQuery("fieldName").gte("fieldValue");
  • 1

小于

QueryBuilder queryBuilder4 = QueryBuilders.rangeQuery(“fieldName”).lt(“fieldValue”);

小于等于

QueryBuilder queryBuilder5 = QueryBuilders.rangeQuery("fieldName").lte("fieldValue");1、
  • 1

聚合查询

桶Buckets:类似于group by分组后的结果放入一个一个的桶中

指标Metrics:类似于min,max、sum筛选计算。

(1)统计某个字段的数量

 ValueCountBuilder vcb=  AggregationBuilders.count("count_uid").field("uid");
  • 1

(2)去重统计某个字段的数量(有少量误差)

 CardinalityBuilder cb= AggregationBuilders.cardinality("distinct_count_uid").field("uid");
  • 1

(3)聚合过滤

FilterAggregationBuilder fab= AggregationBuilders.filter("uid_filter").filter(QueryBuilders.queryStringQuery("uid:001"));
  • 1

(4)按某个字段分组

TermsBuilder tb=  AggregationBuilders.terms("group_name").field("name");
  • 1

(5)求和

SumBuilder  sumBuilder=	AggregationBuilders.sum("sum_price").field("price");
  • 1

(6)求平均

AvgBuilder ab= AggregationBuilders.avg("avg_price").field("price");
  • 1

(7)求最大值

MaxBuilder mb= AggregationBuilders.max("max_price").field("price"); 
  • 1

(8)求最小值

MinBuilder min=	AggregationBuilders.min("min_price").field("price");
  • 1

(9)按日期间隔分组

DateHistogramBuilder dhb= AggregationBuilders.dateHistogram("dh").field("date");
  • 1

(10)获取聚合里面的结果

TopHitsBuilder thb=  AggregationBuilders.topHits("top_result");
  • 1

(11)嵌套的聚合

NestedBuilder nb= AggregationBuilders.nested("negsted_path").path("quests");
  • 1

(12)反转嵌套

AggregationBuilders.reverseNested("res_negsted").path("kps ");
  • 1

获取数据查询结果

  • 第一种方式
 Page<Book> search = bookRepository.search(nativeSearchQuery);

 List<Book> content = search.getContent();
  • 1
  • 2
  • 3
  • 第二种方式

使用封装的对象。

 List<Book> bookList = elasticsearchTemplate.queryForList(nativeSearchQuery, Book.class);

  for (Book b1:bookList){

   System.out.printf(b1.getAuthor());

 }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

获取聚合函数结果

    Aggregations aggregations = elasticsearchTemplate.query(nativeSearchQuery, new ResultsExtractor<Aggregations>() {

​      @Override

​      public Aggregations extract(SearchResponse searchResponse) {

​        return searchResponse.getAggregations();

​      }

​    });
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

1、代码样例

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qn7KVp6Z-1605506616609)(file:///C:\Users\SZ_JHT~1\AppData\Local\Temp\ksohtml15096\wps2.jpg)]

1、布尔查询

BoolQueryBuilder queryBuilder = new BoolQueryBuilder();
  • 1

字段匹配

queryBuilder.must(QueryBuilders.matchPhraseQuery("itemKey", itemKey));
  • 1

范围查询

RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("statDate").from(startDate).to(endDate);
  • 1

类似于 sql语句中 hourkey in (1,2,3,4);

集合查询

if (null != hour) {
  queryBuilder.must(QueryBuilders.termsQuery("hourKey",hour));
}

queryBuilder.must(rangeQueryBuilder);

NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

构建查询对象

nativeSearchQueryBuilder.withIndices("dwl_over_time_event_report_hour") //索引    .withQuery(queryBuilder)				//查询语句   .addAggregation(AggregationBuilders.terms("statDateGroup").field("statDate") //聚合       .subAggregation(AggregationBuilders.sum("over_time_event_num_total").field("overTimeEventNumTotal"))); NativeSearchQuery nativeSearchQuery = nativeSearchQueryBuilder.build(); List<DwlOverTimeEventReportHourEs> list = new ArrayList<>();
  • 1

3、获取聚合结果集

Aggregations aggregations = elasticsearchTemplate.query(nativeSearchQuery, new ResultsExtractor<Aggregations>() { @Override
  public Aggregations extract(SearchResponse searchResponse) {
   return searchResponse.getAggregations();
  }
});
  • 1
  • 2
  • 3
  • 4
  • 5

聚合结果转成map

Map<String, Aggregation> aggregationMap = aggregations.asMap();

//获取聚合后的桶、类似group by 后的结果。

LongTerms hourKeyGroup_list = (LongTerms) aggregationMap.get("statDateGroup");

List<LongTerms.Bucket> buckets1 = hourKeyGroup_list.getBuckets();
for (int i = 0; i < buckets1.size(); i++) {
  Aggregations aggregations1 = buckets1.get(i).getAggregations();
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

获取分组字段

  String statDate = buckets1.get(i).getKeyAsString();
  Map<String, Aggregation> aggregationMap1 = aggregations1.asMap();
  • 1
  • 2

获取对应类型的聚合值

 InternalSum sum_over_time_event_num_total = (InternalSum) aggregationMap1.get("over_time_event_num_total");
  int value = (int) sum_over_time_event_num_total.getValue();

  DwlOverTimeEventReportHourEs dwlOverTimeEventReportHour = new DwlOverTimeEventReportHourEs();
  dwlOverTimeEventReportHour.setStatDate(DateUtil.parseDate(statDate,"yyyy-MM-dd"));
  dwlOverTimeEventReportHour.setOverTimeEventNumTotal(value);

  list.add(dwlOverTimeEventReportHour);
}
return list; 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

6、常见问题

1、匹配的字段:

要转换成与ES中字段类型

例如Es时间类型:yyyy-MM-dd ,我们也要转换成这种类型才能匹配

2、条件查询中范围查询

Ps:注意names是一个字符串数组,获取是一个整形数组的话,你需要范围查询时,一定要在后面加上.keyword,不然无法起到类似于mysql中in的类型。

if (null != names) {
  queryBuilder.must(QueryBuilders.termsQuery("name.keyword",name));
}
  • 1
  • 2
  • 3

3、排序查询:

Ps:排序查询需要注意的是,如果是string类型,不希望分词就要加上.keyword,不然会报错。

Ps:还有一个我认为重要的就是,如果索引建立了,没有数据的情况其中(也就是es中的index没有字段),使用排序查询需要在后面对象字段的类型.unmappedType(“String”)。不然会报错,原因:es中没有对应的字段,也就没有对应字段的类型。

FieldSortBuilder sort = SortBuilders.fieldSort("name.keywrod").order(SortOrder.DESC).unmappedType("string");
  • 1

4、分组后的桶类型

Ps:根据你分组的字段类型进行类型转换,例如,你的分组字段类型是int,flat,double类型的,可以全部转成LongTerms类型(Date日期类型可以也是LongTerms)

如果是其他类型的,例如String转换成StringTerms类型。

LongTerms hourKeyGroup_list = (LongTerms) aggregationMap.get("statDateGroup");
  • 1

5、条件查询和聚合分组查询的结果是分开的

PS:es中这里很难受,不先mysql,一条sql语句就解决了,

Es中条件查询中有聚合分组,结果是聚合分组之前的原始数据。无法得到分组后的结果,

这也就是上面为什么要单独获取聚合分组后的结果,然后进行数据组装了。

PS:分组查询一定要住处一个问题,默认返回10条数据,一定要设置分组后的条数,不认查询的数据不准确。多分terms都需要设置。

.addAggregation(AggregationBuilders.terms("yearmonthKeyGroup").field("yearmonthKey.keyword").size(EsPageConstant.totalPageSize)
  • 1

6、聚合查询时报UnmappedTerms转换错误

出现这种情况是因为,ES索引建立了,但是没有记录查询,也就是没有数据。

解决:

 Aggregation yearmonthKeyMapper = aggregationMap.get("yearmonthKeyGroup");
 if (yearmonthKeyMapper instanceof UnmappedTerms){
 	return list;
 }
  • 1
  • 2
  • 3
  • 4

二、ES索引操作

添加字段并赋值

POST http://10.10.206.34:9200/dwl_high_volatility_report_week/_update_by_query
{
    "script": {
        "float": "painless",
        "inline": "if (ctx._source.intervEventDiff== null) {ctx._source.intervEventDiff= 0}"
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

删除字段

POST index/type/_update_by_query
{
"script":{
"lang":"painless",
"inline":"ctx._source.remove(\"dept_name\")"
}
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

添加mappping映射

PUT /myindex/_mapping/article
{
  "properties": {
       "new_field_name": {
           "type":  "text"
       }
   }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

修改mapping字段类型

https://www.cnblogs.com/royfans/p/11436395.html

一、原索引

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "create_date": {
          "type":   "date",
          "format": "yyyy-MM-dd ||yyyy/MM/dd"
           
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

二、创建新索引

    PUT my_index2
    {
      "mappings": {
        "_doc": {
          "properties": {
            "create_date": {
              "type":   "text"

            }
          }
        }
      }
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

三、同步数据

POST _reindex                   
{
  "source": {
    "index": "my_index"
  },
  "dest": {
    "index": "my_index2"
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

四、删除原索引

DELETE 	my_index
  • 1

五、设置别名

POST /_aliases
  {
        "actions": [
            {"add": {"index": "my_index2", "alias": "my_index"}}
        ]
  }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

三、ES分页查询

int pageNum = 1;
int pageSize = 10;
Pageable pageable = PageRequest.of(pageNum-1, pageSize);
FieldSortBuilder fsb = SortBuilders.fieldSort("index").order(SortOrder.DESC)
                .unmappedType("string");

//NativeSearchQuery searchQuery = new NativeSearchQueryBuilder().withSort(fsb).build();
        
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder().withSort(fsb)
                .withPageable(pageable).build();

//NativeSearchQuery searchQuery = new NativeSearchQueryBuilder().build();

Page<TestModelEs> page = (Page<TestModelEs>)testModelEsRepository.search(searchQuery);

//获取总条数
long totalElements = page.getTotalElements();
//获取总页数
int totalPages = page.getTotalPages();

List<TestModelEs> content = page.getContent();
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

二、ES查询限制(<10000)

1、不用重启集群就可以实现(亲测,可实现)

例如:
curl -H “Content-Type: application/json” -XPUT http://10.10.203.90:9200/test_es_page_limit/_settings -d ‘{ “index” : { “max_result_window” : 50000}}’

2、通过配置文件设置(需重启集群,未测试)

配置文件路径:config/elasticsearch.yml
新配置:
max_result_window: 200000

3、scroll深度搜索(推荐使用)

//要查询第几页(传递的参数)
int pageNum = 0;
Int pageSize = 10;
List resultAdminLogs = new ArrayList<>();

    //排序
    FieldSortBuilder fsb = SortBuilders.fieldSort("index").order(SortOrder.DESC)
            .unmappedType("string");

    //构建查询
    NativeSearchQueryBuilder nsQueryBuilder = new NativeSearchQueryBuilder();

    //分页
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

//pageSize:每页查出多条数据
int count = 0;
Pageable pageable = PageRequest.of(count, pageSize);

nsQueryBuilder.withPageable(pageable).withIndices(“test_es_page_limit”)
.withSort(fsb);

    //深度搜索
    ScrolledPage<TestModelEs> scrollAdminLog = elasticsearchTemplate
            .startScroll(SCROLL_TIMEOUT, nsQueryBuilder.build(), TestModelEs.class);

    while (scrollAdminLog.hasContent()){
        count++;
        List<TestModelEs> content = scrollAdminLog.getContent();
        if(pageNum == count){
            return content; //返回结果
        }
        resultAdminLogs.addAll(content);
        //取下一页,scrollId在es服务器上可能会发生变化,需要用最新的;发起				continueScroll请求会重新刷新快照保留时间
        scrollAdminLog = (ScrolledPage<TestModelEs>) elasticsearchTemplate
                .continueScroll(scrollAdminLog.getScrollId(), SCROLL_TIMEOUT, 					TestModelEs.class);
    }
    //及时清除es快照,释放资源
    elasticsearchTemplate.clearScroll(scrollAdminLog.getScrollId());
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小小林熬夜学编程/article/detail/200654
推荐阅读
相关标签
  

闽ICP备14008679号