赞
踩
1.elasticsearch简单介绍
①什么是elasticsearch?
开源的分布式搜索引擎,实现海量数据搜索,日志统计,分析,系统监控等功能
②什么是elastic stack?
是以elasticsearch为核心的技术栈,包括beats,Logstash,kibana, elasticsearch
③什么是Lucene?
是Apache开源搜索引擎类库,提供搜索引擎的API
2.正向索引和倒排索引
①什么是文档document和词条term?
文档:每一条数据就是文档
词条:对文档的内容分词
②什么是正向索引?
基于文档id创建索引,查询词条时必须找到文档,然后判断该文档是否包含词条。
③什么是倒排索引?
对文档内容进行分词,对词条创建索引,并记录词条所在文档的信息。
搜索时要进行分词,然后去词条列表查询文档id,通过文档id查询文档,把结果存放在结果集。
④倒排索引的组成
词条词典:记录所有词条, 词条和倒排列表PostingList之间关系。给词条创建索引,提高查询和插入效率
倒排列表PostingList:记录词条所在的文档id,词条出现频率,词条在文档的位置
3.ES的基本概念
①索引index:
同类型文档的集合。类似Mysql的表table
②文档document:
一条数据就是一个文档,文档数据在es是Json格式。类似Mysql的行row
③字段field:
Json文档的字段。类似Mysql的列
④映射Mapping:
字段的类型约束信息,类似表的结构约束
4.ES-MySQL的架构
ES:海量数据搜索
Mysql:事务操作,数据安全性和一致性
1.索引库 (建表)的概念
①mapping是对文档字段的约束。
②mapping约束的属性
type:字段类型(
字符串:text(分词)、keyword(不分词)
数字:long、integer、short、byte、double、float
布尔:boolean
日期:date
对象:object
index:该字段是否用约束,默认true
analyzer:使用的分词器
properties:该字段下的子字段
2.创建索引库
ES通过Restful请求操作索引库和文档。请求内容通过DSL语句表示
创建索引库:创建一个索引库heima,有字段info,email,name有子字段firstName, lastName
# 创建索引库heima PUT /heima { "mappings": { "properties": { "info":{ "type":"text", "analyzer":"ik_smart" }, "email":{ "type": "keyword", "index": false }, "name":{ "type": "object", "properties": { "firstName":{ "type":"keyword" }, "lastName":{ "type":"keyword" } } } } } }
查看索引库:GET /索引库名
删除索引库:DELETE /索引库名
添加索引库字段:PUT /索引库名/_mapping
- PUT /heima/_mapping
- {
- "properties":{
- "age":{
- "type":"integer"
- }
- }
- }
3.文档操作(添加数据)
①增
- POST /heima/_doc/1
- {
- "age": 20,
- "info": "这是学生",
- "email": "1111@qq.com",
- "name": {
- "lastName": "wang",
- "firsrName":"zhi"
- }
- }
②删
DELETE /索引库名/_doc/文档id
③改
方式一:根据主键值id全量修改,删除旧文档,新增以下字段文档。此时只有age
- PUT /heima/_doc/1
- {
- "age": 10
- }
方式二:修改某个字段值,只把年龄修改
- POST /heima/_update/1
- {
- "doc":{
- "age": 10
- }
-
- }
④查
GET /索引库名/_doc/文档id
GET /heima/_doc/1
4.文档操作有哪些?
全量修改:PUT /索引库名/_doc/文档id { json文档 }
增量修改:POST /索引库名/_update/文档id { "doc": {字段}}
1.什么是RestClient
RestClient本质是组装DSL语句,通过http请求发送ES
2.利用JavaRestClient实现创建、删除索引库。判断索引库是否存在
步骤1:导入资料包hotel-demo
步骤2:分析数据结构
①mapping要考虑问题:字段名,数据类型type,是否搜索index,是否分词analyzer(分词器)
mapping映射
# 酒店的mapping PUT /hotel { "mappings": { "properties": { "id":{ "type": "keyword" }, "name":{ "type": "text", "analyzer": "ik_max_word" }, "address":{ "type": "keyword", "index": false }, "price":{ "type": "integer" }, "score":{ "type":"integer" }, "brand":{ "type":"keyword" }, "city":{ "type":"keyword" }, "starName":{ "type":"keyword" }, "business":{ "type":"keyword" }, "location":{ "type":"geo_point" }, "pic":{ "type":"keyword", "index":false } } } }
修改:把全部搜索的字段组合放在all字段
# 酒店的mapping PUT /hotel { "mappings": { "properties": { "id":{ "type": "keyword" }, "name":{ "type": "text", "analyzer": "ik_max_word", "copy_to": "all" }, "address":{ "type": "keyword", "index": false }, "price":{ "type": "integer" }, "score":{ "type":"integer" }, "brand":{ "type":"keyword", "copy_to": "all" }, "city":{ "type":"keyword" }, "starName":{ "type":"keyword" }, "business":{ "type":"keyword" }, "location":{ "type":"geo_point" }, "pic":{ "type":"keyword", "index":false }, "all":{ "type": "text", "analyzer": "ik_max_word" } } } }
步骤3:初始化JavaRestClient
①导入es依赖
- <dependency>
- <groupId>org.elasticsearch.client</groupId>
- <artifactId>elasticsearch-rest-high-level-client</artifactId>
- <version>7.12.1</version>
- </dependency>
②初始化RestHighLevelClient:
- public class HotelIndexTest {
- private RestHighLevelClient client;
- @Test
- void testInit(){
-
- }
- @BeforeEach
- void setUp() {
- this.client = new RestHighLevelClient(RestClient.builder(
- HttpHost.create("http://192.168.137.129:9200")
- ));
- }
- @AfterEach
- void tearDown() throws IOException {
- this.client.close();
- }
-
- }

所有的单元测试,先运行@BeforeEach再@Test,最后@AfterEach
步骤4:
MAPPING_TEMPLATE
- public class HotelConstants {
- public static final String MAPPING_TEMPLATE =
- "{\n" +
- " \"mappings\": {\n" +
- " \"properties\": {\n" +
- " \"id\":{\n" +
- " \"type\": \"keyword\"\n" +
- " },\n" +
- " \"name\":{\n" +
- " \"type\": \"text\",\n" +
- " \"analyzer\": \"ik_max_word\",\n" +
- " \"copy_to\": \"all\"\n" +
- " },\n" +
- " \"address\":{\n" +
- " \"type\": \"keyword\",\n" +
- " \"index\": false\n" +
- " },\n" +
- " \"price\":{\n" +
- " \"type\": \"integer\"\n" +
- " },\n" +
- " \"score\":{\n" +
- " \"type\":\"integer\"\n" +
- " },\n" +
- " \"brand\":{\n" +
- " \"type\":\"keyword\",\n" +
- " \"copy_to\": \"all\"\n" +
- " },\n" +
- " \"city\":{\n" +
- " \"type\":\"keyword\"\n" +
- " }, \n" +
- " \"starName\":{\n" +
- " \"type\":\"keyword\"\n" +
- " },\n" +
- " \"business\":{\n" +
- " \"type\":\"keyword\"\n" +
- " },\n" +
- " \"location\":{\n" +
- " \"type\":\"geo_point\"\n" +
- " },\n" +
- " \"pic\":{\n" +
- " \"type\":\"keyword\",\n" +
- " \"index\":false\n" +
- " },\n" +
- " \"all\":{\n" +
- " \"type\": \"text\",\n" +
- " \"analyzer\": \"ik_max_word\"\n" +
- " }\n" +
- " }\n" +
- " }\n" +
- "}\n";
-
-
- }

①创建索引库
- @Test
- void createHotelIndex() throws IOException {
- // 1.创建request对象(索引库)
- CreateIndexRequest request = new CreateIndexRequest("hotel");
- // 2.请求参数(mapping)
- request.source(MAPPING_TEMPLATE, XContentType.JSON);
- // 3.发送请求(创建)
- client.indices().create(request, RequestOptions.DEFAULT);
- }
②查询索引库是否存在
- @Test
- void getHotelIndex() throws IOException {
- // 1.创建查询对象
- GetIndexRequest request = new GetIndexRequest("hotel");
- // 2.执行查询
- Boolean response = client.indices().exists(request, RequestOptions.DEFAULT);
- System.out.println(response);
- }
③删除索引库
- @Test
- void deleteHotelIndex() throws IOException {
- // 1.创建删除对象
- DeleteIndexRequest request = new DeleteIndexRequest("hotel");
- // 2.执行删除
- client.indices().delete(request,RequestOptions.DEFAULT);
- }
索引库操作的基本步骤:
1.添加文档client.index()
①查询数据库里面ID号为38609酒店信息
②把查询的信息转换为json插入,id从信息里面获取
- @Test
- void testIndexDocument() throws IOException {
- // 1.根据id查询
- Hotel hotel = hotelService.getById(38609L);
- // 转换为文档数据
- HotelDoc hotelDoc = new HotelDoc(hotel);
-
- // 2.创建request对象
- IndexRequest request = new IndexRequest("hotel").id(hotelDoc.getId().toString());
- // 创建JSON文档
- request.source(JSON.toJSONString(hotelDoc), XContentType.JSON);
- // 发送请求
- client.index(request, RequestOptions.DEFAULT);
- }
2.查询文档client.get()
- @Test
- void testGetIndex() throws IOException {
- // 查询请求
- GetRequest request = new GetRequest("hotel","38609");
- // 发送请求
- GetResponse response = client.get(request, RequestOptions.DEFAULT);
- // 获取结果_source的JSON
- String json = response.getSourceAsString();
- // JSON转换为对象
- HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
- System.out.println(hotelDoc);
-
- }
3.更新文档client.update()
- @Test
- void testUpdate() throws IOException {
- // 更新请求
- UpdateRequest request = new UpdateRequest("hotel","38609");
- // 更新字段 K,V 逗号隔开
- request.doc(
- "price","666",
- "starName","四钻"
- );
- // 执行更新
- client.update(request,RequestOptions.DEFAULT);
-
- }
4.根据ID删除文档
- @Test
- void testDelete() throws IOException {
- // 删除请求
- DeleteRequest request = new DeleteRequest("hotel","38609");
- // 执行删除
- client.delete(request,RequestOptions.DEFAULT);
- }
5.批量导入数据BulkRequest
- @Test
- void testBulk() throws IOException {
- // 查询数据
- List<Hotel> hotels = hotelService.list();
- // 创建Request
- BulkRequest request = new BulkRequest();
- // 批量数据
- for (Hotel hotel : hotels) {
- HotelDoc hotelDoc = new HotelDoc(hotel);
- request.add(new IndexRequest("hotel")
- .id(hotelDoc.getId().toString())
- .source(JSON.toJSONString(hotelDoc),XContentType.JSON));
- }
- // 执行添加
- client.bulk(request,RequestOptions.DEFAULT);
- }

6.总结文档操作步骤
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。