elasticsearch API 学习

2018-02-09 12:41:06来源:https://www.jianshu.com/p/a0fb9311014d作者:琯琯人点击

分享


创建索引 yii2blog/articles
curl -v -X PUT "http://localhost:9200/yii2blog?pretty=true" -d "json"
curl -v -X POST "http://localhost:9200/yii2blog?pretty=true" -d "json"
{
"settings":{
"refresh_interval": "5s", //5秒后刷新
"number_of_shards": 1, //分片,目前1台机器,所以为1
"number_of_replicas": 0 //副本为0
},
"mappings": {
"_default_": {
"_all": {
"enabled": true //所有数据都索引
}
},
"articles": { //名称可自定义,可定义为表名
"dynamic": false , //动态映射
"properties": {
"article_id": {
"type": "long"
},
"post_title": {
"type": "string",
"index": "analyzed",
"analyzer": "ik"
},
"post_excerpt": {
"type": "string",
"index": "analyzed",
"analyzer": "ik"
}
}
}
}
}

删除索引 yii2blog/articles
curl -v -X DELETE "localhost:9200/yii2blog?pretty=true"

添加记录
curl -v X PUT "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
curl -v X POST "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
{
"article_id" : 1,
"post_title" : "这是文章标题",
"post_excerpt" : "这是文章描述"
}

查看记录
curl -v X GET "http://localhost:9200/yii2blog/articles/1?pretty=true"

删除记录
curl -v X DELETE "http://localhost:9200/yii2blog/articles/1?pretty=true"

更新记录
curl -v X PUT "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
curl -v X POST "http://localhost:9200/yii2blog/articles/1?pretty=true" -d "json"
{
"article_id" : 1,
"post_title" : "更新文章标题",
"post_excerpt" : "更新文章描述"
}

数据查询
1. 返回 elastic 中所有记录
curl -v X GET "http://localhost:9200/_search?pretty=true"

2. 返回 yii2blog 中所有记录
curl -v X GET "http://localhost:9200/yii2blog/_search?pretty=true"

3. 返回 yii2blog/articles 中所有记录
curl -v X GET "http://localhost:9200/yii2blog/articles/_search?pretty=true"

全文搜索
1. 使用 Match 查询,指定的匹配条件是 post_excerpt 字段里面包含"描述"这个词
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述" }}
}

2. 返回2两条记录(Elastic 默认一次返回10条结果)
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述" }},
"size": 2
}

3. 从位置3开始(默认是从位置0开始),只返回5条结果。
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述" }},
"from": 3,
"size": 5
}

4. 搜索的是 "描述" or "文章"。
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query" : { "match" : { "post_excerpt" : "描述 文章" }}
}

5. 搜索的是 "描述" and "文章",必须使用布尔查询。
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query": {
"bool": {
"must": [
{ "match": { "post_excerpt": "描述" } },
{ "match": { "post_excerpt": "文章" } }
]
}
}
}

6. 使用 Match 查询,指定的匹配条件是 post_excerpt 字段里面包含带高亮的"描述"这个词
curl -v X POST "http://localhost:9200/yii2blog/articles/_search?pretty=true" -d "json"
{
"query": {
"multi_match": {
"query": "描述",
"fields": [
"post_excerpt"
]
}
},
"highlight": {
"pre_tags": [
"<b class=/"highlight/">"
],
"post_tags": [
"</b>"
],
"fields": {
"post_excerpt": {}
}
}
}

结果:


{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.48288077,
"hits": [
{
"_index": "yii2blog",
"_type": "articles",
"_id": "1",
"_score": 0.48288077,
"_source": {
"article_id": 1,
"post_title": "这是文章标题",
"post_excerpt": "这是文章描述,更新"
},
"highlight": {
"post_excerpt": [
"这是文章描述,<b>更新</b>"
]
}
},
{
"_index": "yii2blog",
"_type": "articles",
"_id": "2",
"_score": 0.48288077,
"_source": {
"article_id": 1,
"post_title": "这是文章标题",
"post_excerpt": "这是文章描述,更新"
},
"highlight": {
"post_excerpt": [
"这是文章描述,<b>更新</b>"
]
}
}
]
}
}

中文分词设置

首先,安装中文分词插件。这里使用的是 ik,也可以考虑其他插件(比如 smartcn)。



$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip

上面代码安装的是5.5.1版的插件,与 Elastic 5.5.1 配合使用。


接着,重新启动 Elastic,就会自动加载这个新安装的插件。


然后,新建一个 Index,指定需要分词的字段。这一步根据数据结构而异,下面的命令只针对本文。基本上,凡是需要搜索的中文字段,都要单独设置一下。



$ curl -X PUT 'localhost:9200/accounts' -d '
{
"mappings": {
"person": {
"properties": {
"user": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"title": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"desc": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}
}
}'

上面代码中,首先新建一个名称为accounts的 Index,里面有一个名称为person的 Type。person有三个字段。


user
title
desc

这三个字段都是中文,而且类型都是文本(text),所以需要指定中文分词器,不能使用默认的英文分词器。


Elastic 的分词器称为 analyzer。我们对每个字段指定分词器。



"user": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}

上面代码中,analyzer是字段文本的分词器,search_analyzer是搜索词的分词器。ik_max_word分词器是插件ik提供的,可以对文本进行最大数量的分词。


分词测试
1. 自带分词器 standard
curl -v -X POST "http://localhost:9200/_analyze?analyzer=standard&pretty=true" -d "这是一段汉字"

结果


{
"tokens": [
{
"token": "这",
"start_offset": 1,
"end_offset": 2,
"type": "<IDEOGRAPHIC>",
"position": 0
},
{
"token": "是",
"start_offset": 2,
"end_offset": 3,
"type": "<IDEOGRAPHIC>",
"position": 1
},
{
"token": "一",
"start_offset": 3,
"end_offset": 4,
"type": "<IDEOGRAPHIC>",
"position": 2
},
{
"token": "段",
"start_offset": 4,
"end_offset": 5,
"type": "<IDEOGRAPHIC>",
"position": 3
},
{
"token": "汉",
"start_offset": 5,
"end_offset": 6,
"type": "<IDEOGRAPHIC>",
"position": 4
},
{
"token": "字",
"start_offset": 6,
"end_offset": 7,
"type": "<IDEOGRAPHIC>",
"position": 5
}
]
}

2. 中文分词插件 ik
curl -v -X POST "http://localhost:9200/_analyze?analyzer=ik&pretty=true" -d "这是一段汉字"

结果


{
"tokens": [
{
"token": "这是",
"start_offset": 1,
"end_offset": 3,
"type": "CN_WORD",
"position": 0
},
{
"token": "一段",
"start_offset": 3,
"end_offset": 5,
"type": "CN_WORD",
"position": 1
},
{
"token": "一",
"start_offset": 3,
"end_offset": 4,
"type": "TYPE_CNUM",
"position": 2
},
{
"token": "段",
"start_offset": 4,
"end_offset": 5,
"type": "COUNT",
"position": 3
},
{
"token": "汉字",
"start_offset": 5,
"end_offset": 7,
"type": "CN_WORD",
"position": 4
},
{
"token": "汉",
"start_offset": 5,
"end_offset": 6,
"type": "CN_WORD",
"position": 5
},
{
"token": "字",
"start_offset": 6,
"end_offset": 7,
"type": "CN_CHAR",
"position": 6
}
]
}

参考:
Elasticsearch Reference [2.2]Document APIs
全文搜索引擎 Elasticsearch 入门教程







最新文章

123

最新摄影

闪念基因

微信扫一扫

第七城市微信公众平台