elasticsearch-restful-api

elasticsearch 的 resuful api 笔记

(1) 查看ES及索引信息

(1.1) 查看ES基本信息

[wkq@VM_77_25_centos ~]$ curl 'http://localhost:9200?pretty'
{
  "name" : "elasticsearch_001_data",
  "cluster_name" : "elasticsearch_test",
  "cluster_uuid" : "NsxYKhI1Qw63MzaPKl34dA",
  "version" : {
    "number" : "6.6.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "3bd3e59",
    "build_date" : "2019-03-06T15:16:26.864148Z",
    "build_snapshot" : false,
    "lucene_version" : "7.6.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

(1.2) 计算集群中文档的数量

$ curl  'http://localhost:9200/_count?pretty' 
{
  "count" : 3,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  }
}

可以看到返回结果是3

(1.3) 查看ES里所有索引

[wkq@VM_77_25_centos ~]$ curl 'localhost:9200/_cat/indices?v'
health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   megacorp EUu3nzyRQN2Mp7tRw2u_nQ   5   1          3            0     17.5kb         17.5kb
yellow open   gb       kAXZAJZTRg2X0Wm72-n5qQ   5   1          1            0      5.7kb          5.7kb
yellow open   us       Q6ubyCClQvyMB5iX8f9BZA   5   1          1            0      5.7kb          5.7kb
yellow open   tweet    hNRmZ9RETWWPoUMIK3BivA   5   1         12            0       26kb           26kb
yellow open   website  1Ptx5N-iTR2nDNtVgVMEpw   5   1          5            0     21.2kb         21.2kb
yellow open   blogs    bW_JTJkfS2GVN8FE_gX-Hg   3   2          0            0       783b           783b
yellow open   user     rOprq90rQsuyP0mad7I6iQ   5   1          2            0     10.2kb         10.2kb
[wkq@VM_77_25_centos ~]$

(1.3) 插入数据

$ curl -X PUT "localhost:9200/megacorp/employee/1" -H 'Content-Type: application/json' -d'
> {
>     "first_name" : "John",
>     "last_name" :  "Smith",
>     "age" :        25,
>     "about" :      "I love to go rock climbing",
>     "interests": [ "sports", "music" ]
> }
> '
{"_index":"megacorp","_type":"employee","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1}

(1.4) 根据id查询数据

$ curl -X GET "localhost:9200/megacorp/employee/1?pretty"
{
  "_index" : "megacorp",
  "_type" : "employee",
  "_id" : "1",
  "_version" : 2,
  "_seq_no" : 1,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "first_name" : "John",
    "last_name" : "Smith",
    "age" : 25,
    "about" : "I love to go rock climbing",
    "interests" : [
      "sports",
      "music"
    ]
  }
}

(1.5) 查询某个索引里的所有数据

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty"
{
  "took" : 17,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "Douglas",
          "last_name" : "Fir",
          "age" : 35,
          "about" : "I like to build cabinets",
          "interests" : [
            "forestry"
          ]
        }
      }
    ]
  }
}

(1.5) 根据某个字段搜索

$ curl -X GET "localhost:9200/megacorp/employee/_search?q=last_name:Smith&pretty"
{
  "took" : 112,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
}

(1.6) 查看索引mapping

$ curl 'http://localhost:9200/megacorp/_mapping?pretty=true'
{
  "megacorp" : {
    "mappings" : {
      "employee" : {
        "properties" : {
          "about" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "age" : {
            "type" : "long"
          },
          "first_name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "interests" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            },
            "fielddata" : true
          },
          "last_name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
    }
  }
}

(1.x)

GET /_search
{
    "_source": {
        "includes": [ "obj1.*", "obj2.*" ],
        "excludes": [ "*.description" ]
    },
    "query" : {
        "term" : { "user" : "kimchy" }
    }
}

(2) 领域特定语言(DSL)搜索

领域特定语言(DSL),使用 JSON 构造了一个请求。

(2.1) 计算集群中文档的数量

$ curl   -XGET 'http://localhost:9200/_count?pretty' -H "Content-Type: application/json" -d '
> {
>     "query": {
>         "match_all": {}
>     }
> }
> '
{
  "count" : 3,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  }
}

可以看到结果是3

(2.2) 使用查询表达式搜索

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match" : {
>             "last_name" : "Smith"
>         }
>     }
> }
> '
{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
}
$

(2.3) 多条件查询

搜索姓氏为 Smith 的员工,但这次我们只需要年龄大于 30 的

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "bool": {
>             "must": {
>                 "match" : {
>                     "last_name" : "smith"
>                 }
>             },
>             "filter": {
>                 "range" : {
>                     "age" : { "gt" : 30 }
>                 }
>             }
>         }
>     }
> }
> '
{
  "took" : 32,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      }
    ]
  }
}
$

range 过滤器,它能找到年龄大于 30 的文档,其中 gt 表示_大于_(great than)

(2.4) 全文搜索

搜索下所有喜欢攀岩(rock climbing)的员工

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match" : {
>             "about" : "rock climbing"
>         }
>     }
> }
> '
{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.5753642,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.5753642,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      }
    ]
  }
}

(2.5) 短语搜索

精确匹配一系列单词或者_短语_

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match_phrase" : {
>             "about" : "rock climbing"
>         }
>     }
> }
> '
{
  "took" : 16,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.5753642,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.5753642,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
}

(2.6) 高亮搜索

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>     "query" : {
>         "match_phrase" : {
>             "about" : "rock climbing"
>         }
>     },
>     "highlight": {
>         "fields" : {
>             "about" : {}
>         }
>     }
> }
> '
{
  "took" : 305,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.5753642,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.5753642,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        },
        "highlight" : {
          "about" : [
            "I love to go <em>rock</em> <em>climbing</em>"
          ]
        }
      }
    ]
  }
}

(2.7) 聚合

聚合与 SQL 中的 GROUP BY 类似但更强大。

挖掘出员工中最受欢迎的兴趣爱好

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>   "aggs": {
>     "all_interests": {
>       "terms": { "field": "interests" }
>     }
>   }
> }
> '
{
  "took" : 139,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "Douglas",
          "last_name" : "Fir",
          "age" : 35,
          "about" : "I like to build cabinets",
          "interests" : [
            "forestry"
          ]
        }
      }
    ]
  },
  "aggregations" : {
    "all_interests" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "music",
          "doc_count" : 2
        },
        {
          "key" : "forestry",
          "doc_count" : 1
        },
        {
          "key" : "sports",
          "doc_count" : 1
        }
      ]
    }
  }
}

可以看到,两位员工对音乐感兴趣,一位对林业感兴趣,一位对运动感兴趣。这些聚合的结果数据并非预先统计,而是根据匹配当前查询的文档即时生成的。

(2.8) 条件聚合

想知道叫 Smith 的员工中最受欢迎的兴趣爱好

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>   "query": {
>     "match": {
>       "last_name": "smith"
>     }
>   },
>   "aggs": {
>     "all_interests": {
>       "terms": {
>         "field": "interests"
>       }
>     }
>   }
> }
> '
{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      }
    ]
  },
  "aggregations" : {
    "all_interests" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "music",
          "doc_count" : 2
        },
        {
          "key" : "sports",
          "doc_count" : 1
        }
      ]
    }
  }
}

(2.9) 聚合条件汇总

查询特定兴趣爱好员工的平均年龄

$ curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>     "aggs" : {
>         "all_interests" : {
>             "terms" : { "field" : "interests" },
>             "aggs" : {
>                 "avg_age" : {
>                     "avg" : { "field" : "age" }
>                 }
>             }
>         }
>     }
> }
> '
{
  "took" : 36,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "first_name" : "Douglas",
          "last_name" : "Fir",
          "age" : 35,
          "about" : "I like to build cabinets",
          "interests" : [
            "forestry"
          ]
        }
      }
    ]
  },
  "aggregations" : {
    "all_interests" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "music",
          "doc_count" : 2,
          "avg_age" : {
            "value" : 28.5
          }
        },
        {
          "key" : "forestry",
          "doc_count" : 1,
          "avg_age" : {
            "value" : 35.0
          }
        },
        {
          "key" : "sports",
          "doc_count" : 1,
          "avg_age" : {
            "value" : 25.0
          }
        }
      ]
    }
  }
}

(3) ES修改配置语句

(3.1) es 5.x 开启全文检索语句

5.x后对排序,聚合这些操作用单独的数据结构(fielddata)缓存到内存里了,需要单独开启

$ curl -X PUT "localhost:9200/megacorp/_mapping/employee/" -H 'Content-Type: application/json' -d'
> {
>   "properties": {
>     "interests": {
>       "type":     "text",
>       "fielddata": true
>     }
>   }
> }
> '
{"acknowledged":true}

推荐使用keyword聚合

(3.2) 创建索引并设置分片数和副本数

$ curl -X PUT "localhost:9200/blogs?pretty" -H 'Content-Type: application/json' -d'
> {
>    "settings" : {
>       "number_of_shards" : 3,
>       "number_of_replicas" : 1
>    }
> }
> '
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "blogs"
}

(3.3) 设置副本数

$ curl -X PUT "localhost:9200/blogs/_settings?pretty" -H 'Content-Type: application/json' -d'
> {
>    "number_of_replicas" : 2
> }
> '
{
  "acknowledged" : true
}

(3.4) get set up

$ curl -X GET "localhost:9200/_nodes/transport?error_trace=true&pretty=true"
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elasticsearch_test",
  "nodes" : {
    "urmXtplyRmyt_LKCTC6_3w" : {
      "name" : "elasticsearch_001_data",
      "transport_address" : "127.0.0.1:9300",
      "host" : "127.0.0.1",
      "ip" : "127.0.0.1",
      "version" : "6.6.2",
      "build_flavor" : "default",
      "build_type" : "tar",
      "build_hash" : "3bd3e59",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "attributes" : {
        "ml.machine_memory" : "1927528448",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20",
        "ml.enabled" : "true"
      },
      "transport" : {
        "bound_address" : [
          "127.0.0.1:9300"
        ],
        "publish_address" : "127.0.0.1:9300",
        "profiles" : { }
      }
    }
  }
}

References

[1] 索引员工文档
[2] 检索文档
[3] 轻量搜索
[4] 使用查询表达式搜索
[5] 更复杂的搜索
[6] 全文搜索
[7] 短语搜索
[8] 高亮搜索
[9] 分析