1. 借鑒
極客時(shí)間 阮一鳴老師的Elasticsearch核心技術(shù)與實(shí)戰(zhàn)
Elasticsearch--Aggregation詳細(xì)總結(jié)(聚合統(tǒng)計(jì))
Elasticsearch聚合——Bucket Aggregations
Elasticsearch聚合——Metrics Aggregations
Elasticsearch聚合——Pipeline Aggregations
官網(wǎng) search-aggregations
地理距離過(guò)濾器
Elasticsearch:aggregation介紹
ES aggregation詳解
aggregation 詳解1(概述)
aggregation 詳解2(metrics aggregations)
aggregation 詳解3(bucket aggregation)
aggregation 詳解4(pipeline aggregations)
[Elasticsearch] 過(guò)濾查詢(xún)以及聚合(Filtering Queries and Aggregations)
官網(wǎng) search-aggregations-bucket
官網(wǎng) search-aggregations-metrics
官網(wǎng) search-aggregations-pipeline
官網(wǎng) search-aggregations-matrix
Using a bucket script aggregation inside filter aggreagtion
問(wèn)題:nested查詢(xún),內(nèi)部需要聚合,再刷選,怎么弄?
2. 開(kāi)始
數(shù)據(jù)準(zhǔn)備:<Elasticsearch 7.x 深入 數(shù)據(jù)準(zhǔn)備>
Aggregation 分類(lèi)
aggregations提供基于搜索查詢(xún)的聚合數(shù)據(jù),它有以下分類(lèi)
- Bucket
一組構(gòu)建bucket的聚合,其中每個(gè)bucket與一個(gè)鍵和一個(gè)文檔條件相關(guān)聯(lián)。當(dāng)執(zhí)行聚合時(shí),將對(duì)上下文中每個(gè)文檔計(jì)算所有bucket條件,當(dāng)某個(gè)條件匹配時(shí),將認(rèn)為文檔“落在”相關(guān)bucket中。在聚合過(guò)程的最后,我們將得到一個(gè)存儲(chǔ)段列表——每個(gè)存儲(chǔ)段都有一組“屬于”它的文檔。 - Metric
在一組文檔上跟蹤和計(jì)算指標(biāo)的聚合。 - Matrix
操作多個(gè)字段并根據(jù)從請(qǐng)求的文檔字段中提取的值生成矩陣結(jié)果的一組聚合。與Bucket和Metric不同,這個(gè)聚合還不支持腳本。 - Pipeline
聚合,聚合其他聚合及其相關(guān)指標(biāo)的輸出
聚合的語(yǔ)法
"aggregations" : { // 關(guān)鍵詞
"<aggregation_name>" : { // 自定義的聚合名字
"<aggregation_type>" : { // 聚合的類(lèi)型
<aggregation_body>
}
[,"meta" : { [<meta_data_body>] } ]?
[,"aggregations" : { [<sub_aggregation>]+ } ]? // 子聚合
}
[,"<aggregation_name_2>" : { ... } ]* // 同級(jí)聚合
}
下面我們依次來(lái)看一下
Bucket
在es的文檔中有好多類(lèi)型,這里就不一一列舉了
- Terms
- Range
- Date Range
- Histogram
- Date Histogram
- ...
栗子1: terms
我們舉個(gè)栗子,看下有訂單中有幾種商品
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_goodsName": {
"terms": {
"field": "goodsName.keyword",
"size": 10
}
}
}
}
我們看下結(jié)果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_goodsName" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "IPhone 8 Plus",
"doc_count" : 2
},
{
"key" : "IPhone 9 Plus",
"doc_count" : 2
},
{
"key" : "IPhone 10 Plus",
"doc_count" : 1
}
]
}
}
}
- 優(yōu)化terms聚合的性能[在mapping時(shí)指定eager_global_ordinals為true]
在字段需要經(jīng)常被聚合;同時(shí)不斷有新文檔寫(xiě)入時(shí),可以增加這個(gè)屬性 - min_doc_count:我們可以在聚合時(shí)指定最小的文檔數(shù)目,只有滿足這個(gè)參數(shù)要求的個(gè)數(shù)的詞條才會(huì)被記錄返回
terms聚合中,返回結(jié)果中的屬性含義:
屬性 | 含義 |
---|---|
doc_count_error_upper_bound | 被遺漏的term桶,可能包含文檔的最大值 |
sum_other_doc_count | 除了返回結(jié)果中bucket中的terms之外,其他terms的文檔總數(shù)(總數(shù)-返回的總數(shù)) |
栗子2:子聚合
取每種商品中,價(jià)格最高的1個(gè)訂單
# 先根據(jù)goodsName.keyword分組,然后在按照價(jià)格倒序排序,取第一個(gè)
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_goodsName": {
"terms": {
"field": "goodsName.keyword",
"size": 10
},
"aggs": {
"more_amount": {
"top_hits": {
"size": 1,
"sort": [
{
"amount": {
"order": "desc"
}
}
]
}
}
}
}
}
}
看下返回結(jié)果
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_goodsName" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "IPhone 8 Plus",
"doc_count" : 2,
"more_amount" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "aggs_order",
"_type" : "_doc",
"_id" : "HAOY-SKXIS-LIWN",
"_score" : null,
"_source" : {
"platform" : "IOS",
"amount" : 1200,
"createTime" : "2020-04-15 10:00",
"originatorId" : 2,
"originatorName" : "李四",
"goodsId" : 1,
"goodsName" : "IPhone 8 Plus"
},
"sort" : [
1200
]
}
]
}
}
},
{
"key" : "IPhone 9 Plus",
"doc_count" : 2,
"more_amount" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "aggs_order",
"_type" : "_doc",
"_id" : "USYX_SJJSUL_XUSYA",
"_score" : null,
"_source" : {
"platform" : "PC",
"amount" : 500,
"createTime" : "2020-01-20 10:00",
"originatorId" : 1,
"originatorName" : "張三",
"goodsId" : 2,
"goodsName" : "IPhone 9 Plus"
},
"sort" : [
500
]
}
]
}
}
},
{
"key" : "IPhone 10 Plus",
"doc_count" : 1,
"more_amount" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "aggs_order",
"_type" : "_doc",
"_id" : "XXSA-KSUWL-USIA",
"_score" : null,
"_source" : {
"platform" : "PC",
"createTime" : "2020-01-20 10:00",
"originatorId" : 3,
"originatorName" : "王五",
"goodsId" : 3,
"goodsName" : "IPhone 10 Plus"
},
"sort" : [
-9223372036854775808
]
}
]
}
}
}
]
}
}
}
栗子3:range
按照訂單價(jià)格區(qū)間進(jìn)行分組(通過(guò)這個(gè)例子,可以看到range是前閉后開(kāi)區(qū)間 [0, 300) )
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"amount_range": {
"range": {
"field": "amount",
"ranges": [
{
"to": 300
},
{
"from": 300,
"to": 700
},
{
"key": "gt 700",
"from": 700
}
]
}
}
}
}
看下結(jié)果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"amount_range" : {
"buckets" : [
{
"key" : "*-300.0",
"to" : 300.0,
"doc_count" : 1
},
{
"key" : "300.0-700.0",
"from" : 300.0,
"to" : 700.0,
"doc_count" : 2
},
{
"key" : "gt 700",
"from" : 700.0,
"doc_count" : 1
}
]
}
}
}
栗子4:script
首先計(jì)算出訂單中的年,然后按照年進(jìn)行分組
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_year": {
"range": {
"script": {
"source": """
JodaCompatibleZonedDateTime dateTime = doc['createTime'].value;
return params.now - dateTime.getYear();
""",
"params": {
"now": 2020
}
},
"ranges": [
{
"to": 1
},
{
"from": 1,
"to": 3
},
{
"from": 3,
"to": 5
},
{
"from": 5
}
]
}
}
}
}
栗子5:geo_distance
以給定位置為圓心畫(huà)一個(gè)圓,來(lái)找出那些地理坐標(biāo)落在其中的文檔
GET /aggs_hotel/_search
{
"size": 0,
"aggs": {
"rings_around_amsterdam": {
"geo_distance": {
"field": "location",
"origin": {
"lon": 109.0000000,
"lat": 34.0000000
},
"ranges": [
{ "to" : 100000 },
{ "from" : 100000, "to" : 300000 },
{ "from" : 300000 }
]
}
}
}
}
我們來(lái)看下結(jié)果
{
"took" : 82,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"rings_around_amsterdam" : {
"buckets" : [
{
"key" : "*-100000.0",
"from" : 0.0,
"to" : 100000.0,
"doc_count" : 6
},
{
"key" : "100000.0-300000.0",
"from" : 100000.0,
"to" : 300000.0,
"doc_count" : 0
},
{
"key" : "300000.0-*",
"from" : 300000.0,
"doc_count" : 2
}
]
}
}
}
- 我們可以使用unit來(lái)指定單位,默認(rèn)是m
By default, the distance unit is m (meters) but it can also accept: mi (miles), in (inches),
yd (yards), km (kilometers), cm (centimeters), mm (millimeters).
- 我們可以使用keyed,將buckets下的數(shù)組變?yōu)閎uckets下的每一個(gè)hash
栗子5:filter ,nested
我們查一下“澤蘭雅家酒店”這個(gè)酒店,會(huì)員等級(jí)為001,住離日期是[2020-05-01, 2020-05-03),所要花費(fèi)的價(jià)格等信息
# 第一種寫(xiě)法,直接篩選
GET /aggs_hotel_price/_search
{
"size": 0,
"query": {
"constant_score": {
"filter": {
"term": {
"name.keyword": "澤蘭雅家酒店"
}
}
}
},
"aggs": {
"prices": {
"nested": {
"path": "prices"
},
"aggs": {
"group_by_level": {
"terms": {
"field": "prices.level",
"size": 1,
"include": "001"
},
"aggs": {
"date_range": {
"date_range": {
"field": "prices.selldate",
"ranges": [
{
"from": "2020-05-01",
"to": "2020-05-03"
}
]
},
"aggs": {
"stats": {
"stats": {
"field": "prices.price"
}
}
}
}
}
}
}
}
}
}
# 第二種寫(xiě)法-并不是很建議,因?yàn)檫@里對(duì)名字進(jìn)行分組后篩選
GET /aggs_hotel_price/_search
{
"size": 0,
"aggs": {
"group_by_name": {
"filter": {
"term": {
"name.keyword": "澤蘭雅家酒店"
}
},
"aggs": {
"prices": {
"nested": {
"path": "prices"
},
"aggs": {
"group_by_level": {
"terms": {
"field": "prices.level",
"size": 1,
"include": "001"
},
"aggs": {
"date_range": {
"date_range": {
"field": "prices.selldate",
"ranges": [
{
"from": "2020-05-01",
"to": "2020-05-03"
}
]
},
"aggs": {
"stats": {
"stats": {
"field": "prices.price"
}
}
}
}
}
}
}
}
}
}
}
}
我們看下第二種方式的結(jié)果[需要注意的是,上述2中方式的返回結(jié)果的格式不一樣,因?yàn)榈诙N多了一次聚合]
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_name" : {
"doc_count" : 1,
"prices" : {
"doc_count" : 6,
"group_by_level" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "001",
"doc_count" : 2,
"date_range" : {
"buckets" : [
{
"key" : "2020-05-01-2020-05-03",
"from" : 1.5882912E12,
"from_as_string" : "2020-05-01",
"to" : 1.588464E12,
"to_as_string" : "2020-05-03",
"doc_count" : 2,
"stats" : {
"count" : 2,
"min" : 9.0,
"max" : 15.0,
"avg" : 12.0,
"sum" : 24.0
}
}
]
}
}
]
}
}
}
}
}
Metric
在Metric中,有兩種類(lèi)型,一種是單值類(lèi)型,另外一種是多值類(lèi)型,我們接下來(lái)分別看下
單值類(lèi)型(只返回一個(gè)分析結(jié)果)
在es的文檔中有好多類(lèi)型,這里就不一一列舉了
- min 最小值
- max 最大值
- avg 平均值
- sum 總和
- cardinality 去重后的數(shù)量
接下來(lái)我們來(lái)舉個(gè)栗子
- 我要查詢(xún)訂單中最小的支付金額
GET /aggs_order/_search
{
"size": 0, // 我這里沒(méi)有query部分,我也不關(guān)系它的返回,這里size設(shè)置為0
"aggs": { // 這里是關(guān)鍵字,不能變的
"min_aggs": { // 這里是自定義的aggs的名稱(chēng),自定義
"min": { // 這里是要聚合的類(lèi)型,只能是我們上面說(shuō)的那些
"field": "amount" // 要進(jìn)行聚合的字段
}
}
}
}
返回結(jié)果如下:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"min_aggs" : {
"value" : 100.0 // 可以看到這里返回了最小值
}
}
}
同樣我們也可以有多個(gè)聚合,這里我們查詢(xún)訂單中支付金額的最大,最小和平均值
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"min_aggs": {
"min": {
"field": "amount"
}
},
"max_aggs": {
"max": {
"field": "amount"
}
},
"avg_aggs": {
"avg": {
"field": "amount"
}
}
}
}
我們來(lái)看下結(jié)果
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"avg_aggs" : { // 平均值
"value" : 525.0
},
"min_aggs" : { // 最小值
"value" : 100.0
},
"max_aggs" : { // 最大值
"value" : 1200.0
}
}
}
多值類(lèi)型(返回多個(gè)分析結(jié)果)
在es的文檔中有好多類(lèi)型,這里就不一一列舉了
- stats
- extended stats
- percentile
- percentile rank
- top hits
- ...
我們舉個(gè)栗子,我要看下訂單中amount的綜合數(shù)據(jù),比如最大值,最小值等等
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"stats_aggs": {
"stats": { // 指定聚合類(lèi)型為多值類(lèi)型中的stats
"field": "amount"
}
}
}
}
我們看下返回結(jié)果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"stats_aggs" : {
"count" : 4, // 包含amount字段的文檔數(shù)量
"min" : 100.0, // 最小值
"max" : 1200.0, // 最大值
"avg" : 525.0, // 平均值
"sum" : 2100.0 // 總和
}
}
}
Pipeline
對(duì)聚合再次進(jìn)行聚合
- pipeline的分析結(jié)果根據(jù)不同聚合,會(huì)輸出到不同位置[以下解釋摘自冰鐘,多謝]
- Sibling - 以兄弟聚合(同級(jí)聚合)的結(jié)果作為輸入,對(duì)兄弟聚合的結(jié)果進(jìn)行聚合計(jì)算。計(jì)算出一個(gè)新的聚合結(jié)果,結(jié)果與兄弟聚合的結(jié)果同級(jí)。
max,min,avg,sum
stats,extended status
percentiles
... - Parent - 以父聚合的結(jié)果作為輸入,對(duì)父聚合的結(jié)果進(jìn)行聚合計(jì)算??梢杂?jì)算出新的桶或是新的聚合結(jié)果加入到現(xiàn)有的桶中。
derivative[求導(dǎo)]
cumltive sum[累計(jì)求和]
moving function[滑動(dòng)窗口]
...
- Sibling - 以兄弟聚合(同級(jí)聚合)的結(jié)果作為輸入,對(duì)兄弟聚合的結(jié)果進(jìn)行聚合計(jì)算。計(jì)算出一個(gè)新的聚合結(jié)果,結(jié)果與兄弟聚合的結(jié)果同級(jí)。
在pipeline的聚合中,必須要指定buckets_path,我們看下這個(gè)path的語(yǔ)法
buckets_path 的語(yǔ)法
# 聚合分隔符 ==> ">",指定父子聚合關(guān)系,如:"my_bucket>my_stats"
AGG_SEPARATOR = `>` ;
# metric aggregation的分隔符,指定度量值,如:“my_stats.avg”
# 我自己的實(shí)驗(yàn):bucket和bucket聚合之間用>,bucket和metric聚合之間用>或者.都行,metric和metric之間用metric
METRIC_SEPARATOR = `.` ;
# 聚合名稱(chēng) ==> <name of the aggregation> ,指定聚合的名稱(chēng)
AGG_NAME = <the name of the aggregation> ;
# 在多值metric聚合的情況下,指定metric聚合的名字
METRIC = <the name of the metric (in case of multi-value metrics aggregation)> ;
# 用于多值聚合選取其中指定名稱(chēng)的聚合進(jìn)行
# 如:sale_type['hat']>sales
MULTIBUCKET_KEY = `[<KEY_NAME>]`
# 最后的路徑公式為:
PATH = <AGG_NAME><MULTIBUCKET_KEY>? (<AGG_SEPARATOR>, <AGG_NAME> )* ( <METRIC_SEPARATOR>, <METRIC> ) ;
栗子1: min_bucket
計(jì)算個(gè)人訂單的平均金額,并從中取出最小的那個(gè)
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_originatorId": {
"terms": {
"field": "originatorName"
},
"aggs": {
"avg_amount": {
"avg": {
"field": "amount",
"missing": 0
}
}
}
},
"min_avg_amount": { // 這里是自定義的pipeline聚合的名字
"min_bucket": { // 這里是關(guān)鍵字
"buckets_path": "group_by_originatorId>avg_amount" // 這里是聚合路徑
}
}
}
}
我們看下結(jié)果,因?yàn)閙in bucket是Sibling pipeline,所以結(jié)果與兄弟聚合的結(jié)果同級(jí)
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_originatorId" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "張三",
"doc_count" : 2,
"avg_amount" : {
"value" : 300.0
}
},
{
"key" : "王五",
"doc_count" : 2,
"avg_amount" : {
"value" : 150.0
}
},
{
"key" : "李四",
"doc_count" : 1,
"avg_amount" : {
"value" : 1200.0
}
}
]
},
"min_avg_amount" : {
"value" : 150.0,
"keys" : [
"王五"
]
}
}
}
聚合的作用范圍
默認(rèn)的作用范圍是query的查詢(xún)結(jié)果集
我們可以使用以下方式改變聚合的作用范圍
post filter
在聚合分析之后進(jìn)行篩選
# 按照名稱(chēng)分桶,分別統(tǒng)計(jì)每個(gè)人的訂單金額信息[在返回結(jié)果的aggregations中展示],最后篩選出張三的信息[在返回結(jié)果的hits中展示]
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_originatorName": {
"terms": {
"field": "originatorName"
},
"aggs": {
"stats": {
"stats": {
"field": "amount"
}
}
}
}
},
"post_filter": {
"term": {
"originatorName": "張三"
}
}
}
查詢(xún)結(jié)果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "aggs_order",
"_type" : "_doc",
"_id" : "HASA-XSIAN-SIWU",
"_score" : 1.0,
"_source" : {
"platform" : "Android",
"amount" : 100,
"createTime" : "2019-05-20 10:00",
"originatorId" : 1,
"originatorName" : "張三",
"goodsId" : 1,
"goodsName" : "IPhone 8 Plus"
}
},
{
"_index" : "aggs_order",
"_type" : "_doc",
"_id" : "USYX_SJJSUL_XUSYA",
"_score" : 1.0,
"_source" : {
"platform" : "PC",
"amount" : 500,
"createTime" : "2020-01-20 10:00",
"originatorId" : 1,
"originatorName" : "張三",
"goodsId" : 2,
"goodsName" : "IPhone 9 Plus"
}
}
]
},
"aggregations" : {
"group_by_originatorName" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "張三",
"doc_count" : 2,
"stats" : {
"count" : 2,
"min" : 100.0,
"max" : 500.0,
"avg" : 300.0,
"sum" : 600.0
}
},
{
"key" : "王五",
"doc_count" : 2,
"stats" : {
"count" : 1,
"min" : 300.0,
"max" : 300.0,
"avg" : 300.0,
"sum" : 300.0
}
},
{
"key" : "李四",
"doc_count" : 1,
"stats" : {
"count" : 1,
"min" : 1200.0,
"max" : 1200.0,
"avg" : 1200.0,
"sum" : 1200.0
}
}
]
}
}
}
global
在該聚合中,忽略掉query部分的查詢(xún)限制
GET /aggs_order/_search
{
"size": 0,
"query": {
"range": {
"amount": {
"gt": 100
}
}
},
"aggs": {
"group_by_originatorName": {
"terms": {
"field": "originatorName"
},
"aggs": {
"stats": {
"stats": {
"field": "amount"
}
}
}
},
"all": {
"global": {},
"aggs": {
"group_by_originatorName": {
"terms": {
"field": "originatorName"
},
"aggs": {
"stats": {
"stats": {
"field": "amount"
}
}
}
}
}
}
}
}
我們看下結(jié)果比對(duì)下:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"all" : {
"doc_count" : 5,
"group_by_originatorName" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "張三",
"doc_count" : 2,
"stats" : {
"count" : 2,
"min" : 100.0,
"max" : 500.0,
"avg" : 300.0,
"sum" : 600.0
}
},
{
"key" : "王五",
"doc_count" : 2,
"stats" : {
"count" : 1,
"min" : 300.0,
"max" : 300.0,
"avg" : 300.0,
"sum" : 300.0
}
},
{
"key" : "李四",
"doc_count" : 1,
"stats" : {
"count" : 1,
"min" : 1200.0,
"max" : 1200.0,
"avg" : 1200.0,
"sum" : 1200.0
}
}
]
}
},
"group_by_originatorName" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "張三",
"doc_count" : 1,
"stats" : {
"count" : 1,
"min" : 500.0,
"max" : 500.0,
"avg" : 500.0,
"sum" : 500.0
}
},
{
"key" : "李四",
"doc_count" : 1,
"stats" : {
"count" : 1,
"min" : 1200.0,
"max" : 1200.0,
"avg" : 1200.0,
"sum" : 1200.0
}
},
{
"key" : "王五",
"doc_count" : 1,
"stats" : {
"count" : 1,
"min" : 300.0,
"max" : 300.0,
"avg" : 300.0,
"sum" : 300.0
}
}
]
}
}
}
排序
根據(jù)關(guān)鍵字排序
- _count
- _key
通過(guò)聚合后的文檔數(shù)量和關(guān)鍵詞排序
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_originatorName": {
"terms": {
"field": "originatorName",
"order": [
{"_count": "desc"},
{"_key": "desc"}
]
}
}
}
}
根據(jù)子單值聚合結(jié)果排序
使用類(lèi)似min,max,min等返回單值結(jié)果的聚合作為排序條件
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_originatorName": {
"terms": {
"field": "originatorName",
"order": {
"avg_amount": "desc"
}
},
"aggs": {
"avg_amount": {
"avg": {
"field": "amount"
}
}
}
}
}
}
根據(jù)子多值聚合結(jié)果排序
使用類(lèi)似stats等返回多值結(jié)果的聚合中的某一項(xiàng)作為排序條件
GET /aggs_order/_search
{
"size": 0,
"aggs": {
"group_by_originatorName": {
"terms": {
"field": "originatorName",
"order": {
"stats_amount.sum": "desc"
}
},
"aggs": {
"stats_amount": {
"stats": {
"field": "amount"
}
}
}
}
}
}
思考題
nested查詢(xún),內(nèi)部需要聚合,再刷選,怎么弄?
業(yè)務(wù)場(chǎng)景:當(dāng)前有100w用戶(hù),50w紅包記錄,一個(gè)用戶(hù)有多條紅包記錄。首先建100w索引記錄,然后在用戶(hù)記錄中,使用一個(gè)字段nested類(lèi)型,保存對(duì)應(yīng)當(dāng)前的紅包列表。
紅包記錄有:紅包金額,紅包有效期。
需求:需要實(shí)現(xiàn)一個(gè)功能,在當(dāng)前的紅包有效期內(nèi),累計(jì)的紅包金額滿足,對(duì)應(yīng)的當(dāng)前用戶(hù)有多少?
在找資料的時(shí)候,發(fā)現(xiàn)了這么一個(gè)問(wèn)題,然后我自己試了一下,現(xiàn)在給出我的答案
- 第一種方式
這種方式需要在terms中指定size,多分片時(shí)候會(huì)有數(shù)據(jù)精準(zhǔn)度問(wèn)題,而且如果size過(guò)大,會(huì)占用更多內(nèi)存,慎用
GET /aggs_user_envelope/_search
{
"size": 0,
"aggs": {
"aggs_nested": {
"nested": {
"path": "envelope"
},
"aggs": {
"filter_date": {
"filters": {
"filters": {
"range": {
"range": {
"envelope.until": {
"gte": "2020-05-30 00:00"
}
}
}
}
},
"aggs": {
"group_by_username": {
"terms": {
"field": "envelope.userId",
"size": 10
},
"aggs": {
"sum_of_money": {
"sum": {
"field": "envelope.money"
}
},
"filter_money": {
"bucket_selector": {
"buckets_path": {
"money": "sum_of_money"
},
"script": "params.money >= 50"
}
},
"sort": {
"bucket_sort": {
"sort": [
{"sum_of_money": {"order": "desc"}}
,{"_count": {"order": "desc"}}
,{"_key": {"order": "desc"}}
]
}
}
}
}
}
}
}
}
}
}
- 第二種方式
官網(wǎng)推薦使用composite進(jìn)行分頁(yè),類(lèi)似scroll分頁(yè),但是composite聚合也有限制,內(nèi)部只能是Terms,Histogram,Date histogram這三種聚合
第一次分頁(yè):
GET /aggs_user_envelope/_search
{
"size": 0,
"aggs": {
"nested_wrapper": {
"nested": {
"path": "envelope"
},
"aggs": {
"group_by_userName": {
"composite": {
"size": 2,
"sources": [
{
"userName": {
"terms": {
"field": "envelope.userId",
"missing_bucket": true
}
}
}
]
},
"aggs": {
"filter_date": {
"filter": {
"range": {
"envelope.until": {
"gte": "2020-05-30 00:00"
}
}
},
"aggs": {
"sum_of_money": {
"sum": {
"field": "envelope.money"
}
}
}
},
"filter_money": {
"bucket_selector": {
"buckets_path": {
"money": "filter_date>sum_of_money"
},
"script": "params.money >= 50"
}
},
"sort": {
"bucket_sort": {
"sort": [
{"filter_date>sum_of_money": {"order": "desc"}}
,{"_count": {"order": "desc"}}
,{"_key": {"order": "desc"}}
]
}
}
}
}
}
}
}
}
我們看下第一次分頁(yè)的結(jié)果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"nested_wrapper" : {
"doc_count" : 9,
"group_by_userName" : {
"after_key" : {
"userName" : "10086" // 使用這個(gè)作為下次分頁(yè)的依據(jù)
},
"buckets" : [
{
"key" : {
"userName" : "10086"
},
"doc_count" : 3,
"filter_date" : {
"doc_count" : 3,
"sum_of_money" : {
"value" : 50.0
}
}
}
]
}
}
}
}
第二次分頁(yè)需要指定after
GET /aggs_user_envelope/_search
{
"size": 0,
"aggs": {
"nested_wrapper": {
"nested": {
"path": "envelope"
},
"aggs": {
"group_by_userName": {
"composite": {
"size": 2,
"sources": [
{
"userName": {
"terms": {
"field": "envelope.userId",
"missing_bucket": true
}
}
}
],
"after": {"userName" : "10086"} // 這里指定after
},
"aggs": {
"filter_date": {
"filter": {
"range": {
"envelope.until": {
"gte": "2020-05-30 00:00"
}
}
},
"aggs": {
"sum_of_money": {
"sum": {
"field": "envelope.money"
}
}
}
},
"filter_money": {
"bucket_selector": {
"buckets_path": {
"money": "filter_date>sum_of_money"
},
"script": "params.money >= 50"
}
},
"sort": {
"bucket_sort": {
"sort": [
{"filter_date>sum_of_money": {"order": "desc"}}
,{"_count": {"order": "desc"}}
,{"_key": {"order": "desc"}}
]
}
}
}
}
}
}
}
}