目录

Es布尔查询

ES 布尔查询

电商网站商品高级搜索模块

https://s1.ax1x.com/2022/06/18/XXFxaQ.png

假设要搜索商品,需要包含以下条件

  • 在热卖专辑下
  • 供应商为8848
  • 商品价格在100到200元之间
  • ……

怎样同时满足这三个逻辑,并且有比较好的性能?使用复合查询:bool Query

1. Bool Query多条件复合查询

  • 一个bool查询,是一个或多个查询字句的组合
    • 共4种字句(must,should,must_not,filter)。其中两种会影响算分,2种不影响算分
  • 匹配的字句越多,相关性评分就越高。bool查询,每个查询字句计算得出的评分会被合并到总的相关性评分中
  • 当没有 must 语句的时候,至少有一个 should 语句必须匹配
  • 通过 minimum_should_match 参数控制需要匹配的 should 语句的数量,它既可以是一个绝对的数字,又可以是个百分比
bool字句类型 类型 是否需要匹配 是否贡献算分
must Query Context 必须匹配 贡献算分
should Query Context 选择性匹配 贡献算分(加分项)
must_not Filter Context 必须不匹配 不贡献算分
filter Filter context 必须匹配 不贡献算分
1.1 Bool 查询语法
  • 子查询顺序可以任意
  • 可以嵌套多个子查询

https://s1.ax1x.com/2022/04/09/LPlxSJ.md.png

1.2 解决结构化查询—“包含而不是相等"的问题

https://s1.ax1x.com/2022/04/09/LP1cc9.md.png

解决方案:调整索引结构增加一个genre_count字段进行计数,使用bool查询

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
POST /newmovies/_bulk
{ "index":{ "_id":1}}
{ "title" : "Faimer of the Bridge Part II","year":1995, "genre":"Comedy","genre_count":1}
{ "index":{ "_id": 2}}
{ "title" : "Dave","year":1993,"genre":["Comedy","Romance"],"genre_count":2 } 


GET newmovies/_search
{
  "query": {
    "bool": {
      "must": [
        {"term": {"genre.keyword": {"value": "Comedy"}}},
        {"term": {"genre_count": {"value": "1"}}}
      ]
    }
  }
}
1.3 bool嵌套

https://s1.ax1x.com/2022/04/09/LP8gdx.md.png

3.查询结构和相关性算分

  • 同一层级下的竞争字段,具有相同的权重
  • 通过嵌套bool查询,改变对算法的影响

https://s1.ax1x.com/2022/04/09/LPGpOs.md.png

右边的查询brownred加起来的贡献才和上一级是一样的,通过嵌套的方式修改不同字段的权重

4.控制查询的的精确度

Boosting
  • Boosting是控制相关度的一种手段
  • 当 boost > 1,打分的相关度提升
  • 当 0 < boost < 1,打分的权重降低
  • 当 0 < boost 时,贡献负分

https://s1.ax1x.com/2022/04/09/LPGanA.md.png

案例: 要求苹果公司的产品优先展示,而不是食品相关的信息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
POST /news/_bulk
{ "index": { "_id":1}} 
{ "content":"Apple Mac" }
{ "index": { "_id": 2 }} 
{ "content":"Apple iPad"} 
{ "index":{ "_id":3}}
{ "content":"Apple employee like Apple Pie and Apple Juice"} 


# id为3的文档排名会靠前
POST news/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {"content": "apple"}
      }
    }
  }
}

# id为3的文档排除
POST news/_search
{
  "query": {
    "bool": {
      "must": {"match": {"content": "apple"}},
      "must_not": {"match": {"content": "pie"}}
    }
  }
}

# 仍能搜索出但排名靠后
POST news/_search
{
  "query": {
    "boosting": {
      "positive": {
        "match": {"content": "apple"}},
      "negative": {
        "match": {"content": "pie"}
      },
      "negative_boost": 0.5
    }
  }
}

小结

  • Query Conext vs Filter Conext
  • Bool Query 多条件
  • 查询结构,对搜索算分的影响
  • 控制查询的精确度

完整查询语句

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
# bool 查询语法

POST /products/_search
{
  "query": {
    "bool": {
      "must": {
        "term": {"price": "30"}
      },
      "filter": {
        "term": {"avaliable": "true"}
      },
      "must_not": {
        "range": {
          "price": {"lte": 10}
        }
      },
      "should": [
        {
          "term": {"productID.keyword": "JODL-X-1937-#pV7"}
        },
        {
          "term": {"productID.keyword": "XHDK-A-1293-#fJ3"}
        }
      ],
      "minimum_should_match": 1
    }
  }
}


# bool 查询嵌套结构,实现 should not 逻辑
POST /products/_search
{
  "query": {
    "bool": {
      "must": {
        "term": {
          "price": "30"
        }
      },
      "should": [
        {
          "bool": {
            "must_not": {
              "term": {
                "avaliable": "false"
              }
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}


# 结构化查询"包含而不是相等"的问题
DELETE movies

POST /movies/_bulk
{ "index":{ "_id":1}}
{ "title" : "Faimer of the Bridge Part II","year":1995, "genre":"Comedy"}
{ "index":{ "_id": 2}}
{ "title" : "Dave","year":1993,"genre":["Comedy","Romance"]}


POST /movies/_search
{
  "query": {"constant_score": {
    "filter": {
      "term": {
        "genre.keyword": "Comedy"
      }
    }
  }}
}

# 调整结构,增加genre_count
POST /newmovies/_bulk
{ "index":{ "_id":1}}
{ "title" : "Faimer of the Bridge Part II","year":1995, "genre":"Comedy","genre_count":1}
{ "index":{ "_id": 2}}
{ "title" : "Dave","year":1993,"genre":["Comedy","Romance"],"genre_count":2 } 


# must 算分
GET newmovies/_search
{
  "query": {
    "bool": {
      "must": [
        {"term": {"genre.keyword": {"value": "Comedy"}}},
        {"term": {"genre_count": {"value": "1"}}}
      ]
    }
  }
}

# filter 不算分
GET newmovies/_search
{
  "query": {
    "bool": {
      "filter": [
        {"term": {"genre.keyword": {"value": "Comedy"}}},
        {"term": {"genre_count": {"value": "1"}}}
      ]
    }
  }
}




# 通过boost来控制打分
POST /blogs/_bulk
{ "index": { "_id":1}}
{"title":"Apple iPad", "content":"Apple iPad,Apple iPad"}
{ "index":{ "_id":2}}
{"title":"Apple iPad,Apple iPad", "content":"Apple iPad"}



POST /blogs/_search
{
  "query": {
    "bool": {
      "should": [
        {"match": {
          "title":{
            "query": "apple,ipad",
            "boost": 2
          }
        }},
        
        {"match": {
          "content": {
            "query": "apple,ipad",
            "boost": 1
          }
        }}
      ]
    }
  }
}


# 要求苹果公司的产品优先展示,而不是食品相关的信息
POST /news/_bulk
{ "index": { "_id":1}} 
{ "content":"Apple Mac" }
{ "index": { "_id": 2 }} 
{ "content":"Apple iPad"} 
{ "index":{ "_id":3}}
{ "content":"Apple employee like Apple Pie and Apple Juice"} 


# id为3的文档排名会靠前
POST news/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "content": "apple"
        }
      }
    }
  }
}


POST news/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {"content": "apple"}
      },
      "must_not": {
        "match": {"content": "pie"}
      }
    }
  }
}

# 希望仍能搜索出但排名靠后
POST news/_search
{
  "query": {
    "boosting": {
      "positive": {
        "match": {
          "content": "apple"
        }
      },
      "negative": {
        "match": {
          "content": "pie"
        }
      },
      "negative_boost": 0.2
    }
  }
}
1
//BatchIndexInventoryByVIds
1
BatchIndexInventoryByPidEvent