Regexp query

edit

Returns documents that contain terms matching a regular expression.

A regular expression is a way to match patterns in data using placeholder characters, called operators. For a list of operators supported by the regexp query, see Regular expression syntax.

Example request

edit

The following search returns documents where the user.id field contains any term that begins with k and ends with y. The .* operators match any characters of any length, including no characters. Matching terms can include ky, kay, and kimchy.

response = client.search(
  body: {
    query: {
      regexp: {
        "user.id": {
          value: 'k.*y',
          flags: 'ALL',
          case_insensitive: true,
          max_determinized_states: 10_000,
          rewrite: 'constant_score'
        }
      }
    }
  }
)
puts response
GET /_search
{
  "query": {
    "regexp": {
      "user.id": {
        "value": "k.*y",
        "flags": "ALL",
        "case_insensitive": true,
        "max_determinized_states": 10000,
        "rewrite": "constant_score"
      }
    }
  }
}

Top-level parameters for regexp

edit
<field>
(Required, object) Field you wish to search.

Parameters for <field>

edit
value

(Required, string) Regular expression for terms you wish to find in the provided <field>. For a list of supported operators, see Regular expression syntax.

By default, regular expressions are limited to 1,000 characters. You can change this limit using the index.max_regex_length setting.

The performance of the regexp query can vary based on the regular expression provided. To improve performance, avoid using wildcard patterns, such as .* or .*?+, without a prefix or suffix.

flags
(Optional, string) Enables optional operators for the regular expression. For valid values and more information, see Regular expression syntax.
case_insensitive [7.10.0] Added in 7.10.0.
(Optional, Boolean) Allows case insensitive matching of the regular expression value with the indexed field values when set to true. Default is false which means the case sensitivity of matching depends on the underlying field’s mapping.
max_determinized_states

(Optional, integer) Maximum number of automaton states required for the query. Default is 10000.

Elasticsearch uses Apache Lucene internally to parse regular expressions. Lucene converts each regular expression to a finite automaton containing a number of determinized states.

You can use this parameter to prevent that conversion from unintentionally consuming too many resources. You may need to increase this limit to run complex regular expressions.

rewrite
(Optional, string) Method used to rewrite the query. For valid values and more information, see the rewrite parameter.

Notes

edit

Allow expensive queries

edit

Regexp queries will not be executed if search.allow_expensive_queries is set to false.