Query DSL
Elastic Stack Serverless
Query DSL is a full-featured JSON-style query language that enables complex searching, filtering, and aggregations. It is the original and most powerful query language for Elasticsearch today.
The _search
endpoint accepts queries written in Query DSL syntax.
Query DSL support a wide range of search techniques, including the following:
- Full-text search: Search text that has been analyzed and indexed to support phrase or proximity queries, fuzzy matches, and more.
- Keyword search: Search for exact matches using
keyword
fields. - Semantic search: Search
semantic_text
fields using dense or sparse vector search on embeddings generated in your Elasticsearch cluster. - Vector search: Search for similar dense vectors using the kNN algorithm for embeddings generated outside of Elasticsearch.
- Geospatial search: Search for locations and calculate spatial relationships using geospatial queries.
You can also filter data using Query DSL. Filters enable you to include or exclude documents by retrieving documents that match specific field-level criteria. A query that uses the filter
parameter indicates filter context.
Aggregations are the primary tool for analyzing Elasticsearch data using Query DSL. Aggregations enable you to build complex summaries of your data and gain insight into key metrics, patterns, and trends.
Because aggregations leverage the same data structures used for search, they are also very fast. This enables you to analyze and visualize your data in real time. You can search documents, filter results, and perform analytics at the same time, on the same data, in a single request. That means aggregations are calculated in the context of the search query.
The following aggregation types are available:
- Metric: Calculate metrics, such as a sum or average, from field values.
- Bucket: Group documents into buckets based on field values, ranges, or other criteria.
- Pipeline: Run aggregations on the results of other aggregations.
Run aggregations by specifying the search API's aggs
parameter. Learn more in Run an aggregation.
Think of the Query DSL as an AST (Abstract Syntax Tree) of queries, consisting of two types of clauses:
Leaf query clauses: Leaf query clauses look for a particular value in a particular field, such as the match
, term
or range
queries. These queries can be used by themselves.
Compound query clauses: Compound query clauses wrap other leaf or compound queries and are used to combine multiple queries in a logical fashion (such as the bool
or dis_max
query), or to alter their behavior (such as the constant_score
query).
Query clauses behave differently depending on whether they are used in query context or filter context.
Allow expensive queries: Certain types of queries will generally execute slowly due to the way they are implemented, which can affect the stability of the cluster. Those queries can be categorized as follows:
Queries that need to do linear scans to identify matches:
script
queries- queries on numeric, date, boolean, ip, geo_point or keyword fields that are not indexed but have doc values enabled
Queries that have a high up-front cost:
fuzzy
queries (except onwildcard
fields)regexp
queries (except onwildcard
fields)prefix
queries (except onwildcard
fields or those withoutindex_prefixes
)wildcard
queries (except onwildcard
fields)range
queries ontext
andkeyword
fields
Queries that may have a high per-document cost:
The execution of such queries can be prevented by setting the value of the search.allow_expensive_queries
setting to false
(defaults to true
).
By default, Elasticsearch sorts matching search results by relevance score, which measures how well each document matches a query.
The relevance score is a positive floating point number, returned in the _score
metadata field of the search API. The higher the _score
, the more relevant the document. While each query type can calculate relevance scores differently, score calculation also depends on whether the query clause is run in a query or filter context.
In the query context, a query clause answers the question How well does this document match this query clause? Besides deciding whether or not the document matches, the query clause also calculates a relevance score in the _score
metadata field.
Query context is in effect whenever a query clause is passed to a query
parameter, such as the query
parameter in the search API.
A filter answers the binary question “Does this document match this query clause?”. The answer is simply "yes" or "no". Filtering has several benefits:
- Simple binary logic: In a filter context, a query clause determines document matches based on a yes/no criterion, without score calculation.
- Performance: Because they don’t compute relevance scores, filters execute faster than queries.
- Caching: Elasticsearch automatically caches frequently used filters, speeding up subsequent search performance.
- Resource efficiency: Filters consume less CPU resources compared to full-text queries.
- Query combination: Filters can be combined with scored queries to refine result sets efficiently.
Filters are particularly effective for querying structured data and implementing "must have" criteria in complex searches.
Structured data refers to information that is highly organized and formatted in a predefined manner. In the context of Elasticsearch, this typically includes:
- Numeric fields (integers, floating-point numbers)
- Dates and timestamps
- Boolean values
- Keyword fields (exact match strings)
- Geo-points and geo-shapes
Unlike full-text fields, structured data has a consistent, predictable format, making it ideal for precise filtering operations.
Common filter applications include:
- Date range checks: for example is the
timestamp
field between 2015 and 2016 - Specific field value checks: for example is the
status
field equal to "published" or is theauthor
field equal to "John Doe"
Filter context applies when a query clause is passed to a filter
parameter, such as:
filter
ormust_not
parameters inbool
queriesfilter
parameter inconstant_score
queriesfilter
aggregations
Filters optimize query performance and efficiency, especially for structured data queries and when combined with full-text searches.
Below is an example of query clauses being used in query and filter context in the search
API. This query will match documents where all of the following conditions are met:
- The
title
field contains the wordsearch
. - The
content
field contains the wordelasticsearch
. - The
status
field contains the exact wordpublished
. - The
publish_date
field contains a date from 1 Jan 2015 onwards.
GET /_search
{
"query": {
"bool": {
"must": [
{ "match": { "title": "Search" }},
{ "match": { "content": "Elasticsearch" }}
],
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2015-01-01" }}}
]
}
}
}
- The
query
parameter indicates query context. - The
bool
and twomatch
clauses are used in query context, which means that they are used to score how well each document matches. - The
filter
parameter indicates filter context. Itsterm
andrange
clauses are used in filter context. They will filter out documents which do not match, but they will not affect the score for matching documents.
Scores calculated for queries in query context are represented as single precision floating point numbers; they have only 24 bits for significand’s precision. Score calculations that exceed the significand’s precision will be converted to floats with loss of precision.
Use query clauses in query context for conditions which should affect the score of matching documents (i.e. how well does the document match), and use all other query clauses in filter context.