Elasticsearch highlights
editElasticsearch highlights
editThis list summarizes the most important enhancements in Elasticsearch 8.0. For the complete list, go to Elasticsearch release highlights.
7.x REST API compatibility
edit8.0 introduces several breaking changes to the Elasticsearch REST APIs. While it’s important to update your application to account for these changes, finding and updating every API call in a single upgrade can be painful and error-prone. To make this process easier, we’ve added support for 7.x compatibility headers to our REST APIs. In many cases, these optional headers let you make 7.x-compatible requests to an 8.0 cluster and receive 7.x-compatible responses.
While we still recommend you update your application to use native 8.0 requests and responses, the 7.x API compatibility headers let you safely make these changes over a longer period of time.
For more information about the headers and how to use them, see REST API compatibility.
Security features are enabled and configured by default
editRunning Elasticsearch without security leaves your cluster exposed to anyone who can send network traffic to Elasticsearch. In previous versions, you had to explicitly enable the Elasticsearch security features such as authentication, authorization, and network encryption (TLS). Starting in Elasticsearch 8.0, security is enabled and configured by default when you start Elasticsearch for the first time.
At startup, we generate enrollment tokens that you use to connect a Kibana instance or enroll additional nodes in your secured Elasticsearch cluster, without having to generate security certificates or update YAML configuration files. Just use the generated enrollment token when starting new nodes or Kibana instances, and the Elastic Stack handles all of the security configuration for you. Out of the box, you’ll get:
- User authentication
- User authorization
- Encrypted internode communication with TLS
- Encrypted communication between Elasticsearch and Kibana with TLS
Need a new enrollment token? Use the
elasticsearch-create-enrollment-token
tool to create enrollment tokens for Elasticsearch nodes and Kibana instances.
Better protection for system indices
editSystem indices store configurations and internal data for Elastic features. Generally, system indices are reserved only for internal use by these features. While possible, directly accessing or changing a system index can cause instability and other issues.
In 8.0, we’ve made several changes to protect system indices from direct access.
To access a system index, you must now have the
allow_restricted_indices
permission set to true
.
The superuser
role also no longer gives write access to system indices. As a
result, the built-in elastic
superuser can’t change system indices by
default.
If available, use Kibana or the associated Elasticsearch APIs to manage data for a feature rather than accessing a system index. If you attempt to directly access a system index, Elasticsearch will return a warning in the header of API responses and in the deprecation logs.
New kNN search API
editWith 8.0, we’re introducing a technical preview of the kNN search API.
Using dense_vector
fields, a k-nearest neighbor (kNN)
search finds the k nearest vectors to a query vector, as measured by a
similarity metric. kNN is commonly used to power recommendation engines and rank
relevancy based on natural language processing (NLP) algorithms.
Previously, Elasticsearch only supported exact kNN searches using a script_score
query
with a vector function. While this method guarantees accurate results, it often
results in slow searches and doesn’t scale well with large datasets. In exchange
for slower indexing and imperfect accuracy, the new kNN search API lets you run
approximate kNN searches on larger datasets and at faster speeds.
Storage savings for keyword
, match_only_text
, and text
fields
editWe’ve updated inverted indices, an internal data structure, to use a more
space-efficient encoding. This change will benefit keyword
fields,
match_only_text
fields, and, to a lesser extent, text
fields. In our
benchmarks using application logs, this translated into a 14.4% reduction of
the size of the index of the message
field (mapped as match_only_text
) and
an overall 3.5% reduction of the on-disk footprint.
This change will be picked up automatically by both new indices, and existing indices for every new segment.
Faster indexing of geo_point
, geo_shape
, and range fields
editWe’ve optimized indexing speeds for multi-dimensional points, an internal data
structure used for geo_point
, geo_shape
, and range fields. Lucene-level
benchmarks reported 10-15% faster indexing for these fields types. Elasticsearch indices
and data streams that mostly consist of these fields may see noticeable
improvements to indexing speed.
PyTorch model support for natural language processing (NLP)
editNow it is possible to upload PyTorch models that are trained outside Elasticsearch and use them for inference at ingest time. Third party model support brings modern natural language processing (NLP) and search use cases to the Elastic Stack such as:
- Fill-mask
- Named entity recognition (NER)
- Text classification
- Text embedding
- Zero-shot classification