ELSER – Elastic Learned Sparse EncodeR

edit

Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform semantic search to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches.

ELSER is an out-of-domain model which means it does not require fine-tuning on your own data, making it adaptable for various use cases out of the box.

While ELSER V2 is generally available, ELSER V1 is in [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. and will remain in technical preview.

Tokens - not synonyms

edit

ELSER expands the indexed and searched passages into collections of terms that are learned to co-occur frequently within a diverse set of training data. The terms that the text is expanded into by the model are not synonyms for the search terms; they are learned associations capturing relevance. These expanded terms are weighted as some of them are more significant than others. Then the Elasticsearch sparse vector (or rank features) field type is used to store the terms and weights at index time, and to search against later.

This approach provides a more understandable search experience compared to vector embeddings. However, attempting to directly interpret the tokens and weights can be misleading, as the expansion essentially results in a vector in a very high-dimensional space. Consequently, certain tokens, especially those with low weight, contain information that is intertwined with other low-weight tokens in the representation. In this regard, they function similarly to a dense vector representation, making it challenging to separate their individual contributions. This complexity can potentially lead to misinterpretations if not carefully considered during analysis.

Requirements

edit

To use ELSER, you must have the appropriate subscription level for semantic search or the trial period activated.

The minimum dedicated ML node size for deploying and using the ELSER model is 4 GB in Elasticsearch Service if deployment autoscaling is turned off. Turning on autoscaling is recommended because it allows your deployment to dynamically adjust resources based on demand. Better performance can be achieved by using more allocations or more threads per allocation, which requires bigger ML nodes. Autoscaling provides bigger nodes when required. If autoscaling is turned off, you must provide suitably sized nodes yourself.

ELSER v2

edit

Compared to the initial version of the model, ELSER v2 offers improved retrieval accuracy and more efficient indexing. This enhancement is attributed to the extension of the training data set, which includes high-quality question and answer pairs and the improved FLOPS regularizer which reduces the cost of computing the similarity between a query and a document.

ELSER v2 has two versions: one cross-platform version which runs on any hardware and one version which is optimized for Intel® silicon. The Model Management > Trained Models page shows you which version of ELSER v2 is recommended to deploy based on your cluster’s hardware.

If you want to learn more about the ELSER V2 improvements, refer to this blog post.

Upgrading to ELSER v2

edit

ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This tutorial shows you how to create an ingest pipeline with an inference processor that uses ELSER v2, and how to reindex your data through the pipeline.

Additionally, the elasticearch-labs GitHub repository contains an interactive Python notebook that walks through upgrading an index to ELSER V2.

Download and deploy ELSER

edit

You can download and deploy ELSER either from Machine Learning > Trained Models, from Search > Indices, or by using the Dev Console.

Using the Trained Models page

edit
  1. In Kibana, navigate to Machine Learning > Trained Models. ELSER can be found in the list of trained models. There are two versions available: one portable version which runs on any hardware and one version which is optimized for Intel® silicon. You can see which model is recommended to use based on your hardware configuration.
  2. Click the Download model button under Actions. You can check the download status on the Notifications page.

    Downloading ELSER
  3. After the download is finished, start the deployment by clicking the Start deployment button.
  4. Provide a deployment ID, select the priority, and set the number of allocations and threads per allocation values.

    Deploying ELSER
  5. Click Start.

Using the search indices UI

edit

Alternatively, you can download and deploy ELSER to an inference pipeline using the search indices UI.

  1. In Kibana, navigate to Search > Indices.
  2. Select the index from the list that has an inference pipeline in which you want to use ELSER.
  3. Navigate to the Pipelines tab.
  4. Under Machine Learning Inference Pipelines, click the Deploy button to begin downloading the ELSER model. This may take a few minutes depending on your network.

    Deploying ELSER in Elasticsearch
  5. Once the model is downloaded, click the Start single-threaded button to start the model with basic configuration or select the Fine-tune performance option to navigate to the Trained Models page where you can configure the model deployment.

    Start ELSER in Elasticsearch

When your ELSER model is deployed and started, it is ready to be used in a pipeline.

Adding ELSER to an ingest pipeline
edit

To add ELSER to an ingest pipeline, you need to copy the default ingest pipeline and then customize it according to your needs.

  1. Click Copy and customize under the Unlock your custom pipelines block at the top of the page. This enables the Add inference pipeline button.

    Start ELSER in Elasticsearch
  2. Under Machine Learning Inference Pipelines, click Add inference pipeline.
  3. Give a name to the pipeline, select ELSER from the list of trained ML models, and click Continue.
  4. Select the source text field, define the target field, and click Add then Continue.
  5. Review the index mappings updates. Click Back if you want to change the mappings. Click Continue if you are satisfied with the updated index mappings.
  6. You can optionally test your pipeline. Click Continue.
  7. Create pipeline.

Once your pipeline is created, you are ready to ingest documents and utilize ELSER for text expansions in your search queries.

Using the Dev Console

edit
  1. In Kibana, navigate to the Dev Console.
  2. Create the ELSER model configuration by running the following API call:

    PUT _ml/trained_models/.elser_model_2
    {
      "input": {
    	"field_names": ["text_field"]
      }
    }

    The API call automatically initiates the model download if the model is not downloaded yet.

  3. Deploy the model by using the start trained model deployment API with a delpoyment ID:

    POST _ml/trained_models/.elser_model_2/deployment/_start?deployment_id=for_search

    You can deploy the model multiple times with different deployment IDs.

After the deployment is complete, ELSER is ready to use either in an ingest pipeline or in a text_expansion query to perform semantic search.

Deploy ELSER in an air-gapped environment

edit

If you want to deploy ELSER in a restricted or closed network, you have two options:

  • create your own HTTP/HTTPS endpoint with the model artifacts on it,
  • put the model artifacts into a directory inside the config directory on all master-eligible nodes.

Model artifact files

edit

For the cross-platform verison, you need the following files in your system:

https://ml-models.elastic.co/elser_model_2.metadata.json
https://ml-models.elastic.co/elser_model_2.pt
https://ml-models.elastic.co/elser_model_2.vocab.json

For the optimized version, you need the following files in your system:

https://ml-models.elastic.co/elser_model_2_linux-x86_64.metadata.json
https://ml-models.elastic.co/elser_model_2_linux-x86_64.pt
https://ml-models.elastic.co/elser_model_2_linux-x86_64.vocab.json

Using an HTTP server

edit

INFO: If you use an existing HTTP server, note that the model downloader only supports passwordless HTTP servers.

You can use any HTTP service to deploy ELSER. This example uses the official Nginx Docker image to set a new HTTP download service up.

  1. Download the model artifact files.
  2. Put the files into a subdirectory of your choice.
  3. Run the following commands:

    export ELASTIC_ML_MODELS="/path/to/models"
    docker run --rm -d -p 8080:80 --name ml-models -v ${ELASTIC_ML_MODELS}:/usr/share/nginx/html nginx

    Don’t forget to change /path/to/models to the path of the subdirectory where the model artifact files are located.

    These commands start a local Docker image with an Nginx server with the subdirectory containing the model files. As the Docker image has to be downloaded and built, the first start might take a longer period of time. Subsequent runs start quicker.

  4. Verify that Nginx runs properly by visiting the following URL in your browser:

    http://{IP_ADDRESS_OR_HOSTNAME}:8080/elser_model_2.metadata.json

    If Nginx runs properly, you see the content of the metdata file of the model.

  5. Point your Elasticsearch deployment to the model artifacts on the HTTP server by adding the following line to the config/elasticsearch.yml file:

    xpack.ml.model_repository: http://{IP_ADDRESS_OR_HOSTNAME}:8080

    If you use your own HTTP or HTTPS server, change the address accordingly. It is important to specificy the protocol ("http://" or "https://"). Ensure that all master-eligible nodes can reach the server you specify.

  6. Repeat step 5 on all master-eligible nodes.
  7. Restart the master-eligible nodes one by one.
  8. Navigate to the Trained Models page in Kibana, ELSER can be found in the list of trained models.
  9. Click the Add trained model button, select the ELSER model version you downloaded in step 1 and want to deploy, and click Download. The selected model will be downloaded from the HTTP/HTTPS server you configured.
  10. After the download is finished, start the deployment by clicking the Start deployment button.
  11. Provide a deployment ID, select the priority, and set the number of allocations and threads per allocation values.
  12. Click Start.

The HTTP server is only required for downloading the model. After the download has finished, you can stop and delete the service. You can stop the Docker image used in this example by running the following command:

docker stop ml-models

Using file-based access

edit

For a file-based access, follow these steps:

  1. Download the model artifact files.
  2. Put the files into a models subdirectory inside the config directory of your Elasticsearch deployment.
  3. Point your Elasticsearch deployment to the model directory by adding the following line to the config/elasticsearch.yml file:

    xpack.ml.model_repository: file://${path.home}/config/models/`
  4. Repeat step 2 and step 3 on all master-eligible nodes.
  5. Restart the master-eligible nodes one by one.
  6. Navigate to the Trained Models page in Kibana, ELSER can be found in the list of trained models.
  7. Click the Add trained model button, select the ELSER model version you downloaded in step 1 and want to deploy and click Download. The selected model will be downloaded from the model directory where you put in step 2.
  8. After the download is finished, start the deployment by clicking the Start deployment button.
  9. Provide a deployment ID, select the priority, and set the number of allocations and threads per allocation values.
  10. Click Start.

Testing ELSER

edit

You can test the deployed model in Kibana. Navigate to Model Management > Trained Models, locate the deployed ELSER model in the list of trained models, then select Test model from the Actions menu.

You can use data from an existing index to test the model. Select the index, then a field of the index you want to test ELSER on. Provide a search query and click Test. Evaluating model recall is simpler when using a query related to the documents.

The results contain a list of ten random values for the selected field along with a score showing how relevant each document is to the query. The higher the score, the more relevant the document is. You can reload example documents by clicking Reload examples.

Testing ELSER

Performance considerations

edit
  • ELSER works best on small-to-medium sized fields that contain natural language. For connector or web crawler use cases, this aligns best with fields like title, description, summary, or abstract. As ELSER encodes the first 512 tokens of a field, it may not provide as relevant of results for large fields. For example, body_content on web crawler documents, or body fields resulting from extracting text from office documents with connectors. For larger fields like these, consider "chunking" the content into multiple values, where each chunk can be under 512 tokens.
  • Larger documents take longer at ingestion time, and inference time per document also increases the more fields in a document that need to be processed.
  • The more fields your pipeline has to perform inference on, the longer it takes per document to ingest.

To learn more about ELSER performance, refer to the Benchmark information.

Further reading

edit