How to build autocomplete feature on search application automatically using LLM generated terms

Autocomplete is a crucial feature in search applications, enhancing user experience by providing real-time suggestions as users type. Traditionally, autocomplete in Elasticsearch is implemented using the completion suggester, which relies on predefined terms. This approach requires manual curation of suggestion terms and often lacks contextual relevance. By leveraging LLM-generated terms via OpenAI’s completion endpoint, we can build a more intelligent, scalable, and automated autocomplete feature.

Supercharge your search autocomplete using LLM

In this article, we’ll explore:

Traditional method of implementing autocomplete in Elasticsearch.
How integrating OpenAI’s LLM improves autocomplete suggestions.
Scaling the solution using Ingest Pipeline and Inference Endpoint in Elastic Cloud.

Traditional autocomplete in Elasticsearch

The conventional approach to building autocomplete in Elasticsearch involves defining a completion field in the index mapping. This allows Elasticsearch to provide suggestions based on predefined terms. This would be straightforward to implement, especially if you have already built a comprehensive suggestion list for a fairly static dataset.

Implementation Steps

Create an index with a completion field.
Manually curate suggestion terms and store them in the index.
Query using a completion suggester to retrieve relevant suggestions.

Example: Traditional autocomplete setup

First, create a new index named products_test. In this index, we define a field called suggest of type completion, which is optimized for fast autocomplete suggestions.

PUT /products_test
{
  "mappings": {
    "properties": {
      "suggest": { "type": "completion" }
    }
  }
}

Insert a test document into the products_test index. The suggest field stores multiple completion suggestions.

PUT /products_test/_doc/1
{
  "suggest": ["MacBook Air M2", "Apple Laptop", "Lightweight Laptop"]
}

Finally, we use the completion suggester query to search for suggestions starting with "MacB."

The prefix "MacB" will match "MacBook Air M2."

POST /products_test/_search
{
  "suggest": {
    "search-suggestion": {
      "prefix": "MacB",
      "completion": { "field": "suggest" }
    }
  }
}

The suggest section contains matched suggestions. Options contain an array of matching suggestions, where "text": "MacBook Air M2" is the top suggestion.

  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "suggest": {
    "search-suggestion": [
      {
        "text": "MacB",
        "offset": 0,
        "length": 4,
        "options": [
          {
            "text": "MacBook Air M2",
            "_index": "products_test",
            "_id": "1",
            "_score": 1,
            "_source": {
              "suggest": [
                "MacBook Air M2",
                "Apple Laptop",
                "Lightweight Laptop"
              ]
            }
          }
        ]
      }
    ]
  }
}

While effective, this method requires manual curation, constant updates to suggestion terms and does not adapt dynamically to new products or descriptions.

Enhancing autocomplete with OpenAI LLM

In some use cases, datasets change frequently, which requires you to continuously update a list of valid suggestions. If new products, names, or terms emerge, you have to manually add them to the suggestion list. This is where LLM steps in, as it can dynamically generate relevant completions based on real-world knowledge and live data.

By leveraging OpenAI’s completion endpoint, we can dynamically generate autocomplete suggestions based on product names and descriptions. This allows for:

Automatic generation of synonyms and related terms.
Context-aware suggestions derived from product descriptions.
No need for manual curation, making the system more scalable.

Steps to implement LLM-powered autocomplete

Create an inference endpoint using OpenAI’s completion API.
Set up an Elasticsearch ingest pipeline that queries OpenAI for suggestions using a pre-defined prompt using a script processor
Store the generated terms in an Elasticsearch index with a completion field.
Use a search request to fetch dynamic autocomplete results.

All the steps above can be easily completed by copying and pasting the API requests step by step in the Kibana Dev tool. In this example, we will be using the gpt-4o-mini model. You will need to get your OpenAI API key for this step. Login to your OpenAI account and navigate to https://platform.openai.com/api-keys. Next, create a new secret key or use an existing key.

Creating an inference endpoint

First, we create an inference endpoint. This allows us to interact seamlessly with a machine learning model (in this case OpenAI) via API, while still working within Elastic’s interface.

PUT _inference/completion/openai-completion
{
    "service": "openai",
    "service_settings": {
        "api_key": "<insert_your_api_key>",
        "model_id": "gpt-4o-mini"
    }
}

Setting up the Elasticsearch ingest pipeline

By setting up an ingest pipeline, we can process data upon indexing. In this case, the pipeline is named autocomplete-LLM-pipeline and it contains:

A script processor, which defines the prompt we are sending to OpenAI to get our suggestion list. Product name and product description are included as dynamic values in the prompt.
An inference processor, which refers to our OpenAI inference endpoint. This processor takes a prompt from the script processor as input, sends it to the LLM model, and stores the result in an output field called results.
A split processor, which splits the text output from LLM within the results field into a comma-separated array to fit the format of a completion type field of suggest.
2 remove processors, which remove the prompt and results field after the suggest field has been populated.

PUT _ingest/pipeline/autocomplete-LLM-pipeline
{
  "processors": [
    {
      "script": {
        "source": "\n    ctx.prompt = 'Based on the following product name and product description, create relevant autocomplete suggestion terms from the following product, including the exact product name itself as the first term, synonyms of the product category, and keywords which might commonly be used when searching the following product:' + '\\\\n Product Name:\\\\n' + ctx.ProductName + '\\\\nProduct Description:\\\\n' + ctx.Description + '\\\\nJust include the suggestion terms in the response, as an array encapsulated in double quotes and separated by commas without any prefix or numbering'\n    "
      }
    },
    {
      "inference": {
        "model_id": "openai-completion",
        "input_output": {
          "input_field": "prompt",
          "output_field": "results"
        }
      }
    },
    {
      "split": {
        "field": "results",
        "separator": ",",
        "target_field": "suggest"
      }
    },
    {
      "remove": {
        "field": "prompt"
      }
    },
    {
      "remove": {
        "field": "results"
      }
    }
  ]
}

Indexing sample documents

For this example, we are using the documents API to manually index documents from the dev tool to a temporary index called ‘products’. This is not the autocomplete index we will be using.

PUT products/_doc/1
{
  "ProductName": "MacBook Air M2",
  "Description": "The MacBook Air M2 is a powerful, ultra-portable laptop designed to deliver exceptional performance, all while maintaining an ultra-slim profile. Powered by Apple’s latest M2 chip, this lightweight machine is perfect for both work and play, combining top-tier performance with impressive battery life"
}

PUT products/_doc/2
{
  "ProductName": "DKNY Unisex Black & Grey Printed Trolley Bag",
  "Description": "Black and grey printed medium trolley bag, secured with a TSA lockOne handle on the top and one on the side, has a trolley with a retractable handle on the top and four corner mounted inline skate wheelsOne main zip compartment, zip lining, two compression straps with click clasps, one zip compartment on the flap with three zip pocketsWarranty: 5 yearsWarranty provided by Brand Owner / Manufacturer"
}

Creating index with completion type mapping

Now, we are creating the actual autocomplete index which contains the completion type field called suggest.

PUT products_with_suggestion
{
  "mappings": {
    "properties": {
      "suggest": { "type": "completion" } 
    }
  }
}

Reindexing documents to a designated index via the ingest pipeline

In this step, we are reindexing data from our products index created previously to the actual autocomplete index products_with_suggestion, through our ingest pipeline autocomplete-LLM-Pipeline. The pipeline will process the sample documents from the original index and populate the autocomplete suggest field in the destination index.

POST _reindex?slices=auto&wait_for_completion=false
{
 "source": {
   "index": "products"
 },
 "dest": {
   "index": "products_with_suggestion",
   "pipeline": "autocomplete-LLM-pipeline"
 }
}

Sample autocomplete suggestions

As shown below, the new index (products_with_suggestion) now includes a new field called suggest, which contains an array of terms or synonyms generated by OpenAI LLM.

You can run the following request to check:

GET products_with_suggestion/_search

Results:

{
  "hits": [
    {
      "ProductName": "MacBook Air M2",
      "Description": "The MacBook Air M2 is a powerful, ultra-portable laptop designed to deliver exceptional performance, all while maintaining an ultra-slim profile. Powered by Apple’s latest M2 chip, this lightweight machine is perfect for both work and play, combining top-tier performance with impressive battery life",
      "suggest": [
        "MacBook Air M2",
        "ultra-portable laptop",
        "lightweight laptop",
        "performance laptop",
        "Apple laptop",
        "M2 chip laptop",
        "thin laptop",
        "best laptop for work",
        "laptop with long battery life",
        "powerful lightweight laptop",
        "Apple MacBook",
        "MacBook Air",
        "laptop for students",
        "portable computer",
        "laptop for professionals"
      ]
    },
    {
      "ProductName": "DKNY Unisex Black & Grey Printed Medium Trolley Bag",
      "Description": "Black and grey printed medium trolley bag, secured with a TSA lockOne handle on the top and one on the side, has a trolley with a retractable handle on the top and four corner mounted inline skate wheelsOne main zip compartment, zip lining, two compression straps with click clasps, one zip compartment on the flap with three zip pocketsWarranty: 5 yearsWarranty provided by Brand Owner / Manufacturer",
      "suggest": [
        "DKNY Unisex Black & Grey Printed Medium Trolley Bag",
        "medium trolley bag",
        "travel bag",
        "luggage",
        "roller bag",
        "printed suitcase",
        "black and grey suitcase",
        "trolley luggage",
        "travel trolley",
        "carry-on trolley",
        "retractable handle bag",
        "inline skate wheels bag",
        "TSA lock luggage",
        "zip compartment suitcase",
        "compression straps bag",
        "soft sided luggage",
        "durable travel bag",
        "wheeled duffel bag",
        "luggage with warranty",
        "brand name luggage"
      ]
    }
  ]
}

Take note that the generated terms from LLM are not always the same even if the same prompt was used. You can check the resulting terms and see if they are suitable for your search use case. Else, you have the option to modify the prompt in your script processor to get more predictable and consistent suggestion terms.

Testing the autocomplete search

Now, we can test the autocomplete functionality using the completion suggester query. The example below also includes a fuzzy parameter to enhance the user experience by handling minor misspellings in the search query. You can execute the query below in the dev tool and check the suggestion results.

POST /products_with_suggestion/_search
{
 "suggest": {
   "search-suggestion": {
     "prefix": "lugg",
     "completion": {
       "field": "suggest",
       "fuzzy": { "fuzziness": 1 }
     }
   }
 }
}

To visualize the autocomplete results, I have implemented a simple search bar that executes a query against the autocomplete index in Elastic Cloud using our client. The search returns result based on terms in the suggestion list generated by LLM as you type.

Elastic product search autocompleate example 2

Scaling with OpenAI inference integration

By using OpenAI’s completion API as an inference endpoint within Elastic Cloud, we can scale this solution efficiently:

Inference endpoint allows automated and scalable LLM suggestions without having to manually create and maintain your own list.
Ingest Pipeline ensures real-time enrichment of data during indexing.
Script Processor within the ingest pipeline allows easy editing of the prompt in case there is a need to customise the nature of the suggestion list in a more specific way.
Pipeline execution can also be configured directly upon ingestion as an index template for further automation. This enables the suggestion list to be built on the fly as new products are added to the index.

In terms of cost efficiency, the model is only invoked during the ingestion process, meaning its usage scales with the number of documents processed rather than the search volume. This can result in significant cost savings compared to running the model at search time if you are expecting growth in users or search activity.

Conclusion

Traditionally, autocomplete relies on manually defined terms, which can be limiting and labour intensive. By leveraging OpenAI’s LLM-generated suggestions, we have the option to automate and enhance autocomplete functionality, improving search relevance and user experience. Furthermore, using Elastic’s ingest pipeline and inference endpoint integration ensures an automated, scalable autocomplete system.

Overall, if your search use case requires a very specific set of suggestions from a well maintained and curated list, ingesting the list in bulk via our API conventionally as described in the first part of this article would still be a great and performant option. If managing and updating a suggestion list is a pain point, an LLM-based completion system removes that burden by automatically generating contextually relevant suggestions—without any manual input.

Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics, or building prod-ready apps Elastic Vector Database.

To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now.

Report an issue