This documentation contains work-in-progress information for future Elastic Stack and Cloud releases. Use the version selector to view supported release docs. It also contains some Elastic Cloud serverless information. Check out our serverless docs for more details.

« Example: Enrich your data by matching a value to a range Append processor »

› ›

Ingest processor reference

edit

Ingest processor reference

edit

An ingest pipeline is made up of a sequence of processors that are applied to documents as they are ingested into an index. Each processor performs a specific task, such as filtering, transforming, or enriching data.

Each successive processor depends on the output of the previous processor, so the order of processors is important. The modified documents are indexed into Elasticsearch after all processors are applied.

Elasticsearch includes over 40 configurable processors. The subpages in this section contain reference documentation for each processor. To get a list of available processors, use the nodes info API.

resp = client.nodes.info(
    node_id="ingest",
    filter_path="nodes.*.ingest.processors",
)
print(resp)

response = client.nodes.info(
  node_id: 'ingest',
  filter_path: 'nodes.*.ingest.processors'
)
puts response

const response = await client.nodes.info({
  node_id: "ingest",
  filter_path: "nodes.*.ingest.processors",
});
console.log(response);

GET _nodes/ingest?filter_path=nodes.*.ingest.processors

Ingest processors by category

edit

We’ve categorized the available processors on this page and summarized their functions. This will help you find the right processor for your use case.

Data enrichment processors

edit

General outcomes

edit

append processor: Appends a value to a field.
date_index_name processor: Points documents to the right time-based index based on a date or timestamp field.
enrich processor: Enriches documents with data from another index.

Refer to Enrich your data for detailed examples of how to use the enrich processor to add data from your existing indices to incoming documents during ingest.

inference processor: Uses machine learning to classify and tag text fields.

Specific outcomes

edit

attachment processor: Parses and indexes binary data, such as PDFs and Word documents.
circle processor: Converts a location field to a Geo-Point field.
community_id processor: Computes the Community ID for network flow data.
fingerprint processor: Computes a hash of the document’s content.
geo_grid processor: Converts geo-grid definitions of grid tiles or cells to regular bounding boxes or polygons which describe their shape.
geoip processor: Adds information about the geographical location of an IPv4 or IPv6 address from a Maxmind database.
ip_location processor: Adds information about the geographical location of an IPv4 or IPv6 address from an ip geolocation database.
network_direction processor: Calculates the network direction given a source IP address, destination IP address, and a list of internal networks.
registered_domain processor: Extracts the registered domain (also known as the effective top-level domain or eTLD), sub-domain, and top-level domain from a fully qualified domain name (FQDN).
set_security_user processor: Sets user-related details (such as username, roles, email, full_name,metadata, api_key, realm and authentication_type) from the current authenticated user to the current document by pre-processing the ingest.
uri_parts processor: Parses a Uniform Resource Identifier (URI) string and extracts its components as an object.
urldecode processor: URL-decodes a string.
user_agent processor: Parses user-agent strings to extract information about web clients.

Data transformation processors

edit

General outcomes

edit

convert processor: Converts a field in the currently ingested document to a different type, such as converting a string to an integer.
dissect processor: Extracts structured fields out of a single text field within a document. Unlike the grok processor, dissect does not use regular expressions. This makes the dissect’s a simpler and often faster alternative.
grok processor: Extracts structured fields out of a single text field within a document, using the Grok regular expression dialect that supports reusable aliased expressions.
gsub processor: Converts a string field by applying a regular expression and a replacement.
redact processor: Uses the Grok rules engine to obscure text in the input document matching the given Grok patterns.
rename processor: Renames an existing field.
set processor: Sets a value on a field.

Specific outcomes

edit

bytes processor: Converts a human-readable byte value to its value in bytes (for example 1kb becomes 1024).
csv processor: Extracts a single line of CSV data from a text field.
date processor: Extracts and converts date fields.
dot_expand processor: Expands a field with dots into an object field.
html_strip processor: Removes HTML tags from a field.
join processor: Joins each element of an array into a single string using a separator character between each element.
kv processor: Parse messages (or specific event fields) containing key-value pairs.
lowercase processor and uppercase processor: Converts a string field to lowercase or uppercase.
split processor: Splits a field into an array of values.
trim processor: Trims whitespace from field.

Data filtering processors

edit

drop processor: Drops the document without raising any errors.
remove processor: Removes fields from documents.

Pipeline handling processors

edit

fail processor: Raises an exception. Useful for when you expect a pipeline to fail and want to relay a specific message to the requester.
pipeline processor: Executes another pipeline.
reroute processor: Reroutes documents to another target index or data stream.
terminate processor: Terminates the current ingest pipeline, causing no further processors to be run.

Array/JSON handling processors

edit

for_each processor: Runs an ingest processor on each element of an array or object.
json processor: Converts a JSON string into a structured JSON object.
script processor: Runs an inline or stored script on incoming documents. The script runs in the painless ingest context.
sort processor: Sorts the elements of an array in ascending or descending order.

Add additional processors

edit

You can install additional processors as plugins.

You must install any plugin processors on all nodes in your cluster. Otherwise, Elasticsearch will fail to create pipelines containing the processor.

Mark a plugin as mandatory by setting plugin.mandatory in elasticsearch.yml. A node will fail to start if a mandatory plugin is not installed.

plugin.mandatory: my-ingest-plugin

« Example: Enrich your data by matching a value to a range Append processor »