IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« ICU tokenizer ICU folding token filter »

› › ›

ICU normalization token filter

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

ICU normalization token filter

edit

Normalizes characters as explained here. It registers itself as the icu_normalizer token filter, which is available to all indices without any further configuration. The type of normalization can be specified with the name parameter, which accepts nfc, nfkc, and nfkc_cf (default).

Which letters are normalized can be controlled by specifying the unicode_set_filter parameter, which accepts a UnicodeSet.

You should probably prefer the Normalization character filter.

Here are two examples, the default usage and a customised token filter:

PUT icu_sample
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "nfkc_cf_normalized": { 
            "tokenizer": "icu_tokenizer",
            "filter": [
              "icu_normalizer"
            ]
          },
          "nfc_normalized": { 
            "tokenizer": "icu_tokenizer",
            "filter": [
              "nfc_normalizer"
            ]
          }
        },
        "filter": {
          "nfc_normalizer": {
            "type": "icu_normalizer",
            "name": "nfc"
          }
        }
      }
    }
  }
}

Copy as curl Try in Elastic

	Uses the default `nfkc_cf` normalization.
	Uses the customized `nfc_normalizer` token filter, which is set to use `nfc` normalization.

« ICU tokenizer ICU folding token filter »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

ICU normalization token filter

ICU normalization token filter

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards