Configuring built-in analyzers

edit

Configuring built-in analyzers

edit

The built-in analyzers can be used directly without any configuration. Some of them, however, support configuration options to alter their behaviour. For instance, the standard analyzer can be configured to support a list of stop words:

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "std_english": { 
          "type":      "standard",
          "stopwords": "_english_"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "my_text": {
        "type":     "text",
        "analyzer": "standard", 
        "fields": {
          "english": {
            "type":     "text",
            "analyzer": "std_english" 
          }
        }
      }
    }
  }
}

POST my_index/_analyze
{
  "field": "my_text", 
  "text": "The old brown cow"
}

POST my_index/_analyze
{
  "field": "my_text.english", 
  "text": "The old brown cow"
}

We define the std_english analyzer to be based on the standard analyzer, but configured to remove the pre-defined list of English stopwords.

The my_text field uses the standard analyzer directly, without any configuration. No stop words will be removed from this field. The resulting terms are: [ the, old, brown, cow ]

The my_text.english field uses the std_english analyzer, so English stop words will be removed. The resulting terms are: [ old, brown, cow ]