Create Index

edit

The Create Index API is used to manually create an index in Elasticsearch. All documents in Elasticsearch are stored inside of one index or another.

The most basic command is the following:

PUT twitter

This create an index named twitter with all default setting.

Index name limitations

There are several limitations to what you can name your index. The complete list of limitations are:

  • Lowercase only
  • Cannot include \, /, *, ?, ", <, >, |, ` ` (space character), ,, #
  • Indices prior to 7.0 could contain a colon (:), but that’s been deprecated and won’t be supported in 7.0+
  • Cannot start with -, _, +
  • Cannot be . or ..
  • Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will count towards the 255 limit faster)

Index Settings

edit

Each index created can have specific settings associated with it, defined in the body:

PUT twitter
{
    "settings" : {
        "index" : {
            "number_of_shards" : 3, 
            "number_of_replicas" : 2 
        }
    }
}

Default for number_of_shards is 5

Default for number_of_replicas is 1 (ie one replica for each primary shard)

or more simplified

PUT twitter
{
    "settings" : {
        "number_of_shards" : 3,
        "number_of_replicas" : 2
    }
}

You do not have to explicitly specify index section inside the settings section.

For more information regarding all the different index level settings that can be set when creating an index, please check the index modules section.

Mappings

edit

The create index API allows to provide a type mapping:

PUT test
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "_doc" : {
            "properties" : {
                "field1" : { "type" : "text" }
            }
        }
    }
}

Aliases

edit

The create index API allows also to provide a set of aliases:

PUT test
{
    "aliases" : {
        "alias_1" : {},
        "alias_2" : {
            "filter" : {
                "term" : {"user" : "kimchy" }
            },
            "routing" : "kimchy"
        }
    }
}

Wait For Active Shards

edit

By default, index creation will only return a response to the client when the primary copies of each shard have been started, or the request times out. The index creation response will indicate what happened:

{
    "acknowledged": true,
    "shards_acknowledged": true,
    "index": "test"
}

acknowledged indicates whether the index was successfully created in the cluster, while shards_acknowledged indicates whether the requisite number of shard copies were started for each shard in the index before timing out. Note that it is still possible for either acknowledged or shards_acknowledged to be false, but the index creation was successful. These values simply indicate whether the operation completed before the timeout. If acknowledged is false, then we timed out before the cluster state was updated with the newly created index, but it probably will be created sometime soon. If shards_acknowledged is false, then we timed out before the requisite number of shards were started (by default just the primaries), even if the cluster state was successfully updated to reflect the newly created index (i.e. acknowledged=true).

We can change the default of only waiting for the primary shards to start through the index setting index.write.wait_for_active_shards (note that changing this setting will also affect the wait_for_active_shards value on all subsequent write operations):

PUT test
{
    "settings": {
        "index.write.wait_for_active_shards": "2"
    }
}

or through the request parameter wait_for_active_shards:

PUT test?wait_for_active_shards=2

A detailed explanation of wait_for_active_shards and its possible values can be found here.

Skipping types

edit

Types are being removed from Elasticsearch: in 7.0, the mappings element will no longer take the type name as a top-level key by default. You can already opt in for this behavior by setting include_type_name=false and putting mappings directly under mappings in the index creation call, without specifying a type name.

Here is an example:

PUT test?include_type_name=false
{
  "mappings": {
    "properties": {
      "foo": {
        "type": "keyword"
      }
    }
  }
}