Arrays

edit

In Elasticsearch, there is no dedicated array data type. Any field can contain zero or more values by default, however, all values in the array must be of the same data type. For instance:

  • an array of strings: [ "one", "two" ]
  • an array of integers: [ 1, 2 ]
  • an array of arrays: [ 1, [ 2, 3 ]] which is the equivalent of [ 1, 2, 3 ]
  • an array of objects: [ { "name": "Mary", "age": 12 }, { "name": "John", "age": 10 }]

Arrays of objects

Arrays of objects do not work as you would expect: you cannot query each object independently of the other objects in the array. If you need to be able to do this then you should use the nested data type instead of the object data type.

This is explained in more detail in Nested.

When adding a field dynamically, the first value in the array determines the field type. All subsequent values must be of the same data type or it must at least be possible to coerce subsequent values to the same data type.

Arrays with a mixture of data types are not supported: [ 10, "some string" ]

An array may contain null values, which are either replaced by the configured null_value or skipped entirely. An empty array [] is treated as a missing field — a field with no values.

Nothing needs to be pre-configured in order to use arrays in documents, they are supported out of the box:

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    message: 'some arrays in this document...',
    tags: [
      'elasticsearch',
      'wow'
    ],
    lists: [
      {
        name: 'prog_list',
        description: 'programming list'
      },
      {
        name: 'cool_list',
        description: 'cool stuff list'
      }
    ]
  }
)
puts response

response = client.index(
  index: 'my-index-000001',
  id: 2,
  body: {
    message: 'no arrays in this document...',
    tags: 'elasticsearch',
    lists: {
      name: 'prog_list',
      description: 'programming list'
    }
  }
)
puts response

response = client.search(
  index: 'my-index-000001',
  body: {
    query: {
      match: {
        tags: 'elasticsearch'
      }
    }
  }
)
puts response
{
	res, err := es.Index(
		"my-index-000001",
		strings.NewReader(`{
	  "message": "some arrays in this document...",
	  "tags": [
	    "elasticsearch",
	    "wow"
	  ],
	  "lists": [
	    {
	      "name": "prog_list",
	      "description": "programming list"
	    },
	    {
	      "name": "cool_list",
	      "description": "cool stuff list"
	    }
	  ]
	}`),
		es.Index.WithDocumentID("1"),
		es.Index.WithPretty(),
	)
	fmt.Println(res, err)
}

{
	res, err := es.Index(
		"my-index-000001",
		strings.NewReader(`{
	  "message": "no arrays in this document...",
	  "tags": "elasticsearch",
	  "lists": {
	    "name": "prog_list",
	    "description": "programming list"
	  }
	}`),
		es.Index.WithDocumentID("2"),
		es.Index.WithPretty(),
	)
	fmt.Println(res, err)
}

{
	res, err := es.Search(
		es.Search.WithIndex("my-index-000001"),
		es.Search.WithBody(strings.NewReader(`{
	  "query": {
	    "match": {
	      "tags": "elasticsearch"
	    }
	  }
	}`)),
		es.Search.WithPretty(),
	)
	fmt.Println(res, err)
}
PUT my-index-000001/_doc/1
{
  "message": "some arrays in this document...",
  "tags":  [ "elasticsearch", "wow" ], 
  "lists": [ 
    {
      "name": "prog_list",
      "description": "programming list"
    },
    {
      "name": "cool_list",
      "description": "cool stuff list"
    }
  ]
}

PUT my-index-000001/_doc/2 
{
  "message": "no arrays in this document...",
  "tags":  "elasticsearch",
  "lists": {
    "name": "prog_list",
    "description": "programming list"
  }
}

GET my-index-000001/_search
{
  "query": {
    "match": {
      "tags": "elasticsearch" 
    }
  }
}

The tags field is dynamically added as a string field.

The lists field is dynamically added as an object field.

The second document contains no arrays, but can be indexed into the same fields.

The query looks for elasticsearch in the tags field, and matches both documents.