Auto mapping

edit

When creating a mapping either when creating an index or through the Put Mapping API, NEST offers a feature called auto mapping that can automagically infer the correct Elasticsearch field datatypes from the CLR POCO property types you are mapping.

We’ll look at the features of auto mapping with a number of examples. For this, we’ll define two POCOs, Company, which has a name and a collection of Employees, and Employee which has various properties of different types, and itself has a collection of Employee types.

public class Company
{
    public string Name { get; set; }
    public List<Employee> Employees { get; set; }
}

public class Employee
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public int Salary { get; set; }
    public DateTime Birthday { get; set; }
    public bool IsManager { get; set; }
    public List<Employee> Employees { get; set; }
    public TimeSpan Hours { get; set; }
}

Auto mapping can take the pain out of having to define a manual mapping for all properties on the POCO

var descriptor = new CreateIndexDescriptor("myindex")
    .Mappings(ms => ms
        .Map<Company>(m => m.AutoMap()) 
        .Map<Employee>(m => m.AutoMap()) 
    );

Auto map Company

Auto map Employee

{
  "mappings": {
    "company": {
      "properties": {
        "employees": {
          "properties": {
            "birthday": {
              "type": "date"
            },
            "employees": {
              "properties": {},
              "type": "object"
            },
            "firstName": {
              "type": "string"
            },
            "hours": {
              "type": "long"
            },
            "isManager": {
              "type": "boolean"
            },
            "lastName": {
              "type": "string"
            },
            "salary": {
              "type": "integer"
            }
          },
          "type": "object"
        },
        "name": {
          "type": "string"
        }
      }
    },
    "employee": {
      "properties": {
        "birthday": {
          "type": "date"
        },
        "employees": {
          "properties": {},
          "type": "object"
        },
        "firstName": {
          "type": "string"
        },
        "hours": {
          "type": "long"
        },
        "isManager": {
          "type": "boolean"
        },
        "lastName": {
          "type": "string"
        },
        "salary": {
          "type": "integer"
        }
      }
    }
  }
}
var descriptor = new CreateIndexDescriptor("myindex")
    .Mappings(ms => ms
        .Map<ParentWithStringId>(m => m
            .AutoMap()
        )
    );

var expected = new
{
    mappings = new
    {
        parent = new
        {
            properties = new
            {
                id = new
                {
                    type = "string",
                }
            }
        }
    }
};

var settings = WithConnectionSettings(s => s
    .InferMappingFor<ParentWithStringId>(m => m
        .TypeName("parent")
        .Ignore(p => p.Description)
        .Ignore(p => p.IgnoreMe)
    )
);

settings.Expect(expected).WhenSerializing((ICreateIndexRequest) descriptor);

Observe that NEST has inferred the Elasticsearch types based on the CLR type of our POCO properties. In this example,

  • Birthday is mapped as a date,
  • Hours is mapped as a long (TimeSpan ticks)
  • IsManager is mapped as a boolean,
  • Salary is mapped as an integer
  • Employees is mapped as an object

and the remaining string properties as string datatypes.

NEST has inferred mapping support for the following .NET types

  • String maps to "string"
  • Int32 maps to "integer"
  • UInt16 maps to "integer"
  • Int16 maps to "short"
  • Byte maps to "short"
  • Int64 maps to "long"
  • UInt32 maps to "long"
  • TimeSpan maps to "long"
  • Single maps to "float"
  • Double maps to "double"
  • Decimal maps to "double"
  • UInt64 maps to "double"
  • DateTime maps to "date"
  • DateTimeOffset maps to "date"
  • Boolean maps to "boolean"
  • Char maps to "string"
  • Guid maps to "string"

and supports a number of special types defined in NEST

  • Nest.GeoLocation maps to "geo_point"
  • Nest.CompletionField<TPayload> maps to "completion"
  • Nest.Attachment maps to "attachment"

All other types map to "object" by default.

Some .NET types do not have direct equivalent Elasticsearch types. For example, System.Decimal is a type commonly used to express currencies and other financial calculations that require large numbers of significant integral and fractional digits and no round-off errors. There is no equivalent type in Elasticsearch, and the nearest type is double, a double-precision 64-bit IEEE 754 floating point.

When a POCO has a System.Decimal property, it is automapped to the Elasticsearch double type. With the caveat of a potential loss of precision, this is generally acceptable for a lot of use cases, but it can however cause problems in some edge cases.

As the C# Specification states,

For a conversion from decimal to float or double, the decimal value is rounded to the nearest double or float value. While this conversion may lose precision, it never causes an exception to be thrown.
-- C# Specification section 6.2.1

This conversion causes an exception to be thrown at deserialization time for Decimal.MinValue and Decimal.MaxValue because, at serialization time, the nearest double value that is converted to is outside of the bounds of Decimal.MinValue or Decimal.MaxValue, respectively. In these cases, it is advisable to use double as the POCO property type.

Mapping Recursion

edit

If you notice in our previous Company and Employee example, the Employee type is recursive in that the Employee class itself contains a collection of type Employee. By default, .AutoMap() will only traverse a single depth when it encounters recursive instances like this; the collection of type Employee on the Employee class did not get any of its properties mapped.

This is done as a safe-guard to prevent stack overflows and all the fun that comes with infinite recursion. Additionally, in most cases, when it comes to Elasticsearch mappings, it is often an edge case to have deeply nested mappings like this. However, you may still have the need to do this, so you can control the recursion depth of .AutoMap().

Let’s introduce a very simple class, A, which itself has a property Child of type A.

public class A
{
    public A Child { get; set; }
}

By default, .AutoMap() only goes as far as depth 1

var descriptor = new CreateIndexDescriptor("myindex")
    .Mappings(ms => ms
        .Map<A>(m => m.AutoMap())
    );

Thus we do not map properties on the second occurrence of our Child property

{
  "mappings": {
    "a": {
      "properties": {
        "child": {
          "properties": {},
          "type": "object"
        }
      }
    }
  }
}

Now let’s specify a maxRecursion of 3

var withMaxRecursionDescriptor = new CreateIndexDescriptor("myindex")
    .Mappings(ms => ms
        .Map<A>(m => m.AutoMap(3))
    );

.AutoMap() has now mapped three levels of our Child property

{
  "mappings": {
    "a": {
      "properties": {
        "child": {
          "type": "object",
          "properties": {
            "child": {
              "type": "object",
              "properties": {
                "child": {
                  "type": "object",
                  "properties": {
                    "child": {
                      "type": "object",
                      "properties": {}
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}