Best Practices for Elasticsearch mappings

At first, Elasticsearch may appear to be schemaless since you can add new fields any time you want, but every field in a document must match the mapping.

Dynamic Templates reduce boilerplate

How many times have you opened up a mapping file to something like this where the same type definition is repeated over and over again?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
{
  "properties": {
    "foo": {
      "type": "keyword"
    },
    "foo": {
      "type": "keyword"
    },
    "foo": {
      "type": "keyword"
    },
    "baz": {
      "type": "keyword"
    },
    "other": {
      "type": "text"
    },
    ...
  }
}

It’s super easy to refactor this into an alternative where by default all string values are mapped as keyword, except for the specific field listed as “text”.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
{
  "properties": {
    "dynamic_templates": [
      {
        "example_name": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ],
    "other": {
      "type": "text"
    }
  }
}

Disable type detection

For new fields, Elasticsearch can automatically identify what type to use, but it can be wrong or do unexpected things. For example, I’ve seen Elasticsearch accidentally identify a decimal value as a long because the first value to go into the index did not have any decimal points. Then all other documents failed to be indexed because they did not match. This is especially important if you have fields that have a wide range of values (for example, user controlled) because you can’t predict if the first value is going to look like a number or a date, when it should always be considered to be a string.

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html

1
2
3
4
5
6
{
  "mappings": {
    "date_detection": false,
    "numeric_detection": false
  }
}
Copyright - All Rights Reserved

Comments

Comments are currently unavailable while I move to this new blog platform. To give feedback, send an email to adam [at] this website url.