elasticsearch - How to define specific field tokenization on Logstash -
i using logstash index mysql data on elasticsearch:
input { jdbc { // jdbc configurations } } output { elasticsearch { index => "" document_type => "" document_id => "" hosts => [ "" ] } }
when checking results found elasticsearch automatically tokenizes text this:
"foo/bar" -> "foo", "bar" "the thing" -> "the", "thing" "fork, knife" -> "fork", "knife"
well, ok of fields. there 1 specific field i'd have custom tokenizer. comma separated field (or semi-colon separated). should be:
"foo/bar" -> "foo/bar" "the thing" -> "the thing" "fork, knife" -> "fork", "knife"
i wander if there way configure on logstash configuration.
update:
this 1 example of index have. specific field kind
:
{ "index-name": { "aliases": {}, "mappings": { "my-type": { "properties": { "@timestamp": { "type": "date", "format": "strict_date_optional_time||epoch_millis" }, "@version": { "type": "string" }, "kind": { "type": "string" }, "id": { "type": "long" }, "text": { "type": "string" }, "version": { "type": "string" } } } }, "settings": { "index": { "creation_date": "", "number_of_shards": "", "number_of_replicas": "", "uuid": "", "version": { "created": "" } } }, "warmers": {} } }
it's possible using index template.
first delete current index:
delete index_name
then create template index appropriate mapping kind
field, this:
put _template/index_name { "template": "index-name", "mappings": { "my-type": { "properties": { "@timestamp": { "type": "date", "format": "strict_date_optional_time||epoch_millis" }, "@version": { "type": "string" }, "kind": { "type": "string", "index": "not_analyzed" }, "id": { "type": "long" }, "text": { "type": "string" }, "version": { "type": "string" } } } } }
then can run logstash again , index re-created proper mapping.
Comments
Post a Comment