Using Grok in a runtime field by elastic content share

Using grok in a runtime field can be very powerful. The Grok pattern is already widely used in the Elastic Stack. You can use Grok in your Logstash pipelines as well as in Ingest Node Pipelines of Elasticsearch. Grok is a simplified and improved way to apply regular expressions (Regex) on top of your fields. While there is also the possibility to use Regex pattern directly using runtime fields it has a lot of advantages to use a grok runtime field.

With Grok you have the ability to use the prebuilt Regex patterns as well as creating your own. To create your own Grok patterns it is good practice to use the Kibana Grok Debugger. Of course you can also use a third party Grok Debugger.

In the Elastic documentation you can find the following example. This runtime field example is parsing the message field that may contains an Apache Log. If this is the case it extracts the client IP. In v.7.14 it is not possible to extract multiple results from grok into different fields.

String clientip=grok('%{COMMONAPACHELOG}').extract(doc["message"].value)?.clientip;
if (clientip != null) emit(clientip);

This is an example that is using a prebuilt grok pattern to extract the domain from the referrer. This is based on Elastic APM Real user monitoring data. While the first example is doing basic null checks this example is more resilient and is not producing shard failures while searching. This example also emits the field with empty value if there is no page referrer.

if (doc['transaction.page.referer'].size()>0) {
    def referer_full = doc["transaction.page.referer"].value;
    if (referer_full != null) {
        String domain=grok('https?://(www.)?%{HOSTNAME:domain}').extract(referer_full)?.domain;
        if (domain != null) emit(domain); 
        return;
    }
}
emit("");

in theory you can built every grok parsing using a runtime field. But you should always consider the performance impact. If you need to field for a long time it is good practice to do the parsing on ingest time using ingest node pipelines. This will increase your search performance significantly in comparison to use runtime fields. However if you wanna test a field first or if you don’t want to reindex your data using a runtime field for grok is a very useful option.

Another example is accessing the message field directly from _source.

String clientip=grok('Returning %{NUMBER:foobar} ads').extract(params._source.message)?.foobar;
emit(clientip);

About the author: Creator of the Elastic Content Share.

4 comments

Elastic runtime field example: Grok

4 comments

Leave a Reply Cancel reply

Get notified whenever we have new content to share

The following categories might be also interesting for you