Skip to content

Transform

This example shows how handling apache logs with a tremor and elastic search could work. The example is a lot more complex than the initial showcases and combines three components.

Kibana, which once started with docker-compose can be reached locally. It allows browsing through the logs. If you have never used Kibana before you can get started by clicking on Management then in the Elasticsearch section on Index Management.

Elastic Search, which stores the logs submitted.

Tremor, which takes the apache logs, parses and classifies them then submits them to indexes in elastic search.

In addition the file demo/data/apache_access_logs.xz Link is used as example payload.

Environment

In the example.trickle we define scripts that extract and categorize apache logs. Any log that is not conforming ther predefined format will be dropped. All other configuration is the same as per the previous example and is elided here for brevity.

Business Logic

define script extract                                                          # define the script that parses our apache logs
script
  match {"raw": event} of                                                      # we user the dissect extractor to parse the apache log
    case r = %{ raw ~= dissect|%{ip} %{} %{} [%{timestamp}] "%{method} %{path} %{proto}" %{code:int} %{cost:int}\\n| }
            => r.raw                                                           # this first case is hit of the log includes an execution time (cost) for the request
    case r = %{ raw ~= dissect|%{ip} %{} %{} [%{timestamp}] "%{method} %{path} %{proto}" %{code:int} %{}\\n| }
            => r.raw
    default => emit => "bad"
  end
end;
define script categorize                                                       # defome the script that classifies the logs
with
  user_error_index = "errors",                                                 # we use with here to default some configuration for
  server_error_index = "errors",                                               # the script, we could then re-use this script in multiple
  ok_index = "requests",                                                       # places with different indexes
  other_index = "requests"
script
  let $doc_type = "log";                                                      # doc_type is used by the offramp, the $ denots this is stored in event metadat
  let $index = match event of
    case e = %{present code} when e.code >= 200 and e.code < 400              # for http codes between 200 and 400 (exclusive) - those are success codes
      => args.ok_index
    case e = %{present code} when e.code >= 400 and e.code < 500              # 400 to 500 (exclusive) are client side errors
      => args.user_error_index
    case e = %{present code} when e.code >= 500 and e.code < 600
      => args.server_error_index                                              # 500 to 500 (exclusive) are server side errors
    default => args.other_index                                               # if we get any other code we just use a default index
  end;
  event                                                                       # emit the event with it's new metadata
end;

Command line testing during logic development

$ docker-compose up
  ... lots of logs ...

Inject test messages via websocat

Note

Can be installed via cargo install websocat for the lazy/impatient amongst us

$ xzcat demo/data/apache_access_logs.xz | websocat ws://localhost:4242
...

Open the Kibana index management and create indexes to view the data.

Discussion

This is a fairly complex example that combines everything we've seen in the prior examples and a bit more. It should serve as a starting point of how to use tremor to ingest, process, filter and classify data with tremor into an upstream system.

Tip

When using this as a baseline be aware that around things like batching tuning will be involved to make the numbers fit with the infrastructure it is pointed at. Also since it is not an ongoing data stream we omitted backpressure or classification based rate limiting from the example.