Inspired by PromQL, Loki also has its own query language, called LogQL, which is like a distributed grep that aggregates views of logs. Like PromQL, LogQL is filtered using tags and operators, and has two main types of query functions.

  • Query to return log line contents
  • Calculating relevant metrics in the log stream by filtering rules

Log queries

A basic log query consists of two parts.

  • log stream selector
  • log pipeline

Log queries

Due to the design of Loki, all LogQL queries must contain a Log Stream selector. A Log Stream represents log entries that have the same metadata (set of Labels).

A Log Stream Selector determines how many logs will be searched for. A more granular Log Stream Selector reduces the number of streams searched to a manageable number, which can significantly reduce resource consumption during queries by finely matching log streams.

The log stream selector is optionally followed by a log pipeline for further processing and filtering of log stream information, which consists of a set of expressions, each of which performs relevant filtering for each log line in left-to-right order, each of which can filter, parse and change the log line content and its respective label.

Log queries

The following example shows the operation of a complete log query.

1
{container="query-frontend",namespace="loki-dev"} |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500

The query statement consists of the following parts.

  • a log stream selector {container="query-frontend",namespace="logi-dev"} that filters logs from the query-frontend container under the loki-dev namespace
  • then followed by a log pipeline |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500 that will filter out logs containing the word metrics.go, then parse each line to extract more expressions and filter

To avoid escaping the featured character, you can use single quotes instead of double quotes when quoting a string, for example \w+1 is the same as “\w+”.

Log Stream Selector

The log stream selector determines which log streams should be included in your query results. The selector consists of one or more key-value pairs, where each key is a log tag and each value is the value of that tag.

Log stream selectors are written by wrapping key-value pairs in a pair of curly brackets, e.g.

1
{app="mysql", name="mysql-backup"}

The above example means that all log streams with the tag app and the value mysql and the tag name and the value mysql-backup will be included in the query results.

The = operator after the tag name is a tag matching operator, and there are several tag matching operators supported in LogQL.

  • =: exact match
  • ! =: unequal
  • =~: regular expression matching
  • ! ~: regular expression does not match

For example.

  • {name=~"mysql.+"}
  • {name!~"mysql.+"}
  • {name!~"mysql-\\d+"}

The same rules that apply to the Prometheus tag selector also apply to the Loki log stream selector.

Log Pipeline

A log pipeline can be attached to a log stream selector to further process and filter log streams. It typically consists of one or more expressions, each executed in turn for each log line. If a log line is filtered out by an expression, the pipeline will stop there and start processing the next line. Some expressions can change the log content and their respective labels, which can then be used to further filter and process subsequent expressions or metrics queries.

A log pipeline can consist of the following parts.

  • Log line filtering expressions
  • Parser expressions
  • Tag filtering expressions
  • Log line formatting expressions
  • Tag formatting expressions
  • Unwrap expression

where unwrap expression is a special expression that can only be used in metric queries.

Log line filtering expressions

Log line filtering expressions are used to perform a distributed grep on aggregated logs in a matching log stream.

After writing in the log stream selector, the resulting log data set can be further filtered using a search expression, which can be text or a regular expression, e.g.

  • {job="mysql"} |= "error"
  • {name="kafka"} |~ "tsdb-ops.*io:2003"
  • {name="cassandra"} |~ "error=\\w+"
  • {instance=~"kafka-[23]",name="kafka"} ! = "kafka.server:type=ReplicaManager"

The |=, |~ and ! = are filter operators that support the following.

  • |=: the string contained in the log line
  • ! =: strings not included in the log line
  • |~: log line matching regular expressions
  • ! ~: log line does not match a regular expression

The filter operators can be chained and will filter expressions in order, and the resulting log lines must satisfy each filter. When using |~ and ! ~, regular expressions with Golang’s RE2 syntax can be used. By default, the matching is case-sensitive and can be switched to be case-insensitive by prefixing the regular expression with (?i).

While log line filter expressions can be placed anywhere in the pipeline, it is best to place them at the beginning to improve the performance of the query and only do further follow-up when a line matches. For example, while the results are the same, the following query {job="mysql"} |= "error" |json | line_format "{{.err}}" will be faster than {job="mysql"} | json | line_format "{{.message}}" |= "error", Log line filter expressions are the fastest way to filter logs after log stream selectors .

Parser expressions

Parser expressions parse and extract tags from log content, and these extracted tags can be used in tag filtering expressions for filtering, or for metric aggregation.

The extracted tag keys are automatically formatted by the parser to follow the Prometheus metric name conventions (they can only contain ASCII letters and numbers, as well as underscores and colons, and cannot start with a number).

For example, the following log passing through the pipeline | json will produce the following Map data.

1
{ "a.b": { "c": "d" }, "e": "f" }

->

1
{a_b_c="d", e="f"}

In the case of an error, for example, if the line is not in the expected format, the log line will not be filtered but a new __error__ tag will be added.

Note that if an extracted tag key name already exists in the original log stream, then the extracted tag key will be suffixed with _extracted to distinguish between the two tags. You can use a tag formatting expression to force an override of the original tag, but if an extracted key appears twice, then only the latest tag value will be retained.

The parsers json, logfmt, pattern, regexp and unpack are currently supported.

We should use predefined parsers like json and logfmt whenever possible, it will be easier, and when the log line structure is unusual, you can use regexp, which allows you to use multiple parsers in the same log pipeline, which is useful when you are parsing complex logs.

JSON

The json parser operates in two modes.

  • If the log line is a valid json document, adding | json to your pipeline will extract all json attributes as labels, and nested attributes will be tiled into the label key using the _ separator.

Note: Arrays will be ignored.

For example, use the json parser to extract the tags from the contents of the following files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
{
  "protocol": "HTTP/2.0",
  "servers": ["129.0.1.1", "10.2.1.3"],
  "request": {
    "time": "6.032",
    "method": "GET",
    "host": "foo.grafana.net",
    "size": "55",
    "headers": {
      "Accept": "*/*",
      "User-Agent": "curl/7.68.0"
    }
  },
  "response": {
    "status": 401,
    "size": "228",
    "latency_seconds": "6.031"
  }
}

A list of tags can be obtained as shown below.

1
2
3
4
5
6
7
8
"protocol" => "HTTP/2.0"
"request_time" => "6.032"
"request_method" => "GET"
"request_host" => "foo.grafana.net"
"request_size" => "55"
"response_status" => "401"
"response_size" => "228"
"response_latency_seconds" => "6.031"
  • Using |json label="expression", another="expression" in your pipeline will extract only the specified json field as a label, you can specify one or more expressions in this way, same as label_format, all expressions must be quoted.

Only field access (my.field, my["field"]) and array access (list[0]) are currently supported, as well as combinations of these in any level of nesting (my.list[0]["field"]).

For example, |json first_server="servers[0]", ua="request.headers[\"User-Agent\"] will extract tags from the following log files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
{
  "protocol": "HTTP/2.0",
  "servers": ["129.0.1.1", "10.2.1.3"],
  "request": {
    "time": "6.032",
    "method": "GET",
    "host": "foo.grafana.net",
    "size": "55",
    "headers": {
      "Accept": "*/*",
      "User-Agent": "curl/7.68.0"
    }
  },
  "response": {
    "status": 401,
    "size": "228",
    "latency_seconds": "6.031"
  }
}

The list of extracted tags is.

1
2
"first_server" => "129.0.1.1"
"ua" => "curl/7.68.0"

If the expression returns an array or object, it will be assigned to the tag in json format. For example, |json server_list="services", headers="request.headers will extract to the following tags.

logfmt

The logfmt parser can be added by using | logfmt, which will advance all the keys and values from the logfmt formatted log lines.

For example, the following log line data.

1
at=info method=GET path=/ host=grafana.net fwd="124.133.124.161" service=8ms status=200

The labels will be extracted as shown below.

1
2
3
4
5
6
7
"at" => "info"
"method" => "GET"
"path" => "/"
"host" => "grafana.net"
"fwd" => "124.133.124.161"
"service" => "8ms"
"status" => "200"

regexp

Unlike logfmt and json (which extract all values implicitly and without arguments), the regexp parser takes a single argument | regexp "<re>" in the form of a regular expression using Golang RE2 syntax.

The regular expression must contain at least one named submatch (e.g. (?P<name>re)), with each submatch extracting a different tag.

For example, the parser | regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\\d+?) \\\) (?P<duration>. *)" will extract tags from the following lines.

1
POST /api/prom/api/v1/query_range (200) 1.5s

The extracted tags are.

1
2
3
4
"method" => "POST"
"path" => "/api/prom/api/v1/query_range"
"status" => "200"
"duration" => "1.5s"

pattern

The pattern parser allows fields to be extracted explicitly from log lines by defining a pattern expression (| pattern "<pattern-expression>") that matches the structure of the log line.

For example, let’s consider the following NGINX log line data.

1
0.191.12.2 - - [10/Jun/2021:09:14:29 +0000] "GET /api/plugins/versioncheck HTTP/1.1" 200 2 "-" "Go-http-client/2.0" "13.76.247.102, 34.120.177.193" "TLSv1.2" "US" ""

The log line can be parsed with the following expression.

1
<ip> - - <_> "<method> <uri> <_>" <status> <size> <_> "<agent>" <_>

After parsing, these attributes can be extracted as follows.

1
2
3
4
5
6
7

"ip" => "0.191.12.2"
"method" => "GET"
"uri" => "/api/plugins/versioncheck"
"status" => "200"
"size" => "2"
"agent" => "Go-http-client/2.0"

The capture of a pattern expression is a field name separated by the < and > characters, for example <example> defines the field name as example, unnamed capture is displayed as <_>, and unnamed capture skips the match. By default, the pattern expression is anchored at the beginning of the log line, and you can use <_> at the beginning of the expression to anchor the expression at the beginning.

For example, let’s look at the following log line data.

1
level=debug ts=2021-06-10T09:24:13.472094048Z caller=logging.go:66 traceID=0568b66ad2d9294c msg="POST /loki/api/v1/push (204) 16.652862ms"

If we wish to match only the contents of msg=", we can use the following expression to do so.

1
<_> msg="<method> <path> (<status>) <latency>"

We don’t need most of the preceding log data, we just need to use <_> for placeholders, which is obviously much simpler than regular expressions.

unpack

The unpack parser will parse the json log lines and unpack all embedded tags through the packing phase, a special attribute _entry will also be used to replace the original log lines.

For example, using the | unpack parser, you can get tags as follows.

1
2
3
4
5
{
  "container": "myapp",
  "pod": "pod-3223f",
  "_entry": "original log message"
}

Allows extracting container and pod tags and raw log messages as new log lines.

If the original embedded log lines are in a specific format, you can use unpack in combination with a json parser (or other parser).

Tag filtering expressions

A tag filter expression allows to filter log lines using their original and extracted tags, and it can contain multiple predicates.

A predicate contains a tag identifier, operator and a value for comparing tags.

For example cluster="namespace" where cluster is the tag identifier, the operator is = and the value is "namespace".

LogQL supports a variety of value types that are automatically inferred from the query input.

  • String (string) is caused by double quotes or backquotes, such as "200" or us-central1.
  • Duration (time) is a string of decimal numbers, each with an optional number and unit suffix, such as "300ms", "1.5h" or "2h45m", and valid time units are "ns", "us" (or "┬Ás"), "ms", "s", "m", "h".
  • Number is a floating point number (64 bits), e.g. 250, 89.923.
  • Bytes is a string of decimal numbers, each with an optional number and unit suffix, such as "42MB", "1.5Kib" or "20b", and valid byte units are "b", "kib", "kb", "mib", "mb", "gib", "gb", "tib ", "tb", "pib", "bb", "eb".

The string type works exactly the same way as the Prometheus tag matcher is used in the log stream selector, which means you can use the same operators (=, ! =, =~, ! ~).

Using Duration, Number and Bytes will convert the tag values before comparing and supports the following comparators.

  • == or = equal comparison
  • ! = not equal comparison
  • > and >= for greater-than or greater-than-equal comparisons
  • < and <= for less-than or less-than-equal comparisons

For example, logfmt | duration > 1m and bytes_consumed > 20MB filters the expression.

If the conversion of the tag value fails, the log line is not filtered and a __error__ tag is added. You can use and and or to concatenate multiple predicates that represent and and or binary operations, respectively. and can be represented by commas, spaces, or other pipes, and tag filters can be placed anywhere in the log pipeline.

All of the following expressions are equivalent:

1
2
3
4
| duration >= 20ms or size == 20kb and method!~"2.."
| duration >= 20ms or size == 20kb | method!~"2.."
| duration >= 20ms or size == 20kb,method!~"2.."
| duration >= 20ms or size == 20kb method!~"2.."

By default, multiple predicates are prioritized from right to left. You can wrap predicates in parentheses to force a different priority from left to right.

For example, the following is equivalent.

1
2
| duration >= 20ms or method="GET" and size <= 20KB
| ((duration >= 20ms or method="GET") and size <= 20KB)

It will first evaluate duration>=20ms or method="GET" , to first evaluate method="GET" and size<=20KB , make sure to use the appropriate brackets as shown below.

1
| duration >= 20ms or (method="GET" and size <= 20KB)

Log line formatting expressions

Log line formatting expressions can be used to rewrite the contents of log lines by using Golang’s text/template template format, which takes a string parameter | line_format "{{.label_name}}" as the template format, and all labels are variables injected into the template and can be used with the {.label_name }} notation to be used.

For example, the following expression.

1
{container="frontend"} | logfmt | line_format "{{.query}} {{.duration}}"

The log lines will be extracted and rewritten to contain only query and the requested duration. You can use double-quoted strings or backquotes {{.label_name}} for templates to avoid escaping special characters.

Also line_format supports mathematical functions, e.g.

If we have the following labels ip=1.1.1.1, status=200 and duration=3000(ms), we can divide duration by 1000 to get the value in seconds.

The above query will result in a log line of 1.1.1.1 200 3.

Label format expressions

The | label_format expression can rename, modify or add labels. It takes a comma-separated list of operations as arguments, and can perform multiple operations at once.

When both sides are label identifiers, for example dst=src, the operation will rename the src label to dst.

The left side can also be a template string, e.g. dst="{{.status}} {{.query}}", in which case the dst tag value will be replaced by the Golang template execution result, which is the same template engine as the | line_format expression, which means that the tag can be used as a variable, or the same function list.

In both cases above, if the target tag does not exist, then a new tag will be created.

The renamed form dst=src will remove the src tag after remapping it to the dst tag, however, the template form will retain the referenced tag, for example dst="{{.src}}" results in both dst and src having the same value.

A label name can only appear once in each expression, which means that | label_format foo=bar,foo="new" is not allowed, but you can use two expressions to achieve the desired effect, such as | label_format foo=bar | label_format foo="new" .

Logging Metrics

LogQL also supports metrics for log streams as a function, typically we can use it to calculate the error rate of messages or to sort the application log output Top N over time.

Interval vectors

LogQL also supports a limited number of interval vector metric statements, similar to PromQL, with the following 4 functions.

  • rate: counts log entries per second
  • count_over_time: counts the entries of each log stream within the specified range
  • bytes_rate: calculates the number of bytes per second for a log stream
  • bytes_over_time: the number of bytes used for each log stream in the specified range

For example, to calculate the qps of nginx.

1
rate({filename="/var/log/nginx/access.log"}[5m]))

Calculate the number of times the kernel has experienced oom in the last 5 minutes.

1
count_over_time({filename="/var/log/message"} |~ "oom_kill_process" [5m]))

Aggregation Functions

LogQL also supports aggregation, which can be used to aggregate the elements within a single vector to produce a new vector with fewer elements.

  • sum: summation
  • min: minimum value
  • max: maximum value
  • avg: average value
  • stddev: standard deviation
  • stdvar: standard variance
  • count: count
  • bottomk: the smallest k elements
  • topk: the largest k elements

The aggregation function we can describe with the following expression.

1
<aggr-op>([parameter,] <vector expression>) [without|by (<label list>)]

For grouping tags, we can use without or by to distinguish them. For example, to calculate the qps of nginx and group it by pod.

1
sum(rate({filename="/var/log/nginx/access.log"}[5m])) by (pod)

Only when using the bottomk and topk functions, we can enter the relevant arguments to the functions. For example, to calculate the top 5 qps for nginx and group them by pod.

1
topk(5,sum(rate({filename="/var/log/nginx/access.log"}[5m])) by (pod)))

Binary operations

mathematical calculations

Loki stores logs, they are all text, how do you calculate them? Obviously the mathematical operations in LogQL are oriented towards interval vector operations, and the supported binary operators in LogQL are as follows

  • +: addition
  • -: subtraction
  • *: multiplication
  • /: division
  • %: modulo
  • ^: find the power

For example, if we want to find the error rate inside a certain business log, we can calculate it as follows.

1
sum(rate({app="foo", level="error"}[1m])) / sum(rate({app="foo"}[1m]))

Logical operations

Set operations are only valid in the interval vector range, and currently support

  • and: and
  • or: or
  • unless: exclude

For example.

1
rate({app=~"foo|bar"}[1m]) and rate({app="bar"}[1m])

Comparison Operators

LogQL supports the same comparison operators as PromQL, including

  • ==: equal to
  • ! =: not equal
  • >: greater than
  • >=: greater than or equal to
  • <: less than
  • <=: less than or equal to

Usually we do a comparison of thresholds after using interval vector calculations, which is useful for alerting, e.g. to count error level log entries greater than 10 within 5 minutes.

1
count_over_time({app="foo", level="error"}[5m]) > 10

We can also express this through a Boolean calculation, such as a statistic of error level log entries greater than 10 within 5 minutes is true. The opposite is false.

1
count_over_time({app="foo", level="error"}[5m]) > bool 10

Comments

LogQL queries can be annotated with the # character, e.g.

1
{app="foo"} # anything that comes after will not be interpreted in your query

For multi-row LogQL queries, you can use # to exclude whole or partial rows.

1
2
3
4
5

{app="foo"}
    | json
    # this line will be ignored
    | bar="baz" # this checks if bar = "baz"

Query Example

Here we deploy a sample application that is a fake logger with debug, info and warning logs output to stdout. error level logs will be written to stderr and the actual log messages are generated in JSON format and a new log message will be created every 500 milliseconds. The log message format is shown below.

1
2
3
4
5
6
7
8
9
{
    "app":"The fanciest app of mankind",
    "executable":"fake-logger",
    "is_even": true,
    "level":"debug",
    "msg":"This is a debug message. Hope you'll catch the bug",
    "time":"2022-04-04T13:41:50+02:00",
    "version":"1.0.0"
}

Use the following command to create the sample application.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
 labels:
  app: fake-logger
  environment: development
 name: fake-logger
spec:
 selector:
  matchLabels:
   app: fake-logger
   environment: development
 template:
  metadata:
   labels:
    app: fake-logger
    environment: development
  spec:
   containers:
   - image: thorstenhans/fake-logger:0.0.2
     name: fake-logger
     resources:
      requests:
       cpu: 10m
       memory: 32Mi
      limits:
       cpu: 10m
       memory: 32Mi
EOF

We can use {app="fake-logger"} to query the application’s log stream data in Grafana.

Grafana

Since the logs of our sample application are in JSON form, we can use a JSON parser to parse the logs with the expression {app="fake-logger"} | json, as shown below.

Grafana

After parsing the log using the JSON parser, you can see that the Grafana-provided panel is differentiated using different colors depending on the value of level, and that the attributes of our log are now added to the Log tab.

Grafana

Now that the data in JSON is turned into log tags we can naturally use these tags to filter log data. For example, if we want to filter logs with level=error, we just use the expression {app="fake-logger"} | json | level="error" to do so.

Grafana

In addition, we can format the output logs according to our needs using line_format, for example, we use the query statement {app="fake-logger"} | json |is_even="true" | line_format "logs generated in {{.time}} on {{.level}}@ {{.pod}} Pod generated log {{.msg}}" to format the log output.

Grafana

Monitoring Panel

Here we illustrate monitoring Kubernetes events as an example. First you need to install [kubernetes-event-exporter] at https://github.com/opsgenie/kubernetes-event-exporter/tree/master/deploy and the kubernetes-event- exporter logs will be printed to stdout, and then our promtail will upload the logs to Loki.

Monitoring Panel

Then import the Dashboard at https://grafana.com/grafana/dashboards/14003, but be careful to change the filter tag in each chart to job="monitoring/event-exporter".

grafana dashboards

After the modification, you can normally see the relevant event information in the cluster in Dashboard, but it is recommended to replace the query statement in the panel with a record rule.

grafana dashboards

Recommendations

  1. try to use static labels, the overhead is smaller, usually logs are injected into labels before they are sent to Loki, the recommended static labels contain

    • Host: kubernetes/hosts
    • Application name: kubernetes/labels/app_kubernetes_io/name
    • component name: kubernetes/labels/name
    • Namespace: kubernetes/namespace
    • Other static tags, such as environment, version, etc.
  2. Use dynamic tags with caution. Too many tag combinations can create a lot of streams, and it can make Loki store a lot of indexes and small chunks of object files. These can significantly consume Loki’s query performance. To avoid these problems, don’t add labels until you know you need them. Loki’s strength lies in parallel querying, using filter expressions (label=“text”, |~ “regex”, …) to query the logs will be more efficient and fast.

    logql

  3. bounded range of tag values, as Loki users or operators our goal should be to use as few tags as possible to store your logs. This means that fewer tags lead to smaller indexes, which leads to better performance, so we should always think twice before adding tags.

  4. configure caching, Loki can configure caching for multiple components, either redis or memcached, which can significantly improve performance.

  5. use LogQL syntax wisely to dramatically improve query efficiency. label matchers (label matchers) are your first line of defense and are the best way to dramatically reduce the number of logs you search (for example, from 100TB to 1TB). Of course, this means you need to have good label definition specifications on the log collection side.