Rules are the cornerstone of processing pipelines. They contain the logic about how to change, enrich, route, and drop messages.
To avoid the complexities of a complete programming language, Graylog supports a small rule language to express processing logic. The rule language is intentionally limited to allow for easier understanding, faster learning, and better runtime optimization.
The real work of rules is done in functions, which are completely pluggable. Graylog already ships with a great number of built-in functions, providing data conversion, string manipulation, data retrieval using lookup tables, JSON parsing, and much more.
We expect that special purpose functions will be written and shared by the community, enabling faster innovation and problem solving than previously possible.
Building upon the previous example in the Pipelines section, let’s look at examples of some of the rules we’ve referenced:
rule "has firewall fields" when has_field("src_ip") && has_field("dst_ip") then end
rule "from firewall subnet" when cidr_match("10.10.10.0/24", to_ip($message.gl2_remote_ip)) then end
Firstly, apart from naming the rule structure follows a simple when, then pattern. In the when clause we specify
a boolean expression which is evaluated in the context of the current message in the pipeline. These are the conditions
used by the pipeline processor to determine whether to run a rule, and collectively (when evaluating the containing stage’s
match all or
match any requirement) whether to continue in a pipeline.
Note that the has firewall fields rule uses the built-in function
has_field to check whether the message has
dst_ip fields, as we want to use them in a later stage of the pipeline. This rule has
no actions to run in its then clause, since we only want to use it to determine whether subsequent stages should run.
The second rule, from firewall subnet, uses the built-in function cidr_match, which takes a CIDR pattern
and an IP address. In this case we reference a field from the currently-processed message using the message reference syntax
Graylog always sets the
gl2_remote_ip field on messages, so we don’t need to check whether that field exists. If we wanted to use a
field that might not exist on all messages we’d first use the
has_field function to ensure its presence.
Note the call to
to_ip around the
gl2_remote_ip field reference. This is necessary since the field is stored as a string internally, and
requires an IP address object for its
Requiring an explicit conversion to an IP address object demonstrates an important feature of Graylog’s rule language: enforcement of type safety to ensure that you end up with the data in the correct format. All too often everything is treated as a string, which wastes enormous amounts of cycles on data conversion and prevents proper analysis of the data.
We again have no actions to run, since we’re just using the rule to manage the pipeline’s flow, so the then block is empty.
You might be wondering why we didn’t just combine the has firewall fields and from firewall subnet rules, since they seem to be serving the same purpose.
While we could absolutely do so, recall that rules are intended to be reusable building blocks. Imagine you have a another pipeline for a different
firewall subnet. Rather than duplicating the logic to check for
dst_ip, and updating each rule if anything ever changes (e.g. additional fields),
you can simply add the has firewall fields rule to your new stage. With this approach you only need to update a single rule, with the change immediatedly
taking effect for all pipelines referencing it. Nice!
As we have seen in the previous section, we need to make sure to use the proper data types when calling functions.
Graylog’s rule language parser rejects invalid use of types, making it safe to write rules.
The six built-in types in Graylog are
string (a UTF-8 string),
double (corresponds to Java’s
void (indicating a function has no return value to prevent it
being used in a condition), and
ip (a subset of
InetAddress), but plugins are free
to add additional types as they see fit. The rule processor takes care of ensuring that values and functions agree on the types
By convention, functions that convert types start with the prefix
to_. Please refer to the Functions index for a list.
In Graylog’s rules the when clause is a boolean expression, which is evaluated against the processed message.
Expressions support the common boolean operators
!), and comparison operators
Any function that returns a value can be called in the when clause, but it must eventually evaluate to a boolean. For example: we were
able to use
to_ip in the from firewall subnet since it was being passed to
cidr_match, which returns a boolean, but could not
route_to_stream since it doesn’t return a value.
The condition must not be empty, but can simply consist of the boolean literal
true. This is useful when you always want to execute a rule’s actions.
If a condition calls a function which is not present (perhaps due to a typo or missing plugin) the call evaluates to
Comparing two fields can be done when you use the same data type, e.g.
to_string($message.src_ip) == to_string($message.dst_ip) will compare the two strings and will become
true on match. Comparing different data types evaluates to
A rule’s then clause contains a list of actions which are evaluated in the order they appear.
There are two different types of actions:
Function calls look exactly like they do in conditions. All functions, including those which do not return a value, may be used in the then clause.
Variable assignments have the following form:
let name = value;
Variables are useful to avoid recomputing expensive parsing of data, holding on to temporary values, or making rules more readable.
Variables need to be defined before they can be used. Their fields (if any) can be accessed using the
name.field notation in any place
where a value of the field’s type is required.
The list of actions can be empty, in which case the rule is essentially a pluggable condition to help manage a pipeline’s processing flow.