Deprecation Note
We published the last version of Graylog Documentation before the release of Graylog 4.2. Now, all documentation and help content for Graylog products are available at https://docs.graylog.org/.
There will be no further updates to these pages as of October 2021.
Do you have questions about our documentation? You may place comments or start discussions about documentation here: https://community.graylog.org/c/documentation-campfire/30
Rules¶
Overview¶
Rules are the cornerstone of processing pipelines. They contain the logic about how to change, enrich, route, and drop messages.
To avoid the complexities of a complete programming language, Graylog supports a small rule language to express processing logic. The rule language is intentionally limited to allow for easier understanding, faster learning, and better runtime optimization.
The real work of rules is done in functions, which are completely pluggable. Graylog already ships with a great number of built-in functions, providing data conversion, string manipulation, data retrieval using lookup tables, JSON parsing, and much more.
We expect that special purpose functions will be written and shared by the community, enabling faster innovation and problem solving than previously possible.
Rule Structure¶
Building upon the previous example in the Pipelines section, let’s look at examples of some of the rules we’ve referenced:
rule "has firewall fields"
when
has_field("src_ip") && has_field("dst_ip")
then
end
rule "from firewall subnet"
when
cidr_match("10.10.10.0/24", to_ip($message.gl2_remote_ip))
then
end
Firstly, apart from naming the rule structure follows a simple when, then pattern. In the when clause we specify
a boolean expression which is evaluated in the context of the current message in the pipeline. These are the conditions
used by the pipeline processor to determine whether to run a rule, and collectively (when evaluating the containing stage’s
match all
or match any
requirement) whether to continue in a pipeline.
Note that the has firewall fields rule uses the built-in function has_field
to check whether the message has
the src_ip
and dst_ip
fields, as we want to use them in a later stage of the pipeline. This rule has
no actions to run in its then clause, since we only want to use it to determine whether subsequent stages should run.
The second rule, from firewall subnet, uses the built-in function cidr_match, which takes a CIDR pattern
and an IP address. In this case we reference a field from the currently-processed message using the message reference syntax $message
.
Graylog always sets the gl2_remote_ip
field on messages, so we don’t need to check whether that field exists. If we wanted to use a
field that might not exist on all messages we’d first use the has_field
function to ensure its presence.
Note the call to to_ip
around the gl2_remote_ip
field reference. This is necessary since the field is stored as a string internally, and cidr_match
requires an IP address object for its ip
parameter.
Requiring an explicit conversion to an IP address object demonstrates an important feature of Graylog’s rule language: enforcement of type safety to ensure that you end up with the data in the correct format. All too often everything is treated as a string, which wastes enormous amounts of cycles on data conversion and prevents proper analysis of the data.
We again have no actions to run, since we’re just using the rule to manage the pipeline’s flow, so the then block is empty.
You might be wondering why we didn’t just combine the has firewall fields and from firewall subnet rules, since they seem to be serving the same purpose.
While we could absolutely do so, recall that rules are intended to be reusable building blocks. Imagine you have a another pipeline for a different
firewall subnet. Rather than duplicating the logic to check for src_ip
and dst_ip
, and updating each rule if anything ever changes (e.g. additional fields),
you can simply add the has firewall fields rule to your new stage. With this approach you only need to update a single rule, with the change immediatedly
taking effect for all pipelines referencing it. Nice!
Data Types¶
As we have seen in the previous section, we need to make sure to use the proper data types when calling functions.
Graylog’s rule language parser rejects invalid use of types, making it safe to write rules.
The six built-in types in Graylog are string
(a UTF-8 string), double
(corresponds to Java’s Double
),
long
(Java’s Long
), boolean
(Boolean
), void
(indicating a function has no return value to prevent it
being used in a condition), and ip
(a subset of InetAddress
), but plugins are free
to add additional types as they see fit. The rule processor takes care of ensuring that values and functions agree on the types
being used.
By convention, functions that convert types start with the prefix to_
. Please refer to the Functions index for a list.
Conditions¶
In Graylog’s rules the when clause is a boolean expression, which is evaluated against the processed message.
Expressions support the common boolean operators AND
(or &&
), OR
(||
), NOT
(!
), and comparison operators
(<
, <=
, >
, >=
, ==
, !=
).
Any function that returns a value can be called in the when clause, but it must eventually evaluate to a boolean. For example: we were
able to use to_ip
in the from firewall subnet since it was being passed to cidr_match
, which returns a boolean, but could not
use route_to_stream
since it doesn’t return a value.
The condition must not be empty, but can simply consist of the boolean literal true
. This is useful when you always want to execute a rule’s actions.
If a condition calls a function which is not present (perhaps due to a typo or missing plugin) the call evaluates to false
.
Note
Comparing two fields can be done when you use the same data type, e.g. to_string($message.src_ip) == to_string($message.dst_ip)
will compare the two strings and will become true
on match. Comparing different data types evaluates to false
.
Actions¶
A rule’s then clause contains a list of actions which are evaluated in the order they appear.
There are two different types of actions:
Function calls
Variable assignments
Function calls look exactly like they do in conditions. All functions, including those which do not return a value, may be used in the then clause.
Variable assignments have the following form:
let name = value;
Variables are useful to avoid recomputing expensive parsing of data, holding on to temporary values, or making rules more readable.
Variables need to be defined before they can be used. Their fields (if any) can be accessed using the name.field
notation in any place
where a value of the field’s type is required.
The list of actions can be empty, in which case the rule is essentially a pluggable condition to help manage a pipeline’s processing flow.