Authored by: Michael Stöckli

In our latest release, Unomaly 3.0, we made significant improvement to our search functionality to make finding specific information easier by adding multiple values for the same filter type. This means you can search for more than one system, group, anomaly type or other conditions within a single search.

Before 3.0 searching allowed only for combining different conditions in a way that retrieves results that match all the specified conditions. For example, the conditions below will retrieve results that come from the system with name “deploy-15” and where the anomaly has tag “database”.

fpo

The operative word above being ‘and’. All conditions must be satisfied, this makes it impossible to perform a search that relies on asking for results that, say, come from multiple systems or ask for results across multiple tags.

Modeling the AST

We want to be able to use this filter construct in multiple contexts, which means we need some intermediate representation that can be used within different contexts. In concrete terms, we need a representation of the query that is portable across different languages and doesn’t rely on any implicit information that is otherwise captured by the context itself.

So we built a query AST using Google protobuf.

The conditions we use in the query above look like the following.

fpo

A boolean expression tree where the leaf elements are assertions.

The AST representation reflects the ‘query domain model’ that we want to support. It is strongly typed and can only be generated if valid input types are used.

We parse the conditions as they exist in Unomaly today to generate the AST. Conditions are an array of strings, and if we see the same condition type more than once, we automatically convert that to an ‘or’-based condition. For the query above, if we want to fetch results for both deploy-15 and deploy-10, we would end up with the following AST.

fpo

The core of the protobuf definition for expressing the AST look like the following:

message AssertExpr {
	// op is an enum value: EQ, LT, GT, LTE, etc...
	AssertOp op = 1;
	// entity is based on entities defined in the query 
	model: system,
// anomaly, etc...
	oneof entity { ... };
}

message BoolExpr {
// op is an enum value: AND, OR, NOT
	BoolOp op = 1;
	repeated Expr children = 2;
}

// Expr defines a node in our tree that is either a bool 
expression or 
// an assertion expression
message Expr {
	oneof expr {
		BoolExpr bool_expr = 1;
		AssertExpr assert_expr = 2;
	}
}

Here’s an example of an entity that is used in the AssertExpr above

message Anomaly {
	// field defines which entity field and value is 
	used in the
	// assertion.
	oneof field {
    	string tag = 1;
    	ClassificationType classification = 2;
    	string event = 3;
		// etc...
	}
}

Generate AST by parsing the DSL

We created a parser for a familiar query language (see example below) that generates the AST. This enables us to quickly and fully test the AST generation itself.

(tag:database or tag:service) and not group.name:staging

The above query generates the following AST.

fpo

The parser is implemented in Chevrotain. One of the main selling points for using Chevrotain is its playground functionality which allowed us to quickly iterate on the design. We used the “Calculator separated semantics” as a starting point and iteratively added new tokens and rules. Initially the output of the playground was a Lisp representation of what would be the generated AST. After many iterations, we could move the code into our repository and replace the Lisp generation with the real protobuf AST generation.

Example of the query above represented in Lisp form:

(and
	(not
    		(eq group.name 'staging'))
	(or
    	(eq anomaly.tag 'database')
    		(eq anomaly.tag 'service')))

We used Jests snapshot feature for testing both the DSL to AST generation as well as the string conditions to AST generation. Each test case produced a snapshot of the AST that we then inspected to verify whether the result is what we expected.

Putting it all together

Our UI needed some minor adjustments to allow the user to select multiple conditions of the same type in the search bar. Once that was done, we added the new conditions parser to the API service. We could use the same endpoint as what we already had since we didn’t change the contract.

When the API service receives a search request, it extracts the conditions from the request payload and uses the parser to generate an AST. That AST is then traversed in pre-order, generating a SQL ‘where’ clause at each tree node. These clauses are then combined into a single SQL query. In practical terms, we use knex Query Builder and its ability to accept functions as arguments to ‘where’ clauses to build the query.

Here’s a code snippet of what happens when we traverse a leaf node, in this case an assertion that the situation score should be greater than or equal to some given score:

const visitSituationAssert = (op: pb.AssertExpr.Op, 
situation: pb.Situation): knex.QueryCallback => {
	const field = situation.field!;
	switch (field) {
    	case "score":
        	switch (op) {
            	case pb.AssertExpr.Op.GTE:
                	return (qb: knex.QueryBuilder) => 
                	qb.where("situation.display_score", 
                	">=", situation.score);

We then combine all the functions from the leaf nodes of a boolean expression with the appropriate where-clause of the Query Builder, i.e. use ‘builder.orWhere’ for ‘or’ operations, ‘builder.whereNot’ for ‘not’ operations and just ‘builder.where’ for ‘and’ operations. Finally, we execute the builder against our Postgres instance and wait for the results.

If you want to get started writing your own query-based API you can start by breaking down your query domain model. Remember that your query domain model is for the user and doesn’t need to fit the internal model of your data.

Happy to answer any questions on twitter. Find me @mchlstckl or to chat to my colleagues @unomaly.


Unomaly closes the gap between engineers and their IT environments by helping them understand current state and seek proactive resolution using anomalies. Click here to start a free trial.