Skip to content

The Tree Query Language#

Query Syntax#

A query is a pattern that matches a certain set of nodes in a tree. The expression to match a given node consists of a pair of brackets ([]) containing two things: the node's kind, and optionally, a series of other patterns that match the node's children. For example, this pattern would match any MultiplicativeExpression node that has two children Expression nodes, with an Asterisk node in between:

[MultiplicativeExpression [Expression] [Asterisk] [Expression]]

The children of a node can optionally be labeled. The label is a property of the edge from the node to the child, and is not a property of the child. For example, this pattern will match a MultiplicativeExpression node with the two Expression children labeled left_operand and right_operand:

[MultiplicativeExpression left_operand:[Expression] [Asterisk] right_operand:[Expression]]

You can also match a node's textual content using a string literal. For example, this pattern would match a MultiplicativeExpression with a * operator (for clarity):

[MultiplicativeExpression left_operand:[_] operator:["*"] right_operand:[_]]

If you don't care about the kind of a node, you can use an underscore _, which matches any kind. For example, this pattern will match a MultiplicativeExpression node with two children, one of any kind labeled left_operand and one of any kind:

[MultiplicativeExpression left_operand:[_] [_]]

Children can be elided. For example, this would produce multiple matches for a MultiplicativeExpression where at least one of the children is an expression of a StringExpression variant, where each match is associated with each of the StringExpression children:

[MultiplicativeExpression [Expression [StringExpression]]]

Trivia nodes (whitespace, comments, etc.) will be skipped over when running a query. Furthermore, trivia nodes cannot be explicitly (or implicitly with _) matched by queries.

Capturing Nodes#

When matching patterns, you may want to process specific nodes within the pattern. Captures allow you to associate names with specific nodes in a pattern, so that you can later refer to those nodes by those names. Capture names are written before the nodes that they refer to, and start with an @ character.

For example, this pattern would match any struct definition and it would associate the name struct_name with the identifier:

[StructDefinition @struct_name name:[Identifier]]

And this pattern would match all event definitions for a contract, associating the name event_name with the event name, contract_name with the containing contract name:

[ContractDefinition
    @contract_name name:[Identifier]
    members:[ContractMembers
        [ContractMember
            [EventDefinition @event_name name:[Identifier]]
        ]
    ]
]

Quantification#

You can surround a sequence of patterns in parenthesis (()), followed by a ?, * or + operator. The ? operator matches zero or one repetitions of a pattern, the * operator matches zero or more, and the + operator matches one or more.

For example, this pattern would match a sequence of one or more import directives at the top of the file:

[SourceUnit members:[_ ([_ @import [ImportDirective]])+]]

This pattern would match a structure definition with one or more members, capturing their names:

[StructDefinition
    @name name:[_]
    members:[_ ([_ @member [Identifier]])+]
]

This pattern would match all function calls, capturing a string argument if one was present:

[FunctionCallExpression
    arguments:[ArgumentsDeclaration
        variant:[PositionalArgumentsDeclaration
            arguments:[PositionalArguments
                (@arg [Expression variant:[StringExpression]])?
            ]
        ]
    ]
]

Alternations#

An alternation is written as a sequence of patterns separated by | and surrounded by parentheses.

For example, this pattern would match a call to either a variable or an object property. In the case of a variable, capture it as @function, and in the case of a property, capture it as @method:

[FunctionCallExpression
    operand:[Expression
        (@function variant:[Identifier]
        | @method variant:[MemberAccessExpression])
    ]
]

This pattern would match a set of possible keyword terminals, capturing them as @keyword:

@keyword (
    ["break"]
  | ["delete"]
  | ["else"]
  | ["for"]
  | ["function"]
  | ["if"]
  | ["return"]
  | ["try"]
  | ["while"]
)

Adjacency#

By using the adjacency operator . you can constrain a pattern to only match the first or the last child nodes.

For example, the following pattern would match only the first parameter declaration in a function definition:

[FunctionDefinition
    [ParametersDeclaration
        [Parameters . @first_param [Parameter]]
    ]
]

And conversely the following will match only the last parameter:

[FunctionDefinition
    [ParametersDeclaration
        [Parameters @last_param [Parameter] .]
    ]
]

If the adjacency operator is used in between two patterns it constrains matches on both patterns to occur consecutively, ie. without any other sibling node in between. For example, this pattern matches pairs of consecutive statements:

[Statements @stmt1 [Statement] . @stmt2 [Statement]]