The Tree Query Language#
Query Syntax#
A query is a pattern that matches a certain set of nodes in a tree. The expression to match a given node consists of a pair of brackets ([]
) containing two things: the node's kind, and optionally, a series of other patterns that match the node's children. For example, this pattern would match any MultiplicativeExpression
node that has two children Expression
nodes, with an Asterisk
node in between:
The children of a node can optionally be labeled. The label is a property of the edge from the node to the child, and is not a property of the child. For example, this pattern will match a MultiplicativeExpression
node with the two Expression
children labeled left_operand
and right_operand
:
You can also match a node's textual content using a string literal. For example, this pattern would match a MultiplicativeExpression
with a *
operator (for clarity):
If you don't care about the kind of a node, you can use an underscore _
, which matches any kind. For example, this pattern will match a MultiplicativeExpression
node with two children, one of any kind labeled left_operand
and one of any kind:
Children can be elided. For example, this would produce multiple matches for a MultiplicativeExpression
where at least one of the children is an expression of a StringExpression
variant, where each match is associated with each of the StringExpression
children:
Trivia nodes (whitespace, comments, etc.) will be skipped over when running a query. Furthermore, trivia nodes cannot be explicitly (or implicitly with _
) matched by queries.
Capturing Nodes#
When matching patterns, you may want to process specific nodes within the pattern. Captures allow you to associate names with specific nodes in a pattern, so that you can later refer to those nodes by those names. Capture names are written before the nodes that they refer to, and start with an @
character.
For example, this pattern would match any struct definition and it would associate the name struct_name
with the identifier:
And this pattern would match all event definitions for a contract, associating the name event_name
with the event name, contract_name
with the containing contract name:
[ContractDefinition
@contract_name name:[Identifier]
members:[ContractMembers
[ContractMember
[EventDefinition @event_name name:[Identifier]]
]
]
]
Quantification#
You can surround a sequence of patterns in parenthesis (()
), followed by a ?
, *
or +
operator. The ?
operator matches zero or one repetitions of a pattern, the *
operator matches zero or more, and the +
operator matches one or more.
For example, this pattern would match a sequence of one or more import directives at the top of the file:
This pattern would match a structure definition with one or more members, capturing their names:
This pattern would match all function calls, capturing a string argument if one was present:
[FunctionCallExpression
arguments:[ArgumentsDeclaration
variant:[PositionalArgumentsDeclaration
arguments:[PositionalArguments
(@arg [Expression variant:[StringExpression]])?
]
]
]
]
Alternations#
An alternation is written as a sequence of patterns separated by |
and surrounded by parentheses.
For example, this pattern would match a call to either a variable or an object property. In the case of a variable, capture it as @function
, and in the case of a property, capture it as @method
:
[FunctionCallExpression
operand:[Expression
(@function variant:[Identifier]
| @method variant:[MemberAccessExpression])
]
]
This pattern would match a set of possible keyword terminals, capturing them as @keyword
:
@keyword (
["break"]
| ["delete"]
| ["else"]
| ["for"]
| ["function"]
| ["if"]
| ["return"]
| ["try"]
| ["while"]
)
Adjacency#
By using the adjacency operator .
you can constrain a pattern to only match the first or the last child nodes.
For example, the following pattern would match only the first parameter declaration in a function definition:
And conversely the following will match only the last parameter:
If the adjacency operator is used in between two patterns it constrains matches on both patterns to occur consecutively, ie. without any other sibling node in between. For example, this pattern matches pairs of consecutive statements: