6.1. Query Syntax#
It's often more convenient to use the declarative Query API to traverse the CST, as they allow you to express your intent more concisely and can largely replace the need for both internal (cursor), and external (visitor) iterator patterns.
The query engine performs pattern matching, and the execution semantics are closer to unification than to regular expression matching. A query returns all possible matches, not just the longest/shortest/first/last match.
Matching#
A query is a pattern that matches a certain set of nodes in a tree. The expression to match a given node consists of a pair of brackets ([]) containing two things: the node's kind, and optionally, a series of other patterns that match the node's children. For example, this pattern would match any MultiplicativeExpression node that has two children Expression nodes, with an Asterisk node in between:
The children of a node can optionally be labeled. The label is a property of the edge from the node to the child, and is not a property of the child. For example, this pattern will match a MultiplicativeExpression node with the two Expression children labeled left_operand and right_operand:
You can also match a node's textual content using a string literal. For example, this pattern would match a MultiplicativeExpression with a * operator (for clarity):
If you don't care about the kind of a node, you can use an underscore _, which matches any kind. For example, this pattern will match a MultiplicativeExpression node with any two children with any kind, as long as one of them is labeled left_operand:
Children can be elided. For example, this would produce multiple matches for a MultiplicativeExpression where at least one of the children is an expression of a StringExpression variant, where each match is associated with each of the StringExpression children:
Trivia nodes (whitespace, comments, etc.) will be skipped over when running a query. Furthermore, trivia nodes cannot be explicitly (or implicitly with _) matched by queries.
Capturing#
When matching patterns, you may want to process specific nodes within the pattern. Captures allow you to associate names with specific nodes in a pattern, so that you can later refer to those nodes by those names. Capture names are written before the nodes that they refer to, and start with an @ character.
For example, this pattern would match any struct definition and it would associate the name struct_name with the identifier:
And this pattern would match all event definitions for a contract, associating the name event_name with the event name, contract_name with the containing contract name:
[ContractDefinition
  @contract_name name: [Identifier]
  members: [ContractMembers
    [ContractMember
      [EventDefinition
        @event_name name: [Identifier]
      ]
    ]
  ]
]
Quantification#
You can surround a sequence of patterns in parenthesis (()), followed by a ?, * or + operator. The ? operator matches zero or one repetitions of a pattern, the * operator matches zero or more, and the + operator matches one or more.
For example, this pattern would match a sequence of one or more import directives at the top of the file:
This pattern would match a structure definition with one or more members, capturing their names:
This pattern would match all function calls, capturing a string argument if one was present:
[FunctionCallExpression
  arguments: [ArgumentsDeclaration
    variant: [PositionalArgumentsDeclaration
      arguments: [PositionalArguments
        (@arg [Expression variant: [StringExpression]])?
      ]
    ]
  ]
]
Alternation#
An alternation is written as a sequence of patterns separated by | and surrounded by parentheses.
For example, this pattern would match a call to either a variable or an object property. In the case of a variable, capture it as @function, and in the case of a property, capture it as @method:
[FunctionCallExpression
  operand: [Expression
    (
        @function variant: [Identifier]
      | @method variant: [MemberAccessExpression]
    )
  ]
]
This pattern would match a set of possible keyword terminals, capturing them as @keyword:
@keyword (
    ["break"]
  | ["delete"]
  | ["else"]
  | ["for"]
  | ["function"]
  | ["if"]
  | ["return"]
  | ["try"]
  | ["while"]
)
Adjacency#
By using the adjacency operator . you can constrain a pattern to only match the first or the last child nodes.
For example, the following pattern would match only the first parameter declaration in a function definition:
And conversely the following will match only the last parameter:
If the adjacency operator is used in between two patterns it constrains matches on both patterns to occur consecutively, ie. without any other sibling node in between. For example, this pattern matches pairs of consecutive statements: