Skip to content

Using Queries#

It's often more convenient to use the declarative Query API to traverse the CST, as they allow you to express your intent more concisely and can largely replace the need for both internal (cursor), and external (visitor) iterator patterns.

The query language is based on pattern matching, and the execution semantics are closer to unification than to regular expression matching. A query returns all possible matches, not just the longest/shortest/first/last match.

If not specified otherwise, let's assume we already parsed a Solidity source and have a cursor pointing to the root node of the CST (created with createTreeCursor, see Using the Cursor).

Creating and executing queries#

You can create a Query object using Query.parse, which accepts a string value. These can be then used by Cursor.query to execute it.

You can pass multiple queries to a cursor to and efficiently traverse the tree looking for matches. They will be executed concurrently, returning matches in the order they appear in input.

// Any `Cursor` can be used to create a query.
const cursor = parseOutput.createTreeCursor();

const query = Query.parse("[ContractDefinition]");
const matches: QueryMatchIterator = cursor.query([query]);

Iterating over node patterns#

Queries allow you to iterate over all node patterns that match the query, which can replace your need for manual iteration via cursors or visitors. In order to get a Cursor that points to the matched node, you need to capture them with a name capture (@capture_name) to a specific node in the query pattern.

Let's use this to list all the contract definitions in the source file:

input.sol
contract Foo {}
contract Bar {}
contract Baz {}
const found = [];

const query = Query.parse("@contract [ContractDefinition]");
const matches = cursor.query([query]);

for (const match of matches) {
  const cursor = match.captures["contract"]![0]!;

  assertIsNonterminalNode(cursor.node);
  found.push(cursor.node.unparse().trim());
}

assert.deepStrictEqual(found, ["contract Foo {}", "contract Bar {}", "contract Baz {}"]);

Multiple patterns simultaneously#

We can also intersperse multiple patterns in a single query, which will return all the matches for each pattern. This can be useful when you want to match multiple types of nodes in a single pass.

const names = [];

const structDefinition = Query.parse("[StructDefinition @name [Identifier]]");
const enumDefinition = Query.parse("[EnumDefinition @name [Identifier]]");
const matches = cursor.query([structDefinition, enumDefinition]);

for (const match of matches) {
  const index = match.queryNumber;
  const cursor = match.captures["name"]![0]!;

  names.push([index, cursor.node.unparse()]);
}

assert.deepStrictEqual(names, [
  [0, "Foo"],
  [1, "Bar"],
  [0, "Baz"],
  [1, "Qux"],
]);

Matching on node's label#

We can match not only on the node's kind, but also on its label. This can be useful if there may be two children with the same kind but different labels or to be more declarative.

To do so, we use [label: _] syntax. Here, we also use _ to allow matching any kind of node, as long as it matches the given label.

input.sol
contract Example {
    function foo() public {
        (uint a, uint16 b, uint64 c, uint256 d) = (1, 2, 3, 4);
    }
}
const names = [];

const query = Query.parse("[TypedTupleMember @type type_name:[_]]");
const matches = cursor.query([query]);

for (const match of matches) {
  const cursor = match.captures["type"]![0]!;

  names.push(cursor.node.unparse());
}

assert.deepStrictEqual(names, ["uint", " uint16", " uint64", " uint256"]);

Matching on node's literal content#

Lastly, we can also match on the node's literal content. This can be useful when you want to match a specific identifier, string, or number.

Let's say we prefer our code to be explicit and prefer using uint256 instead of uint. To find all instances of the uint alias we could do the following:

input.sol
contract Example {
    function foo() public {
        (uint a, uint16 b, uint64 c, uint256 d) = (1, 2, 3, 4);
    }
}
const names = [];

const query = Query.parse(`[ElementaryType @uint_keyword variant:["uint"]]`);
const matches = cursor.query([query]);

for (const match of matches) {
  const cursor = match.captures["uint_keyword"]![0]!;

  names.push(cursor.node.unparse());
}

assert.deepStrictEqual(names, ["uint"]);

Example: Finding tx.origin patterns#

As a more realistic example, let's say we want to write a linter that unconditionally lints against all tx.origin accesses.

Let's use the motivating example from https://soliditylang.org:

input.sol
// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.7.0 <0.9.0;
// THIS CONTRACT CONTAINS A BUG - DO NOT USE
contract TxUserWallet {
    address owner;

    constructor() {
        owner = msg.sender;
    }

    function transferTo(address payable dest, uint amount) public {
        // THE BUG IS RIGHT HERE, you must use msg.sender instead of tx.origin
        require(tx.origin == owner);
        dest.transfer(amount);
    }
}

Now, we can above features to write a query that matches all tx.origin patterns:

const query = Query.parse(`
@txorigin [MemberAccessExpression
  [Expression @start ["tx"]]
  ["origin"]
]`);

const matches = cursor.query([query]);
const found = [];

for (const match of matches) {
  const cursor = match.captures["txorigin"]![0]!;

  found.push([cursor.textOffset.utf8, cursor.node.unparse()]);
}

assert.deepStrictEqual(found, [[375, "tx.origin"]]);