Using Queries#

It's often more convenient to use the declarative Query API to traverse the CST, as they allow you to express your intent more concisely and can largely replace the need for both internal (cursor), and external (visitor) iterator patterns.

The query language is based on pattern matching, and the execution semantics are closer to unification than to regular expression matching. A query returns all possible matches, not just the longest/shortest/first/last match.

If not specified otherwise, let's assume we already parsed a Solidity source and have a cursor pointing to the root node of the CST (created with create_tree_cursor, see Using the Cursor).

Creating and executing queries#

You can create a Query struct using Query::parse, which accepts a &str. These can be then used by Cursor::query to execute it.

You can pass multiple queries to a cursor to and efficiently traverse the tree looking for matches. They will be executed concurrently, returning matches in the order they appear in input.

use slang_solidity::cst::Query;

// Any `Cursor` can be used to create a query.
let cursor = parse_output.create_tree_cursor();

let query = Query::parse("[ContractDefinition]").unwrap();
let result: QueryMatchIterator = cursor.query(vec![query]);

Iterating over node patterns#

Queries allow you to iterate over all node patterns that match the query, which can replace your need for manual iteration via cursors or visitors. In order to get a Cursor that points to the matched node, you need to capture them with a name capture (@capture_name) to a specific node in the query pattern.

Let's use this to list all the contract definitions in the source file:

input.sol

contract Foo {}
contract Bar {}
contract Baz {}

let mut found = vec![];

let query = Query::parse("@contract [ContractDefinition]").unwrap();

for r#match in cursor.query(vec![query]) {
    let captures = r#match.captures;
    let cursors = captures.get("contract").unwrap();

    let cursor = cursors.first().unwrap();

    found.push(cursor.node().unparse().trim().to_owned());
}

assert_eq!(
    found,
    ["contract Foo {}", "contract Bar {}", "contract Baz {}"]
);

Multiple patterns simultaneously#

We can also intersperse multiple patterns in a single query, which will return all the matches for each pattern. This can be useful when you want to match multiple types of nodes in a single pass.

let mut names = vec![];

let struct_def = Query::parse("[StructDefinition @name [Identifier]]").unwrap();
let enum_def = Query::parse("[EnumDefinition @name [Identifier]]").unwrap();

for r#match in cursor.query(vec![struct_def, enum_def]) {
    let index = r#match.query_number;
    let captures = r#match.captures;
    let cursors = captures.get("name").unwrap();

    let cursor = cursors.first().unwrap();

    names.push((index, cursor.node().unparse()));
}

assert_eq!(
    names,
    &[
        (0, "Foo".to_string()),
        (1, "Bar".to_string()),
        (0, "Baz".to_string()),
        (1, "Qux".to_string())
    ]
);

Matching on node's label#

We can match not only on the node's kind, but also on its label. This can be useful if there may be two children with the same kind but different labels or to be more declarative.

To do so, we use [label: _] syntax. Here, we also use _ to allow matching any kind of node, as long as it matches the given label.

input.sol

contract Example {
    function foo() public {
        (uint a, uint16 b, uint64 c, uint256 d) = (1, 2, 3, 4);
    }
}

let mut names = vec![];

let query = Query::parse("[TypedTupleMember @type type_name:[_]]").unwrap();

for r#match in cursor.query(vec![query]) {
    let captures = r#match.captures;
    let cursors = captures.get("type").unwrap();

    let cursor = cursors.first().unwrap();

    names.push(cursor.node().unparse());
}

assert_eq!(names, &["uint", " uint16", " uint64", " uint256"]);

Matching on node's literal content#

Lastly, we can also match on the node's literal content. This can be useful when you want to match a specific identifier, string, or number.

Let's say we prefer our code to be explicit and prefer using uint256 instead of uint. To find all instances of the uint alias we could do the following:

input.sol

contract Example {
    function foo() public {
        (uint a, uint16 b, uint64 c, uint256 d) = (1, 2, 3, 4);
    }
}

let mut names = vec![];

let query = Query::parse(r#"[ElementaryType @uint_keyword variant:["uint"]]"#).unwrap();

for r#match in cursor.query(vec![query]) {
    let captures = r#match.captures;
    let cursors = captures.get("uint_keyword").unwrap();

    let cursor = cursors.first().unwrap();

    names.push(cursor.node().unparse());
}

assert_eq!(names, &["uint"]);

Example: Finding `tx.origin` patterns#

As a more realistic example, let's say we want to write a linter that unconditionally lints against all tx.origin accesses.

Let's use the motivating example from https://soliditylang.org:

input.sol

// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.7.0 <0.9.0;
// THIS CONTRACT CONTAINS A BUG - DO NOT USE
contract TxUserWallet {
    address owner;

    constructor() {
        owner = msg.sender;
    }

    function transferTo(address payable dest, uint amount) public {
        // THE BUG IS RIGHT HERE, you must use msg.sender instead of tx.origin
        require(tx.origin == owner);
        dest.transfer(amount);
    }
}

Now, we can above features to write a query that matches all tx.origin patterns:

let query = Query::parse(
    r#"@txorigin [MemberAccessExpression
            [Expression @start ["tx"]]
            ["origin"]
        ]"#,
)
.unwrap();

let mut results = vec![];

for r#match in cursor.query(vec![query]) {
    let captures = r#match.captures;
    let cursors = captures.get("txorigin").unwrap();

    let cursor = cursors.first().unwrap();

    results.push((cursor.text_offset().utf8, cursor.node().unparse()));
}

assert_eq!(results, &[(375usize, "tx.origin".to_string())]);

Using Queries#

Creating and executing queries#

Iterating over node patterns#

Multiple patterns simultaneously#

Matching on node's label#

Matching on node's literal content#

Example: Finding tx.origin patterns#

Example: Finding `tx.origin` patterns#