Installation

You can start using solx in the following ways:

  1. Use the installation script.

    curl -L https://raw.githubusercontent.com/NomicFoundation/solx/main/install-solx | bash
    

    The script will download the latest stable release of solx and install it in your PATH.

    ⚠️ The script requires curl to be installed on your system.
    This is the recommended way to install solx for MacOS users to bypass gatekeeper checks.

  2. Download stable releases. See Static Executables.

  3. Build solx from sources. See Building from Source.

System Requirements

It is recommended to have at least 4 GB of RAM to compile large projects. The compilation process is parallelized by default, so the number of threads used is equal to the number of CPU cores.

Large projects can consume a lot of RAM during compilation on machines with a high number of cores. If you encounter memory issues, consider reducing the number of threads using the --threads option.

The table below outlines the supported platforms and architectures:

CPU/OSMacOSLinuxWindows
x86_64
arm64

Please avoid using outdated distributions of operating systems, as they may lack the necessary dependencies or include outdated versions of them. solx is only tested on recent versions of popular distributions, such as MacOS 11.0 and Windows 10.

Versioning

The solx version consists of two parts:

  1. solx version itself.
  2. Version of solc libraries solx is statically linked with.

We recommend always using the latest version of solx to benefit from the latest features and bug fixes.

Ethereum Development Toolkits

For large codebases, it is more convenient to use solx via toolkits such as Hardhat. These tools manage compiler input and output on a higher level, and provide additional features like incremental compilation and caching.

Static Executables

We ship solx binaries on the releases page of the eponymous repository. This repository maintains intuitive and stable naming for the executables and provides a changelog for each release. Tools using solx must download the binaries from this repository and cache them locally.

All executables are statically linked and must work on all recent platforms without issues.

Building from Source

Please consider using the pre-built executables before building from source. Building from source is only necessary for development, research, and debugging purposes. Deployment and production use cases should rely only on the officially released executables.

  1. Install the necessary system-wide dependencies.

    • For Linux (Debian):
    apt install cmake ninja-build curl git libssl-dev pkg-config clang lld
    
    • For Linux (Arch):
    pacman -Syu which cmake ninja curl git pkg-config clang lld
    
    • For MacOS:

      1. Install the Homebrew package manager by following the instructions at brew.sh.

      2. Install the necessary system-wide dependencies:

        brew install cmake ninja coreutils
        
      3. Install a recent build of the LLVM/Clang compiler using one of the following tools:

  2. Install Rust.

    The easiest way to do it is following the latest official instructions.

The Rust version used for building is pinned in the rust-toolchain.toml file at the repository root. cargo will automatically download the pinned version of rustc when you start building the project.

  1. Clone and checkout this repository with submodules.

    git clone https://github.com/NomicFoundation/solx --recursive
    

    By default, submodules checkout is disabled to prevent cloning large repositories via cargo. If you're building locally, ensure all submodules are checked out with:

    git submodule update --recursive --checkout
    
  2. Build the development tools.

    cargo build --release --bin solx-dev
    
  3. Build the LLVM framework using solx-dev.

    ./target/release/solx-dev llvm build --enable-mlir
    

    This builds LLVM with the EVM target, MLIR, and LLD projects enabled. The build artifacts will be placed in target-llvm/.

    For more information and available build options, run ./target/release/solx-dev llvm build --help.

  4. Build the solc libraries using solx-dev.

    ./target/release/solx-dev solc build
    

    This will configure and build the solc libraries in solx-solidity/build/. The command automatically detects MLIR and LLD paths if LLVM was built with those projects.

    For more options, run ./target/release/solx-dev solc build --help.

  5. Build the solx executable.

    cargo build --release
    

    The solx executable will appear as ./target/release/solx, where you can run it directly or move it to another location.

    If cargo cannot find the LLVM build artifacts, ensure that the LLVM_SYS_211_PREFIX environment variable is not set in your system, as it may be pointing to a location different from the one expected by solx.

Tuning the LLVM build

  • For more information and available build options, run ./target/release/solx-dev llvm build --help.
  • The --enable-mlir flag enables MLIR support in the LLVM build (required for MLIR-based optimizations). LLD is always built.
  • Use the --ccache-variant ccache option to speed up the build process if you have ccache installed.

Building LLVM manually

If you prefer building the LLVM framework manually, include the following flags in your CMake command:

# We recommend using the latest version of CMake.

-DLLVM_TARGETS_TO_BUILD='EVM'
-DLLVM_ENABLE_PROJECTS='lld;mlir'
-DLLVM_ENABLE_RTTI='On'
-DBUILD_SHARED_LIBS='Off'

For most users, solx-dev is the recommended way to build the framework. This section was added for compiler toolchain developers and researchers with specific requirements and experience with the LLVM framework.

Command Line Interface (CLI)

The CLI of solx is designed to mimic that of solc. There are several main input/output (I/O) modes in the solx interface:

The basic CLI is simpler and suitable for using from the shell. The standard JSON mode is similar to client-server interaction, thus more suitable for using from other applications.

All toolkits using solx must be operating in standard JSON mode and follow its specification. It will make the toolkits more robust and future-proof, as the standard JSON mode is the most versatile and used for the majority of popular projects.

This page focuses on the basic CLI mode. For more information on the standard JSON mode, see this page.

Basic CLI

Basic CLI mode is the simplest way to compile a file with the source code.

To compile a basic Solidity contract, run the simple example from the --bin section.

The rest of this section describes the available CLI options and their usage. You may also check out solx --help for a quick reference.

--bin

Emits the full bytecode.

solx 'Simple.sol' --bin

Output:

======= Simple.sol:Simple =======
Binary:
34601557630000008480630000001a...

--bin-runtime

Emits the runtime part of the bytecode.

solx 'Simple.sol' --bin-runtime

Output:

======= Simple.sol:Simple =======
Binary of the runtime part:
34600b57600336116016575b5f5ffd...

--asm

Emits the text assembly produced by LLVM.

solx 'Simple.sol' --asm

Output:

======= Simple.sol:Simple =======
Deploy LLVM EVM assembly:
        .text
        .file   "Simple.sol:Simple"
main:
.func_begin0:
        JUMPDEST
        PUSH1 128
        PUSH1 64
...

Runtime LLVM EVM assembly:
        .text
        .file   "Simple.sol:Simple.runtime"
main:
.func_begin0:
        JUMPDEST
        PUSH1 128
        PUSH1 64
...

--metadata

Emits the contract metadata. The metadata is a JSON object that contains information about the contract, such as its name, source code hash, the list of dependencies, compiler versions, and so on.

The solx metadata format is compatible with the Solidity metadata format. This means that the metadata output can be used with other tools that support Solidity metadata. Extra solx data is inserted into solc metadata with this JSON object:

{
  "solx": {
    "llvm_options": [],
    "optimizer_settings": {
      "is_debug_logging_enabled": false,
      "is_fallback_to_size_enabled": false,
      "is_verify_each_enabled": false,
      "level_back_end": "Aggressive",
      "level_middle_end": "Aggressive",
      "level_middle_end_size": "Zero"
    },
    // Optional: only set for Solidity and Yul contracts.
    "solc_version": "0.8.34",
    // Mandatory: current version of solx.
    "solx_version": "0.1.4"
  }
}

Usage:

solx 'Simple.sol' --metadata

Output:

======= Simple.sol:Simple =======
Metadata:
{"compiler":{"version":"0.8.34+commit.e2cbf92c"},"language":"Solidity","output":{"abi":[{"inputs":[],"name":"first","outputs":[{"internalType":"uint64","name":"","type":"uint64"}],"stateMutability":"pure","type":"function"},{"inputs":[],"name":"second","outputs":[{"internalType":"uint256","name":"","type":"uint256"}],"stateMutability":"pure","type":"function"}],"devdoc":{"kind":"dev","methods":{},"version":1},"userdoc":{"kind":"user","methods":{},"version":1}},"settings":{"compilationTarget":{"Simple.sol":"Simple"},"evmVersion":"osaka","libraries":{},"metadata":{"bytecodeHash":"ipfs"},"optimizer":{"enabled":false,"runs":200},"remappings":[]},"solx":{"llvm_options":[],"optimizer_settings":{"is_debug_logging_enabled":false,"is_fallback_to_size_enabled":false,"is_verify_each_enabled":false,"level_back_end":"Aggressive","level_middle_end":"Aggressive","level_middle_end_size":"Zero"},"solc_version":"0.8.34","solx_version":"0.1.4"},"sources":{"Simple.sol":{"keccak256":"0x402fe0b38cc9d81e8c9f6d07854cca27fbb307f06d8a129998026907a10c7ca1","license":"MIT","urls":["bzz-raw://04714cab56c1f931e3cc1ddae4c7ff0c8832d0849e23966c6326028f6783d45a","dweb:/ipfs/QmehmUFKCtytG8WcWQ676KvqwURfkVYK89VHZEvSzyLc2Z"]}},"version":1}

--ast-json

Emits the AST of each Solidity file.

solx 'Simple.sol' --ast-json

Output:

======= Simple.sol:Simple =======
JSON AST:
{"absolutePath":".../Simple.sol","exportedSymbols":{"Simple":[24]},"id":25,"license":"MIT","nodeType":"SourceUnit","nodes":[ ... ],"src":"32:288:0"}

Since solx communicates with solc only via standard JSON under the hood, the full JSON AST is emitted instead of the compact one.

--abi

Emits the contract ABI specification.

solx 'Simple.sol' --abi

Output:

======= Simple.sol:Simple =======
Contract JSON ABI:
[{"inputs":[],"name":"first","outputs":[{"internalType":"uint64","name":"","type":"uint64"}],"stateMutability":"pure","type":"function"},{"inputs":[],"name":"second","outputs":[{"internalType":"uint256","name":"","type":"uint256"}],"stateMutability":"pure","type":"function"}]

--hashes

Emits the contract function signatures.

solx 'Simple.sol' --hashes

Output:

======= Simple.sol:Simple =======
Function signatures:
3df4ddf4: first()
5a8ac02d: second()

--storage-layout

Emits the contract storage layout.

solx 'Simple.sol' --storage-layout

Output:

======= Simple.sol:Simple =======
Contract Storage Layout:
{"storage":[{"astId":3,"contract":"Simple.sol:Simple","label":"field_1","offset":0,"slot":"0","type":"t_uint256"},{"astId":5,"contract":"Simple.sol:Simple","label":"field_2","offset":0,"slot":"1","type":"t_uint256"},{"astId":7,"contract":"Simple.sol:Simple","label":"field_3","offset":0,"slot":"2","type":"t_uint256"}],"types":{"t_uint256":{"encoding":"inplace","label":"uint256","numberOfBytes":"32"}}}

--transient-storage-layout

Emits the contract transient storage layout.

solx 'Simple.sol' --transient-storage-layout

Output:

======= Simple.sol:Simple =======
Contract Transient Storage Layout:
{"storage":[{"astId":3,"contract":"Simple.sol:Simple","label":"field_1","offset":0,"slot":"0","type":"t_uint256"},{"astId":5,"contract":"Simple.sol:Simple","label":"field_2","offset":0,"slot":"1","type":"t_uint256"},{"astId":7,"contract":"Simple.sol:Simple","label":"field_3","offset":0,"slot":"2","type":"t_uint256"}],"types":{"t_uint256":{"encoding":"inplace","label":"uint256","numberOfBytes":"32"}}}

--userdoc

Emits the contract user documentation.

solx 'Simple.sol' --userdoc

Output:

======= Simple.sol:Simple =======
User Documentation:
{"kind":"user","methods":{ ... },"version":1}

--devdoc

Emits the contract developer documentation.

solx 'Simple.sol' --devdoc

Output:

======= Simple.sol:Simple =======
Developer Documentation:
{"kind":"dev","methods":{ ... },"version":1}

--asm-solc-json

Emits the solc EVM assembly parsed from solc's JSON output.

solx 'Simple.sol' --asm-solc-json

Output:

======= Simple.sol:Simple =======
EVM assembly:
000     PUSH                80
001     MEMORYGUARD
002     PUSH                40
003     MSTORE
...

This is the solc EVM assembly output that is translated to LLVM IR by solx. For solx's own EVM assembly output emitted by LLVM, use the --asm option instead.

--ir (or --ir-optimized)

Emits the solc Yul IR.

solx does not use the Yul optimizer anymore, so the Yul IR is always unoptimized, and it is not possible to emit solc-optimized Yul IR with solx.

solx 'Simple.sol' --ir

Output:

======= Simple.sol:Simple =======
IR:
/// @use-src 0:"Simple.sol"
object "Simple_24" {
    code {
        {
            ...
        }
    }
    /// @use-src 0:"Simple.sol"
    object "Simple_24_deployed" {
        code {
            {
                ...
            }
        }
        data ".metadata" hex"a26469706673582212206c34df79f8cc8ba870a350940cb8623c60d4f6f9c356e2185b812187d9ae55ee64736f6c63430008220033"
    }
}

--debug-info

Emits the ELF-wrapped DWARF debug info of the deploy code.

solx 'Simple.sol' --debug-info

Output:

======= Simple.sol:Simple =======
Debug info:
7f454c46010201ff...

--debug-info-runtime

Emits the ELF-wrapped DWARF debug info of the runtime code.

solx 'Simple.sol' --debug-info-runtime

Output:

======= Simple.sol:Simple =======
Debug info of the runtime part:
7f454c46010201ff

--evmla

Emits EVM legacy assembly (intermediate representation from solc).

When used with --output-dir, writes .evmla files to the output directory. Without --output-dir, outputs to stdout.

Usage with --output-dir:

solx 'Simple.sol' --evmla --output-dir './build/'
ls './build/'

Output:

Compiler run successful.
Simple_sol_Simple.evmla
Simple_sol_Simple.runtime.evmla

Usage with stdout:

solx 'Simple.sol' --evmla --bin

Output:

======= Simple.sol:Simple =======
Binary:
...
Deploy EVM legacy assembly:
000     PUSH                80
...

--ethir

Emits Ethereal IR (intermediate representation between EVM assembly and LLVM IR).

When used with --output-dir, writes .ethir files to the output directory. Without --output-dir, outputs to stdout.

Usage with --output-dir:

solx 'Simple.sol' --ethir --output-dir './build/'
ls './build/'

Output:

Compiler run successful.
Simple_sol_Simple.ethir
Simple_sol_Simple.runtime.ethir

Usage with stdout:

solx 'Simple.sol' --ethir --bin

Output:

======= Simple.sol:Simple =======
Binary:
...
Deploy Ethereal IR:
function main(0, 0, 0, 0, 0) -> 0, 0, 0, 0 {
...

--emit-llvm-ir

Emits LLVM IR (both unoptimized and optimized).

When used with --output-dir, writes .ll files to the output directory. Without --output-dir, outputs to stdout.

Usage with --output-dir:

solx 'Simple.sol' --emit-llvm-ir --output-dir './build/'
ls './build/'

Output:

Compiler run successful.
Simple_sol_Simple.optimized.ll
Simple_sol_Simple.runtime.optimized.ll
Simple_sol_Simple.runtime.unoptimized.ll
Simple_sol_Simple.unoptimized.ll

Usage with stdout:

solx 'Simple.sol' --emit-llvm-ir --bin --via-ir

Output:

======= Simple.sol:Simple =======
Binary:
...
Deploy LLVM IR (unoptimized):
; ModuleID = 'Simple.sol:Simple'
...
Deploy LLVM IR:
; ModuleID = 'Simple.sol:Simple'
...

--benchmarks

Emits benchmarks of the solx LLVM-based pipeline and its underlying call to solc.

solx 'Simple.sol' --benchmarks

Output:

Benchmarks:
solc_Solidity_Standard_JSON: 6ms
solx_Solidity_IR_Analysis: 0ms
solx_Compilation: 75ms

======= Simple.sol:Simple =======
Benchmarks:
    Simple.sol:Simple:deploy/EVMAssemblyToLLVMIR/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple:deploy/InitVerify/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple:deploy/OptimizeVerify/M3B3/SpillArea(0): 1ms
    Simple.sol:Simple:runtime/EVMAssemblyToLLVMIR/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple.runtime:runtime/InitVerify/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple.runtime:runtime/OptimizeVerify/M3B3/SpillArea(0): 5ms

Input Files

solx supports multiple input files. The following command compiles two Solidity files and prints the bytecode:

solx 'Simple.sol' 'Complex.sol' --bin

Solidity import remappings are passed the same way as input files, but they are distinguished by a = symbol between source and destination. The following command compiles a Solidity file with a remapping and prints the bytecode:

solx 'Simple.sol' 'github.com/ethereum/dapp-bin/=/usr/local/lib/dapp-bin/' --bin

solx does not handle remappings itself, but only passes them through to solc. Visit the solc documentation to learn more about the processing of remappings.

--libraries

Specifies the libraries to link with compiled contracts. The option accepts multiple string arguments. The safest way is to wrap each argument in single quotes, and separate them with a space.

The specifier has the following format: <ContractPath>:<ContractName>=<LibraryAddress>.

Usage:

solx 'Simple.sol' --bin --libraries 'Simple.sol:Simple=0x1234567890abcdef1234567890abcdef12345678'

--base-path, --include-path, --allow-paths

These options are used to specify Solidity import resolution settings. They are not used by solx and only passed through to solc like import remappings.

Visit the solc documentation to learn more about the processing of these options.

--output-dir

Specifies the output directory for build artifacts. Can only be used in basic CLI mode.

Usage in basic CLI mode:

solx 'Simple.sol' --bin --asm --metadata --output-dir './build/'
ls './build/'

Output:

Compiler run successful. Artifact(s) can be found in directory "build".
Simple_sol_Simple.asm
Simple_sol_Simple.bin
Simple_sol_Simple.runtime.asm
Simple_sol_Simple_llvm.asm
Simple_sol_Simple_llvm.asm-runtime
Simple_sol_Simple_meta.json

--overwrite

Overwrites the output files if they already exist in the output directory. By default, solx does not overwrite existing files.

Can only be used in combination with the --output-dir option.

Usage:

solx 'Simple.sol' --bin --output-dir './build/' --overwrite

If the --overwrite option is not specified and the output files already exist, solx will print an error message and exit:

Error: Refusing to overwrite an existing file "./build/Simple_sol_Simple.bin" (use --overwrite to force).

--version

Prints the version of solx and the hash of the LLVM commit it was built with.

Usage:

solx --version

--help

Prints the help message.

Usage:

solx --help

Other I/O Modes

The mode-altering CLI options are mutually exclusive. This means that only one of the options below can be enabled at a time:

--standard-json

For the standard JSON mode usage, see the Standard JSON page.

solx Compilation Settings

The options in this section are only configuring the solx compiler and do not affect the underlying solc compiler.

--threads

Sets the number of threads used for parallel compilation. Each thread compiles a separate translation unit in a child process. By default, the number of threads equals the number of CPU cores.

Large projects can consume a lot of RAM during compilation on machines with a high number of cores. If you encounter memory issues, consider reducing the number of threads.

Usage:

solx 'Simple.sol' --bin --threads 4

--optimization / -O

Sets the optimization level of the LLVM optimizer. Available values are:

LevelMeaningHints
0No optimizationFor fast compilation during development (unsupported)
1Performance: basicFor optimization research
2Performance: defaultFor optimization research
3Performance: aggressiveBest performance for production
sSize: defaultFor optimization research
zSize: aggressiveBest size for contracts with size constraints

For most cases, it is fine to keep the default value of 3. You should only use the level z if you are ready to deliberately sacrifice performance and optimize for size.

Large contracts may hit the EVM bytecode size limit. In this case, it is recommended to use the --optimization-size-fallback option rather than setting the level to z.

Usage:

solx 'Simple.sol' --bin -O3

This option can also be set with an environment variable SOLX_OPTIMIZATION, which is useful for toolkits where arbitrary solx-specific options are not supported:

SOLX_OPTIMIZATION='3' solx 'Simple.sol' --bin

--optimization-size-fallback

Sets the optimization level to z for contracts that failed to compile due to overrunning the bytecode size constraints.

Under the hood, this option automatically triggers recompilation of contracts with level z. Contracts that were successfully compiled with the original --optimization setting are not recompiled.

For deployment, it is recommended to have this option enabled in order to mitigate potential issues with EVM bytecode size constraints on a per-contract basis. If your environment does not have bytecode size limitations, it is better to disable it to prevent unnecessary recompilations. A good example is running forge test.

Usage:

solx 'Simple.sol' --bin -O3 --optimization-size-fallback

This option can also be set with an environment variable SOLX_OPTIMIZATION_SIZE_FALLBACK, which is useful for toolkits where arbitrary solx-specific options are not supported:

SOLX_OPTIMIZATION_SIZE_FALLBACK= solx 'Simple.sol' --bin -O3

--metadata-hash

Specifies the hash format used for contract metadata.

Usage with ipfs:

solx 'Simple.sol' --bin --metadata-hash 'ipfs'

Output with ipfs:

======= Simple.sol:Simple =======
Binary:
34601557630000008480630000001a6080396080f35b5f5ffdfe34600b5760...
a2646970667358221220579682b419e25ecc4524604eb5f3a8dbe3b15621ca21cc8ada8dcf6196a512df64736f6c637816736f6c783a302e312e343b736f6c633a302e382e33340047

The byte array starting with a2 at the end of the bytecode is a CBOR-encoded compiler version data and an optional metadata hash.

The last two bytes of the metadata (0x0047) are not a part of the CBOR payload, but the length of it, which must be known to correctly decode the payload.

JSON representation of the CBOR payload:

{
    // Optional: included if `--metadata-hash` is set to `ipfs`.
    "ipfs": "1220579682b419e25ecc4524604eb5f3a8dbe3b15621ca21cc8ada8dcf6196a512df",

    // Required: consists of semicolon-separated pairs of colon-separated compiler names and versions.
    // `solx:<version>` is always included.
    // `solc:<version>` is only included for Solidity and Yul contracts, but not included for LLVM IR ones.
    "solc": "solx:0.1.4;solc:0.8.34"
}

For more information on these formats, see the CBOR and IPFS documentation.

--no-cbor-metadata

Disables the CBOR metadata that is appended at the end of bytecode. This option is useful for debugging and research purposes.

It is not recommended to use this option in production, as it is not possible to verify contracts deployed without metadata.

Usage:

solx 'Simple.sol' --no-cbor-metadata

--llvm-options

Specifies additional options for the LLVM framework. The argument must be a single quoted string following a = separator.

Usage:

solx 'Simple.sol' --bin --llvm-options='-key=value'

The --llvm-options option is experimental and must only be used by experienced users. All supported options will be documented in the future.

solc Compilation Settings

The options in this section are only configuring solc, so they are passed directly to its child process, and do not affect the solx compiler.

--via-ir

Switches the solc codegen to Yul a.k.a. IR.

Usage:

solx 'Simple.sol' --bin --via-ir

--evm-version

Specifies the EVM version solx will produce bytecode for. For instance, with version osaka, solx will be producing clz instructions, whereas for older EVM versions it will not.

Only the following EVM versions are supported:

  • cancun
  • prague
  • osaka (default)

Usage:

solx 'Simple.sol' --bin --evm-version 'osaka'

--metadata-literal

Tells solc to store referenced sources as literal data in the metadata output.

This option only affects the contract metadata output produced by solc, and does not affect artifacts produced by solx.

Usage:

solx 'Simple.sol' --bin --metadata --metadata-literal

--no-import-callback

Disables the default import resolution callback in solc.

This parameter is used by some tooling that resolves all imports by itself, such as Hardhat.

Usage:

solx 'Simple.sol' --no-import-callback

Multi-Language Support

solx supports input in multiple programming languages:

The following sections outline how to use solx with these languages.

--yul (or --strict-assembly)

Enables the Yul mode. In this mode, input is expected to be in the Yul language. The output works the same way as with Solidity input.

Usage:

solx --yul 'Simple.yul' --bin

Output:

======= Simple.yul =======
Binary:
5b60806040525f341415601c5763...

--llvm-ir

Enables the LLVM IR mode. In this mode, input is expected to be in the LLVM IR language. The output works the same way as with Solidity input.

In this mode, every input file is treated as runtime code, while deploy code will be generated automatically by solx. It is not possible to write deploy code manually yet, but it will be supported in the future.

Unlike solc, solx is an LLVM-based compiler toolchain, so it uses LLVM IR as an intermediate representation. It is not recommended to write LLVM IR manually, but it can be useful for debugging and optimization purposes. LLVM IR is more low-level than Yul and EVM assembly in the solx IR hierarchy.

Usage:

solx --llvm-ir 'Simple.ll' --bin

Output:

======= Simple.ll =======
Binary:
5b60806040525f341415601c5763...

Debugging

IR Output Flags

For selective IR output, use the following flags with --output-dir:

These flags respect the --overwrite option. Without --overwrite, the compiler will refuse to overwrite existing files.

SOLX_OUTPUT_DIR Environment Variable

For debugging purposes, all intermediate build artifacts can be dumped to a directory using the SOLX_OUTPUT_DIR environment variable. This is useful for toolkits where arbitrary solx-specific options are not supported.

When this environment variable is set, solx will output all intermediate representations to the specified directory, always overwriting existing files.

The intermediate build artifacts include:

NameExtension
EVM Assemblyevmla
EthIRethir
Yulyul
LLVM IRll
LLVM Assemblyasm

Usage:

SOLX_OUTPUT_DIR='./debug/' solx 'Simple.sol' --bin
ls './debug/'

Output:

Simple_sol_Simple.evmla
Simple_sol_Simple.ethir
Simple_sol_Simple.unoptimized.ll
Simple_sol_Simple.optimized.ll
Simple_sol_Simple.asm
Simple_sol_Simple.runtime.evmla
Simple_sol_Simple.runtime.ethir
Simple_sol_Simple.runtime.unoptimized.ll
Simple_sol_Simple.runtime.optimized.ll
Simple_sol_Simple.runtime.asm

The output file name is constructed as follows: <ContractPath>_<ContractName>.<Modifiers>.<Extension>.

Additionally, it is possible to dump the standard JSON input file with the SOLX_STANDARD_JSON_DEBUG environment variable:

SOLX_STANDARD_JSON_DEBUG='./debug/input.json' solx 'Simple.sol' --bin
cat './debug/input.json' | jq .

--llvm-verify-each

Enables the verification of the LLVM IR after each optimization pass. This option is useful for debugging and research purposes.

Usage:

solx 'Simple.sol' --bin --llvm-verify-each

--llvm-debug-logging

Enables the debug logging of the LLVM IR optimization passes. This option is useful for debugging and research purposes.

Usage:

solx 'Simple.sol' --bin --llvm-debug-logging

Standard JSON

Standard JSON is a protocol for interaction with the solx and solc compilers. This protocol must be implemented by toolkits such as Hardhat.

The protocol uses two data formats for communication: input JSON and output JSON.

Usage

Input JSON can be provided by-value via the --standard-json option:

solx --standard-json './input.json'

Alternatively, the input JSON can be fed to solx via stdin:

cat './input.json' | solx --standard-json

You can also insert your standard JSON input directly into the command line:

solx --standard-json

<paste into stdin here and press Ctrl-D>

For the sake of interface unification, solx will always return with exit code 0 and have its standard JSON output printed to stdout. It differs from solc that may return with exit code 1 and a free-formed error in some cases, such as when the standard JSON input file is missing, even though the solc documentation claims otherwise.

Input JSON

The input JSON provides the compiler with the source code and settings for the compilation. The example below serves as the specification of the input JSON format.

This format introduces several solx-specific parameters such as settings.optimizer.sizeFallback. These parameters are marked as solx-only.

On the other hand, parameters that are not mentioned here but are parts of solc standard JSON protocol have no effect in solx.

{
  // Required: Source code language.
  // Currently supported: "Solidity", "Yul", "LLVM IR".
  "language": "Solidity",
  // Required: Source code files to compile.
  // The keys here are the "global" names of the source files. Imports can be using other file paths via remappings.
  "sources": {
    // In source file entry, either but not both "urls" and "content" must be specified.
    "myFile.sol": {
      // Required (unless "content" is used): URL(s) to the source file.
      "urls": [
        // In Solidity mode, directories must be added to the command-line via "--allow-paths <path>" for imports to work.
        // It is possible to specify multiple URLs for a single source file. In this case the first successfully resolved URL will be used.
        "/tmp/path/to/file.sol"
      ],
      // Required (unless "urls" is used): Literal contents of the source file.
      "content": "contract settable is owned { uint256 private x = 0; function set(uint256 _x) public { if (msg.sender == owner) x = _x; } }"
    }
  },

  // Required: Compilation settings.
  "settings": {
    // Optional: Optimizer settings.
    "optimizer": {
      // Optional, solx-only: Set the LLVM optimizer level.
      // Available options:
      // -0: do not optimize (unsupported)
      // -1: basic optimizations for gas usage
      // -2: advanced optimizations for gas usage
      // -3: all optimizations for gas usage
      // -s: basic optimizations for bytecode size
      // -z: all optimizations for bytecode size
      // Default: 3.
      "mode": "3",
      // Optional, solx-only: Re-run the compilation with "mode": "z" if the initial compilation exceeds the EVM bytecode size limit.
      // Used on a per-contract basis and applied automatically, so some contracts will end up compiled in the initial mode, and others with "mode": "z".
      // Only activated if "mode" is set to "3", which is the default optimization mode.
      // Default: false.
      "sizeFallback": false
    },

    // Optional: Sorted list of remappings.
    // Important: Only used with Solidity input.
    "remappings": [ ":g=/dir" ],
    // Optional: Addresses of the libraries.
    // If not all library addresses are provided here, it will result in unlinked bytecode files that will require post-compile-time linking before deployment.
    // Important: Only used with Solidity, Yul, and LLVM IR input.
    "libraries": {
      // The top level key is the name of the source file where the library is used.
      // If remappings are used, this source file should match the global path after remappings were applied.
      "myFile.sol": {
        // Source code library name and address where it is deployed.
        "MyLib": "0x123123..."
      }
    },

    // Optional: Version of EVM solx will produce bytecode for.
    // Supported EVM versions: "cancun", "prague", "osaka".
    // For instance, with version "osaka", solx will be producing `clz` instructions, whereas for older EVM versions it will not.
    // The oldest supported EVM version is "cancun".
    // Default: "osaka".
    "evmVersion": "osaka",
    // Optional: Select the desired output.
    // Default: no flags are selected, and no output is generated.
    "outputSelection": {
      "<path>": {
        // Available file-level options, must be listed under "<path>"."":
        "": [
          // AST of all source files.
          "ast",
          // Benchmarks of the solx LLVM-based compilation pipeline and its underlying call to solc.
          "benchmarks"
        ],
        // Available contract-level options, must be listed under "<path>"."<name>":
        "<name>": [
          // Solidity ABI.
          "abi",
          // Metadata.
          "metadata",
          // Developer documentation (natspec).
          "devdoc",
          // User documentation (natspec).
          "userdoc",
          // Slots, offsets and types of the contract's state variables in storage.
          "storageLayout",
          // Slots, offsets and types of the contract's state variables in transient storage.
          "transientStorageLayout",
          // Yul produced by solc.
          // An alias "irOptimized" is supported for compatibility, but it will request unoptimized Yul IR anyway.
          "ir",
          // Everything of the below.
          "evm",
          // Solidity function hashes.
          "evm.methodIdentifiers",
          // EVM assembly produced by solc.
          "evm.legacyAssembly",
          // Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.gasEstimates",
          // Everything that starts with "evm.bytecode".
          "evm.bytecode",
          // Deploy bytecode produced by solx/LLVM.
          // As long as the solx bytecode linker is in experimental stage, all contracts will be compiled if this key is enabled for at least one contract.
          "evm.bytecode.object",
          // Deploy code assembly produced by solx/LLVM.
          "evm.bytecode.llvmAssembly",
          // solx-only: EVM legacy assembly IR (internal representation). Only available for non-viaIR mode.
          "evm.bytecode.evmla",
          // solx-only: Ethereal IR (internal representation). Only available for non-viaIR mode.
          "evm.bytecode.ethir",
          // solx-only: Unoptimized LLVM IR (internal representation).
          "evm.bytecode.llvmIrUnoptimized",
          // solx-only: Optimized LLVM IR (internal representation).
          "evm.bytecode.llvmIr",
          // ELF-wrapped DWARF debug info produced by solx/LLVM. Only available for Solidity source code input.
          "evm.bytecode.debugInfo",
          // Link references for linkers that are to resolve library addresses at deploy time.
          "evm.bytecode.linkReferences",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.bytecode.opcodes",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.bytecode.sourceMap",
          // Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.bytecode.functionDebugData",
          // Unsupported, but emitted as an empty array to preserve compatibility with some toolkits.
          "evm.bytecode.generatedSources",
          // Everything that starts with "evm.deployedBytecode".
          "evm.deployedBytecode",
          // Runtime bytecode produced by solx/LLVM.
          // As long as the solx bytecode linker is in experimental stage, all contracts will be compiled if this key is enabled for at least one contract.
          "evm.deployedBytecode.object",
          // Runtime code assembly produced by solx/LLVM.
          "evm.deployedBytecode.llvmAssembly",
          // solx-only: EVM legacy assembly IR (internal representation). Only available for non-viaIR mode.
          "evm.deployedBytecode.evmla",
          // solx-only: Ethereal IR (internal representation). Only available for non-viaIR mode.
          "evm.deployedBytecode.ethir",
          // solx-only: Unoptimized LLVM IR (internal representation).
          "evm.deployedBytecode.llvmIrUnoptimized",
          // solx-only: Optimized LLVM IR (internal representation).
          "evm.deployedBytecode.llvmIr",
          // Link references for linkers that are to resolve library addresses at deploy time.
          "evm.deployedBytecode.linkReferences",
          // Resolved automatically by solx/LLVM, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.deployedBytecode.immutableReferences",
          // ELF-wrapped DWARF debug info produced by solx/LLVM. Only available for Solidity source code input.
          "evm.deployedBytecode.debugInfo",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.deployedBytecode.opcodes",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.deployedBytecode.sourceMap",
          // Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.deployedBytecode.functionDebugData",
          // Unsupported, but emitted as an empty array to preserve compatibility with some toolkits.
          "evm.deployedBytecode.generatedSources"
        ]
      }
    },
    // Optional: Metadata settings.
    "metadata": {
      // Optional: Use the given hash method for the metadata hash that is appended to the bytecode.
      // Available options: "none", "ipfs".
      // Default: "ipfs".
      "bytecodeHash": "ipfs",
      // Optional: Use only literal content and not URLs.
      // Default: false.
      "useLiteralContent": true,
      // Optional: Whether to include CBOR-encoded metadata at the end of bytecode.
      // Default: true.
      "appendCBOR": true
    },
    // Optional: Enables the IR codegen in solc.
    "viaIR": true,

    // Optional, solx-only: Extra LLVM settings.
    "llvmOptions": [
      "-key", "value"
    ]
  }
}

Output JSON

The output JSON contains all artifacts produced by solx and solc together. The example below serves as the specification of the output JSON format.

{
  // Required: File-level outputs.
  "sources": {
    "sourceFile.sol": {
      // Required: Identifier of the source.
      "id": 1,
      // Optional: The AST object.
      // Corresponds to "ast" in the outputSelection settings.
      "ast": {/* ... */}
    }
  },

  // Required: Contract-level outputs.
  "contracts": {
    // The source name.
    "sourceFile.sol": {
      // The contract name.
      // If the language only supports one contract per file, this field equals to the source name.
      "ContractName": {
        // Optional: The Ethereum Contract ABI (object).
        // See https://docs.soliditylang.org/en/develop/abi-spec.html.
        // Corresponds to "abi" in the outputSelection settings.
        "abi": [/* ... */],
        // Optional: Storage layout (object).
        // Corresponds to "storageLayout" in the outputSelection settings.
        "storageLayout": {/* ... */},
        // Optional: Transient storage layout (object).
        // Corresponds to "transientStorageLayout" in the outputSelection settings.
        "transientStorageLayout": {/* ... */},
        // Optional: Contract metadata (string).
        // Corresponds to "metadata" in the outputSelection settings.
        "metadata": "/* ... */",
        // Optional: Developer documentation (natspec object).
        // Corresponds to "devdoc" in the outputSelection settings.
        "devdoc": {/* ... */},
        // Optional: User documentation (natspec object).
        // Corresponds to "userdoc" in the outputSelection settings.
        "userdoc": {/* ... */},
        // Optional: Yul produced by solc (string).
        // Corresponds to "ir" in the outputSelection settings.
        "ir": "/* ... */",
        // Optional: EVM target outputs.
        // Corresponds to "evm" in the outputSelection settings.
        "evm": {
          // Optional: EVM assembly produced by solc (object).
          // Corresponds to "evm.legacyAssembly" in the outputSelection settings.
          "legacyAssembly": {/* ... */},
          // Optional: List of function hashes (object).
          // Corresponds to "evm.methodIdentifiers" in the outputSelection settings.
          "methodIdentifiers": {
            // Mapping between the function signature and its hash.
            "delegate(address)": "5c19a95c"
          },
          // Optional: Always empty, Included only to preserve compatibility with some toolkits (object).
          // Corresponds to "evm.gasEstimates" in the outputSelection settings.
          "gasEstimates": {},
          // Optional: Deploy EVM bytecode.
          // Corresponds to "evm.bytecode" in the outputSelection settings.
          "bytecode": {
            // Optional: Bytecode (string).
            // Corresponds to "evm.bytecode.object" in the outputSelection settings.
            "object": "5b60806040525f341415601c5763...",
            // Optional: LLVM text assembly (string).
            // Corresponds to "evm.bytecode.llvmAssembly" in the outputSelection settings.
            "llvmAssembly": "/* ... */",
            // Optional, solx-only: EVM legacy assembly IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.bytecode.evmla" in the outputSelection settings.
            "evmla": "/* ... */",
            // Optional, solx-only: Ethereal IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.bytecode.ethir" in the outputSelection settings.
            "ethir": "/* ... */",
            // Optional, solx-only: Unoptimized LLVM IR (string).
            // Corresponds to "evm.bytecode.llvmIrUnoptimized" in the outputSelection settings.
            "llvmIrUnoptimized": "/* ... */",
            // Optional, solx-only: Optimized LLVM IR (string).
            // Corresponds to "evm.bytecode.llvmIr" in the outputSelection settings.
            "llvmIr": "/* ... */",
            // Optional: ELF-wrapped DWARF debug info (string).
            // Corresponds to "evm.bytecode.debugInfo" in the outputSelection settings.
            "debugInfo": "/* ... */",
            // Optional: Link references for linkers that are to resolve library addresses at deploy time (object).
            // Corresponds to "evm.bytecode.linkReferences" in the outputSelection settings.
            "linkReferences": {/* ... */},
            // Optional: Benchmarks of each stage of the compilation on a per-translation unit basis (array).
            // Corresponds to "benchmarks" in the outputSelection settings.
            "benchmarks": [/* ... */],
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.bytecode.opcodes" in the outputSelection settings.
            "opcodes": "",
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.bytecode.sourceMap" in the outputSelection settings.
            "sourceMap": "",
            // Optional: Always empty object, included only to preserve compatibility with some toolkits (object).
            // Corresponds to "evm.bytecode.functionDebugData" in the outputSelection settings.
            "functionDebugData": {},
            // Optional: Always empty array, included only to preserve compatibility with some toolkits (array).
            // Corresponds to "evm.bytecode.generatedSources" in the outputSelection settings.
            "generatedSources": []
          },
          // Optional: Runtime EVM bytecode.
          // Corresponds to "evm.deployedBytecode" in the outputSelection settings.
          "deployedBytecode": {
            // Optional: Bytecode (string).
            // Corresponds to "evm.deployedBytecode.object" in the outputSelection settings.
            "object": "5b60806040525f34141560145760...",
            // Optional: LLVM text assembly (string).
            // Corresponds to "evm.deployedBytecode.llvmAssembly" in the outputSelection settings.
            "llvmAssembly": "/* ... */",
            // Optional, solx-only: EVM legacy assembly IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.deployedBytecode.evmla" in the outputSelection settings.
            "evmla": "/* ... */",
            // Optional, solx-only: Ethereal IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.deployedBytecode.ethir" in the outputSelection settings.
            "ethir": "/* ... */",
            // Optional, solx-only: Unoptimized LLVM IR (string).
            // Corresponds to "evm.deployedBytecode.llvmIrUnoptimized" in the outputSelection settings.
            "llvmIrUnoptimized": "/* ... */",
            // Optional, solx-only: Optimized LLVM IR (string).
            // Corresponds to "evm.deployedBytecode.llvmIr" in the outputSelection settings.
            "llvmIr": "/* ... */",
            // Optional: ELF-wrapped DWARF debug info (string).
            // Corresponds to "evm.deployedBytecode.debugInfo" in the outputSelection settings.
            "debugInfo": "/* ... */",
            // Optional: Link references for linkers that are to resolve library addresses at deploy time (object).
            // Corresponds to "evm.deployedBytecode.linkReferences" in the outputSelection settings.
            "linkReferences": {/* ... */},
            // Optional: Benchmarks of each stage of the compilation on a per-translation unit basis (array).
            // Corresponds to "benchmarks" in the outputSelection settings.
            "benchmarks": [/* ... */],
            // Optional: Resolved by LLVM automatically, so always returned as an empty object (object).
            // Included only to preserve compatibility with some toolkits.
            // Corresponds to "evm.deployedBytecode.immutableReferences" in the outputSelection settings.
            "immutableReferences": {},
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.deployedBytecode.opcodes" in the outputSelection settings.
            "opcodes": "",
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.deployedBytecode.sourceMap" in the outputSelection settings.
            "sourceMap": "",
            // Optional: Always empty object, included only to preserve compatibility with some toolkits (object).
            // Corresponds to "evm.deployedBytecode.functionDebugData" in the outputSelection settings.
            "functionDebugData": {},
            // Optional: Always empty array, included only to preserve compatibility with some toolkits (array).
            // Corresponds to "evm.deployedBytecode.generatedSources" in the outputSelection settings.
            "generatedSources": []
          }
        }
      }
    }
  },

  // Optional: Benchmarks of the solx LLVM-based compilation pipeline and its underlying call to solc (array).
  // Corresponds to "benchmarks" in the outputSelection settings.
  "benchmarks": [/* ... */],

  // Optional: Unset if no messages were emitted.
  "errors": [
    {
      // Optional: Location within the source file.
      // Unset if the error is unrelated to input sources.
      "sourceLocation": {
        // Required: The source path.
        "file": "sourceFile.sol",
        // Required: The source location start. Equals -1 if unknown.
        "start": 0,
        // Required: The source location end. Equals -1 if unknown.
        "end": 100
      },
      // Required: Message type.
      // solc errors are listed at https://docs.soliditylang.org/en/latest/using-the-compiler.html#error-types.
      "type": "Error",
      // Required: Component the error originates from.
      "component": "general",
      // Required: Message severity.
      // Possible values: "error", "warning", "info".
      "severity": "error",
      // Optional: Unique code for the cause of the error.
      // Only solc produces error codes for now.
      // solx currently emits errors without codes, but they will be introduced soon.
      "errorCode": "3141",
      // Required: Message.
      "message": "Invalid keyword",
      // Required: Message formatted using the source location.
      "formattedMessage": "sourceFile.sol:100: Invalid keyword"
    }
  ]
}

Limitations and Differences from Upstream solc

This chapter summarizes where solx differs from upstream solc, and which limitations currently apply.

Compilation Modes

solx supports two codegen pipelines:

  • Yul pipeline: enabled with --via-ir (matching solc's --via-ir flag).
  • Legacy EVM assembly pipeline: the default code generation path.

The --evmla and --ethir debug flags are only available in the legacy (non-via-ir) pipeline.

solc Fork Modifications

The solx-solidity fork includes the following changes relative to upstream solc:

  • extraMetadata output: emits user-defined function metadata (name, entry tag, input/output sizes, AST IDs) used during LLVM lowering.
  • DUPX / SWAPX instructions: extends stack access beyond depth 16 to avoid classic "stack too deep" failures.
  • spillAreaSize setting: configures a memory spill region for values that cannot remain on stack.
  • Function pointer dispatch tables: uses static dispatch through FuncPtrTracker instead of dynamic jump-based dispatch.
  • Simplified try/catch in legacy mode: reduces control-flow complexity for translator compatibility.
  • Bypassed EVM bytecode generation: solx does not use solc's EVM bytecode output; final bytecode is produced by the LLVM backend.
  • Disabled optimizer: the solc optimizer is turned off to preserve function boundaries and metadata validity. All optimization is handled by the LLVM backend.

Behavioral Differences

  • Generated bytecode can differ from upstream solc output because final code generation happens in LLVM.
  • Optimization levels map to LLVM optimization pipelines, not upstream solc optimization heuristics.
  • Final code size can differ from upstream due to LLVM pass behavior.

Unsupported Features

  • CALLCODE is rejected at compile time. Use DELEGATECALL instead.
  • SELFDESTRUCT is rejected at compile time (deprecated by EIP-6049).
  • PC (program counter) is not supported.
  • BLOBHASH and BLOBBASEFEE (EIP-4844/EIP-7516) are rejected at compile time.
  • Inline assembly marked memory-safe can cause errors when spill-area-based lowering is active.
  • Some solc optimizer settings are ignored since the solc optimizer is disabled.

Version Support

  • The solx-solidity fork tracks upstream solc releases.
  • The minimum supported Solidity version matches the forked solc version.

Architecture

solx is an LLVM-based compiler that translates Solidity source code into optimized EVM bytecode.

Components

The compiler consists of three repositories:

  1. solx — The main compiler executable and Rust crates that translate Yul and EVM assembly to LLVM IR.
  2. solx-solidity — An LLVM-friendly fork of the Solidity compiler that emits Yul and EVM assembly.
  3. solx-llvm — A fork of the LLVM framework with an EVM target backend.

Compilation Pipeline

                        ┌─────────────────────────────────────────────┐
                        │                  Frontend                   │
┌──────────┐            │  ┌────────────────┐       ┌──────────────┐  │
│ Solidity │ ────────── │  │ solx-solidity  │ ───── │     solx     │  │
│  source  │            │  │                │       │              │  │
└──────────┘            │  │ Parsing,       │ Yul / │ Yul & EVM    │  │
                        │  │ semantic       │ EVM   │ assembly     │  │
                        │  │ analysis       │ asm   │ translation  │  │
                        │  └────────────────┘       └──────────────┘  │
                        └─────────────────────────────────────────────┘
                                                           │
                                                        LLVM IR
                                                           │
                                                           ▼
                        ┌─────────────────────────────────────────────┐
                        │                 Middle-end                  │
                        │  ┌────────────────────────────────────────┐ │
                        │  │           LLVM Optimizer               │ │
                        │  │                                        │ │
                        │  │  IR transformations and optimizations  │ │
                        │  └────────────────────────────────────────┘ │
                        └─────────────────────────────────────────────┘
                                                           │
                                                     Optimized IR
                                                           │
                                                           ▼
                        ┌─────────────────────────────────────────────┐
                        │                  Backend                    │
                        │  ┌────────────────────────────────────────┐ │
                        │  │         solx-llvm EVM Target           │ │
                        │  │                                        │ │
                        │  │  Instruction selection, register       │ │
                        │  │  allocation, code emission             │ │
                        │  └────────────────────────────────────────┘ │
                        └─────────────────────────────────────────────┘
                                                           │
                                                           ▼
                                                   ┌──────────────┐
                                                   │ EVM bytecode │
                                                   └──────────────┘

Frontend

The frontend transforms Solidity source code into LLVM IR:

  1. solx-solidity parses the Solidity source, performs semantic analysis, and emits either Yul or EVM assembly.
  2. solx reads the Yul or EVM assembly and translates it into LLVM IR.

Middle-end

The LLVM optimizer applies a series of IR transformations to improve code quality and performance. These optimizations are target-independent and work on the LLVM IR representation.

Backend

The solx-llvm EVM target converts optimized LLVM IR into EVM bytecode. This includes:

  • Instruction selection (mapping IR operations to EVM opcodes)
  • Register allocation (managing the EVM stack)
  • Stackification (converting register-based code to stack-based EVM operations)
  • Code emission (generating the final bytecode)

Why a Fork of solc?

The solx-solidity fork includes modifications to make the Solidity compiler output compatible with LLVM IR generation. The upstream solc compiler is designed to emit EVM bytecode directly, but solx needs intermediate representations (Yul or EVM assembly) that can be translated to LLVM IR.

The fork maintains compatibility with upstream solc and tracks its releases.

EVM Assembly Translator

The EVM assembly translator converts legacy EVM assembly (the default solc output) into LLVM IR via an intermediate representation called Ethereal IR (EthIR). The Yul pipeline (--via-ir) bypasses this translator entirely.

Why EthIR?

EVM assembly is stack-based with dynamic jumps, making it difficult to translate directly to LLVM IR which requires explicit control flow graphs. EthIR bridges this gap by:

  1. Tracking stack state to identify jump destinations at compile time
  2. Cloning blocks reachable from predecessors with different stack states
  3. Reconstructing control flow from stack-based jumps into a static CFG
  4. Resolving function calls using metadata from the solc fork

Translation Pipeline

Solidity source
    │
    ▼
solc (solx-solidity fork)
    │  Emits EVM assembly JSON + extraMetadata
    ▼
Assembly parsing
    │  Parses instructions, resolves dependencies
    ▼
Block construction
    │  Groups instructions between Tag labels
    ▼
EthIR traversal
    │  DFS with stack simulation, block cloning
    ▼
LLVM IR generation
    │  Creates LLVM functions, basic blocks, instructions
    ▼
LLVM optimizer
    │
    ▼
EVM bytecode (via LLVM EVM backend)

Key Data Structures

Assembly

The Assembly struct represents the raw solc output. It contains:

  • code: Flat list of instructions (deploy code)
  • data["0"]: Nested assembly for runtime code
  • data[hex]: Referenced data entries — sub-assemblies, hashes, or resolved contract paths (for CREATE/CREATE2)

Each instruction has a name (opcode), optional value (operand), and optional source location.

EtherealIR

The top-level container holding:

  • entry_function: The main contract function (deploy + runtime)
  • defined_functions: Internal functions discovered during traversal

Function

The Function struct is the core of the translator. It contains:

  • blocks: BTreeMap<BlockKey, Vec<Block>> — maps each block tag to one or more instances (clones for different stack states)
  • block_hash_index: HashMap<BlockKey, HashSet<u64>> — fast duplicate detection by stack hash
  • stack_size: Maximum stack height observed, used to size LLVM stack allocations

Block

Each Block represents a sequence of instructions between two Tag labels:

  • key: BlockKey (code segment + tag number)
  • instance: Clone index (0, 1, 2... for blocks visited with different stack states)
  • elements: Instructions with full stack state snapshots
  • initial_stack / stack: Stack state at entry and after processing

Stack Elements

The stack tracks six kinds of values:

VariantDescriptionExample
Value(String)Runtime value (opaque)Result of ADD, MLOAD
Constant(BigUint)Compile-time 256-bit constant0x60, 0xFFFF
Tag(u64)Block tag (jump target)Tag 42
Path(String)Contract dependency path"SubContract"
Data(String)Hex data chunk"deadbeef"
ReturnAddress(usize)Function return markerReturn with 2 outputs

Block Cloning and Stack Hashing

The same block may be reached via different code paths with different stack contents. Since the stack determines jump targets (a JUMP pops its destination from the stack), the translator must handle each unique stack state separately.

How It Works

  1. When entering a block, the translator computes a stack hash using XxHash3_64
  2. The hash considers only Tag elements — tags determine control flow, while constants and runtime values affect only data flow
  3. The pair (BlockKey, stack_hash) uniquely identifies a block instance
  4. If this pair has been visited before, the block is skipped (cycle detection)
  5. Otherwise, a new block instance is created
Block "process" reached with stack [T_10, V_x]:  → instance 0
Block "process" reached with stack [T_20, V_y]:  → instance 1 (different tag)
Block "process" reached with stack [T_10, V_z]:  → instance 0 (same hash, reused)

Stack Hash Algorithm

fn hash(&self) -> u64 {
    let mut hasher = XxHash3_64::default();
    for element in self.elements.iter() {
        match element {
            Element::Tag(tag) => hasher.write(&tag.to_le_bytes()),
            _ => hasher.write_u8(0),
        }
    }
    hasher.finish()
}

Only Tag values contribute to the hash. This is intentional: two stack states with the same tags but different runtime values will follow the same control flow path.

Traversal Algorithm

The Function::traverse() method performs a depth-first traversal of blocks, simulating EVM execution:

traverse(blocks, extra_metadata):
    queue ← [(entry_block, empty_stack)]
    visited ← {}

    while queue is not empty:
        (block_key, stack) ← queue.pop()
        hash ← stack.hash()

        if (block_key, hash) in visited:
            continue
        visited.add((block_key, hash))

        block ← blocks[block_key].clone_with(stack)
        for instruction in block:
            simulate_instruction(instruction, stack)
            if instruction is JUMP/JUMPI:
                queue.push((target_tag, stack))

Instruction Simulation

For each instruction, the translator:

  1. Pops the required number of inputs from the simulated stack
  2. Computes the output (compile-time if possible, runtime value otherwise)
  3. Pushes the result onto the stack
  4. For control flow instructions, queues successor blocks

Compile-Time Constant Folding

Arithmetic operations on known values are folded at compile time:

OperandsResult
Constant + ConstantConstant (computed)
Tag + ConstantTag (if result is valid block)
Tag + TagTag (if result is valid block)
Any other combinationValue (runtime, opaque)

This is critical for resolving jump targets: solc often computes jump destinations via PUSH tag + arithmetic.

Function Call Detection

The translator identifies function calls using extra metadata from the solc fork. The extraMetadata JSON field lists all user-defined functions with their:

  • Entry tag (in deploy and/or runtime code)
  • Input parameter count
  • Output return value count
  • Function name and AST node ID

When a JUMP targets a known function entry:

  1. The stack is split: return address, arguments, and remaining caller state
  2. A RecursiveCall pseudo-instruction replaces the JUMP
  3. A new Function is created and recursively traversed from the entry block
  4. The caller's stack receives output_size opaque return values
Before JUMP to function "add(uint,uint)":
  Stack: [... | return_tag | arg1 | arg2 | function_entry_tag]

After call detection:
  Instruction: RecursiveCall add(uint,uint), input=2, output=1
  Caller stack: [... | return_value]
  Callee: new Function traversed from entry tag

LLVM IR Generation

After traversal, the translator generates LLVM IR in several phases:

1. Function Declaration

  • Entry function: Uses the pre-declared contract entry point
  • Defined functions: Creates private LLVM functions with N × i256 parameters and return values (multiple returns use LLVM struct types)

2. Stack Variable Allocation

For each function, stack_size stack slots are allocated as LLVM alloca instructions. These represent the simulated EVM stack as addressable memory:

%stack_0 = alloca i256    ; bottom of stack
%stack_1 = alloca i256
...
%stack_N = alloca i256    ; top of stack

For defined functions, slot 0 is reserved for the return address marker, and input parameters are stored starting from slot 1.

3. Basic Block Creation

Each (BlockKey, instance) pair becomes an LLVM BasicBlock:

block_runtime_42/0:       ; tag 42, first instance
  ...
block_runtime_42/1:       ; tag 42, second instance (different stack state)
  ...

4. Instruction Translation

Each EthIR element calls into_llvm() to generate LLVM instructions. Stack operations map to loads/stores on the allocated stack variables:

EVM OperationLLVM Translation
PUSH 0x42store i256 66, ptr %stack_N
DUP2%v = load i256, ptr %stack_(N-2); store i256 %v, ptr %stack_(N+1)
ADD%a = load ...; %b = load ...; %r = add i256 %a, %b; store ...
MLOAD%ptr = load ...; %v = load i256, ptr addrspace(1) %ptr; store ...
JUMPbr label %target_block
JUMPI%cond = ...; br i1 %cond, label %taken, label %fallthrough

solc Fork Modifications

The EVM assembly translator relies on several modifications in the solx-solidity fork. The most relevant to this pipeline are:

  • extraMetadata output: reports all user-defined functions with entry tags, parameter counts, and AST IDs. Without this, the translator cannot distinguish function calls from arbitrary jumps.
  • Dispatch tables for function pointers: indirect calls are lowered to static dispatch tables instead of dynamic jumps.
  • DUPX / SWAPX instructions: extend stack access beyond depth 16, eliminating "stack too deep" errors.
  • Disabled optimizer: the solc optimizer is disabled to preserve function boundaries and metadata validity. All optimization is handled by the LLVM backend.

For the full list of fork modifications, see Limitations and Differences from solc.

EVM Instructions Reference

This chapter describes how the LLVM EVM backend models EVM instructions and lowers LLVM IR into final opcode sequences.

Instruction Definitions

The EVM instruction set is defined in the LLVM backend via TableGen.

  • It contains opcode definitions, pattern mappings, and EVM-specific pseudo-instructions.
  • It covers roughly 180 instruction forms once TableGen expansions are considered (for example DUP1..16, SWAP1..16, and PUSH families).
  • Instructions are modeled around i256 values, matching the EVM word size.

Address Space Model

The backend uses explicit LLVM address spaces to model EVM memory regions:

Address spaceValueMeaning
AS_STACK0Compiler-managed stack memory model
AS_HEAP1EVM linear memory (MLOAD, MSTORE, MCOPY)
AS_CALL_DATA2Call data region
AS_RETURN_DATA3Return data region
AS_CODE4Code segment
AS_STORAGE5Persistent storage
AS_TSTORAGE6Transient storage

These constants are defined in the EVM backend header.

Core Instruction Categories

Arithmetic

Arithmetic opcodes map directly to i256 operations or EVM intrinsics:

  • ADD, MUL, SUB, DIV, SDIV, MOD, SMOD
  • ADDMOD, MULMOD, EXP, SIGNEXTEND

For example, ADD is selected from LLVM add i256 patterns.

Memory

Memory instructions operate on the heap address space:

  • MLOAD maps to a load from AS_HEAP
  • MSTORE maps to a store into AS_HEAP
  • MCOPY lowers memory copy operations in heap memory

Storage

Storage instructions map to storage address spaces:

  • SLOAD, SSTORE use AS_STORAGE
  • TLOAD, TSTORE use AS_TSTORAGE

Control Flow

Control flow instructions are selected from LLVM branch forms:

  • JUMP maps from unconditional br
  • JUMPI maps from conditional br i1

The backend also uses helper pseudos (for example JUMP_UNLESS) that are lowered before emission.

Stack

EVM stack manipulation opcodes are emitted as needed:

  • DUP1..DUP16
  • SWAP1..SWAP16
  • POP

They are introduced and optimized by stackification passes rather than directly authored in frontend IR.

Cryptographic

SHA3/KECCAK256 is represented through EVM-specific intrinsic plumbing:

  • Machine instruction: KECCAK256
  • LLVM intrinsic path: llvm.evm.sha3

Runtime Library (evm-stdlib.ll)

The backend links helper wrappers from the EVM runtime standard library:

  • __addmod
  • __mulmod
  • __signextend
  • __exp
  • __byte
  • __sdiv
  • __div
  • __smod
  • __mod
  • __shl
  • __shr
  • __sar
  • __sha3

These wrappers forward to corresponding llvm.evm.* intrinsics.

Stackification Pipeline

The late codegen pipeline converts virtual-register machine IR to valid EVM stack code:

  1. EVMSingleUseExpression: reorders machine instructions into expression-friendly form.
  2. EVMBackwardPropagationStackification: performs backward propagation stackification from register form.
  3. EVMStackSolver and EVMStackShuffler: compute and emit low-cost DUP/SWAP/spill-reload sequences.
  4. EVMPeephole: runs late peephole optimizations before final emission.

Stack Depth Limit

The EVM stack itself can hold up to 1024 items, but DUP and SWAP instructions can only reach the top 16 positions. The backend enforces this depth-16 manipulation reach.

This limit is exposed by EVMSubtarget::stackDepthLimit().

Pseudo-Instructions

Several pseudos are used during lowering and removed or expanded before final bytecode:

  • PUSHDEPLOYADDRESS: materializes deploy-time address usage for libraries.
  • SELECT: models conditional value selection.
  • CONST_I256: represents immediate constants before stackification.
  • COPY_I256: temporary register-copy form before stackification.

Yul Builtins Reference

This chapter lists all Yul builtin functions supported by solx and how each is lowered to LLVM IR for the EVM backend.

Lowering Strategies

Yul builtins are lowered through one of three strategies:

  • Direct LLVM IR: the builtin maps to native LLVM integer or memory operations on i256.
  • LLVM intrinsic: the builtin maps to an llvm.evm.* intrinsic that the EVM backend expands to opcodes.
  • Address space access: the builtin maps to a load or store in a typed LLVM address space (see EVM Instructions: Address Space Model).

Arithmetic

BuiltinLoweringNotes
addDirect LLVM IRadd i256
subDirect LLVM IRsub i256
mulDirect LLVM IRmul i256
divDirect LLVM IRUnsigned; returns 0 when divisor is 0
sdivDirect LLVM IRSigned; returns 0 when divisor is 0
modDirect LLVM IRUnsigned; returns 0 when divisor is 0
smodDirect LLVM IRSigned; returns 0 when divisor is 0
addmodIntrinsic llvm.evm.addmod(x + y) % m without intermediate overflow
mulmodIntrinsic llvm.evm.mulmod(x * y) % m without intermediate overflow
expIntrinsic llvm.evm.expExponentiation
signextendIntrinsic llvm.evm.signextendSign extend from bit (i*8+7)

Comparison

BuiltinLoweringNotes
ltDirect LLVM IRUnsigned less-than
gtDirect LLVM IRUnsigned greater-than
sltDirect LLVM IRSigned less-than
sgtDirect LLVM IRSigned greater-than
eqDirect LLVM IREquality
iszeroDirect LLVM IRCheck if zero

Bitwise

BuiltinLoweringNotes
andDirect LLVM IRBitwise AND
orDirect LLVM IRBitwise OR
xorDirect LLVM IRBitwise XOR
notDirect LLVM IRBitwise NOT
shlDirect LLVM IRShift left; shift >= 256 yields 0
shrDirect LLVM IRLogical shift right; shift >= 256 yields 0
sarDirect LLVM IRArithmetic shift right; shift >= 256 yields sign-extended value
byteIntrinsic llvm.evm.byteExtract nth byte
clzIntrinsic llvm.ctlzCount leading zeros (requires Osaka EVM version)

Hashing

BuiltinLoweringNotes
keccak256Intrinsic llvm.evm.sha3Keccak-256 over memory range

Memory

BuiltinLoweringNotes
mloadAddress space 1 loadLoad 32 bytes from heap memory
mstoreAddress space 1 storeStore 32 bytes to heap memory
mstore8Intrinsic llvm.evm.mstore8Store single byte to memory
mcopymemcpy in address space 1EIP-5656 memory copy
msizeIntrinsic llvm.evm.msizeHighest accessed memory index

Storage

BuiltinLoweringNotes
sloadAddress space 5 loadLoad from persistent storage
sstoreAddress space 5 storeStore to persistent storage
tloadAddress space 6 loadLoad from transient storage (EIP-1153)
tstoreAddress space 6 storeStore to transient storage (EIP-1153)

Immutables

BuiltinLoweringNotes
loadimmutableIntrinsic llvm.evm.loadimmutableLoad immutable value with metadata identifier
setimmutableSpecialSet immutable value during deployment

Call Data and Return Data

BuiltinLoweringNotes
calldataloadAddress space 2 loadLoad 32 bytes from calldata
calldatasizeIntrinsic llvm.evm.calldatasizeSize of calldata
calldatacopymemcpy from address space 2 to 1Copy calldata to memory
returndatasizeIntrinsic llvm.evm.returndatasizeSize of return data
returndatacopymemcpy from address space 3 to 1Copy return data to memory

Code Operations

BuiltinLoweringNotes
codesizeIntrinsic llvm.evm.codesizeCurrent contract code size
codecopymemcpy from address space 4 to 1Copy code to memory
extcodesizeIntrinsic llvm.evm.extcodesizeExternal contract code size
extcodecopyIntrinsic llvm.evm.extcodecopyCopy external code to memory
extcodehashIntrinsic llvm.evm.extcodehashHash of external contract code

Object and Data Operations

BuiltinLoweringNotes
datasizeIntrinsic llvm.evm.datasizeSize of a named data object
dataoffsetIntrinsic llvm.evm.dataoffsetOffset of a named data object
datacopySame as codecopyCopy data to memory

These builtins are used by deploy stubs to reference embedded runtime and dependency objects. See Binary Layout for details.

Event Logging

BuiltinLoweringNotes
log0Intrinsic llvm.evm.log0Log with 0 topics
log1Intrinsic llvm.evm.log1Log with 1 topic
log2Intrinsic llvm.evm.log2Log with 2 topics
log3Intrinsic llvm.evm.log3Log with 3 topics
log4Intrinsic llvm.evm.log4Log with 4 topics

Contract Calls

BuiltinLoweringNotes
callIntrinsic llvm.evm.callCall with value transfer
delegatecallIntrinsic llvm.evm.delegatecallCall preserving caller and callvalue
staticcallIntrinsic llvm.evm.staticcallRead-only call

Note: callcode is rejected at compile time. Use delegatecall instead.

Contract Creation

BuiltinLoweringNotes
createIntrinsic llvm.evm.createCreate new contract
create2Intrinsic llvm.evm.create2Create at deterministic address

Control Flow

BuiltinLoweringNotes
returnIntrinsic llvm.evm.returnReturn data from execution
revertIntrinsic llvm.evm.revertRevert with return data
stopIntrinsic llvm.evm.stopStop execution
invalidIntrinsic llvm.evm.invalidInvalid instruction (consumes all gas)

Note: selfdestruct is rejected at compile time (deprecated by EIP-6049).

Block and Transaction Context

BuiltinLoweringNotes
addressIntrinsic llvm.evm.addressCurrent contract address
callerIntrinsic llvm.evm.callerMessage sender
callvalueIntrinsic llvm.evm.callvalueWei sent with call
gasIntrinsic llvm.evm.gasRemaining gas
gaspriceIntrinsic llvm.evm.gaspriceGas price of transaction
balanceIntrinsic llvm.evm.balanceBalance of address
selfbalanceIntrinsic llvm.evm.selfbalanceCurrent contract balance
originIntrinsic llvm.evm.originTransaction sender

Block Information

BuiltinLoweringNotes
blockhashIntrinsic llvm.evm.blockhashHash of given block
numberIntrinsic llvm.evm.numberCurrent block number
timestampIntrinsic llvm.evm.timestampBlock timestamp
coinbaseIntrinsic llvm.evm.coinbaseBlock beneficiary
difficultyIntrinsic llvm.evm.difficultyBlock difficulty (pre-merge)
prevrandaoIntrinsic llvm.evm.difficultyPrevious RANDAO value (EIP-4399, reuses difficulty)
gaslimitIntrinsic llvm.evm.gaslimitBlock gas limit
chainidIntrinsic llvm.evm.chainidChain ID (EIP-1344)
basefeeIntrinsic llvm.evm.basefeeBase fee per gas (EIP-1559)
blobhashRejected at compile timeVersioned hash of transaction's i-th blob (EIP-4844)
blobbasefeeRejected at compile timeCurrent block's blob base fee (EIP-7516/EIP-4844)

Note: blobhash and blobbasefee are not yet supported and will produce a compile error.

Special and Meta

BuiltinLoweringNotes
popOptimized awayNo code generated
linkersymbolIntrinsic llvm.evm.linkersymbolLibrary linker placeholder
memoryguardSpecialReserves a memory region; used by solx to configure the spill area for stack-too-deep mitigation

Binary Layout and Linking

This chapter describes how solx models deploy/runtime bytecode objects, dependency data, and post-compilation linking.

Contract Object Model

EVM contracts have two code segments:

  • Deploy code (init code): runs only during contract creation.
  • Runtime code: returned by deploy code and stored as the contract's permanent code.

Deploy code typically builds runtime bytes in memory and executes RETURN(offset, size).

solc JSON Assembly Layout

In legacy assembly JSON, the object is split into top-level deploy code and nested runtime code:

  • Top-level .code: deploy instruction stream.
  • .data["0"]: runtime object.
  • .data[<hex>]: additional referenced data objects (for example constructor-time dependencies).

Conceptually:

{
  ".code": [ /* deploy instructions */ ],
  ".data": {
    "0": { /* runtime assembly object */ },
    "ab12...": { /* dependency object or hash */ }
  }
}

The EVM assembly layer exposes this as Assembly { code, data }, with runtime_code() reading data["0"].

Dependencies and CREATE / CREATE2

Factory-style deploy code can reference other contract objects. In assembly, this is represented via data entries and push-style aliases:

  • PUSH [$] (PUSH_DataOffset) for object offset
  • PUSH #[$] (PUSH_DataSize) for object size
  • PUSH data (PUSH_Data) for raw dependency chunks

These operands are resolved during assembly preprocessing before LLVM lowering.

Deploy Stub Shape

The minimal deploy stub pattern is:

  1. Load runtime size (datasize).
  2. Load runtime offset (dataoffset).
  3. Copy bytes from code section to memory.
  4. Return copied bytes.

The EVM codegen emits this canonical form in minimal_deploy_code() using:

  • llvm.evm.datasize(metadata !"...")
  • llvm.evm.dataoffset(metadata !"...")
  • llvm.memcpy from addrspace(4) (code) to addrspace(1) (heap)
  • llvm.evm.return

datasize / dataoffset Builtins

Yul builtins datasize(<object>) and dataoffset(<object>) lower to EVM intrinsics with metadata object names.

In solx, these are translated to LLVM intrinsics:

  • llvm.evm.datasize
  • llvm.evm.dataoffset

This is how deploy stubs reference embedded runtime/dependency objects without hardcoding absolute byte offsets.

Metadata Hash and CBOR Tail

Runtime bytecode may include CBOR metadata appended at the end.

  • The payload can include compiler version info and optional metadata hash fields.
  • Hash behavior is configurable with --metadata-hash (for example ipfs).
  • CBOR appending can be disabled with --no-cbor-metadata.

In the build pipeline, metadata bytes are appended to runtime objects before final assembly/linking.

Library Linking

Library references are resolved at link time:

  • The linker patches linker symbols with final addresses.
  • If a symbol is unresolved, solx records its offsets and emits placeholders in hex output.
  • Placeholder format follows the common pattern __$<keccak-256-digest>$__.

Standard JSON output reports unresolved positions through evm.*.linkReferences so external tooling can link later.

Dependency Resolution and Path Aliasing

The assembly preprocessor performs a normalization pass over all contracts before lowering:

  1. Hash deploy and runtime sub-objects.
  2. Build hash -> full contract path mapping.
  3. Rewrite .data entries from embedded objects to stable path references (Data::Path).
  4. Build index mappings for deploy and runtime dependency tables.
  5. Replace instruction aliases (PUSH_DataOffset, PUSH_DataSize, PUSH_Data) with resolved identifiers.

Two details are important:

  • Entry "0" is always treated as runtime code and mapped to <contract>.runtime.
  • Hex indices are normalized to 32-byte (64 hex char) aliases before lookup, so short keys and padded keys resolve consistently.

This path aliasing step gives deterministic dependency identifiers for later object assembly and linking.

Testing

This page describes how to run tests for the solx compiler and the format of test files.

Unit and CLI Tests

Run the standard Rust test suite:

# Run all tests (unit + CLI)
cargo test

# Run only unit tests
cargo test --lib

# Run only CLI/integration tests
cargo test --test cli

# Run a specific test
cargo test --test cli -- cli::bin::default

Integration Tests

The solx-tester tool runs integration tests by compiling contracts and executing them with revm.

# Build the compiler and tester
cargo build --release

# Run all integration tests
./target/release/solx-tester --solidity-compiler ./target/release/solx

# Run tests for a specific file
./target/release/solx-tester --solidity-compiler ./target/release/solx --path tests/solidity/simple/default.sol

# Run only Yul IR pipeline tests (excludes EVMLA pipeline)
./target/release/solx-tester --solidity-compiler ./target/release/solx --via-ir

# Run tests with specific optimizer settings
./target/release/solx-tester --solidity-compiler ./target/release/solx --optimizer M3B3

# Combine filters: Yul IR pipeline with M3B3 optimizer
./target/release/solx-tester --solidity-compiler ./target/release/solx --via-ir --optimizer M3B3

Filtering Options

  • --via-ir — Run only tests using the Yul IR pipeline (codegen Y). Without this flag, both Yul IR and EVMLA pipelines are tested.
  • --optimizer <PATTERN> — Filter by optimizer settings. Examples:
    • M3B3 — Match exact optimizer level
    • M^B3 — Match M3 or Mz with B3
    • M*B* — Match any M and B levels
  • --path <PATTERN> — Run only tests whose path contains the pattern.

Foundry and Hardhat Projects

The solx-dev tool can run tests against real-world Foundry and Hardhat projects:

# Build solx-dev
cargo build --release --bin solx-dev

# Run Foundry project tests
./target/release/solx-dev test foundry --test-config-path solx-dev/foundry-tests.toml

# Run Hardhat project tests
./target/release/solx-dev test hardhat --test-config-path solx-dev/hardhat-tests.toml

The test configurations list projects that are cloned and tested automatically. See foundry-tests.toml and hardhat-tests.toml for the full list of tested projects.

Test Collection

This section describes the format of test files used by solx-tester.

Test Types

The repository contains three types of tests:

  • Upstream — Tests following the Solidity semantic test format.
  • Simple — Single-contract tests.
  • Complex — Multi-contract tests and vendored DeFi projects.

Test data is located in:

  • tests/solidity/ — Solidity test contracts
  • tests/yul/ — Yul test contracts
  • tests/llvm-ir/ — LLVM IR test contracts

Test Format

Each test comprises source code files and metadata. Simple tests have only one source file, and their metadata is written in comments that start with !, for example, //! for Solidity. Complex tests use a test.json file to describe their metadata and refer to source code files.

Metadata

Metadata is a JSON object that contains the following fields:

  • cases — An array of test cases (described below).
  • contracts — Used for complex tests to describe the contract instances to deploy. In simple tests, only one Test contract instance is deployed.
"contracts": {
    "Main": "main.sol:Main",
    "Callable": "callable.sol:Callable"
}
  • libraries — An optional field that specifies library addresses for linker:
"libraries": {
    "libraries/UQ112x112.sol": { "UQ112x112": "UQ112x112" },
    "libraries/Math.sol": { "Math": "Math" }
}
  • ignore — An optional flag that disables a test.
  • modes — An optional field that specifies mode filters. Y stands for Yul pipeline, E for EVM assembly pipeline. Compiler versions can be specified as SemVer ranges:
"modes": [
    "Y",
    "E",
    "E >=0.8.30"
]
  • group — An optional string field that specifies a test group for benchmarking.

Test Cases

All test cases are executed in a clean context, making them independent of each other.

Each test case contains the following fields:

  • name — A string name.
  • comment — An optional string comment.
  • inputs — An array of inputs (described below).
  • expected — The expected return data for the last input.
  • ignore, modes — Same as in test metadata.

Inputs

Inputs specify the contract calls in the test case:

  • comment — An optional string comment.
  • instance — The contract instance to call. Default: Test.
  • caller — The caller address. Default: 0xdeadbeef01000000000000000000000000000000.
  • method — The method to call:
    1. #deployer for the deployer call.
    2. #fallback to perform a call with raw calldata.
    3. Any other string is recognized as a function name. The function selector will be prepended to the calldata.
  • calldata — The input calldata:
    1. A hexadecimal string: "calldata": "0x00"
    2. A numbers array (hex, decimal, or instance addresses). Each number is padded to 32 bytes: "calldata": ["1", "2"]
  • value — An optional msg.value, a decimal number with wei or ETH suffix.
  • storage — Storage values to set before the call:
"storage": {
    "Test.address": ["1", "2", "3", "4"]
}
  • expected — The expected return data:
    1. An array of numbers: "expected": ["1", "2"]
    2. Extended format with return_data, exception, and events:
"expected": {
    "return_data": ["0x01"],
    "events": [
        {
            "topics": [
                "0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"
            ],
            "values": ["0xff"]
        }
    ],
    "exception": false
}

The expected field can be an array of objects if different expected data is needed for different compiler versions. Use compiler_version as a SemVer range in extended expected format.

Notes:

  • InstanceName.address can be used in expected, calldata, and storage fields to insert a contract instance address.
  • If a deployer call is not specified for an instance, it will be generated automatically with empty calldata.

Upstream Solidity Semantic Tests

These tests follow the Solidity semantic test format. Test descriptions and expected results are embedded as comments in the test file. Lines begin with // for Solidity files. The beginning of the test description is indicated by a comment line containing ----.

Debugging and Inspecting Compiler Output

This guide shows how to use solx debug flags to inspect intermediate representations at each compilation stage.

IR Dump Flags

Each flag writes files to the output directory (-o):

FlagExtensionDescription
--evmla.evmlaEVM legacy assembly from solc (legacy pipeline only)
--ethir.ethirEthIR (translated from EVM assembly, legacy pipeline only)
--ir / --ir-optimized.yulYul IR from solc
--emit-llvm-ir.unoptimized.ll, .optimized.llLLVM IR before and after optimization
--asm.asmFinal EVM assembly

The --debug-info and --debug-info-runtime flags are output selectors that print deploy and runtime debug info to stdout (or to files when -o is used). They are not IR dump flags.

Example:

solx contract.sol -o ./debug --evmla --ethir --emit-llvm-ir --asm --overwrite

This produces one file per contract per stage in ./debug/.

Quick Dump with SOLX_OUTPUT_DIR

Setting the SOLX_OUTPUT_DIR environment variable enables all IR dumps at once without listing individual flags:

export SOLX_OUTPUT_DIR=./ir_dumps
solx contract.sol

This writes all applicable IR files for every contract, with automatic overwrite. Which files are produced depends on the pipeline used: the Yul pipeline dumps Yul and LLVM IR, while the legacy pipeline dumps EVMLA, EthIR, and LLVM IR.

Benchmarking

The --benchmarks flag prints timing information for each pipeline stage:

solx contract.sol --benchmarks

Output includes per-contract compilation timing in milliseconds.

LLVM Diagnostics

Two flags control LLVM-level diagnostics:

  • --llvm-verify-each — runs LLVM IR verification after every optimization pass. Useful for catching miscompilations. Silent on success; only reports errors when verification fails.
  • --llvm-debug-logging — enables detailed LLVM pass execution logging to stderr. Shows which passes and analyses run, with instruction counts.
solx contract.sol --llvm-verify-each --llvm-debug-logging

LLVM Options Pass-Through

Arbitrary LLVM backend options can be passed with --llvm-options:

solx contract.sol --llvm-options='-evm-metadata-size 10'

The value must be a single string following =. See the LLVM Options guide for available options, including EVM backend options and standard LLVM diagnostic options like -time-passes and -stats.

Optimization Levels

solx maps optimization levels to LLVM pipelines:

FlagMiddle-endSize levelBack-end
-O1LessZeroLess
-O2DefaultZeroDefault
-O3 (default)AggressiveZeroAggressive
-OsDefaultSAggressive
-OzDefaultZAggressive

The default is -O3, optimizing for runtime performance.

The optimization level can also be set with the SOLX_OPTIMIZATION environment variable (values: 1, 2, 3, s, z).

Size Fallback

The --optimization-size-fallback flag (or SOLX_OPTIMIZATION_SIZE_FALLBACK env var) recompiles with -Oz when bytecode exceeds the 24,576-byte EVM contract size limit (EIP-170). When triggered, output files include a .size_fallback suffix.

Spill Area Suffix

When the compiler uses a memory spill region to mitigate stack-too-deep errors, output files include an .o{offset}s{size} suffix indicating the spill area parameters. For example: MyContract.o256s1024.ethir.

Typical Debugging Workflow

  1. Reproduce the issue with a minimal Solidity file.
  2. Dump all IRs using SOLX_OUTPUT_DIR:
    SOLX_OUTPUT_DIR=./debug solx contract.sol
    
  3. Inspect stage by stage:
    • Yul pipeline: Yul → LLVM IR (unoptimized) → LLVM IR (optimized) → assembly.
    • Legacy pipeline: EVMLA → EthIR → LLVM IR (unoptimized) → LLVM IR (optimized) → assembly.
  4. Narrow down which stage introduces the problem.
  5. Use LLVM verification if the issue is in the optimizer:
    solx contract.sol --llvm-verify-each --emit-llvm-ir -o ./debug --overwrite
    
  6. Compare with solc using the integration tester:
    cargo run --release --bin solx-tester -- \
      --solidity-compiler ./target/release/solx \
      --path contract.sol
    

LLVM Options

This guide documents LLVM backend options available in solx through the --llvm-options flag.

Usage

Pass options as a single string after =:

solx contract.sol --llvm-options='-option1 value1 -option2 value2'

EVM Backend Options

These options are specific to the custom LLVM EVM backend and affect compilation behavior directly.

-evm-stack-region-size <value>

Sets the stack spill region size in bytes. The compiler uses this region to spill values that cannot remain on the EVM stack (stack-too-deep mitigation). Normally set automatically based on optimizer settings. Requires -evm-stack-region-offset to be set as well.

-evm-stack-region-offset <value>

Sets the stack spill region memory offset. Normally set automatically to match the solc user memory offset.

-evm-metadata-size <value>

Sets the metadata size hint used by the backend for gas and code size tradeoff decisions.

Standard LLVM Diagnostic Options

Standard LLVM diagnostic options can be passed through --llvm-options and their output is printed to stderr. Some options (such as -debug and -debug-only) require LLVM built with assertions enabled (-DLLVM_ENABLE_ASSERTIONS=ON). When building from source, pass --enable-assertions to solx-dev llvm build.

-time-passes

Print timing information for each LLVM pass.

solx contract.sol --bin --llvm-options='-time-passes'

-stats

Print statistics from LLVM passes (number of transformations applied, etc.).

-print-after-all

Print LLVM IR after every optimization pass. Produces very large output (tens of thousands of lines) but useful for tracing pass behavior.

-print-before-all

Print LLVM IR before every optimization pass.

-debug-only=<pass-name>

Enable debug output for a specific LLVM pass. Note that --llvm-debug-logging controls pass-builder logging specifically, not the general LLVM DEBUG() macro categories.

CLI Debug Flags

These are top-level solx flags (not passed through --llvm-options):

FlagEffect
--llvm-verify-eachRun IR verifier after each LLVM pass. Silent on success; produces an error if verification fails.
--llvm-debug-loggingEnable pass-builder debug logging. Shows which passes and analyses run, with instruction counts.

See the Debugging guide for the full set of diagnostic flags.

Building with Sanitizers

This is the guide on building solx with sanitizers enabled.

Introduction

Sanitizers are tools that help find bugs in code. They are used to detect memory corruption, leaks, and undefined behavior. The most common sanitizers are AddressSanitizer, MemorySanitizer, and ThreadSanitizer.

If you are not familiar with sanitizers, see the official documentation.

Who is this guide for?

This guide is for developers who want to debug issues with solx.

Prerequisites

For sanitizers build to work, the host LLVM compiler version that is used to build LLVM MUST have the same version as the LLVM compiler that is used internally by `rustc` to build **solx**.

You can check the LLVM version used by rustc by running the following command rustc --version --verbose.

Build steps

The general steps to have a sanitizer enabled build include:

  1. Build the LLVM framework with the required sanitizer enabled.
  2. Build solx with the LLVM build from the previous step.

Please, follow the common installation instructions until the LLVM build step.

This guide assumes the build with AddressSanitizer enabled.

Build LLVM with sanitizer enabled

When building LLVM, use --sanitizer <sanitizer> option and set build type to RelWithDebInfo:

./target/release/solx-dev llvm build --sanitizer=Address --build-type=RelWithDebInfo
Please note that the default Apple Clang compiler is not compatible with Rust. You need to install LLVM using Homebrew and specify the path to the LLVM compiler in the `--extra-args` option. For example:
./target/release/solx-dev llvm build --sanitizer=Address --build-type=RelWithDebInfo --extra-args '-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang' '-DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++'

Build solx with the sanitizer enabled

To build solx with the sanitizer enabled, you need to set the RUSTFLAGS environment variable to -Z sanitizer=address and run the cargo build command. Sanitizers build is a feature that is available only for the nightly Rust compiler, it is recommended to set RUSTC_BOOTSTRAP=1 environment variable before the build.

It is also mandatory to use --target option to specify the target architecture. Otherwise, the build will fail. Please, check the table below to find the correct target for your platform.

PlatformLLVM Target Triple
MacOS-arm64aarch64-apple-darwin
MacOS-x86x86_64-apple-darwin
Linux-arm64aarch64-unknown-linux-gnu
Linux-x86x86_64-unknown-linux-gnu

Additionally, for proper reports symbolization it is recommended to set the ASAN_SYMBOLIZER_PATH environment variable. For more info, see symbolizing reports section of LLVM documentation.

For example, to build solx for MacOS-arm64 with AddressSanitizer enabled, run the following command:

export RUSTC_BOOTSTRAP=1
export ASAN_SYMBOLIZER_PATH=$(which llvm-symbolizer) # check the path to llvm-symbolizer
TARGET=aarch64-apple-darwin # Change to your target
RUSTFLAGS="-Z sanitizer=address" cargo test --target=${TARGET}

Congratulations! You have successfully built solx with the sanitizers enabled.

Please, refer to the official documentation for more information on how to use sanitizers and their types.