Installation

You can start using solx in the following ways:

Use the installation script.
```
curl -L https://raw.githubusercontent.com/NomicFoundation/solx/main/install-solx | bash
```
The script will download the latest stable release of solx and install it in your PATH.

⚠️ The script requires curl to be installed on your system.
This is the recommended way to install solx for MacOS users to bypass gatekeeper checks.
Download stable releases. See Static Executables.
Build solx from sources. See Building from Source.

System Requirements

It is recommended to have at least 4 GB of RAM to compile large projects. The compilation process is parallelized by default, so the number of threads used is equal to the number of CPU cores.

Large projects can consume a lot of RAM during compilation on machines with a high number of cores. If you encounter memory issues, consider reducing the number of threads using the --threads option.

The table below outlines the supported platforms and architectures:

CPU/OS	MacOS	Linux	Windows
x86_64	✅	✅	✅
arm64	✅	✅	❌

Please avoid using outdated distributions of operating systems, as they may lack the necessary dependencies or include outdated versions of them. solx is only tested on recent versions of popular distributions, such as MacOS 11.0 and Windows 10.

Versioning

The solx version consists of two parts:

solx version itself.
Version of solc libraries solx is statically linked with.

We recommend always using the latest version of solx to benefit from the latest features and bug fixes.

For large codebases, it is more convenient to use solx via toolkits such as Hardhat. These tools manage compiler input and output on a higher level, and provide additional features like incremental compilation and caching.

Static Executables

We ship solx binaries on the releases page of the eponymous repository. This repository maintains intuitive and stable naming for the executables and provides a changelog for each release. Tools using solx must download the binaries from this repository and cache them locally.

All executables are statically linked and must work on all recent platforms without issues.

Building from Source

Please consider using the pre-built executables before building from source. Building from source is only necessary for development, research, and debugging purposes. Deployment and production use cases should rely only on the officially released executables.

Install the necessary system-wide dependencies.
- For Linux (Debian):
```
apt install cmake ninja-build curl git libssl-dev pkg-config clang lld
```
- For Linux (Arch):
```
pacman -Syu which cmake ninja curl git pkg-config clang lld
```
- For MacOS:
  1. Install the Homebrew package manager by following the instructions at brew.sh.
  2. Install the necessary system-wide dependencies:
```
brew install cmake ninja coreutils
```
  3. Install a recent build of the LLVM/Clang compiler using one of the following tools:
    - Xcode
    - Apple’s Command Line Tools
    - Your preferred package manager.
Install Rust.

The easiest way to do it is following the latest official instructions.

The Rust version used for building is pinned in the rust-toolchain.toml file at the repository root. cargo will automatically download the pinned version of rustc when you start building the project.

Clone and checkout this repository with submodules.
```
git clone https://github.com/NomicFoundation/solx --recursive
```
By default, submodules checkout is disabled to prevent cloning large repositories via cargo. If you're building locally, ensure all submodules are checked out with:
```
git submodule update --recursive --checkout
```
Build the development tools.
```
cargo build --release --bin solx-dev
```
Build the LLVM framework using solx-dev.
```
./target/release/solx-dev llvm build --enable-mlir
```
This builds LLVM with the EVM target, MLIR, and LLD projects enabled. The build artifacts will be placed in target-llvm/.

For more information and available build options, run ./target/release/solx-dev llvm build --help.
Build the solc libraries using solx-dev.
```
./target/release/solx-dev solc build
```
This will configure and build the solc libraries in solx-solidity/build/. The command automatically detects MLIR and LLD paths if LLVM was built with those projects.

For more options, run ./target/release/solx-dev solc build --help.
Build the solx executable.
```
cargo build --release
```
The solx executable will appear as ./target/release/solx, where you can run it directly or move it to another location.

If cargo cannot find the LLVM build artifacts, ensure that the LLVM_SYS_211_PREFIX environment variable is not set in your system, as it may be pointing to a location different from the one expected by solx.

Tuning the LLVM build

For more information and available build options, run ./target/release/solx-dev llvm build --help.
The --enable-mlir flag enables MLIR support in the LLVM build (required for MLIR-based optimizations). LLD is always built.
Use the --ccache-variant ccache option to speed up the build process if you have ccache installed.

Building LLVM manually

If you prefer building the LLVM framework manually, include the following flags in your CMake command:

# We recommend using the latest version of CMake.

-DLLVM_TARGETS_TO_BUILD='EVM'
-DLLVM_ENABLE_PROJECTS='lld;mlir'
-DLLVM_ENABLE_RTTI='On'
-DBUILD_SHARED_LIBS='Off'

For most users, solx-dev is the recommended way to build the framework. This section was added for compiler toolchain developers and researchers with specific requirements and experience with the LLVM framework.

Command Line Interface (CLI)

The CLI of solx is designed to mimic that of solc. There are several main input/output (I/O) modes in the solx interface:

The basic CLI is simpler and suitable for using from the shell. The standard JSON mode is similar to client-server interaction, thus more suitable for using from other applications.

All toolkits using solx must be operating in standard JSON mode and follow its specification. It will make the toolkits more robust and future-proof, as the standard JSON mode is the most versatile and used for the majority of popular projects.

This page focuses on the basic CLI mode. For more information on the standard JSON mode, see this page.

Basic CLI

Basic CLI mode is the simplest way to compile a file with the source code.

To compile a basic Solidity contract, run the simple example from the --bin section.

The rest of this section describes the available CLI options and their usage. You may also check out solx --help for a quick reference.

`--bin`

Emits the full bytecode.

solx 'Simple.sol' --bin

Output:

======= Simple.sol:Simple =======
Binary:
34601557630000008480630000001a...

`--bin-runtime`

Emits the runtime part of the bytecode.

solx 'Simple.sol' --bin-runtime

Output:

======= Simple.sol:Simple =======
Binary of the runtime part:
34600b57600336116016575b5f5ffd...

`--asm`

Emits the text assembly produced by LLVM.

solx 'Simple.sol' --asm

Output:

======= Simple.sol:Simple =======
Deploy LLVM EVM assembly:
        .text
        .file   "Simple.sol:Simple"
main:
.func_begin0:
        JUMPDEST
        PUSH1 128
        PUSH1 64
...

Runtime LLVM EVM assembly:
        .text
        .file   "Simple.sol:Simple.runtime"
main:
.func_begin0:
        JUMPDEST
        PUSH1 128
        PUSH1 64
...

`--metadata`

Emits the contract metadata. The metadata is a JSON object that contains information about the contract, such as its name, source code hash, the list of dependencies, compiler versions, and so on.

The solx metadata format is compatible with the Solidity metadata format. This means that the metadata output can be used with other tools that support Solidity metadata. Extra solx data is inserted into solc metadata with this JSON object:

{
  "solx": {
    "llvm_options": [],
    "optimizer_settings": {
      "is_debug_logging_enabled": false,
      "is_fallback_to_size_enabled": false,
      "is_verify_each_enabled": false,
      "level_back_end": "Aggressive",
      "level_middle_end": "Aggressive",
      "level_middle_end_size": "Zero"
    },
    // Optional: only set for Solidity and Yul contracts.
    "solc_version": "0.8.34",
    // Mandatory: current version of solx.
    "solx_version": "0.1.4"
  }
}

Usage:

solx 'Simple.sol' --metadata

Output:

======= Simple.sol:Simple =======
Metadata:
{"compiler":{"version":"0.8.34+commit.e2cbf92c"},"language":"Solidity","output":{"abi":[{"inputs":[],"name":"first","outputs":[{"internalType":"uint64","name":"","type":"uint64"}],"stateMutability":"pure","type":"function"},{"inputs":[],"name":"second","outputs":[{"internalType":"uint256","name":"","type":"uint256"}],"stateMutability":"pure","type":"function"}],"devdoc":{"kind":"dev","methods":{},"version":1},"userdoc":{"kind":"user","methods":{},"version":1}},"settings":{"compilationTarget":{"Simple.sol":"Simple"},"evmVersion":"osaka","libraries":{},"metadata":{"bytecodeHash":"ipfs"},"optimizer":{"enabled":false,"runs":200},"remappings":[]},"solx":{"llvm_options":[],"optimizer_settings":{"is_debug_logging_enabled":false,"is_fallback_to_size_enabled":false,"is_verify_each_enabled":false,"level_back_end":"Aggressive","level_middle_end":"Aggressive","level_middle_end_size":"Zero"},"solc_version":"0.8.34","solx_version":"0.1.4"},"sources":{"Simple.sol":{"keccak256":"0x402fe0b38cc9d81e8c9f6d07854cca27fbb307f06d8a129998026907a10c7ca1","license":"MIT","urls":["bzz-raw://04714cab56c1f931e3cc1ddae4c7ff0c8832d0849e23966c6326028f6783d45a","dweb:/ipfs/QmehmUFKCtytG8WcWQ676KvqwURfkVYK89VHZEvSzyLc2Z"]}},"version":1}

`--ast-json`

Emits the AST of each Solidity file.

solx 'Simple.sol' --ast-json

Output:

======= Simple.sol:Simple =======
JSON AST:
{"absolutePath":".../Simple.sol","exportedSymbols":{"Simple":[24]},"id":25,"license":"MIT","nodeType":"SourceUnit","nodes":[ ... ],"src":"32:288:0"}

Since solx communicates with solc only via standard JSON under the hood, the full JSON AST is emitted instead of the compact one.

`--abi`

Emits the contract ABI specification.

solx 'Simple.sol' --abi

Output:

======= Simple.sol:Simple =======
Contract JSON ABI:
[{"inputs":[],"name":"first","outputs":[{"internalType":"uint64","name":"","type":"uint64"}],"stateMutability":"pure","type":"function"},{"inputs":[],"name":"second","outputs":[{"internalType":"uint256","name":"","type":"uint256"}],"stateMutability":"pure","type":"function"}]

`--hashes`

Emits the contract function signatures.

solx 'Simple.sol' --hashes

Output:

======= Simple.sol:Simple =======
Function signatures:
3df4ddf4: first()
5a8ac02d: second()

`--storage-layout`

Emits the contract storage layout.

solx 'Simple.sol' --storage-layout

Output:

======= Simple.sol:Simple =======
Contract Storage Layout:
{"storage":[{"astId":3,"contract":"Simple.sol:Simple","label":"field_1","offset":0,"slot":"0","type":"t_uint256"},{"astId":5,"contract":"Simple.sol:Simple","label":"field_2","offset":0,"slot":"1","type":"t_uint256"},{"astId":7,"contract":"Simple.sol:Simple","label":"field_3","offset":0,"slot":"2","type":"t_uint256"}],"types":{"t_uint256":{"encoding":"inplace","label":"uint256","numberOfBytes":"32"}}}

`--transient-storage-layout`

Emits the contract transient storage layout.

solx 'Simple.sol' --transient-storage-layout

Output:

======= Simple.sol:Simple =======
Contract Transient Storage Layout:
{"storage":[{"astId":3,"contract":"Simple.sol:Simple","label":"field_1","offset":0,"slot":"0","type":"t_uint256"},{"astId":5,"contract":"Simple.sol:Simple","label":"field_2","offset":0,"slot":"1","type":"t_uint256"},{"astId":7,"contract":"Simple.sol:Simple","label":"field_3","offset":0,"slot":"2","type":"t_uint256"}],"types":{"t_uint256":{"encoding":"inplace","label":"uint256","numberOfBytes":"32"}}}

`--userdoc`

Emits the contract user documentation.

solx 'Simple.sol' --userdoc

Output:

======= Simple.sol:Simple =======
User Documentation:
{"kind":"user","methods":{ ... },"version":1}

`--devdoc`

Emits the contract developer documentation.

solx 'Simple.sol' --devdoc

Output:

======= Simple.sol:Simple =======
Developer Documentation:
{"kind":"dev","methods":{ ... },"version":1}

`--asm-solc-json`

Emits the solc EVM assembly parsed from solc's JSON output.

solx 'Simple.sol' --asm-solc-json

Output:

======= Simple.sol:Simple =======
EVM assembly:
000     PUSH                80
001     MEMORYGUARD
002     PUSH                40
003     MSTORE
...

This is the solc EVM assembly output that is translated to LLVM IR by solx. For solx's own EVM assembly output emitted by LLVM, use the --asm option instead.

`--ir` (or `--ir-optimized`)

Emits the solc Yul IR.

solx does not use the Yul optimizer anymore, so the Yul IR is always unoptimized, and it is not possible to emit solc-optimized Yul IR with solx.

solx 'Simple.sol' --ir

Output:

======= Simple.sol:Simple =======
IR:
/// @use-src 0:"Simple.sol"
object "Simple_24" {
    code {
        {
            ...
        }
    }
    /// @use-src 0:"Simple.sol"
    object "Simple_24_deployed" {
        code {
            {
                ...
            }
        }
        data ".metadata" hex"a26469706673582212206c34df79f8cc8ba870a350940cb8623c60d4f6f9c356e2185b812187d9ae55ee64736f6c63430008220033"
    }
}

`--debug-info`

Emits the ELF-wrapped DWARF debug info of the deploy code.

solx 'Simple.sol' --debug-info

Output:

======= Simple.sol:Simple =======
Debug info:
7f454c46010201ff...

`--debug-info-runtime`

Emits the ELF-wrapped DWARF debug info of the runtime code.

solx 'Simple.sol' --debug-info-runtime

Output:

======= Simple.sol:Simple =======
Debug info of the runtime part:
7f454c46010201ff

`--evmla`

Emits EVM legacy assembly (intermediate representation from solc).

When used with --output-dir, writes .evmla files to the output directory. Without --output-dir, outputs to stdout.

Usage with --output-dir:

solx 'Simple.sol' --evmla --output-dir './build/'
ls './build/'

Output:

Compiler run successful.
Simple_sol_Simple.evmla
Simple_sol_Simple.runtime.evmla

Usage with stdout:

solx 'Simple.sol' --evmla --bin

Output:

======= Simple.sol:Simple =======
Binary:
...
Deploy EVM legacy assembly:
000     PUSH                80
...

`--ethir`

Emits Ethereal IR (intermediate representation between EVM assembly and LLVM IR).

When used with --output-dir, writes .ethir files to the output directory. Without --output-dir, outputs to stdout.

Usage with --output-dir:

solx 'Simple.sol' --ethir --output-dir './build/'
ls './build/'

Output:

Compiler run successful.
Simple_sol_Simple.ethir
Simple_sol_Simple.runtime.ethir

Usage with stdout:

solx 'Simple.sol' --ethir --bin

Output:

======= Simple.sol:Simple =======
Binary:
...
Deploy Ethereal IR:
function main(0, 0, 0, 0, 0) -> 0, 0, 0, 0 {
...

`--emit-llvm-ir`

Emits LLVM IR (both unoptimized and optimized).

When used with --output-dir, writes .ll files to the output directory. Without --output-dir, outputs to stdout.

Usage with --output-dir:

solx 'Simple.sol' --emit-llvm-ir --output-dir './build/'
ls './build/'

Output:

Compiler run successful.
Simple_sol_Simple.optimized.ll
Simple_sol_Simple.runtime.optimized.ll
Simple_sol_Simple.runtime.unoptimized.ll
Simple_sol_Simple.unoptimized.ll

Usage with stdout:

solx 'Simple.sol' --emit-llvm-ir --bin --via-ir

Output:

======= Simple.sol:Simple =======
Binary:
...
Deploy LLVM IR (unoptimized):
; ModuleID = 'Simple.sol:Simple'
...
Deploy LLVM IR:
; ModuleID = 'Simple.sol:Simple'
...

`--benchmarks`

Emits benchmarks of the solx LLVM-based pipeline and its underlying call to solc.

solx 'Simple.sol' --benchmarks

Output:

Benchmarks:
solc_Solidity_Standard_JSON: 6ms
solx_Solidity_IR_Analysis: 0ms
solx_Compilation: 75ms

======= Simple.sol:Simple =======
Benchmarks:
    Simple.sol:Simple:deploy/EVMAssemblyToLLVMIR/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple:deploy/InitVerify/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple:deploy/OptimizeVerify/M3B3/SpillArea(0): 1ms
    Simple.sol:Simple:runtime/EVMAssemblyToLLVMIR/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple.runtime:runtime/InitVerify/M3B3/SpillArea(0): 0ms
    Simple.sol:Simple.runtime:runtime/OptimizeVerify/M3B3/SpillArea(0): 5ms

Input Files

solx supports multiple input files. The following command compiles two Solidity files and prints the bytecode:

solx 'Simple.sol' 'Complex.sol' --bin

Solidity import remappings are passed the same way as input files, but they are distinguished by a = symbol between source and destination. The following command compiles a Solidity file with a remapping and prints the bytecode:

solx 'Simple.sol' 'github.com/ethereum/dapp-bin/=/usr/local/lib/dapp-bin/' --bin

solx does not handle remappings itself, but only passes them through to solc. Visit the solc documentation to learn more about the processing of remappings.

`--libraries`

Specifies the libraries to link with compiled contracts. The option accepts multiple string arguments. The safest way is to wrap each argument in single quotes, and separate them with a space.

The specifier has the following format: <ContractPath>:<ContractName>=<LibraryAddress>.

Usage:

solx 'Simple.sol' --bin --libraries 'Simple.sol:Simple=0x1234567890abcdef1234567890abcdef12345678'

`--base-path`, `--include-path`, `--allow-paths`

These options are used to specify Solidity import resolution settings. They are not used by solx and only passed through to solc like import remappings.

Visit the solc documentation to learn more about the processing of these options.

`--output-dir`

Specifies the output directory for build artifacts. Can only be used in basic CLI mode.

Usage in basic CLI mode:

solx 'Simple.sol' --bin --asm --metadata --output-dir './build/'
ls './build/'

Output:

Compiler run successful. Artifact(s) can be found in directory "build".
Simple_sol_Simple.asm
Simple_sol_Simple.bin
Simple_sol_Simple.runtime.asm
Simple_sol_Simple_llvm.asm
Simple_sol_Simple_llvm.asm-runtime
Simple_sol_Simple_meta.json

`--overwrite`

Overwrites the output files if they already exist in the output directory. By default, solx does not overwrite existing files.

Can only be used in combination with the --output-dir option.

Usage:

solx 'Simple.sol' --bin --output-dir './build/' --overwrite

If the --overwrite option is not specified and the output files already exist, solx will print an error message and exit:

Error: Refusing to overwrite an existing file "./build/Simple_sol_Simple.bin" (use --overwrite to force).

`--version`

Prints the version of solx and the hash of the LLVM commit it was built with.

Usage:

solx --version

`--help`

Prints the help message.

Usage:

solx --help

Other I/O Modes

The mode-altering CLI options are mutually exclusive. This means that only one of the options below can be enabled at a time:

`--standard-json`

For the standard JSON mode usage, see the Standard JSON page.

solx Compilation Settings

The options in this section are only configuring the solx compiler and do not affect the underlying solc compiler.

`--threads`

Sets the number of threads used for parallel compilation. Each thread compiles a separate translation unit in a child process. By default, the number of threads equals the number of CPU cores.

Large projects can consume a lot of RAM during compilation on machines with a high number of cores. If you encounter memory issues, consider reducing the number of threads.

Usage:

solx 'Simple.sol' --bin --threads 4

`--optimization / -O`

Sets the optimization level of the LLVM optimizer. Available values are:

Level	Meaning	Hints
0	No optimization	For fast compilation during development (unsupported)
1	Performance: basic	For optimization research
2	Performance: default	For optimization research
3	Performance: aggressive	Best performance for production
s	Size: default	For optimization research
z	Size: aggressive	Best size for contracts with size constraints

For most cases, it is fine to keep the default value of 3. You should only use the level z if you are ready to deliberately sacrifice performance and optimize for size.

Large contracts may hit the EVM bytecode size limit. In this case, it is recommended to use the --optimization-size-fallback option rather than setting the level to z.

Usage:

solx 'Simple.sol' --bin -O3

This option can also be set with an environment variable SOLX_OPTIMIZATION, which is useful for toolkits where arbitrary solx-specific options are not supported:

SOLX_OPTIMIZATION='3' solx 'Simple.sol' --bin

`--optimization-size-fallback`

Sets the optimization level to z for contracts that failed to compile due to overrunning the bytecode size constraints.

Under the hood, this option automatically triggers recompilation of contracts with level z. Contracts that were successfully compiled with the original --optimization setting are not recompiled.

For deployment, it is recommended to have this option enabled in order to mitigate potential issues with EVM bytecode size constraints on a per-contract basis. If your environment does not have bytecode size limitations, it is better to disable it to prevent unnecessary recompilations. A good example is running forge test.

Usage:

solx 'Simple.sol' --bin -O3 --optimization-size-fallback

This option can also be set with an environment variable SOLX_OPTIMIZATION_SIZE_FALLBACK, which is useful for toolkits where arbitrary solx-specific options are not supported:

SOLX_OPTIMIZATION_SIZE_FALLBACK= solx 'Simple.sol' --bin -O3

`--metadata-hash`

Specifies the hash format used for contract metadata.

Usage with ipfs:

solx 'Simple.sol' --bin --metadata-hash 'ipfs'

Output with ipfs:

======= Simple.sol:Simple =======
Binary:
34601557630000008480630000001a6080396080f35b5f5ffdfe34600b5760...
a2646970667358221220579682b419e25ecc4524604eb5f3a8dbe3b15621ca21cc8ada8dcf6196a512df64736f6c637816736f6c783a302e312e343b736f6c633a302e382e33340047

The byte array starting with a2 at the end of the bytecode is a CBOR-encoded compiler version data and an optional metadata hash.

The last two bytes of the metadata (0x0047) are not a part of the CBOR payload, but the length of it, which must be known to correctly decode the payload.

JSON representation of the CBOR payload:

{
    // Optional: included if `--metadata-hash` is set to `ipfs`.
    "ipfs": "1220579682b419e25ecc4524604eb5f3a8dbe3b15621ca21cc8ada8dcf6196a512df",

    // Required: consists of semicolon-separated pairs of colon-separated compiler names and versions.
    // `solx:<version>` is always included.
    // `solc:<version>` is only included for Solidity and Yul contracts, but not included for LLVM IR ones.
    "solc": "solx:0.1.4;solc:0.8.34"
}

For more information on these formats, see the CBOR and IPFS documentation.

`--no-cbor-metadata`

Disables the CBOR metadata that is appended at the end of bytecode. This option is useful for debugging and research purposes.

It is not recommended to use this option in production, as it is not possible to verify contracts deployed without metadata.

Usage:

solx 'Simple.sol' --no-cbor-metadata

`--llvm-options`

Specifies additional options for the LLVM framework. The argument must be a single quoted string following a = separator.

Usage:

solx 'Simple.sol' --bin --llvm-options='-key=value'

The --llvm-options option is experimental and must only be used by experienced users. All supported options will be documented in the future.

solc Compilation Settings

The options in this section are only configuring solc, so they are passed directly to its child process, and do not affect the solx compiler.

`--via-ir`

Switches the solc codegen to Yul a.k.a. IR.

Usage:

solx 'Simple.sol' --bin --via-ir

`--evm-version`

Specifies the EVM version solx will produce bytecode for. For instance, with version osaka, solx will be producing clz instructions, whereas for older EVM versions it will not.

Only the following EVM versions are supported:

cancun
prague
osaka (default)

Usage:

solx 'Simple.sol' --bin --evm-version 'osaka'

`--metadata-literal`

Tells solc to store referenced sources as literal data in the metadata output.

This option only affects the contract metadata output produced by solc, and does not affect artifacts produced by solx.

Usage:

solx 'Simple.sol' --bin --metadata --metadata-literal

`--no-import-callback`

Disables the default import resolution callback in solc.

This parameter is used by some tooling that resolves all imports by itself, such as Hardhat.

Usage:

solx 'Simple.sol' --no-import-callback

Multi-Language Support

solx supports input in multiple programming languages:

The following sections outline how to use solx with these languages.

`--yul` (or `--strict-assembly`)

Enables the Yul mode. In this mode, input is expected to be in the Yul language. The output works the same way as with Solidity input.

Usage:

solx --yul 'Simple.yul' --bin

Output:

======= Simple.yul =======
Binary:
5b60806040525f341415601c5763...

`--llvm-ir`

Enables the LLVM IR mode. In this mode, input is expected to be in the LLVM IR language. The output works the same way as with Solidity input.

In this mode, every input file is treated as runtime code, while deploy code will be generated automatically by solx. It is not possible to write deploy code manually yet, but it will be supported in the future.

Unlike solc, solx is an LLVM-based compiler toolchain, so it uses LLVM IR as an intermediate representation. It is not recommended to write LLVM IR manually, but it can be useful for debugging and optimization purposes. LLVM IR is more low-level than Yul and EVM assembly in the solx IR hierarchy.

Usage:

solx --llvm-ir 'Simple.ll' --bin

Output:

======= Simple.ll =======
Binary:
5b60806040525f341415601c5763...

Debugging

IR Output Flags

For selective IR output, use the following flags with --output-dir:

--evmla - EVM legacy assembly
--ethir - Ethereal IR
--emit-llvm-ir - LLVM IR (unoptimized and optimized)
--asm - LLVM EVM assembly

These flags respect the --overwrite option. Without --overwrite, the compiler will refuse to overwrite existing files.

`SOLX_OUTPUT_DIR` Environment Variable

For debugging purposes, all intermediate build artifacts can be dumped to a directory using the SOLX_OUTPUT_DIR environment variable. This is useful for toolkits where arbitrary solx-specific options are not supported.

When this environment variable is set, solx will output all intermediate representations to the specified directory, always overwriting existing files.

The intermediate build artifacts include:

Name	Extension
EVM Assembly	evmla
EthIR	ethir
Yul	yul
LLVM IR	ll
LLVM Assembly	asm

Usage:

SOLX_OUTPUT_DIR='./debug/' solx 'Simple.sol' --bin
ls './debug/'

Output:

Simple_sol_Simple.evmla
Simple_sol_Simple.ethir
Simple_sol_Simple.unoptimized.ll
Simple_sol_Simple.optimized.ll
Simple_sol_Simple.asm
Simple_sol_Simple.runtime.evmla
Simple_sol_Simple.runtime.ethir
Simple_sol_Simple.runtime.unoptimized.ll
Simple_sol_Simple.runtime.optimized.ll
Simple_sol_Simple.runtime.asm

The output file name is constructed as follows: <ContractPath>_<ContractName>.<Modifiers>.<Extension>.

Additionally, it is possible to dump the standard JSON input file with the SOLX_STANDARD_JSON_DEBUG environment variable:

SOLX_STANDARD_JSON_DEBUG='./debug/input.json' solx 'Simple.sol' --bin
cat './debug/input.json' | jq .

`--llvm-verify-each`

Enables the verification of the LLVM IR after each optimization pass. This option is useful for debugging and research purposes.

Usage:

solx 'Simple.sol' --bin --llvm-verify-each

`--llvm-debug-logging`

Enables the debug logging of the LLVM IR optimization passes. This option is useful for debugging and research purposes.

Usage:

solx 'Simple.sol' --bin --llvm-debug-logging

Standard JSON

Standard JSON is a protocol for interaction with the solx and solc compilers. This protocol must be implemented by toolkits such as Hardhat.

The protocol uses two data formats for communication: input JSON and output JSON.

Usage

Input JSON can be provided by-value via the --standard-json option:

solx --standard-json './input.json'

Alternatively, the input JSON can be fed to solx via stdin:

cat './input.json' | solx --standard-json

You can also insert your standard JSON input directly into the command line:

solx --standard-json

<paste into stdin here and press Ctrl-D>

For the sake of interface unification, solx will always return with exit code 0 and have its standard JSON output printed to stdout. It differs from solc that may return with exit code 1 and a free-formed error in some cases, such as when the standard JSON input file is missing, even though the solc documentation claims otherwise.

Input JSON

The input JSON provides the compiler with the source code and settings for the compilation. The example below serves as the specification of the input JSON format.

This format introduces several solx-specific parameters such as settings.optimizer.sizeFallback. These parameters are marked as solx-only.

On the other hand, parameters that are not mentioned here but are parts of solc standard JSON protocol have no effect in solx.

{
  // Required: Source code language.
  // Currently supported: "Solidity", "Yul", "LLVM IR".
  "language": "Solidity",
  // Required: Source code files to compile.
  // The keys here are the "global" names of the source files. Imports can be using other file paths via remappings.
  "sources": {
    // In source file entry, either but not both "urls" and "content" must be specified.
    "myFile.sol": {
      // Required (unless "content" is used): URL(s) to the source file.
      "urls": [
        // In Solidity mode, directories must be added to the command-line via "--allow-paths <path>" for imports to work.
        // It is possible to specify multiple URLs for a single source file. In this case the first successfully resolved URL will be used.
        "/tmp/path/to/file.sol"
      ],
      // Required (unless "urls" is used): Literal contents of the source file.
      "content": "contract settable is owned { uint256 private x = 0; function set(uint256 _x) public { if (msg.sender == owner) x = _x; } }"
    }
  },

  // Required: Compilation settings.
  "settings": {
    // Optional: Optimizer settings.
    "optimizer": {
      // Optional, solx-only: Set the LLVM optimizer level.
      // Available options:
      // -0: do not optimize (unsupported)
      // -1: basic optimizations for gas usage
      // -2: advanced optimizations for gas usage
      // -3: all optimizations for gas usage
      // -s: basic optimizations for bytecode size
      // -z: all optimizations for bytecode size
      // Default: 3.
      "mode": "3",
      // Optional, solx-only: Re-run the compilation with "mode": "z" if the initial compilation exceeds the EVM bytecode size limit.
      // Used on a per-contract basis and applied automatically, so some contracts will end up compiled in the initial mode, and others with "mode": "z".
      // Only activated if "mode" is set to "3", which is the default optimization mode.
      // Default: false.
      "sizeFallback": false
    },

    // Optional: Sorted list of remappings.
    // Important: Only used with Solidity input.
    "remappings": [ ":g=/dir" ],
    // Optional: Addresses of the libraries.
    // If not all library addresses are provided here, it will result in unlinked bytecode files that will require post-compile-time linking before deployment.
    // Important: Only used with Solidity, Yul, and LLVM IR input.
    "libraries": {
      // The top level key is the name of the source file where the library is used.
      // If remappings are used, this source file should match the global path after remappings were applied.
      "myFile.sol": {
        // Source code library name and address where it is deployed.
        "MyLib": "0x123123..."
      }
    },

    // Optional: Version of EVM solx will produce bytecode for.
    // Supported EVM versions: "cancun", "prague", "osaka".
    // For instance, with version "osaka", solx will be producing `clz` instructions, whereas for older EVM versions it will not.
    // The oldest supported EVM version is "cancun".
    // Default: "osaka".
    "evmVersion": "osaka",
    // Optional: Select the desired output.
    // Default: no flags are selected, and no output is generated.
    "outputSelection": {
      "<path>": {
        // Available file-level options, must be listed under "<path>"."":
        "": [
          // AST of all source files.
          "ast",
          // Benchmarks of the solx LLVM-based compilation pipeline and its underlying call to solc.
          "benchmarks"
        ],
        // Available contract-level options, must be listed under "<path>"."<name>":
        "<name>": [
          // Solidity ABI.
          "abi",
          // Metadata.
          "metadata",
          // Developer documentation (natspec).
          "devdoc",
          // User documentation (natspec).
          "userdoc",
          // Slots, offsets and types of the contract's state variables in storage.
          "storageLayout",
          // Slots, offsets and types of the contract's state variables in transient storage.
          "transientStorageLayout",
          // Yul produced by solc.
          // An alias "irOptimized" is supported for compatibility, but it will request unoptimized Yul IR anyway.
          "ir",
          // Everything of the below.
          "evm",
          // Solidity function hashes.
          "evm.methodIdentifiers",
          // EVM assembly produced by solc.
          "evm.legacyAssembly",
          // Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.gasEstimates",
          // Everything that starts with "evm.bytecode".
          "evm.bytecode",
          // Deploy bytecode produced by solx/LLVM.
          // As long as the solx bytecode linker is in experimental stage, all contracts will be compiled if this key is enabled for at least one contract.
          "evm.bytecode.object",
          // Deploy code assembly produced by solx/LLVM.
          "evm.bytecode.llvmAssembly",
          // solx-only: EVM legacy assembly IR (internal representation). Only available for non-viaIR mode.
          "evm.bytecode.evmla",
          // solx-only: Ethereal IR (internal representation). Only available for non-viaIR mode.
          "evm.bytecode.ethir",
          // solx-only: Unoptimized LLVM IR (internal representation).
          "evm.bytecode.llvmIrUnoptimized",
          // solx-only: Optimized LLVM IR (internal representation).
          "evm.bytecode.llvmIr",
          // ELF-wrapped DWARF debug info produced by solx/LLVM. Only available for Solidity source code input.
          "evm.bytecode.debugInfo",
          // Link references for linkers that are to resolve library addresses at deploy time.
          "evm.bytecode.linkReferences",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.bytecode.opcodes",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.bytecode.sourceMap",
          // Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.bytecode.functionDebugData",
          // Unsupported, but emitted as an empty array to preserve compatibility with some toolkits.
          "evm.bytecode.generatedSources",
          // Everything that starts with "evm.deployedBytecode".
          "evm.deployedBytecode",
          // Runtime bytecode produced by solx/LLVM.
          // As long as the solx bytecode linker is in experimental stage, all contracts will be compiled if this key is enabled for at least one contract.
          "evm.deployedBytecode.object",
          // Runtime code assembly produced by solx/LLVM.
          "evm.deployedBytecode.llvmAssembly",
          // solx-only: EVM legacy assembly IR (internal representation). Only available for non-viaIR mode.
          "evm.deployedBytecode.evmla",
          // solx-only: Ethereal IR (internal representation). Only available for non-viaIR mode.
          "evm.deployedBytecode.ethir",
          // solx-only: Unoptimized LLVM IR (internal representation).
          "evm.deployedBytecode.llvmIrUnoptimized",
          // solx-only: Optimized LLVM IR (internal representation).
          "evm.deployedBytecode.llvmIr",
          // Link references for linkers that are to resolve library addresses at deploy time.
          "evm.deployedBytecode.linkReferences",
          // Resolved automatically by solx/LLVM, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.deployedBytecode.immutableReferences",
          // ELF-wrapped DWARF debug info produced by solx/LLVM. Only available for Solidity source code input.
          "evm.deployedBytecode.debugInfo",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.deployedBytecode.opcodes",
          // Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
          "evm.deployedBytecode.sourceMap",
          // Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
          "evm.deployedBytecode.functionDebugData",
          // Unsupported, but emitted as an empty array to preserve compatibility with some toolkits.
          "evm.deployedBytecode.generatedSources"
        ]
      }
    },
    // Optional: Metadata settings.
    "metadata": {
      // Optional: Use the given hash method for the metadata hash that is appended to the bytecode.
      // Available options: "none", "ipfs".
      // Default: "ipfs".
      "bytecodeHash": "ipfs",
      // Optional: Use only literal content and not URLs.
      // Default: false.
      "useLiteralContent": true,
      // Optional: Whether to include CBOR-encoded metadata at the end of bytecode.
      // Default: true.
      "appendCBOR": true
    },
    // Optional: Enables the IR codegen in solc.
    "viaIR": true,

    // Optional, solx-only: Extra LLVM settings.
    "llvmOptions": [
      "-key", "value"
    ]
  }
}

Output JSON

The output JSON contains all artifacts produced by solx and solc together. The example below serves as the specification of the output JSON format.

{
  // Required: File-level outputs.
  "sources": {
    "sourceFile.sol": {
      // Required: Identifier of the source.
      "id": 1,
      // Optional: The AST object.
      // Corresponds to "ast" in the outputSelection settings.
      "ast": {/* ... */}
    }
  },

  // Required: Contract-level outputs.
  "contracts": {
    // The source name.
    "sourceFile.sol": {
      // The contract name.
      // If the language only supports one contract per file, this field equals to the source name.
      "ContractName": {
        // Optional: The Ethereum Contract ABI (object).
        // See https://docs.soliditylang.org/en/develop/abi-spec.html.
        // Corresponds to "abi" in the outputSelection settings.
        "abi": [/* ... */],
        // Optional: Storage layout (object).
        // Corresponds to "storageLayout" in the outputSelection settings.
        "storageLayout": {/* ... */},
        // Optional: Transient storage layout (object).
        // Corresponds to "transientStorageLayout" in the outputSelection settings.
        "transientStorageLayout": {/* ... */},
        // Optional: Contract metadata (string).
        // Corresponds to "metadata" in the outputSelection settings.
        "metadata": "/* ... */",
        // Optional: Developer documentation (natspec object).
        // Corresponds to "devdoc" in the outputSelection settings.
        "devdoc": {/* ... */},
        // Optional: User documentation (natspec object).
        // Corresponds to "userdoc" in the outputSelection settings.
        "userdoc": {/* ... */},
        // Optional: Yul produced by solc (string).
        // Corresponds to "ir" in the outputSelection settings.
        "ir": "/* ... */",
        // Optional: EVM target outputs.
        // Corresponds to "evm" in the outputSelection settings.
        "evm": {
          // Optional: EVM assembly produced by solc (object).
          // Corresponds to "evm.legacyAssembly" in the outputSelection settings.
          "legacyAssembly": {/* ... */},
          // Optional: List of function hashes (object).
          // Corresponds to "evm.methodIdentifiers" in the outputSelection settings.
          "methodIdentifiers": {
            // Mapping between the function signature and its hash.
            "delegate(address)": "5c19a95c"
          },
          // Optional: Always empty, Included only to preserve compatibility with some toolkits (object).
          // Corresponds to "evm.gasEstimates" in the outputSelection settings.
          "gasEstimates": {},
          // Optional: Deploy EVM bytecode.
          // Corresponds to "evm.bytecode" in the outputSelection settings.
          "bytecode": {
            // Optional: Bytecode (string).
            // Corresponds to "evm.bytecode.object" in the outputSelection settings.
            "object": "5b60806040525f341415601c5763...",
            // Optional: LLVM text assembly (string).
            // Corresponds to "evm.bytecode.llvmAssembly" in the outputSelection settings.
            "llvmAssembly": "/* ... */",
            // Optional, solx-only: EVM legacy assembly IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.bytecode.evmla" in the outputSelection settings.
            "evmla": "/* ... */",
            // Optional, solx-only: Ethereal IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.bytecode.ethir" in the outputSelection settings.
            "ethir": "/* ... */",
            // Optional, solx-only: Unoptimized LLVM IR (string).
            // Corresponds to "evm.bytecode.llvmIrUnoptimized" in the outputSelection settings.
            "llvmIrUnoptimized": "/* ... */",
            // Optional, solx-only: Optimized LLVM IR (string).
            // Corresponds to "evm.bytecode.llvmIr" in the outputSelection settings.
            "llvmIr": "/* ... */",
            // Optional: ELF-wrapped DWARF debug info (string).
            // Corresponds to "evm.bytecode.debugInfo" in the outputSelection settings.
            "debugInfo": "/* ... */",
            // Optional: Link references for linkers that are to resolve library addresses at deploy time (object).
            // Corresponds to "evm.bytecode.linkReferences" in the outputSelection settings.
            "linkReferences": {/* ... */},
            // Optional: Benchmarks of each stage of the compilation on a per-translation unit basis (array).
            // Corresponds to "benchmarks" in the outputSelection settings.
            "benchmarks": [/* ... */],
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.bytecode.opcodes" in the outputSelection settings.
            "opcodes": "",
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.bytecode.sourceMap" in the outputSelection settings.
            "sourceMap": "",
            // Optional: Always empty object, included only to preserve compatibility with some toolkits (object).
            // Corresponds to "evm.bytecode.functionDebugData" in the outputSelection settings.
            "functionDebugData": {},
            // Optional: Always empty array, included only to preserve compatibility with some toolkits (array).
            // Corresponds to "evm.bytecode.generatedSources" in the outputSelection settings.
            "generatedSources": []
          },
          // Optional: Runtime EVM bytecode.
          // Corresponds to "evm.deployedBytecode" in the outputSelection settings.
          "deployedBytecode": {
            // Optional: Bytecode (string).
            // Corresponds to "evm.deployedBytecode.object" in the outputSelection settings.
            "object": "5b60806040525f34141560145760...",
            // Optional: LLVM text assembly (string).
            // Corresponds to "evm.deployedBytecode.llvmAssembly" in the outputSelection settings.
            "llvmAssembly": "/* ... */",
            // Optional, solx-only: EVM legacy assembly IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.deployedBytecode.evmla" in the outputSelection settings.
            "evmla": "/* ... */",
            // Optional, solx-only: Ethereal IR (string). Only available for non-viaIR mode.
            // Corresponds to "evm.deployedBytecode.ethir" in the outputSelection settings.
            "ethir": "/* ... */",
            // Optional, solx-only: Unoptimized LLVM IR (string).
            // Corresponds to "evm.deployedBytecode.llvmIrUnoptimized" in the outputSelection settings.
            "llvmIrUnoptimized": "/* ... */",
            // Optional, solx-only: Optimized LLVM IR (string).
            // Corresponds to "evm.deployedBytecode.llvmIr" in the outputSelection settings.
            "llvmIr": "/* ... */",
            // Optional: ELF-wrapped DWARF debug info (string).
            // Corresponds to "evm.deployedBytecode.debugInfo" in the outputSelection settings.
            "debugInfo": "/* ... */",
            // Optional: Link references for linkers that are to resolve library addresses at deploy time (object).
            // Corresponds to "evm.deployedBytecode.linkReferences" in the outputSelection settings.
            "linkReferences": {/* ... */},
            // Optional: Benchmarks of each stage of the compilation on a per-translation unit basis (array).
            // Corresponds to "benchmarks" in the outputSelection settings.
            "benchmarks": [/* ... */],
            // Optional: Resolved by LLVM automatically, so always returned as an empty object (object).
            // Included only to preserve compatibility with some toolkits.
            // Corresponds to "evm.deployedBytecode.immutableReferences" in the outputSelection settings.
            "immutableReferences": {},
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.deployedBytecode.opcodes" in the outputSelection settings.
            "opcodes": "",
            // Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
            // Corresponds to "evm.deployedBytecode.sourceMap" in the outputSelection settings.
            "sourceMap": "",
            // Optional: Always empty object, included only to preserve compatibility with some toolkits (object).
            // Corresponds to "evm.deployedBytecode.functionDebugData" in the outputSelection settings.
            "functionDebugData": {},
            // Optional: Always empty array, included only to preserve compatibility with some toolkits (array).
            // Corresponds to "evm.deployedBytecode.generatedSources" in the outputSelection settings.
            "generatedSources": []
          }
        }
      }
    }
  },

  // Optional: Benchmarks of the solx LLVM-based compilation pipeline and its underlying call to solc (array).
  // Corresponds to "benchmarks" in the outputSelection settings.
  "benchmarks": [/* ... */],

  // Optional: Unset if no messages were emitted.
  "errors": [
    {
      // Optional: Location within the source file.
      // Unset if the error is unrelated to input sources.
      "sourceLocation": {
        // Required: The source path.
        "file": "sourceFile.sol",
        // Required: The source location start. Equals -1 if unknown.
        "start": 0,
        // Required: The source location end. Equals -1 if unknown.
        "end": 100
      },
      // Required: Message type.
      // solc errors are listed at https://docs.soliditylang.org/en/latest/using-the-compiler.html#error-types.
      "type": "Error",
      // Required: Component the error originates from.
      "component": "general",
      // Required: Message severity.
      // Possible values: "error", "warning", "info".
      "severity": "error",
      // Optional: Unique code for the cause of the error.
      // Only solc produces error codes for now.
      // solx currently emits errors without codes, but they will be introduced soon.
      "errorCode": "3141",
      // Required: Message.
      "message": "Invalid keyword",
      // Required: Message formatted using the source location.
      "formattedMessage": "sourceFile.sol:100: Invalid keyword"
    }
  ]
}

Limitations and Differences from Upstream solc

This chapter summarizes where solx differs from upstream solc, and which limitations currently apply.

Compilation Modes

solx supports two codegen pipelines:

Yul pipeline: enabled with --via-ir (matching solc's --via-ir flag).
Legacy EVM assembly pipeline: the default code generation path.

The --evmla and --ethir debug flags are only available in the legacy (non-via-ir) pipeline.

solc Fork Modifications

The solx-solidity fork includes the following changes relative to upstream solc:

extraMetadata output: emits user-defined function metadata (name, entry tag, input/output sizes, AST IDs) used during LLVM lowering.
DUPX / SWAPX instructions: extends stack access beyond depth 16 to avoid classic "stack too deep" failures.
spillAreaSize setting: configures a memory spill region for values that cannot remain on stack.
Function pointer dispatch tables: uses static dispatch through FuncPtrTracker instead of dynamic jump-based dispatch.
Simplified try/catch in legacy mode: reduces control-flow complexity for translator compatibility.
Bypassed EVM bytecode generation: solx does not use solc's EVM bytecode output; final bytecode is produced by the LLVM backend.
Disabled optimizer: the solc optimizer is turned off to preserve function boundaries and metadata validity. All optimization is handled by the LLVM backend.

Behavioral Differences

Generated bytecode can differ from upstream solc output because final code generation happens in LLVM.
Optimization levels map to LLVM optimization pipelines, not upstream solc optimization heuristics.
Final code size can differ from upstream due to LLVM pass behavior.

Unsupported Features

CALLCODE is rejected at compile time. Use DELEGATECALL instead.
SELFDESTRUCT is rejected at compile time (deprecated by EIP-6049).
PC (program counter) is not supported.
BLOBHASH and BLOBBASEFEE (EIP-4844/EIP-7516) are rejected at compile time.
Inline assembly marked memory-safe can cause errors when spill-area-based lowering is active.
Some solc optimizer settings are ignored since the solc optimizer is disabled.

Version Support

The solx-solidity fork tracks upstream solc releases.
The minimum supported Solidity version matches the forked solc version.

Architecture

solx is an LLVM-based compiler that translates Solidity source code into optimized EVM bytecode.

Components

The compiler consists of three repositories:

solx — The main compiler executable and Rust crates that translate Yul and EVM assembly to LLVM IR.
solx-solidity — An LLVM-friendly fork of the Solidity compiler that emits Yul and EVM assembly.
solx-llvm — A fork of the LLVM framework with an EVM target backend.

Compilation Pipeline

                        ┌─────────────────────────────────────────────┐
                        │                  Frontend                   │
┌──────────┐            │  ┌────────────────┐       ┌──────────────┐  │
│ Solidity │ ────────── │  │ solx-solidity  │ ───── │     solx     │  │
│  source  │            │  │                │       │              │  │
└──────────┘            │  │ Parsing,       │ Yul / │ Yul & EVM    │  │
                        │  │ semantic       │ EVM   │ assembly     │  │
                        │  │ analysis       │ asm   │ translation  │  │
                        │  └────────────────┘       └──────────────┘  │
                        └─────────────────────────────────────────────┘
                                                           │
                                                        LLVM IR
                                                           │
                                                           ▼
                        ┌─────────────────────────────────────────────┐
                        │                 Middle-end                  │
                        │  ┌────────────────────────────────────────┐ │
                        │  │           LLVM Optimizer               │ │
                        │  │                                        │ │
                        │  │  IR transformations and optimizations  │ │
                        │  └────────────────────────────────────────┘ │
                        └─────────────────────────────────────────────┘
                                                           │
                                                     Optimized IR
                                                           │
                                                           ▼
                        ┌─────────────────────────────────────────────┐
                        │                  Backend                    │
                        │  ┌────────────────────────────────────────┐ │
                        │  │         solx-llvm EVM Target           │ │
                        │  │                                        │ │
                        │  │  Instruction selection, register       │ │
                        │  │  allocation, code emission             │ │
                        │  └────────────────────────────────────────┘ │
                        └─────────────────────────────────────────────┘
                                                           │
                                                           ▼
                                                   ┌──────────────┐
                                                   │ EVM bytecode │
                                                   └──────────────┘

Frontend

The frontend transforms Solidity source code into LLVM IR:

solx-solidity parses the Solidity source, performs semantic analysis, and emits either Yul or EVM assembly.
solx reads the Yul or EVM assembly and translates it into LLVM IR.

Middle-end

The LLVM optimizer applies a series of IR transformations to improve code quality and performance. These optimizations are target-independent and work on the LLVM IR representation.

Backend

The solx-llvm EVM target converts optimized LLVM IR into EVM bytecode. This includes:

Instruction selection (mapping IR operations to EVM opcodes)
Register allocation (managing the EVM stack)
Stackification (converting register-based code to stack-based EVM operations)
Code emission (generating the final bytecode)

Why a Fork of solc?

The solx-solidity fork includes modifications to make the Solidity compiler output compatible with LLVM IR generation. The upstream solc compiler is designed to emit EVM bytecode directly, but solx needs intermediate representations (Yul or EVM assembly) that can be translated to LLVM IR.

The fork maintains compatibility with upstream solc and tracks its releases.

EVM Assembly Translator

The EVM assembly translator converts legacy EVM assembly (the default solc output) into LLVM IR via an intermediate representation called Ethereal IR (EthIR). The Yul pipeline (--via-ir) bypasses this translator entirely.

Why EthIR?

EVM assembly is stack-based with dynamic jumps, making it difficult to translate directly to LLVM IR which requires explicit control flow graphs. EthIR bridges this gap by:

Tracking stack state to identify jump destinations at compile time
Cloning blocks reachable from predecessors with different stack states
Reconstructing control flow from stack-based jumps into a static CFG
Resolving function calls using metadata from the solc fork

Translation Pipeline

Solidity source
    │
    ▼
solc (solx-solidity fork)
    │  Emits EVM assembly JSON + extraMetadata
    ▼
Assembly parsing
    │  Parses instructions, resolves dependencies
    ▼
Block construction
    │  Groups instructions between Tag labels
    ▼
EthIR traversal
    │  DFS with stack simulation, block cloning
    ▼
LLVM IR generation
    │  Creates LLVM functions, basic blocks, instructions
    ▼
LLVM optimizer
    │
    ▼
EVM bytecode (via LLVM EVM backend)

Key Data Structures

Assembly

The Assembly struct represents the raw solc output. It contains:

code: Flat list of instructions (deploy code)
data["0"]: Nested assembly for runtime code
data[hex]: Referenced data entries — sub-assemblies, hashes, or resolved contract paths (for CREATE/CREATE2)

Each instruction has a name (opcode), optional value (operand), and optional source location.

EtherealIR

The top-level container holding:

entry_function: The main contract function (deploy + runtime)
defined_functions: Internal functions discovered during traversal

Function

The Function struct is the core of the translator. It contains:

blocks: BTreeMap<BlockKey, Vec<Block>> — maps each block tag to one or more instances (clones for different stack states)
block_hash_index: HashMap<BlockKey, HashSet<u64>> — fast duplicate detection by stack hash
stack_size: Maximum stack height observed, used to size LLVM stack allocations

Block

Each Block represents a sequence of instructions between two Tag labels:

key: BlockKey (code segment + tag number)
instance: Clone index (0, 1, 2... for blocks visited with different stack states)
elements: Instructions with full stack state snapshots
initial_stack / stack: Stack state at entry and after processing

Stack Elements

The stack tracks six kinds of values:

Variant	Description	Example
`Value(String)`	Runtime value (opaque)	Result of `ADD`, `MLOAD`
`Constant(BigUint)`	Compile-time 256-bit constant	`0x60`, `0xFFFF`
`Tag(u64)`	Block tag (jump target)	Tag 42
`Path(String)`	Contract dependency path	`"SubContract"`
`Data(String)`	Hex data chunk	`"deadbeef"`
`ReturnAddress(usize)`	Function return marker	Return with 2 outputs

Block Cloning and Stack Hashing

The same block may be reached via different code paths with different stack contents. Since the stack determines jump targets (a JUMP pops its destination from the stack), the translator must handle each unique stack state separately.

How It Works

When entering a block, the translator computes a stack hash using XxHash3_64
The hash considers only Tag elements — tags determine control flow, while constants and runtime values affect only data flow
The pair (BlockKey, stack_hash) uniquely identifies a block instance
If this pair has been visited before, the block is skipped (cycle detection)
Otherwise, a new block instance is created

Block "process" reached with stack [T_10, V_x]:  → instance 0
Block "process" reached with stack [T_20, V_y]:  → instance 1 (different tag)
Block "process" reached with stack [T_10, V_z]:  → instance 0 (same hash, reused)

Stack Hash Algorithm

fn hash(&self) -> u64 {
    let mut hasher = XxHash3_64::default();
    for element in self.elements.iter() {
        match element {
            Element::Tag(tag) => hasher.write(&tag.to_le_bytes()),
            _ => hasher.write_u8(0),
        }
    }
    hasher.finish()
}

Only Tag values contribute to the hash. This is intentional: two stack states with the same tags but different runtime values will follow the same control flow path.

Traversal Algorithm

The Function::traverse() method performs a depth-first traversal of blocks, simulating EVM execution:

traverse(blocks, extra_metadata):
    queue ← [(entry_block, empty_stack)]
    visited ← {}

    while queue is not empty:
        (block_key, stack) ← queue.pop()
        hash ← stack.hash()

        if (block_key, hash) in visited:
            continue
        visited.add((block_key, hash))

        block ← blocks[block_key].clone_with(stack)
        for instruction in block:
            simulate_instruction(instruction, stack)
            if instruction is JUMP/JUMPI:
                queue.push((target_tag, stack))

Instruction Simulation

For each instruction, the translator:

Pops the required number of inputs from the simulated stack
Computes the output (compile-time if possible, runtime value otherwise)
Pushes the result onto the stack
For control flow instructions, queues successor blocks

Compile-Time Constant Folding

Arithmetic operations on known values are folded at compile time:

Operands	Result
`Constant + Constant`	`Constant` (computed)
`Tag + Constant`	`Tag` (if result is valid block)
`Tag + Tag`	`Tag` (if result is valid block)
Any other combination	`Value` (runtime, opaque)

This is critical for resolving jump targets: solc often computes jump destinations via PUSH tag + arithmetic.

Function Call Detection

The translator identifies function calls using extra metadata from the solc fork. The extraMetadata JSON field lists all user-defined functions with their:

Entry tag (in deploy and/or runtime code)
Input parameter count
Output return value count
Function name and AST node ID

When a JUMP targets a known function entry:

The stack is split: return address, arguments, and remaining caller state
A RecursiveCall pseudo-instruction replaces the JUMP
A new Function is created and recursively traversed from the entry block
The caller's stack receives output_size opaque return values

Before JUMP to function "add(uint,uint)":
  Stack: [... | return_tag | arg1 | arg2 | function_entry_tag]

After call detection:
  Instruction: RecursiveCall add(uint,uint), input=2, output=1
  Caller stack: [... | return_value]
  Callee: new Function traversed from entry tag

LLVM IR Generation

After traversal, the translator generates LLVM IR in several phases:

1. Function Declaration

Entry function: Uses the pre-declared contract entry point
Defined functions: Creates private LLVM functions with N × i256 parameters and return values (multiple returns use LLVM struct types)

2. Stack Variable Allocation

For each function, stack_size stack slots are allocated as LLVM alloca instructions. These represent the simulated EVM stack as addressable memory:

%stack_0 = alloca i256    ; bottom of stack
%stack_1 = alloca i256
...
%stack_N = alloca i256    ; top of stack

For defined functions, slot 0 is reserved for the return address marker, and input parameters are stored starting from slot 1.

3. Basic Block Creation

Each (BlockKey, instance) pair becomes an LLVM BasicBlock:

block_runtime_42/0:       ; tag 42, first instance
  ...
block_runtime_42/1:       ; tag 42, second instance (different stack state)
  ...

4. Instruction Translation

Each EthIR element calls into_llvm() to generate LLVM instructions. Stack operations map to loads/stores on the allocated stack variables:

EVM Operation	LLVM Translation
`PUSH 0x42`	`store i256 66, ptr %stack_N`
`DUP2`	`%v = load i256, ptr %stack_(N-2); store i256 %v, ptr %stack_(N+1)`
`ADD`	`%a = load ...; %b = load ...; %r = add i256 %a, %b; store ...`
`MLOAD`	`%ptr = load ...; %v = load i256, ptr addrspace(1) %ptr; store ...`
`JUMP`	`br label %target_block`
`JUMPI`	`%cond = ...; br i1 %cond, label %taken, label %fallthrough`

solc Fork Modifications

The EVM assembly translator relies on several modifications in the solx-solidity fork. The most relevant to this pipeline are:

extraMetadata output: reports all user-defined functions with entry tags, parameter counts, and AST IDs. Without this, the translator cannot distinguish function calls from arbitrary jumps.
Dispatch tables for function pointers: indirect calls are lowered to static dispatch tables instead of dynamic jumps.
DUPX / SWAPX instructions: extend stack access beyond depth 16, eliminating "stack too deep" errors.
Disabled optimizer: the solc optimizer is disabled to preserve function boundaries and metadata validity. All optimization is handled by the LLVM backend.

For the full list of fork modifications, see Limitations and Differences from solc.

EVM Instructions Reference

This chapter describes how the LLVM EVM backend models EVM instructions and lowers LLVM IR into final opcode sequences.

Instruction Definitions

The EVM instruction set is defined in the LLVM backend via TableGen.

It contains opcode definitions, pattern mappings, and EVM-specific pseudo-instructions.
It covers roughly 180 instruction forms once TableGen expansions are considered (for example DUP1..16, SWAP1..16, and PUSH families).
Instructions are modeled around i256 values, matching the EVM word size.

Address Space Model

The backend uses explicit LLVM address spaces to model EVM memory regions:

Address space	Value	Meaning
`AS_STACK`	`0`	Compiler-managed stack memory model
`AS_HEAP`	`1`	EVM linear memory (`MLOAD`, `MSTORE`, `MCOPY`)
`AS_CALL_DATA`	`2`	Call data region
`AS_RETURN_DATA`	`3`	Return data region
`AS_CODE`	`4`	Code segment
`AS_STORAGE`	`5`	Persistent storage
`AS_TSTORAGE`	`6`	Transient storage

These constants are defined in the EVM backend header.

Core Instruction Categories

Arithmetic

Arithmetic opcodes map directly to i256 operations or EVM intrinsics:

ADD, MUL, SUB, DIV, SDIV, MOD, SMOD
ADDMOD, MULMOD, EXP, SIGNEXTEND

For example, ADD is selected from LLVM add i256 patterns.

Memory

Memory instructions operate on the heap address space:

MLOAD maps to a load from AS_HEAP
MSTORE maps to a store into AS_HEAP
MCOPY lowers memory copy operations in heap memory

Storage

Storage instructions map to storage address spaces:

SLOAD, SSTORE use AS_STORAGE
TLOAD, TSTORE use AS_TSTORAGE

Control Flow

Control flow instructions are selected from LLVM branch forms:

JUMP maps from unconditional br
JUMPI maps from conditional br i1

The backend also uses helper pseudos (for example JUMP_UNLESS) that are lowered before emission.

Stack

EVM stack manipulation opcodes are emitted as needed:

DUP1..DUP16
SWAP1..SWAP16
POP

They are introduced and optimized by stackification passes rather than directly authored in frontend IR.

Cryptographic

SHA3/KECCAK256 is represented through EVM-specific intrinsic plumbing:

Machine instruction: KECCAK256
LLVM intrinsic path: llvm.evm.sha3

Runtime Library (`evm-stdlib.ll`)

The backend links helper wrappers from the EVM runtime standard library:

__addmod
__mulmod
__signextend
__exp
__byte
__sdiv
__div
__smod
__mod
__shl
__shr
__sar
__sha3

These wrappers forward to corresponding llvm.evm.* intrinsics.

Stackification Pipeline

The late codegen pipeline converts virtual-register machine IR to valid EVM stack code:

EVMSingleUseExpression: reorders machine instructions into expression-friendly form.
EVMBackwardPropagationStackification: performs backward propagation stackification from register form.
EVMStackSolver and EVMStackShuffler: compute and emit low-cost DUP/SWAP/spill-reload sequences.
EVMPeephole: runs late peephole optimizations before final emission.

Stack Depth Limit

The EVM stack itself can hold up to 1024 items, but DUP and SWAP instructions can only reach the top 16 positions. The backend enforces this depth-16 manipulation reach.

This limit is exposed by EVMSubtarget::stackDepthLimit().

Pseudo-Instructions

Several pseudos are used during lowering and removed or expanded before final bytecode:

PUSHDEPLOYADDRESS: materializes deploy-time address usage for libraries.
SELECT: models conditional value selection.
CONST_I256: represents immediate constants before stackification.
COPY_I256: temporary register-copy form before stackification.

Yul Builtins Reference

This chapter lists all Yul builtin functions supported by solx and how each is lowered to LLVM IR for the EVM backend.

Lowering Strategies

Yul builtins are lowered through one of three strategies:

Direct LLVM IR: the builtin maps to native LLVM integer or memory operations on i256.
LLVM intrinsic: the builtin maps to an llvm.evm.* intrinsic that the EVM backend expands to opcodes.
Address space access: the builtin maps to a load or store in a typed LLVM address space (see EVM Instructions: Address Space Model).

Arithmetic

Builtin	Lowering	Notes
`add`	Direct LLVM IR	`add i256`
`sub`	Direct LLVM IR	`sub i256`
`mul`	Direct LLVM IR	`mul i256`
`div`	Direct LLVM IR	Unsigned; returns 0 when divisor is 0
`sdiv`	Direct LLVM IR	Signed; returns 0 when divisor is 0
`mod`	Direct LLVM IR	Unsigned; returns 0 when divisor is 0
`smod`	Direct LLVM IR	Signed; returns 0 when divisor is 0
`addmod`	Intrinsic `llvm.evm.addmod`	`(x + y) % m` without intermediate overflow
`mulmod`	Intrinsic `llvm.evm.mulmod`	`(x * y) % m` without intermediate overflow
`exp`	Intrinsic `llvm.evm.exp`	Exponentiation
`signextend`	Intrinsic `llvm.evm.signextend`	Sign extend from bit `(i*8+7)`

Comparison

Builtin	Lowering	Notes
`lt`	Direct LLVM IR	Unsigned less-than
`gt`	Direct LLVM IR	Unsigned greater-than
`slt`	Direct LLVM IR	Signed less-than
`sgt`	Direct LLVM IR	Signed greater-than
`eq`	Direct LLVM IR	Equality
`iszero`	Direct LLVM IR	Check if zero

Bitwise

Builtin	Lowering	Notes
`and`	Direct LLVM IR	Bitwise AND
`or`	Direct LLVM IR	Bitwise OR
`xor`	Direct LLVM IR	Bitwise XOR
`not`	Direct LLVM IR	Bitwise NOT
`shl`	Direct LLVM IR	Shift left; shift >= 256 yields 0
`shr`	Direct LLVM IR	Logical shift right; shift >= 256 yields 0
`sar`	Direct LLVM IR	Arithmetic shift right; shift >= 256 yields sign-extended value
`byte`	Intrinsic `llvm.evm.byte`	Extract nth byte
`clz`	Intrinsic `llvm.ctlz`	Count leading zeros (requires Osaka EVM version)

Hashing

Builtin	Lowering	Notes
`keccak256`	Intrinsic `llvm.evm.sha3`	Keccak-256 over memory range

Memory

Builtin	Lowering	Notes
`mload`	Address space 1 load	Load 32 bytes from heap memory
`mstore`	Address space 1 store	Store 32 bytes to heap memory
`mstore8`	Intrinsic `llvm.evm.mstore8`	Store single byte to memory
`mcopy`	memcpy in address space 1	EIP-5656 memory copy
`msize`	Intrinsic `llvm.evm.msize`	Highest accessed memory index

Storage

Builtin	Lowering	Notes
`sload`	Address space 5 load	Load from persistent storage
`sstore`	Address space 5 store	Store to persistent storage
`tload`	Address space 6 load	Load from transient storage (EIP-1153)
`tstore`	Address space 6 store	Store to transient storage (EIP-1153)

Immutables

Builtin	Lowering	Notes
`loadimmutable`	Intrinsic `llvm.evm.loadimmutable`	Load immutable value with metadata identifier
`setimmutable`	Special	Set immutable value during deployment

Call Data and Return Data

Builtin	Lowering	Notes
`calldataload`	Address space 2 load	Load 32 bytes from calldata
`calldatasize`	Intrinsic `llvm.evm.calldatasize`	Size of calldata
`calldatacopy`	memcpy from address space 2 to 1	Copy calldata to memory
`returndatasize`	Intrinsic `llvm.evm.returndatasize`	Size of return data
`returndatacopy`	memcpy from address space 3 to 1	Copy return data to memory

Code Operations

Builtin	Lowering	Notes
`codesize`	Intrinsic `llvm.evm.codesize`	Current contract code size
`codecopy`	memcpy from address space 4 to 1	Copy code to memory
`extcodesize`	Intrinsic `llvm.evm.extcodesize`	External contract code size
`extcodecopy`	Intrinsic `llvm.evm.extcodecopy`	Copy external code to memory
`extcodehash`	Intrinsic `llvm.evm.extcodehash`	Hash of external contract code

Object and Data Operations

Builtin	Lowering	Notes
`datasize`	Intrinsic `llvm.evm.datasize`	Size of a named data object
`dataoffset`	Intrinsic `llvm.evm.dataoffset`	Offset of a named data object
`datacopy`	Same as `codecopy`	Copy data to memory

These builtins are used by deploy stubs to reference embedded runtime and dependency objects. See Binary Layout for details.

Event Logging

Builtin	Lowering	Notes
`log0`	Intrinsic `llvm.evm.log0`	Log with 0 topics
`log1`	Intrinsic `llvm.evm.log1`	Log with 1 topic
`log2`	Intrinsic `llvm.evm.log2`	Log with 2 topics
`log3`	Intrinsic `llvm.evm.log3`	Log with 3 topics
`log4`	Intrinsic `llvm.evm.log4`	Log with 4 topics

Contract Calls

Builtin	Lowering	Notes
`call`	Intrinsic `llvm.evm.call`	Call with value transfer
`delegatecall`	Intrinsic `llvm.evm.delegatecall`	Call preserving caller and callvalue
`staticcall`	Intrinsic `llvm.evm.staticcall`	Read-only call

Note: callcode is rejected at compile time. Use delegatecall instead.

Contract Creation

Builtin	Lowering	Notes
`create`	Intrinsic `llvm.evm.create`	Create new contract
`create2`	Intrinsic `llvm.evm.create2`	Create at deterministic address

Control Flow

Builtin	Lowering	Notes
`return`	Intrinsic `llvm.evm.return`	Return data from execution
`revert`	Intrinsic `llvm.evm.revert`	Revert with return data
`stop`	Intrinsic `llvm.evm.stop`	Stop execution
`invalid`	Intrinsic `llvm.evm.invalid`	Invalid instruction (consumes all gas)

Note: selfdestruct is rejected at compile time (deprecated by EIP-6049).

Block and Transaction Context

Builtin	Lowering	Notes
`address`	Intrinsic `llvm.evm.address`	Current contract address
`caller`	Intrinsic `llvm.evm.caller`	Message sender
`callvalue`	Intrinsic `llvm.evm.callvalue`	Wei sent with call
`gas`	Intrinsic `llvm.evm.gas`	Remaining gas
`gasprice`	Intrinsic `llvm.evm.gasprice`	Gas price of transaction
`balance`	Intrinsic `llvm.evm.balance`	Balance of address
`selfbalance`	Intrinsic `llvm.evm.selfbalance`	Current contract balance
`origin`	Intrinsic `llvm.evm.origin`	Transaction sender

Block Information

Builtin	Lowering	Notes
`blockhash`	Intrinsic `llvm.evm.blockhash`	Hash of given block
`number`	Intrinsic `llvm.evm.number`	Current block number
`timestamp`	Intrinsic `llvm.evm.timestamp`	Block timestamp
`coinbase`	Intrinsic `llvm.evm.coinbase`	Block beneficiary
`difficulty`	Intrinsic `llvm.evm.difficulty`	Block difficulty (pre-merge)
`prevrandao`	Intrinsic `llvm.evm.difficulty`	Previous RANDAO value (EIP-4399, reuses difficulty)
`gaslimit`	Intrinsic `llvm.evm.gaslimit`	Block gas limit
`chainid`	Intrinsic `llvm.evm.chainid`	Chain ID (EIP-1344)
`basefee`	Intrinsic `llvm.evm.basefee`	Base fee per gas (EIP-1559)
`blobhash`	Rejected at compile time	Versioned hash of transaction's i-th blob (EIP-4844)
`blobbasefee`	Rejected at compile time	Current block's blob base fee (EIP-7516/EIP-4844)

Note: blobhash and blobbasefee are not yet supported and will produce a compile error.

Special and Meta

Builtin	Lowering	Notes
`pop`	Optimized away	No code generated
`linkersymbol`	Intrinsic `llvm.evm.linkersymbol`	Library linker placeholder
`memoryguard`	Special	Reserves a memory region; used by solx to configure the spill area for stack-too-deep mitigation

Binary Layout and Linking

This chapter describes how solx models deploy/runtime bytecode objects, dependency data, and post-compilation linking.

Contract Object Model

EVM contracts have two code segments:

Deploy code (init code): runs only during contract creation.
Runtime code: returned by deploy code and stored as the contract's permanent code.

Deploy code typically builds runtime bytes in memory and executes RETURN(offset, size).

`solc` JSON Assembly Layout

In legacy assembly JSON, the object is split into top-level deploy code and nested runtime code:

Top-level .code: deploy instruction stream.
.data["0"]: runtime object.
.data[<hex>]: additional referenced data objects (for example constructor-time dependencies).

Conceptually:

{
  ".code": [ /* deploy instructions */ ],
  ".data": {
    "0": { /* runtime assembly object */ },
    "ab12...": { /* dependency object or hash */ }
  }
}

The EVM assembly layer exposes this as Assembly { code, data }, with runtime_code() reading data["0"].

Dependencies and `CREATE` / `CREATE2`

Factory-style deploy code can reference other contract objects. In assembly, this is represented via data entries and push-style aliases:

PUSH [$] (PUSH_DataOffset) for object offset
PUSH #[$] (PUSH_DataSize) for object size
PUSH data (PUSH_Data) for raw dependency chunks

These operands are resolved during assembly preprocessing before LLVM lowering.

Deploy Stub Shape

The minimal deploy stub pattern is:

Load runtime size (datasize).
Load runtime offset (dataoffset).
Copy bytes from code section to memory.
Return copied bytes.

The EVM codegen emits this canonical form in minimal_deploy_code() using:

llvm.evm.datasize(metadata !"...")
llvm.evm.dataoffset(metadata !"...")
llvm.memcpy from addrspace(4) (code) to addrspace(1) (heap)
llvm.evm.return

`datasize` / `dataoffset` Builtins

Yul builtins datasize(<object>) and dataoffset(<object>) lower to EVM intrinsics with metadata object names.

In solx, these are translated to LLVM intrinsics:

llvm.evm.datasize
llvm.evm.dataoffset

This is how deploy stubs reference embedded runtime/dependency objects without hardcoding absolute byte offsets.

Metadata Hash and CBOR Tail

Runtime bytecode may include CBOR metadata appended at the end.

The payload can include compiler version info and optional metadata hash fields.
Hash behavior is configurable with --metadata-hash (for example ipfs).
CBOR appending can be disabled with --no-cbor-metadata.

In the build pipeline, metadata bytes are appended to runtime objects before final assembly/linking.

Library Linking

Library references are resolved at link time:

The linker patches linker symbols with final addresses.
If a symbol is unresolved, solx records its offsets and emits placeholders in hex output.
Placeholder format follows the common pattern __$<keccak-256-digest>$__.

Standard JSON output reports unresolved positions through evm.*.linkReferences so external tooling can link later.

Dependency Resolution and Path Aliasing

The assembly preprocessor performs a normalization pass over all contracts before lowering:

Hash deploy and runtime sub-objects.
Build hash -> full contract path mapping.
Rewrite .data entries from embedded objects to stable path references (Data::Path).
Build index mappings for deploy and runtime dependency tables.
Replace instruction aliases (PUSH_DataOffset, PUSH_DataSize, PUSH_Data) with resolved identifiers.

Two details are important:

Entry "0" is always treated as runtime code and mapped to <contract>.runtime.
Hex indices are normalized to 32-byte (64 hex char) aliases before lookup, so short keys and padded keys resolve consistently.

This path aliasing step gives deterministic dependency identifiers for later object assembly and linking.

Testing

This page describes how to run tests for the solx compiler and the format of test files.

Unit and CLI Tests

Run the standard Rust test suite:

# Run all tests (unit + CLI)
cargo test

# Run only unit tests
cargo test --lib

# Run only CLI/integration tests
cargo test --test cli

# Run a specific test
cargo test --test cli -- cli::bin::default

Integration Tests

The solx-tester tool runs integration tests by compiling contracts and executing them with revm.

# Build the compiler and tester
cargo build --release

# Run all integration tests
./target/release/solx-tester --solidity-compiler ./target/release/solx

# Run tests for a specific file
./target/release/solx-tester --solidity-compiler ./target/release/solx --path tests/solidity/simple/default.sol

# Run only Yul IR pipeline tests (excludes EVMLA pipeline)
./target/release/solx-tester --solidity-compiler ./target/release/solx --via-ir

# Run tests with specific optimizer settings
./target/release/solx-tester --solidity-compiler ./target/release/solx --optimizer M3B3

# Combine filters: Yul IR pipeline with M3B3 optimizer
./target/release/solx-tester --solidity-compiler ./target/release/solx --via-ir --optimizer M3B3

Filtering Options

--via-ir — Run only tests using the Yul IR pipeline (codegen Y). Without this flag, both Yul IR and EVMLA pipelines are tested.
--optimizer <PATTERN> — Filter by optimizer settings. Examples:
- M3B3 — Match exact optimizer level
- M^B3 — Match M3 or Mz with B3
- M*B* — Match any M and B levels
--path <PATTERN> — Run only tests whose path contains the pattern.

Foundry and Hardhat Projects

The solx-dev tool can run tests against real-world Foundry and Hardhat projects:

# Build solx-dev
cargo build --release --bin solx-dev

# Run Foundry project tests
./target/release/solx-dev test foundry --test-config-path solx-dev/foundry-tests.toml

# Run Hardhat project tests
./target/release/solx-dev test hardhat --test-config-path solx-dev/hardhat-tests.toml

The test configurations list projects that are cloned and tested automatically. See foundry-tests.toml and hardhat-tests.toml for the full list of tested projects.

Test Collection

This section describes the format of test files used by solx-tester.

Test Types

The repository contains three types of tests:

Upstream — Tests following the Solidity semantic test format.
Simple — Single-contract tests.
Complex — Multi-contract tests and vendored DeFi projects.

Test data is located in:

tests/solidity/ — Solidity test contracts
tests/yul/ — Yul test contracts
tests/llvm-ir/ — LLVM IR test contracts

Test Format

Each test comprises source code files and metadata. Simple tests have only one source file, and their metadata is written in comments that start with !, for example, //! for Solidity. Complex tests use a test.json file to describe their metadata and refer to source code files.

Metadata

Metadata is a JSON object that contains the following fields:

cases — An array of test cases (described below).
contracts — Used for complex tests to describe the contract instances to deploy. In simple tests, only one Test contract instance is deployed.

"contracts": {
    "Main": "main.sol:Main",
    "Callable": "callable.sol:Callable"
}

libraries — An optional field that specifies library addresses for linker:

"libraries": {
    "libraries/UQ112x112.sol": { "UQ112x112": "UQ112x112" },
    "libraries/Math.sol": { "Math": "Math" }
}

ignore — An optional flag that disables a test.
modes — An optional field that specifies mode filters. Y stands for Yul pipeline, E for EVM assembly pipeline. Compiler versions can be specified as SemVer ranges:

"modes": [
    "Y",
    "E",
    "E >=0.8.30"
]

group — An optional string field that specifies a test group for benchmarking.

Test Cases

All test cases are executed in a clean context, making them independent of each other.

Each test case contains the following fields:

name — A string name.
comment — An optional string comment.
inputs — An array of inputs (described below).
expected — The expected return data for the last input.
ignore, modes — Same as in test metadata.

Inputs

Inputs specify the contract calls in the test case:

comment — An optional string comment.
instance — The contract instance to call. Default: Test.
caller — The caller address. Default: 0xdeadbeef01000000000000000000000000000000.
method — The method to call:
1. #deployer for the deployer call.
2. #fallback to perform a call with raw calldata.
3. Any other string is recognized as a function name. The function selector will be prepended to the calldata.
calldata — The input calldata:
1. A hexadecimal string: "calldata": "0x00"
2. A numbers array (hex, decimal, or instance addresses). Each number is padded to 32 bytes: "calldata": ["1", "2"]
value — An optional msg.value, a decimal number with wei or ETH suffix.
storage — Storage values to set before the call:

"storage": {
    "Test.address": ["1", "2", "3", "4"]
}

expected — The expected return data:
1. An array of numbers: "expected": ["1", "2"]
2. Extended format with return_data, exception, and events:

"expected": {
    "return_data": ["0x01"],
    "events": [
        {
            "topics": [
                "0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"
            ],
            "values": ["0xff"]
        }
    ],
    "exception": false
}

The expected field can be an array of objects if different expected data is needed for different compiler versions. Use compiler_version as a SemVer range in extended expected format.

Notes:

InstanceName.address can be used in expected, calldata, and storage fields to insert a contract instance address.
If a deployer call is not specified for an instance, it will be generated automatically with empty calldata.

Upstream Solidity Semantic Tests

These tests follow the Solidity semantic test format. Test descriptions and expected results are embedded as comments in the test file. Lines begin with // for Solidity files. The beginning of the test description is indicated by a comment line containing ----.

Debugging and Inspecting Compiler Output

This guide shows how to use solx debug flags to inspect intermediate representations at each compilation stage.

IR Dump Flags

Each flag writes files to the output directory (-o):

Flag	Extension	Description
`--evmla`	`.evmla`	EVM legacy assembly from solc (legacy pipeline only)
`--ethir`	`.ethir`	EthIR (translated from EVM assembly, legacy pipeline only)
`--ir` / `--ir-optimized`	`.yul`	Yul IR from solc
`--emit-llvm-ir`	`.unoptimized.ll`, `.optimized.ll`	LLVM IR before and after optimization
`--asm`	`.asm`	Final EVM assembly

The --debug-info and --debug-info-runtime flags are output selectors that print deploy and runtime debug info to stdout (or to files when -o is used). They are not IR dump flags.

Example:

solx contract.sol -o ./debug --evmla --ethir --emit-llvm-ir --asm --overwrite

This produces one file per contract per stage in ./debug/.

Quick Dump with `SOLX_OUTPUT_DIR`

Setting the SOLX_OUTPUT_DIR environment variable enables all IR dumps at once without listing individual flags:

export SOLX_OUTPUT_DIR=./ir_dumps
solx contract.sol

This writes all applicable IR files for every contract, with automatic overwrite. Which files are produced depends on the pipeline used: the Yul pipeline dumps Yul and LLVM IR, while the legacy pipeline dumps EVMLA, EthIR, and LLVM IR.

Benchmarking

The --benchmarks flag prints timing information for each pipeline stage:

solx contract.sol --benchmarks

Output includes per-contract compilation timing in milliseconds.

LLVM Diagnostics

Two flags control LLVM-level diagnostics:

--llvm-verify-each — runs LLVM IR verification after every optimization pass. Useful for catching miscompilations. Silent on success; only reports errors when verification fails.
--llvm-debug-logging — enables detailed LLVM pass execution logging to stderr. Shows which passes and analyses run, with instruction counts.

solx contract.sol --llvm-verify-each --llvm-debug-logging

LLVM Options Pass-Through

Arbitrary LLVM backend options can be passed with --llvm-options:

solx contract.sol --llvm-options='-evm-metadata-size 10'

The value must be a single string following =. See the LLVM Options guide for available options, including EVM backend options and standard LLVM diagnostic options like -time-passes and -stats.

Optimization Levels

solx maps optimization levels to LLVM pipelines:

Flag	Middle-end	Size level	Back-end
`-O1`	Less	Zero	Less
`-O2`	Default	Zero	Default
`-O3` (default)	Aggressive	Zero	Aggressive
`-Os`	Default	S	Aggressive
`-Oz`	Default	Z	Aggressive

The default is -O3, optimizing for runtime performance.

The optimization level can also be set with the SOLX_OPTIMIZATION environment variable (values: 1, 2, 3, s, z).

Size Fallback

The --optimization-size-fallback flag (or SOLX_OPTIMIZATION_SIZE_FALLBACK env var) recompiles with -Oz when bytecode exceeds the 24,576-byte EVM contract size limit (EIP-170). When triggered, output files include a .size_fallback suffix.

Spill Area Suffix

When the compiler uses a memory spill region to mitigate stack-too-deep errors, output files include an .o{offset}s{size} suffix indicating the spill area parameters. For example: MyContract.o256s1024.ethir.

Typical Debugging Workflow

Reproduce the issue with a minimal Solidity file.

Dump all IRs using SOLX_OUTPUT_DIR:

SOLX_OUTPUT_DIR=./debug solx contract.sol

Inspect stage by stage:
- Yul pipeline: Yul → LLVM IR (unoptimized) → LLVM IR (optimized) → assembly.
- Legacy pipeline: EVMLA → EthIR → LLVM IR (unoptimized) → LLVM IR (optimized) → assembly.
Narrow down which stage introduces the problem.

Use LLVM verification if the issue is in the optimizer:

solx contract.sol --llvm-verify-each --emit-llvm-ir -o ./debug --overwrite

Compare with solc using the integration tester:

cargo run --release --bin solx-tester -- \
  --solidity-compiler ./target/release/solx \
  --path contract.sol

LLVM Options

This guide documents LLVM backend options available in solx through the --llvm-options flag.

Usage

Pass options as a single string after =:

solx contract.sol --llvm-options='-option1 value1 -option2 value2'

EVM Backend Options

These options are specific to the custom LLVM EVM backend and affect compilation behavior directly.

`-evm-stack-region-size <value>`

Sets the stack spill region size in bytes. The compiler uses this region to spill values that cannot remain on the EVM stack (stack-too-deep mitigation). Normally set automatically based on optimizer settings. Requires -evm-stack-region-offset to be set as well.

`-evm-stack-region-offset <value>`

Sets the stack spill region memory offset. Normally set automatically to match the solc user memory offset.

`-evm-metadata-size <value>`

Sets the metadata size hint used by the backend for gas and code size tradeoff decisions.

Standard LLVM Diagnostic Options

Standard LLVM diagnostic options can be passed through --llvm-options and their output is printed to stderr. Some options (such as -debug and -debug-only) require LLVM built with assertions enabled (-DLLVM_ENABLE_ASSERTIONS=ON). When building from source, pass --enable-assertions to solx-dev llvm build.

`-time-passes`

Print timing information for each LLVM pass.

solx contract.sol --bin --llvm-options='-time-passes'

`-stats`

Print statistics from LLVM passes (number of transformations applied, etc.).

`-print-after-all`

Print LLVM IR after every optimization pass. Produces very large output (tens of thousands of lines) but useful for tracing pass behavior.

`-print-before-all`

Print LLVM IR before every optimization pass.

`-debug-only=<pass-name>`

Enable debug output for a specific LLVM pass. Note that --llvm-debug-logging controls pass-builder logging specifically, not the general LLVM DEBUG() macro categories.

CLI Debug Flags

These are top-level solx flags (not passed through --llvm-options):

Flag	Effect
`--llvm-verify-each`	Run IR verifier after each LLVM pass. Silent on success; produces an error if verification fails.
`--llvm-debug-logging`	Enable pass-builder debug logging. Shows which passes and analyses run, with instruction counts.

See the Debugging guide for the full set of diagnostic flags.

Building with Sanitizers

This is the guide on building solx with sanitizers enabled.

Introduction

Sanitizers are tools that help find bugs in code. They are used to detect memory corruption, leaks, and undefined behavior. The most common sanitizers are AddressSanitizer, MemorySanitizer, and ThreadSanitizer.

If you are not familiar with sanitizers, see the official documentation.

Who is this guide for?

This guide is for developers who want to debug issues with solx.

Prerequisites

For sanitizers build to work, the host LLVM compiler version that is used to build LLVM MUST have the same version as the LLVM compiler that is used internally by `rustc` to build **solx**.

You can check the LLVM version used by rustc by running the following command rustc --version --verbose.

Build steps

The general steps to have a sanitizer enabled build include:

Build the LLVM framework with the required sanitizer enabled.
Build solx with the LLVM build from the previous step.

Please, follow the common installation instructions until the LLVM build step.

This guide assumes the build with AddressSanitizer enabled.

Build LLVM with sanitizer enabled

When building LLVM, use --sanitizer <sanitizer> option and set build type to RelWithDebInfo:

./target/release/solx-dev llvm build --sanitizer=Address --build-type=RelWithDebInfo

Please note that the default Apple Clang compiler is not compatible with Rust. You need to install LLVM using Homebrew and specify the path to the LLVM compiler in the `--extra-args` option. For example:

./target/release/solx-dev llvm build --sanitizer=Address --build-type=RelWithDebInfo --extra-args '-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang' '-DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++'

Build solx with the sanitizer enabled

To build solx with the sanitizer enabled, you need to set the RUSTFLAGS environment variable to -Z sanitizer=address and run the cargo build command. Sanitizers build is a feature that is available only for the nightly Rust compiler, it is recommended to set RUSTC_BOOTSTRAP=1 environment variable before the build.

It is also mandatory to use --target option to specify the target architecture. Otherwise, the build will fail. Please, check the table below to find the correct target for your platform.

Platform	LLVM Target Triple
MacOS-arm64	`aarch64-apple-darwin`
MacOS-x86	`x86_64-apple-darwin`
Linux-arm64	`aarch64-unknown-linux-gnu`
Linux-x86	`x86_64-unknown-linux-gnu`

Additionally, for proper reports symbolization it is recommended to set the ASAN_SYMBOLIZER_PATH environment variable. For more info, see symbolizing reports section of LLVM documentation.

For example, to build solx for MacOS-arm64 with AddressSanitizer enabled, run the following command:

export RUSTC_BOOTSTRAP=1
export ASAN_SYMBOLIZER_PATH=$(which llvm-symbolizer) # check the path to llvm-symbolizer
TARGET=aarch64-apple-darwin # Change to your target
RUSTFLAGS="-Z sanitizer=address" cargo test --target=${TARGET}

Congratulations! You have successfully built solx with the sanitizers enabled.

Please, refer to the official documentation for more information on how to use sanitizers and their types.

solx Compiler Documentation