Installation
You can start using solx in the following ways:
-
Use the installation script.
curl -L https://raw.githubusercontent.com/NomicFoundation/solx/main/install-solx | bashThe script will download the latest stable release of solx and install it in your
PATH.⚠️ The script requires
curlto be installed on your system.
This is the recommended way to install solx for MacOS users to bypass gatekeeper checks. -
Download stable releases. See Static Executables.
-
Build solx from sources. See Building from Source.
System Requirements
It is recommended to have at least 4 GB of RAM to compile large projects. The compilation process is parallelized by default, so the number of threads used is equal to the number of CPU cores.
Large projects can consume a lot of RAM during compilation on machines with a high number of cores. If you encounter memory issues, consider reducing the number of threads using the
--threadsoption.
The table below outlines the supported platforms and architectures:
| CPU/OS | MacOS | Linux | Windows |
|---|---|---|---|
| x86_64 | ✅ | ✅ | ✅ |
| arm64 | ✅ | ✅ | ❌ |
Please avoid using outdated distributions of operating systems, as they may lack the necessary dependencies or include outdated versions of them. solx is only tested on recent versions of popular distributions, such as MacOS 11.0 and Windows 10.
Versioning
The solx version consists of two parts:
- solx version itself.
- Version of solc libraries solx is statically linked with.
We recommend always using the latest version of solx to benefit from the latest features and bug fixes.
Ethereum Development Toolkits
For large codebases, it is more convenient to use solx via toolkits such as Hardhat. These tools manage compiler input and output on a higher level, and provide additional features like incremental compilation and caching.
Static Executables
We ship solx binaries on the releases page of the eponymous repository. This repository maintains intuitive and stable naming for the executables and provides a changelog for each release. Tools using solx must download the binaries from this repository and cache them locally.
All executables are statically linked and must work on all recent platforms without issues.
Building from Source
Please consider using the pre-built executables before building from source. Building from source is only necessary for development, research, and debugging purposes. Deployment and production use cases should rely only on the officially released executables.
-
Install the necessary system-wide dependencies.
- For Linux (Debian):
apt install cmake ninja-build curl git libssl-dev pkg-config clang lld- For Linux (Arch):
pacman -Syu which cmake ninja curl git pkg-config clang lld-
For MacOS:
-
Install the Homebrew package manager by following the instructions at brew.sh.
-
Install the necessary system-wide dependencies:
brew install cmake ninja coreutils -
Install a recent build of the LLVM/Clang compiler using one of the following tools:
- Xcode
- Apple’s Command Line Tools
- Your preferred package manager.
-
-
Install Rust.
The easiest way to do it is following the latest official instructions.
The Rust version used for building is pinned in the
rust-toolchain.tomlfile at the repository root. cargo will automatically download the pinned version of rustc when you start building the project.
-
Clone and checkout this repository with submodules.
git clone https://github.com/NomicFoundation/solx --recursiveBy default, submodules checkout is disabled to prevent cloning large repositories via
cargo. If you're building locally, ensure all submodules are checked out with:git submodule update --recursive --checkout -
Build the development tools.
cargo build --release --bin solx-dev -
Build the LLVM framework using solx-dev.
./target/release/solx-dev llvm build --enable-mlirThis builds LLVM with the EVM target, MLIR, and LLD projects enabled. The build artifacts will be placed in
target-llvm/.For more information and available build options, run
./target/release/solx-dev llvm build --help. -
Build the solc libraries using solx-dev.
./target/release/solx-dev solc buildThis will configure and build the solc libraries in
solx-solidity/build/. The command automatically detects MLIR and LLD paths if LLVM was built with those projects.For more options, run
./target/release/solx-dev solc build --help. -
Build the solx executable.
cargo build --releaseThe solx executable will appear as
./target/release/solx, where you can run it directly or move it to another location.If cargo cannot find the LLVM build artifacts, ensure that the
LLVM_SYS_211_PREFIXenvironment variable is not set in your system, as it may be pointing to a location different from the one expected by solx.
Tuning the LLVM build
- For more information and available build options, run
./target/release/solx-dev llvm build --help. - The
--enable-mlirflag enables MLIR support in the LLVM build (required for MLIR-based optimizations). LLD is always built. - Use the
--ccache-variant ccacheoption to speed up the build process if you have ccache installed.
Building LLVM manually
If you prefer building the LLVM framework manually, include the following flags in your CMake command:
# We recommend using the latest version of CMake.
-DLLVM_TARGETS_TO_BUILD='EVM'
-DLLVM_ENABLE_PROJECTS='lld;mlir'
-DLLVM_ENABLE_RTTI='On'
-DBUILD_SHARED_LIBS='Off'
For most users, solx-dev is the recommended way to build the framework. This section was added for compiler toolchain developers and researchers with specific requirements and experience with the LLVM framework.
Command Line Interface (CLI)
The CLI of solx is designed to mimic that of solc. There are several main input/output (I/O) modes in the solx interface:
The basic CLI is simpler and suitable for using from the shell. The standard JSON mode is similar to client-server interaction, thus more suitable for using from other applications.
All toolkits using solx must be operating in standard JSON mode and follow its specification. It will make the toolkits more robust and future-proof, as the standard JSON mode is the most versatile and used for the majority of popular projects.
This page focuses on the basic CLI mode. For more information on the standard JSON mode, see this page.
Basic CLI
Basic CLI mode is the simplest way to compile a file with the source code.
To compile a basic Solidity contract, run the simple example from the --bin section.
The rest of this section describes the available CLI options and their usage. You may also check out solx --help for a quick reference.
--bin
Emits the full bytecode.
solx 'Simple.sol' --bin
Output:
======= Simple.sol:Simple =======
Binary:
34601557630000008480630000001a...
--bin-runtime
Emits the runtime part of the bytecode.
solx 'Simple.sol' --bin-runtime
Output:
======= Simple.sol:Simple =======
Binary of the runtime part:
34600b57600336116016575b5f5ffd...
--asm
Emits the text assembly produced by LLVM.
solx 'Simple.sol' --asm
Output:
======= Simple.sol:Simple =======
Deploy LLVM EVM assembly:
.text
.file "Simple.sol:Simple"
main:
.func_begin0:
JUMPDEST
PUSH1 128
PUSH1 64
...
Runtime LLVM EVM assembly:
.text
.file "Simple.sol:Simple.runtime"
main:
.func_begin0:
JUMPDEST
PUSH1 128
PUSH1 64
...
--metadata
Emits the contract metadata. The metadata is a JSON object that contains information about the contract, such as its name, source code hash, the list of dependencies, compiler versions, and so on.
The solx metadata format is compatible with the Solidity metadata format. This means that the metadata output can be used with other tools that support Solidity metadata. Extra solx data is inserted into solc metadata with this JSON object:
{
"solx": {
"llvm_options": [],
"optimizer_settings": {
"is_debug_logging_enabled": false,
"is_fallback_to_size_enabled": false,
"is_verify_each_enabled": false,
"level_back_end": "Aggressive",
"level_middle_end": "Aggressive",
"level_middle_end_size": "Zero"
},
// Optional: only set for Solidity and Yul contracts.
"solc_version": "0.8.34",
// Mandatory: current version of solx.
"solx_version": "0.1.4"
}
}
Usage:
solx 'Simple.sol' --metadata
Output:
======= Simple.sol:Simple =======
Metadata:
{"compiler":{"version":"0.8.34+commit.e2cbf92c"},"language":"Solidity","output":{"abi":[{"inputs":[],"name":"first","outputs":[{"internalType":"uint64","name":"","type":"uint64"}],"stateMutability":"pure","type":"function"},{"inputs":[],"name":"second","outputs":[{"internalType":"uint256","name":"","type":"uint256"}],"stateMutability":"pure","type":"function"}],"devdoc":{"kind":"dev","methods":{},"version":1},"userdoc":{"kind":"user","methods":{},"version":1}},"settings":{"compilationTarget":{"Simple.sol":"Simple"},"evmVersion":"osaka","libraries":{},"metadata":{"bytecodeHash":"ipfs"},"optimizer":{"enabled":false,"runs":200},"remappings":[]},"solx":{"llvm_options":[],"optimizer_settings":{"is_debug_logging_enabled":false,"is_fallback_to_size_enabled":false,"is_verify_each_enabled":false,"level_back_end":"Aggressive","level_middle_end":"Aggressive","level_middle_end_size":"Zero"},"solc_version":"0.8.34","solx_version":"0.1.4"},"sources":{"Simple.sol":{"keccak256":"0x402fe0b38cc9d81e8c9f6d07854cca27fbb307f06d8a129998026907a10c7ca1","license":"MIT","urls":["bzz-raw://04714cab56c1f931e3cc1ddae4c7ff0c8832d0849e23966c6326028f6783d45a","dweb:/ipfs/QmehmUFKCtytG8WcWQ676KvqwURfkVYK89VHZEvSzyLc2Z"]}},"version":1}
--ast-json
Emits the AST of each Solidity file.
solx 'Simple.sol' --ast-json
Output:
======= Simple.sol:Simple =======
JSON AST:
{"absolutePath":".../Simple.sol","exportedSymbols":{"Simple":[24]},"id":25,"license":"MIT","nodeType":"SourceUnit","nodes":[ ... ],"src":"32:288:0"}
Since solx communicates with solc only via standard JSON under the hood, the full JSON AST is emitted instead of the compact one.
--abi
Emits the contract ABI specification.
solx 'Simple.sol' --abi
Output:
======= Simple.sol:Simple =======
Contract JSON ABI:
[{"inputs":[],"name":"first","outputs":[{"internalType":"uint64","name":"","type":"uint64"}],"stateMutability":"pure","type":"function"},{"inputs":[],"name":"second","outputs":[{"internalType":"uint256","name":"","type":"uint256"}],"stateMutability":"pure","type":"function"}]
--hashes
Emits the contract function signatures.
solx 'Simple.sol' --hashes
Output:
======= Simple.sol:Simple =======
Function signatures:
3df4ddf4: first()
5a8ac02d: second()
--storage-layout
Emits the contract storage layout.
solx 'Simple.sol' --storage-layout
Output:
======= Simple.sol:Simple =======
Contract Storage Layout:
{"storage":[{"astId":3,"contract":"Simple.sol:Simple","label":"field_1","offset":0,"slot":"0","type":"t_uint256"},{"astId":5,"contract":"Simple.sol:Simple","label":"field_2","offset":0,"slot":"1","type":"t_uint256"},{"astId":7,"contract":"Simple.sol:Simple","label":"field_3","offset":0,"slot":"2","type":"t_uint256"}],"types":{"t_uint256":{"encoding":"inplace","label":"uint256","numberOfBytes":"32"}}}
--transient-storage-layout
Emits the contract transient storage layout.
solx 'Simple.sol' --transient-storage-layout
Output:
======= Simple.sol:Simple =======
Contract Transient Storage Layout:
{"storage":[{"astId":3,"contract":"Simple.sol:Simple","label":"field_1","offset":0,"slot":"0","type":"t_uint256"},{"astId":5,"contract":"Simple.sol:Simple","label":"field_2","offset":0,"slot":"1","type":"t_uint256"},{"astId":7,"contract":"Simple.sol:Simple","label":"field_3","offset":0,"slot":"2","type":"t_uint256"}],"types":{"t_uint256":{"encoding":"inplace","label":"uint256","numberOfBytes":"32"}}}
--userdoc
Emits the contract user documentation.
solx 'Simple.sol' --userdoc
Output:
======= Simple.sol:Simple =======
User Documentation:
{"kind":"user","methods":{ ... },"version":1}
--devdoc
Emits the contract developer documentation.
solx 'Simple.sol' --devdoc
Output:
======= Simple.sol:Simple =======
Developer Documentation:
{"kind":"dev","methods":{ ... },"version":1}
--asm-solc-json
Emits the solc EVM assembly parsed from solc's JSON output.
solx 'Simple.sol' --asm-solc-json
Output:
======= Simple.sol:Simple =======
EVM assembly:
000 PUSH 80
001 MEMORYGUARD
002 PUSH 40
003 MSTORE
...
This is the solc EVM assembly output that is translated to LLVM IR by solx. For solx's own EVM assembly output emitted by LLVM, use the
--asmoption instead.
--ir (or --ir-optimized)
Emits the solc Yul IR.
solx does not use the Yul optimizer anymore, so the Yul IR is always unoptimized, and it is not possible to emit solc-optimized Yul IR with solx.
solx 'Simple.sol' --ir
Output:
======= Simple.sol:Simple =======
IR:
/// @use-src 0:"Simple.sol"
object "Simple_24" {
code {
{
...
}
}
/// @use-src 0:"Simple.sol"
object "Simple_24_deployed" {
code {
{
...
}
}
data ".metadata" hex"a26469706673582212206c34df79f8cc8ba870a350940cb8623c60d4f6f9c356e2185b812187d9ae55ee64736f6c63430008220033"
}
}
--debug-info
Emits the ELF-wrapped DWARF debug info of the deploy code.
solx 'Simple.sol' --debug-info
Output:
======= Simple.sol:Simple =======
Debug info:
7f454c46010201ff...
--debug-info-runtime
Emits the ELF-wrapped DWARF debug info of the runtime code.
solx 'Simple.sol' --debug-info-runtime
Output:
======= Simple.sol:Simple =======
Debug info of the runtime part:
7f454c46010201ff
--evmla
Emits EVM legacy assembly (intermediate representation from solc).
When used with --output-dir, writes .evmla files to the output directory. Without --output-dir, outputs to stdout.
Usage with --output-dir:
solx 'Simple.sol' --evmla --output-dir './build/'
ls './build/'
Output:
Compiler run successful.
Simple_sol_Simple.evmla
Simple_sol_Simple.runtime.evmla
Usage with stdout:
solx 'Simple.sol' --evmla --bin
Output:
======= Simple.sol:Simple =======
Binary:
...
Deploy EVM legacy assembly:
000 PUSH 80
...
--ethir
Emits Ethereal IR (intermediate representation between EVM assembly and LLVM IR).
When used with --output-dir, writes .ethir files to the output directory. Without --output-dir, outputs to stdout.
Usage with --output-dir:
solx 'Simple.sol' --ethir --output-dir './build/'
ls './build/'
Output:
Compiler run successful.
Simple_sol_Simple.ethir
Simple_sol_Simple.runtime.ethir
Usage with stdout:
solx 'Simple.sol' --ethir --bin
Output:
======= Simple.sol:Simple =======
Binary:
...
Deploy Ethereal IR:
function main(0, 0, 0, 0, 0) -> 0, 0, 0, 0 {
...
--emit-llvm-ir
Emits LLVM IR (both unoptimized and optimized).
When used with --output-dir, writes .ll files to the output directory. Without --output-dir, outputs to stdout.
Usage with --output-dir:
solx 'Simple.sol' --emit-llvm-ir --output-dir './build/'
ls './build/'
Output:
Compiler run successful.
Simple_sol_Simple.optimized.ll
Simple_sol_Simple.runtime.optimized.ll
Simple_sol_Simple.runtime.unoptimized.ll
Simple_sol_Simple.unoptimized.ll
Usage with stdout:
solx 'Simple.sol' --emit-llvm-ir --bin --via-ir
Output:
======= Simple.sol:Simple =======
Binary:
...
Deploy LLVM IR (unoptimized):
; ModuleID = 'Simple.sol:Simple'
...
Deploy LLVM IR:
; ModuleID = 'Simple.sol:Simple'
...
--benchmarks
Emits benchmarks of the solx LLVM-based pipeline and its underlying call to solc.
solx 'Simple.sol' --benchmarks
Output:
Benchmarks:
solc_Solidity_Standard_JSON: 6ms
solx_Solidity_IR_Analysis: 0ms
solx_Compilation: 75ms
======= Simple.sol:Simple =======
Benchmarks:
Simple.sol:Simple:deploy/EVMAssemblyToLLVMIR/M3B3/SpillArea(0): 0ms
Simple.sol:Simple:deploy/InitVerify/M3B3/SpillArea(0): 0ms
Simple.sol:Simple:deploy/OptimizeVerify/M3B3/SpillArea(0): 1ms
Simple.sol:Simple:runtime/EVMAssemblyToLLVMIR/M3B3/SpillArea(0): 0ms
Simple.sol:Simple.runtime:runtime/InitVerify/M3B3/SpillArea(0): 0ms
Simple.sol:Simple.runtime:runtime/OptimizeVerify/M3B3/SpillArea(0): 5ms
Input Files
solx supports multiple input files. The following command compiles two Solidity files and prints the bytecode:
solx 'Simple.sol' 'Complex.sol' --bin
Solidity import remappings are passed the same way as input files, but they are distinguished by a = symbol between source and destination. The following command compiles a Solidity file with a remapping and prints the bytecode:
solx 'Simple.sol' 'github.com/ethereum/dapp-bin/=/usr/local/lib/dapp-bin/' --bin
solx does not handle remappings itself, but only passes them through to solc. Visit the solc documentation to learn more about the processing of remappings.
--libraries
Specifies the libraries to link with compiled contracts. The option accepts multiple string arguments. The safest way is to wrap each argument in single quotes, and separate them with a space.
The specifier has the following format: <ContractPath>:<ContractName>=<LibraryAddress>.
Usage:
solx 'Simple.sol' --bin --libraries 'Simple.sol:Simple=0x1234567890abcdef1234567890abcdef12345678'
--base-path, --include-path, --allow-paths
These options are used to specify Solidity import resolution settings. They are not used by solx and only passed through to solc like import remappings.
Visit the solc documentation to learn more about the processing of these options.
--output-dir
Specifies the output directory for build artifacts. Can only be used in basic CLI mode.
Usage in basic CLI mode:
solx 'Simple.sol' --bin --asm --metadata --output-dir './build/'
ls './build/'
Output:
Compiler run successful. Artifact(s) can be found in directory "build".
Simple_sol_Simple.asm
Simple_sol_Simple.bin
Simple_sol_Simple.runtime.asm
Simple_sol_Simple_llvm.asm
Simple_sol_Simple_llvm.asm-runtime
Simple_sol_Simple_meta.json
--overwrite
Overwrites the output files if they already exist in the output directory. By default, solx does not overwrite existing files.
Can only be used in combination with the --output-dir option.
Usage:
solx 'Simple.sol' --bin --output-dir './build/' --overwrite
If the --overwrite option is not specified and the output files already exist, solx will print an error message and exit:
Error: Refusing to overwrite an existing file "./build/Simple_sol_Simple.bin" (use --overwrite to force).
--version
Prints the version of solx and the hash of the LLVM commit it was built with.
Usage:
solx --version
--help
Prints the help message.
Usage:
solx --help
Other I/O Modes
The mode-altering CLI options are mutually exclusive. This means that only one of the options below can be enabled at a time:
--standard-json
For the standard JSON mode usage, see the Standard JSON page.
solx Compilation Settings
The options in this section are only configuring the solx compiler and do not affect the underlying solc compiler.
--threads
Sets the number of threads used for parallel compilation. Each thread compiles a separate translation unit in a child process. By default, the number of threads equals the number of CPU cores.
Large projects can consume a lot of RAM during compilation on machines with a high number of cores. If you encounter memory issues, consider reducing the number of threads.
Usage:
solx 'Simple.sol' --bin --threads 4
--optimization / -O
Sets the optimization level of the LLVM optimizer. Available values are:
| Level | Meaning | Hints |
|---|---|---|
| 0 | No optimization | For fast compilation during development (unsupported) |
| 1 | Performance: basic | For optimization research |
| 2 | Performance: default | For optimization research |
| 3 | Performance: aggressive | Best performance for production |
| s | Size: default | For optimization research |
| z | Size: aggressive | Best size for contracts with size constraints |
For most cases, it is fine to keep the default value of 3. You should only use the level z if you are ready to deliberately sacrifice performance and optimize for size.
Large contracts may hit the EVM bytecode size limit. In this case, it is recommended to use the
--optimization-size-fallbackoption rather than setting the level toz.
Usage:
solx 'Simple.sol' --bin -O3
This option can also be set with an environment variable SOLX_OPTIMIZATION, which is useful for toolkits
where arbitrary solx-specific options are not supported:
SOLX_OPTIMIZATION='3' solx 'Simple.sol' --bin
--optimization-size-fallback
Sets the optimization level to z for contracts that failed to compile due to overrunning the bytecode size constraints.
Under the hood, this option automatically triggers recompilation of contracts with level z. Contracts that were successfully compiled with the original --optimization setting are not recompiled.
For deployment, it is recommended to have this option enabled in order to mitigate potential issues with EVM bytecode size constraints on a per-contract basis. If your environment does not have bytecode size limitations, it is better to disable it to prevent unnecessary recompilations. A good example is running
forge test.
Usage:
solx 'Simple.sol' --bin -O3 --optimization-size-fallback
This option can also be set with an environment variable SOLX_OPTIMIZATION_SIZE_FALLBACK, which is useful for toolkits
where arbitrary solx-specific options are not supported:
SOLX_OPTIMIZATION_SIZE_FALLBACK= solx 'Simple.sol' --bin -O3
--metadata-hash
Specifies the hash format used for contract metadata.
Usage with ipfs:
solx 'Simple.sol' --bin --metadata-hash 'ipfs'
Output with ipfs:
======= Simple.sol:Simple =======
Binary:
34601557630000008480630000001a6080396080f35b5f5ffdfe34600b5760...
a2646970667358221220579682b419e25ecc4524604eb5f3a8dbe3b15621ca21cc8ada8dcf6196a512df64736f6c637816736f6c783a302e312e343b736f6c633a302e382e33340047
The byte array starting with a2 at the end of the bytecode is a CBOR-encoded compiler version data and an optional metadata hash.
The last two bytes of the metadata (0x0047) are not a part of the CBOR payload, but the length of it, which must be known to correctly decode the payload.
JSON representation of the CBOR payload:
{
// Optional: included if `--metadata-hash` is set to `ipfs`.
"ipfs": "1220579682b419e25ecc4524604eb5f3a8dbe3b15621ca21cc8ada8dcf6196a512df",
// Required: consists of semicolon-separated pairs of colon-separated compiler names and versions.
// `solx:<version>` is always included.
// `solc:<version>` is only included for Solidity and Yul contracts, but not included for LLVM IR ones.
"solc": "solx:0.1.4;solc:0.8.34"
}
For more information on these formats, see the CBOR and IPFS documentation.
--no-cbor-metadata
Disables the CBOR metadata that is appended at the end of bytecode. This option is useful for debugging and research purposes.
It is not recommended to use this option in production, as it is not possible to verify contracts deployed without metadata.
Usage:
solx 'Simple.sol' --no-cbor-metadata
--llvm-options
Specifies additional options for the LLVM framework. The argument must be a single quoted string following a = separator.
Usage:
solx 'Simple.sol' --bin --llvm-options='-key=value'
The
--llvm-optionsoption is experimental and must only be used by experienced users. All supported options will be documented in the future.
solc Compilation Settings
The options in this section are only configuring solc, so they are passed directly to its child process, and do not affect the solx compiler.
--via-ir
Switches the solc codegen to Yul a.k.a. IR.
Usage:
solx 'Simple.sol' --bin --via-ir
--evm-version
Specifies the EVM version solx will produce bytecode for. For instance, with version osaka, solx will be producing clz instructions, whereas for older EVM versions it will not.
Only the following EVM versions are supported:
- cancun
- prague
- osaka (default)
Usage:
solx 'Simple.sol' --bin --evm-version 'osaka'
--metadata-literal
Tells solc to store referenced sources as literal data in the metadata output.
This option only affects the contract metadata output produced by solc, and does not affect artifacts produced by solx.
Usage:
solx 'Simple.sol' --bin --metadata --metadata-literal
--no-import-callback
Disables the default import resolution callback in solc.
This parameter is used by some tooling that resolves all imports by itself, such as Hardhat.
Usage:
solx 'Simple.sol' --no-import-callback
Multi-Language Support
solx supports input in multiple programming languages:
The following sections outline how to use solx with these languages.
--yul (or --strict-assembly)
Enables the Yul mode. In this mode, input is expected to be in the Yul language. The output works the same way as with Solidity input.
Usage:
solx --yul 'Simple.yul' --bin
Output:
======= Simple.yul =======
Binary:
5b60806040525f341415601c5763...
--llvm-ir
Enables the LLVM IR mode. In this mode, input is expected to be in the LLVM IR language. The output works the same way as with Solidity input.
In this mode, every input file is treated as runtime code, while deploy code will be generated automatically by solx. It is not possible to write deploy code manually yet, but it will be supported in the future.
Unlike solc, solx is an LLVM-based compiler toolchain, so it uses LLVM IR as an intermediate representation. It is not recommended to write LLVM IR manually, but it can be useful for debugging and optimization purposes. LLVM IR is more low-level than Yul and EVM assembly in the solx IR hierarchy.
Usage:
solx --llvm-ir 'Simple.ll' --bin
Output:
======= Simple.ll =======
Binary:
5b60806040525f341415601c5763...
Debugging
IR Output Flags
For selective IR output, use the following flags with --output-dir:
--evmla- EVM legacy assembly--ethir- Ethereal IR--emit-llvm-ir- LLVM IR (unoptimized and optimized)--asm- LLVM EVM assembly
These flags respect the --overwrite option. Without --overwrite, the compiler will refuse to overwrite existing files.
SOLX_OUTPUT_DIR Environment Variable
For debugging purposes, all intermediate build artifacts can be dumped to a directory using the SOLX_OUTPUT_DIR environment variable. This is useful for toolkits where arbitrary solx-specific options are not supported.
When this environment variable is set, solx will output all intermediate representations to the specified directory, always overwriting existing files.
The intermediate build artifacts include:
| Name | Extension |
|---|---|
| EVM Assembly | evmla |
| EthIR | ethir |
| Yul | yul |
| LLVM IR | ll |
| LLVM Assembly | asm |
Usage:
SOLX_OUTPUT_DIR='./debug/' solx 'Simple.sol' --bin
ls './debug/'
Output:
Simple_sol_Simple.evmla
Simple_sol_Simple.ethir
Simple_sol_Simple.unoptimized.ll
Simple_sol_Simple.optimized.ll
Simple_sol_Simple.asm
Simple_sol_Simple.runtime.evmla
Simple_sol_Simple.runtime.ethir
Simple_sol_Simple.runtime.unoptimized.ll
Simple_sol_Simple.runtime.optimized.ll
Simple_sol_Simple.runtime.asm
The output file name is constructed as follows: <ContractPath>_<ContractName>.<Modifiers>.<Extension>.
Additionally, it is possible to dump the standard JSON input file with the SOLX_STANDARD_JSON_DEBUG environment variable:
SOLX_STANDARD_JSON_DEBUG='./debug/input.json' solx 'Simple.sol' --bin
cat './debug/input.json' | jq .
--llvm-verify-each
Enables the verification of the LLVM IR after each optimization pass. This option is useful for debugging and research purposes.
Usage:
solx 'Simple.sol' --bin --llvm-verify-each
--llvm-debug-logging
Enables the debug logging of the LLVM IR optimization passes. This option is useful for debugging and research purposes.
Usage:
solx 'Simple.sol' --bin --llvm-debug-logging
Standard JSON
Standard JSON is a protocol for interaction with the solx and solc compilers. This protocol must be implemented by toolkits such as Hardhat.
The protocol uses two data formats for communication: input JSON and output JSON.
Usage
Input JSON can be provided by-value via the --standard-json option:
solx --standard-json './input.json'
Alternatively, the input JSON can be fed to solx via stdin:
cat './input.json' | solx --standard-json
You can also insert your standard JSON input directly into the command line:
solx --standard-json
<paste into stdin here and press Ctrl-D>
For the sake of interface unification, solx will always return with exit code 0 and have its standard JSON output printed to stdout. It differs from solc that may return with exit code 1 and a free-formed error in some cases, such as when the standard JSON input file is missing, even though the solc documentation claims otherwise.
Input JSON
The input JSON provides the compiler with the source code and settings for the compilation. The example below serves as the specification of the input JSON format.
This format introduces several solx-specific parameters such as settings.optimizer.sizeFallback. These parameters are marked as solx-only.
On the other hand, parameters that are not mentioned here but are parts of solc standard JSON protocol have no effect in solx.
{
// Required: Source code language.
// Currently supported: "Solidity", "Yul", "LLVM IR".
"language": "Solidity",
// Required: Source code files to compile.
// The keys here are the "global" names of the source files. Imports can be using other file paths via remappings.
"sources": {
// In source file entry, either but not both "urls" and "content" must be specified.
"myFile.sol": {
// Required (unless "content" is used): URL(s) to the source file.
"urls": [
// In Solidity mode, directories must be added to the command-line via "--allow-paths <path>" for imports to work.
// It is possible to specify multiple URLs for a single source file. In this case the first successfully resolved URL will be used.
"/tmp/path/to/file.sol"
],
// Required (unless "urls" is used): Literal contents of the source file.
"content": "contract settable is owned { uint256 private x = 0; function set(uint256 _x) public { if (msg.sender == owner) x = _x; } }"
}
},
// Required: Compilation settings.
"settings": {
// Optional: Optimizer settings.
"optimizer": {
// Optional, solx-only: Set the LLVM optimizer level.
// Available options:
// -0: do not optimize (unsupported)
// -1: basic optimizations for gas usage
// -2: advanced optimizations for gas usage
// -3: all optimizations for gas usage
// -s: basic optimizations for bytecode size
// -z: all optimizations for bytecode size
// Default: 3.
"mode": "3",
// Optional, solx-only: Re-run the compilation with "mode": "z" if the initial compilation exceeds the EVM bytecode size limit.
// Used on a per-contract basis and applied automatically, so some contracts will end up compiled in the initial mode, and others with "mode": "z".
// Only activated if "mode" is set to "3", which is the default optimization mode.
// Default: false.
"sizeFallback": false
},
// Optional: Sorted list of remappings.
// Important: Only used with Solidity input.
"remappings": [ ":g=/dir" ],
// Optional: Addresses of the libraries.
// If not all library addresses are provided here, it will result in unlinked bytecode files that will require post-compile-time linking before deployment.
// Important: Only used with Solidity, Yul, and LLVM IR input.
"libraries": {
// The top level key is the name of the source file where the library is used.
// If remappings are used, this source file should match the global path after remappings were applied.
"myFile.sol": {
// Source code library name and address where it is deployed.
"MyLib": "0x123123..."
}
},
// Optional: Version of EVM solx will produce bytecode for.
// Supported EVM versions: "cancun", "prague", "osaka".
// For instance, with version "osaka", solx will be producing `clz` instructions, whereas for older EVM versions it will not.
// The oldest supported EVM version is "cancun".
// Default: "osaka".
"evmVersion": "osaka",
// Optional: Select the desired output.
// Default: no flags are selected, and no output is generated.
"outputSelection": {
"<path>": {
// Available file-level options, must be listed under "<path>"."":
"": [
// AST of all source files.
"ast",
// Benchmarks of the solx LLVM-based compilation pipeline and its underlying call to solc.
"benchmarks"
],
// Available contract-level options, must be listed under "<path>"."<name>":
"<name>": [
// Solidity ABI.
"abi",
// Metadata.
"metadata",
// Developer documentation (natspec).
"devdoc",
// User documentation (natspec).
"userdoc",
// Slots, offsets and types of the contract's state variables in storage.
"storageLayout",
// Slots, offsets and types of the contract's state variables in transient storage.
"transientStorageLayout",
// Yul produced by solc.
// An alias "irOptimized" is supported for compatibility, but it will request unoptimized Yul IR anyway.
"ir",
// Everything of the below.
"evm",
// Solidity function hashes.
"evm.methodIdentifiers",
// EVM assembly produced by solc.
"evm.legacyAssembly",
// Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
"evm.gasEstimates",
// Everything that starts with "evm.bytecode".
"evm.bytecode",
// Deploy bytecode produced by solx/LLVM.
// As long as the solx bytecode linker is in experimental stage, all contracts will be compiled if this key is enabled for at least one contract.
"evm.bytecode.object",
// Deploy code assembly produced by solx/LLVM.
"evm.bytecode.llvmAssembly",
// solx-only: EVM legacy assembly IR (internal representation). Only available for non-viaIR mode.
"evm.bytecode.evmla",
// solx-only: Ethereal IR (internal representation). Only available for non-viaIR mode.
"evm.bytecode.ethir",
// solx-only: Unoptimized LLVM IR (internal representation).
"evm.bytecode.llvmIrUnoptimized",
// solx-only: Optimized LLVM IR (internal representation).
"evm.bytecode.llvmIr",
// ELF-wrapped DWARF debug info produced by solx/LLVM. Only available for Solidity source code input.
"evm.bytecode.debugInfo",
// Link references for linkers that are to resolve library addresses at deploy time.
"evm.bytecode.linkReferences",
// Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
"evm.bytecode.opcodes",
// Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
"evm.bytecode.sourceMap",
// Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
"evm.bytecode.functionDebugData",
// Unsupported, but emitted as an empty array to preserve compatibility with some toolkits.
"evm.bytecode.generatedSources",
// Everything that starts with "evm.deployedBytecode".
"evm.deployedBytecode",
// Runtime bytecode produced by solx/LLVM.
// As long as the solx bytecode linker is in experimental stage, all contracts will be compiled if this key is enabled for at least one contract.
"evm.deployedBytecode.object",
// Runtime code assembly produced by solx/LLVM.
"evm.deployedBytecode.llvmAssembly",
// solx-only: EVM legacy assembly IR (internal representation). Only available for non-viaIR mode.
"evm.deployedBytecode.evmla",
// solx-only: Ethereal IR (internal representation). Only available for non-viaIR mode.
"evm.deployedBytecode.ethir",
// solx-only: Unoptimized LLVM IR (internal representation).
"evm.deployedBytecode.llvmIrUnoptimized",
// solx-only: Optimized LLVM IR (internal representation).
"evm.deployedBytecode.llvmIr",
// Link references for linkers that are to resolve library addresses at deploy time.
"evm.deployedBytecode.linkReferences",
// Resolved automatically by solx/LLVM, but emitted as an empty object to preserve compatibility with some toolkits.
"evm.deployedBytecode.immutableReferences",
// ELF-wrapped DWARF debug info produced by solx/LLVM. Only available for Solidity source code input.
"evm.deployedBytecode.debugInfo",
// Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
"evm.deployedBytecode.opcodes",
// Unsupported, but emitted as an empty string to preserve compatibility with some toolkits.
"evm.deployedBytecode.sourceMap",
// Unsupported, but emitted as an empty object to preserve compatibility with some toolkits.
"evm.deployedBytecode.functionDebugData",
// Unsupported, but emitted as an empty array to preserve compatibility with some toolkits.
"evm.deployedBytecode.generatedSources"
]
}
},
// Optional: Metadata settings.
"metadata": {
// Optional: Use the given hash method for the metadata hash that is appended to the bytecode.
// Available options: "none", "ipfs".
// Default: "ipfs".
"bytecodeHash": "ipfs",
// Optional: Use only literal content and not URLs.
// Default: false.
"useLiteralContent": true,
// Optional: Whether to include CBOR-encoded metadata at the end of bytecode.
// Default: true.
"appendCBOR": true
},
// Optional: Enables the IR codegen in solc.
"viaIR": true,
// Optional, solx-only: Extra LLVM settings.
"llvmOptions": [
"-key", "value"
]
}
}
Output JSON
The output JSON contains all artifacts produced by solx and solc together. The example below serves as the specification of the output JSON format.
{
// Required: File-level outputs.
"sources": {
"sourceFile.sol": {
// Required: Identifier of the source.
"id": 1,
// Optional: The AST object.
// Corresponds to "ast" in the outputSelection settings.
"ast": {/* ... */}
}
},
// Required: Contract-level outputs.
"contracts": {
// The source name.
"sourceFile.sol": {
// The contract name.
// If the language only supports one contract per file, this field equals to the source name.
"ContractName": {
// Optional: The Ethereum Contract ABI (object).
// See https://docs.soliditylang.org/en/develop/abi-spec.html.
// Corresponds to "abi" in the outputSelection settings.
"abi": [/* ... */],
// Optional: Storage layout (object).
// Corresponds to "storageLayout" in the outputSelection settings.
"storageLayout": {/* ... */},
// Optional: Transient storage layout (object).
// Corresponds to "transientStorageLayout" in the outputSelection settings.
"transientStorageLayout": {/* ... */},
// Optional: Contract metadata (string).
// Corresponds to "metadata" in the outputSelection settings.
"metadata": "/* ... */",
// Optional: Developer documentation (natspec object).
// Corresponds to "devdoc" in the outputSelection settings.
"devdoc": {/* ... */},
// Optional: User documentation (natspec object).
// Corresponds to "userdoc" in the outputSelection settings.
"userdoc": {/* ... */},
// Optional: Yul produced by solc (string).
// Corresponds to "ir" in the outputSelection settings.
"ir": "/* ... */",
// Optional: EVM target outputs.
// Corresponds to "evm" in the outputSelection settings.
"evm": {
// Optional: EVM assembly produced by solc (object).
// Corresponds to "evm.legacyAssembly" in the outputSelection settings.
"legacyAssembly": {/* ... */},
// Optional: List of function hashes (object).
// Corresponds to "evm.methodIdentifiers" in the outputSelection settings.
"methodIdentifiers": {
// Mapping between the function signature and its hash.
"delegate(address)": "5c19a95c"
},
// Optional: Always empty, Included only to preserve compatibility with some toolkits (object).
// Corresponds to "evm.gasEstimates" in the outputSelection settings.
"gasEstimates": {},
// Optional: Deploy EVM bytecode.
// Corresponds to "evm.bytecode" in the outputSelection settings.
"bytecode": {
// Optional: Bytecode (string).
// Corresponds to "evm.bytecode.object" in the outputSelection settings.
"object": "5b60806040525f341415601c5763...",
// Optional: LLVM text assembly (string).
// Corresponds to "evm.bytecode.llvmAssembly" in the outputSelection settings.
"llvmAssembly": "/* ... */",
// Optional, solx-only: EVM legacy assembly IR (string). Only available for non-viaIR mode.
// Corresponds to "evm.bytecode.evmla" in the outputSelection settings.
"evmla": "/* ... */",
// Optional, solx-only: Ethereal IR (string). Only available for non-viaIR mode.
// Corresponds to "evm.bytecode.ethir" in the outputSelection settings.
"ethir": "/* ... */",
// Optional, solx-only: Unoptimized LLVM IR (string).
// Corresponds to "evm.bytecode.llvmIrUnoptimized" in the outputSelection settings.
"llvmIrUnoptimized": "/* ... */",
// Optional, solx-only: Optimized LLVM IR (string).
// Corresponds to "evm.bytecode.llvmIr" in the outputSelection settings.
"llvmIr": "/* ... */",
// Optional: ELF-wrapped DWARF debug info (string).
// Corresponds to "evm.bytecode.debugInfo" in the outputSelection settings.
"debugInfo": "/* ... */",
// Optional: Link references for linkers that are to resolve library addresses at deploy time (object).
// Corresponds to "evm.bytecode.linkReferences" in the outputSelection settings.
"linkReferences": {/* ... */},
// Optional: Benchmarks of each stage of the compilation on a per-translation unit basis (array).
// Corresponds to "benchmarks" in the outputSelection settings.
"benchmarks": [/* ... */],
// Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
// Corresponds to "evm.bytecode.opcodes" in the outputSelection settings.
"opcodes": "",
// Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
// Corresponds to "evm.bytecode.sourceMap" in the outputSelection settings.
"sourceMap": "",
// Optional: Always empty object, included only to preserve compatibility with some toolkits (object).
// Corresponds to "evm.bytecode.functionDebugData" in the outputSelection settings.
"functionDebugData": {},
// Optional: Always empty array, included only to preserve compatibility with some toolkits (array).
// Corresponds to "evm.bytecode.generatedSources" in the outputSelection settings.
"generatedSources": []
},
// Optional: Runtime EVM bytecode.
// Corresponds to "evm.deployedBytecode" in the outputSelection settings.
"deployedBytecode": {
// Optional: Bytecode (string).
// Corresponds to "evm.deployedBytecode.object" in the outputSelection settings.
"object": "5b60806040525f34141560145760...",
// Optional: LLVM text assembly (string).
// Corresponds to "evm.deployedBytecode.llvmAssembly" in the outputSelection settings.
"llvmAssembly": "/* ... */",
// Optional, solx-only: EVM legacy assembly IR (string). Only available for non-viaIR mode.
// Corresponds to "evm.deployedBytecode.evmla" in the outputSelection settings.
"evmla": "/* ... */",
// Optional, solx-only: Ethereal IR (string). Only available for non-viaIR mode.
// Corresponds to "evm.deployedBytecode.ethir" in the outputSelection settings.
"ethir": "/* ... */",
// Optional, solx-only: Unoptimized LLVM IR (string).
// Corresponds to "evm.deployedBytecode.llvmIrUnoptimized" in the outputSelection settings.
"llvmIrUnoptimized": "/* ... */",
// Optional, solx-only: Optimized LLVM IR (string).
// Corresponds to "evm.deployedBytecode.llvmIr" in the outputSelection settings.
"llvmIr": "/* ... */",
// Optional: ELF-wrapped DWARF debug info (string).
// Corresponds to "evm.deployedBytecode.debugInfo" in the outputSelection settings.
"debugInfo": "/* ... */",
// Optional: Link references for linkers that are to resolve library addresses at deploy time (object).
// Corresponds to "evm.deployedBytecode.linkReferences" in the outputSelection settings.
"linkReferences": {/* ... */},
// Optional: Benchmarks of each stage of the compilation on a per-translation unit basis (array).
// Corresponds to "benchmarks" in the outputSelection settings.
"benchmarks": [/* ... */],
// Optional: Resolved by LLVM automatically, so always returned as an empty object (object).
// Included only to preserve compatibility with some toolkits.
// Corresponds to "evm.deployedBytecode.immutableReferences" in the outputSelection settings.
"immutableReferences": {},
// Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
// Corresponds to "evm.deployedBytecode.opcodes" in the outputSelection settings.
"opcodes": "",
// Optional: Always empty string, included only to preserve compatibility with some toolkits (string).
// Corresponds to "evm.deployedBytecode.sourceMap" in the outputSelection settings.
"sourceMap": "",
// Optional: Always empty object, included only to preserve compatibility with some toolkits (object).
// Corresponds to "evm.deployedBytecode.functionDebugData" in the outputSelection settings.
"functionDebugData": {},
// Optional: Always empty array, included only to preserve compatibility with some toolkits (array).
// Corresponds to "evm.deployedBytecode.generatedSources" in the outputSelection settings.
"generatedSources": []
}
}
}
}
},
// Optional: Benchmarks of the solx LLVM-based compilation pipeline and its underlying call to solc (array).
// Corresponds to "benchmarks" in the outputSelection settings.
"benchmarks": [/* ... */],
// Optional: Unset if no messages were emitted.
"errors": [
{
// Optional: Location within the source file.
// Unset if the error is unrelated to input sources.
"sourceLocation": {
// Required: The source path.
"file": "sourceFile.sol",
// Required: The source location start. Equals -1 if unknown.
"start": 0,
// Required: The source location end. Equals -1 if unknown.
"end": 100
},
// Required: Message type.
// solc errors are listed at https://docs.soliditylang.org/en/latest/using-the-compiler.html#error-types.
"type": "Error",
// Required: Component the error originates from.
"component": "general",
// Required: Message severity.
// Possible values: "error", "warning", "info".
"severity": "error",
// Optional: Unique code for the cause of the error.
// Only solc produces error codes for now.
// solx currently emits errors without codes, but they will be introduced soon.
"errorCode": "3141",
// Required: Message.
"message": "Invalid keyword",
// Required: Message formatted using the source location.
"formattedMessage": "sourceFile.sol:100: Invalid keyword"
}
]
}
Limitations and Differences from Upstream solc
This chapter summarizes where solx differs from upstream solc, and which limitations currently apply.
Compilation Modes
solx supports two codegen pipelines:
- Yul pipeline: enabled with
--via-ir(matching solc's--via-irflag). - Legacy EVM assembly pipeline: the default code generation path.
The --evmla and --ethir debug flags are only available in the legacy (non-via-ir) pipeline.
solc Fork Modifications
The solx-solidity fork includes the following changes relative to upstream solc:
extraMetadataoutput: emits user-defined function metadata (name, entry tag, input/output sizes, AST IDs) used during LLVM lowering.DUPX/SWAPXinstructions: extends stack access beyond depth 16 to avoid classic "stack too deep" failures.spillAreaSizesetting: configures a memory spill region for values that cannot remain on stack.- Function pointer dispatch tables: uses static dispatch through
FuncPtrTrackerinstead of dynamic jump-based dispatch. - Simplified
try/catchin legacy mode: reduces control-flow complexity for translator compatibility. - Bypassed EVM bytecode generation: solx does not use solc's EVM bytecode output; final bytecode is produced by the LLVM backend.
- Disabled optimizer: the solc optimizer is turned off to preserve function boundaries and metadata validity. All optimization is handled by the LLVM backend.
Behavioral Differences
- Generated bytecode can differ from upstream solc output because final code generation happens in LLVM.
- Optimization levels map to LLVM optimization pipelines, not upstream
solcoptimization heuristics. - Final code size can differ from upstream due to LLVM pass behavior.
Unsupported Features
CALLCODEis rejected at compile time. UseDELEGATECALLinstead.SELFDESTRUCTis rejected at compile time (deprecated by EIP-6049).PC(program counter) is not supported.BLOBHASHandBLOBBASEFEE(EIP-4844/EIP-7516) are rejected at compile time.- Inline assembly marked
memory-safecan cause errors when spill-area-based lowering is active. - Some
solcoptimizer settings are ignored since the solc optimizer is disabled.
Version Support
- The solx-solidity fork tracks upstream solc releases.
- The minimum supported Solidity version matches the forked solc version.
Architecture
solx is an LLVM-based compiler that translates Solidity source code into optimized EVM bytecode.
Components
The compiler consists of three repositories:
- solx — The main compiler executable and Rust crates that translate Yul and EVM assembly to LLVM IR.
- solx-solidity — An LLVM-friendly fork of the Solidity compiler that emits Yul and EVM assembly.
- solx-llvm — A fork of the LLVM framework with an EVM target backend.
Compilation Pipeline
┌─────────────────────────────────────────────┐
│ Frontend │
┌──────────┐ │ ┌────────────────┐ ┌──────────────┐ │
│ Solidity │ ────────── │ │ solx-solidity │ ───── │ solx │ │
│ source │ │ │ │ │ │ │
└──────────┘ │ │ Parsing, │ Yul / │ Yul & EVM │ │
│ │ semantic │ EVM │ assembly │ │
│ │ analysis │ asm │ translation │ │
│ └────────────────┘ └──────────────┘ │
└─────────────────────────────────────────────┘
│
LLVM IR
│
▼
┌─────────────────────────────────────────────┐
│ Middle-end │
│ ┌────────────────────────────────────────┐ │
│ │ LLVM Optimizer │ │
│ │ │ │
│ │ IR transformations and optimizations │ │
│ └────────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
│
Optimized IR
│
▼
┌─────────────────────────────────────────────┐
│ Backend │
│ ┌────────────────────────────────────────┐ │
│ │ solx-llvm EVM Target │ │
│ │ │ │
│ │ Instruction selection, register │ │
│ │ allocation, code emission │ │
│ └────────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
│
▼
┌──────────────┐
│ EVM bytecode │
└──────────────┘
Frontend
The frontend transforms Solidity source code into LLVM IR:
- solx-solidity parses the Solidity source, performs semantic analysis, and emits either Yul or EVM assembly.
- solx reads the Yul or EVM assembly and translates it into LLVM IR.
Middle-end
The LLVM optimizer applies a series of IR transformations to improve code quality and performance. These optimizations are target-independent and work on the LLVM IR representation.
Backend
The solx-llvm EVM target converts optimized LLVM IR into EVM bytecode. This includes:
- Instruction selection (mapping IR operations to EVM opcodes)
- Register allocation (managing the EVM stack)
- Stackification (converting register-based code to stack-based EVM operations)
- Code emission (generating the final bytecode)
Why a Fork of solc?
The solx-solidity fork includes modifications to make the Solidity compiler output compatible with LLVM IR generation. The upstream solc compiler is designed to emit EVM bytecode directly, but solx needs intermediate representations (Yul or EVM assembly) that can be translated to LLVM IR.
The fork maintains compatibility with upstream solc and tracks its releases.
EVM Assembly Translator
The EVM assembly translator converts legacy EVM assembly (the default solc output) into LLVM IR via an intermediate representation called Ethereal IR (EthIR). The Yul pipeline (--via-ir) bypasses this translator entirely.
Why EthIR?
EVM assembly is stack-based with dynamic jumps, making it difficult to translate directly to LLVM IR which requires explicit control flow graphs. EthIR bridges this gap by:
- Tracking stack state to identify jump destinations at compile time
- Cloning blocks reachable from predecessors with different stack states
- Reconstructing control flow from stack-based jumps into a static CFG
- Resolving function calls using metadata from the solc fork
Translation Pipeline
Solidity source
│
▼
solc (solx-solidity fork)
│ Emits EVM assembly JSON + extraMetadata
▼
Assembly parsing
│ Parses instructions, resolves dependencies
▼
Block construction
│ Groups instructions between Tag labels
▼
EthIR traversal
│ DFS with stack simulation, block cloning
▼
LLVM IR generation
│ Creates LLVM functions, basic blocks, instructions
▼
LLVM optimizer
│
▼
EVM bytecode (via LLVM EVM backend)
Key Data Structures
Assembly
The Assembly struct represents the raw solc output. It contains:
- code: Flat list of instructions (deploy code)
- data["0"]: Nested assembly for runtime code
- data[hex]: Referenced data entries — sub-assemblies, hashes, or resolved contract paths (for CREATE/CREATE2)
Each instruction has a name (opcode), optional value (operand), and optional source location.
EtherealIR
The top-level container holding:
- entry_function: The main contract function (deploy + runtime)
- defined_functions: Internal functions discovered during traversal
Function
The Function struct is the core of the translator. It contains:
- blocks:
BTreeMap<BlockKey, Vec<Block>>— maps each block tag to one or more instances (clones for different stack states) - block_hash_index:
HashMap<BlockKey, HashSet<u64>>— fast duplicate detection by stack hash - stack_size: Maximum stack height observed, used to size LLVM stack allocations
Block
Each Block represents a sequence of instructions between two Tag labels:
- key:
BlockKey(code segment + tag number) - instance: Clone index (0, 1, 2... for blocks visited with different stack states)
- elements: Instructions with full stack state snapshots
- initial_stack / stack: Stack state at entry and after processing
Stack Elements
The stack tracks six kinds of values:
| Variant | Description | Example |
|---|---|---|
Value(String) | Runtime value (opaque) | Result of ADD, MLOAD |
Constant(BigUint) | Compile-time 256-bit constant | 0x60, 0xFFFF |
Tag(u64) | Block tag (jump target) | Tag 42 |
Path(String) | Contract dependency path | "SubContract" |
Data(String) | Hex data chunk | "deadbeef" |
ReturnAddress(usize) | Function return marker | Return with 2 outputs |
Block Cloning and Stack Hashing
The same block may be reached via different code paths with different stack contents. Since the stack determines jump targets (a JUMP pops its destination from the stack), the translator must handle each unique stack state separately.
How It Works
- When entering a block, the translator computes a stack hash using
XxHash3_64 - The hash considers only
Tagelements — tags determine control flow, while constants and runtime values affect only data flow - The pair
(BlockKey, stack_hash)uniquely identifies a block instance - If this pair has been visited before, the block is skipped (cycle detection)
- Otherwise, a new block instance is created
Block "process" reached with stack [T_10, V_x]: → instance 0
Block "process" reached with stack [T_20, V_y]: → instance 1 (different tag)
Block "process" reached with stack [T_10, V_z]: → instance 0 (same hash, reused)
Stack Hash Algorithm
fn hash(&self) -> u64 {
let mut hasher = XxHash3_64::default();
for element in self.elements.iter() {
match element {
Element::Tag(tag) => hasher.write(&tag.to_le_bytes()),
_ => hasher.write_u8(0),
}
}
hasher.finish()
}
Only Tag values contribute to the hash. This is intentional: two stack states with the same tags but different runtime values will follow the same control flow path.
Traversal Algorithm
The Function::traverse() method performs a depth-first traversal of blocks, simulating EVM execution:
traverse(blocks, extra_metadata):
queue ← [(entry_block, empty_stack)]
visited ← {}
while queue is not empty:
(block_key, stack) ← queue.pop()
hash ← stack.hash()
if (block_key, hash) in visited:
continue
visited.add((block_key, hash))
block ← blocks[block_key].clone_with(stack)
for instruction in block:
simulate_instruction(instruction, stack)
if instruction is JUMP/JUMPI:
queue.push((target_tag, stack))
Instruction Simulation
For each instruction, the translator:
- Pops the required number of inputs from the simulated stack
- Computes the output (compile-time if possible, runtime value otherwise)
- Pushes the result onto the stack
- For control flow instructions, queues successor blocks
Compile-Time Constant Folding
Arithmetic operations on known values are folded at compile time:
| Operands | Result |
|---|---|
Constant + Constant | Constant (computed) |
Tag + Constant | Tag (if result is valid block) |
Tag + Tag | Tag (if result is valid block) |
| Any other combination | Value (runtime, opaque) |
This is critical for resolving jump targets: solc often computes jump destinations via PUSH tag + arithmetic.
Function Call Detection
The translator identifies function calls using extra metadata from the solc fork. The extraMetadata JSON field lists all user-defined functions with their:
- Entry tag (in deploy and/or runtime code)
- Input parameter count
- Output return value count
- Function name and AST node ID
When a JUMP targets a known function entry:
- The stack is split: return address, arguments, and remaining caller state
- A
RecursiveCallpseudo-instruction replaces the JUMP - A new
Functionis created and recursively traversed from the entry block - The caller's stack receives
output_sizeopaque return values
Before JUMP to function "add(uint,uint)":
Stack: [... | return_tag | arg1 | arg2 | function_entry_tag]
After call detection:
Instruction: RecursiveCall add(uint,uint), input=2, output=1
Caller stack: [... | return_value]
Callee: new Function traversed from entry tag
LLVM IR Generation
After traversal, the translator generates LLVM IR in several phases:
1. Function Declaration
- Entry function: Uses the pre-declared contract entry point
- Defined functions: Creates private LLVM functions with
N × i256parameters and return values (multiple returns use LLVM struct types)
2. Stack Variable Allocation
For each function, stack_size stack slots are allocated as LLVM alloca instructions. These represent the simulated EVM stack as addressable memory:
%stack_0 = alloca i256 ; bottom of stack
%stack_1 = alloca i256
...
%stack_N = alloca i256 ; top of stack
For defined functions, slot 0 is reserved for the return address marker, and input parameters are stored starting from slot 1.
3. Basic Block Creation
Each (BlockKey, instance) pair becomes an LLVM BasicBlock:
block_runtime_42/0: ; tag 42, first instance
...
block_runtime_42/1: ; tag 42, second instance (different stack state)
...
4. Instruction Translation
Each EthIR element calls into_llvm() to generate LLVM instructions. Stack operations map to loads/stores on the allocated stack variables:
| EVM Operation | LLVM Translation |
|---|---|
PUSH 0x42 | store i256 66, ptr %stack_N |
DUP2 | %v = load i256, ptr %stack_(N-2); store i256 %v, ptr %stack_(N+1) |
ADD | %a = load ...; %b = load ...; %r = add i256 %a, %b; store ... |
MLOAD | %ptr = load ...; %v = load i256, ptr addrspace(1) %ptr; store ... |
JUMP | br label %target_block |
JUMPI | %cond = ...; br i1 %cond, label %taken, label %fallthrough |
solc Fork Modifications
The EVM assembly translator relies on several modifications in the solx-solidity fork. The most relevant to this pipeline are:
extraMetadataoutput: reports all user-defined functions with entry tags, parameter counts, and AST IDs. Without this, the translator cannot distinguish function calls from arbitrary jumps.- Dispatch tables for function pointers: indirect calls are lowered to static dispatch tables instead of dynamic jumps.
DUPX/SWAPXinstructions: extend stack access beyond depth 16, eliminating "stack too deep" errors.- Disabled optimizer: the solc optimizer is disabled to preserve function boundaries and metadata validity. All optimization is handled by the LLVM backend.
For the full list of fork modifications, see Limitations and Differences from solc.
EVM Instructions Reference
This chapter describes how the LLVM EVM backend models EVM instructions and lowers LLVM IR into final opcode sequences.
Instruction Definitions
The EVM instruction set is defined in the LLVM backend via TableGen.
- It contains opcode definitions, pattern mappings, and EVM-specific pseudo-instructions.
- It covers roughly 180 instruction forms once TableGen expansions are considered (for example
DUP1..16,SWAP1..16, andPUSHfamilies). - Instructions are modeled around
i256values, matching the EVM word size.
Address Space Model
The backend uses explicit LLVM address spaces to model EVM memory regions:
| Address space | Value | Meaning |
|---|---|---|
AS_STACK | 0 | Compiler-managed stack memory model |
AS_HEAP | 1 | EVM linear memory (MLOAD, MSTORE, MCOPY) |
AS_CALL_DATA | 2 | Call data region |
AS_RETURN_DATA | 3 | Return data region |
AS_CODE | 4 | Code segment |
AS_STORAGE | 5 | Persistent storage |
AS_TSTORAGE | 6 | Transient storage |
These constants are defined in the EVM backend header.
Core Instruction Categories
Arithmetic
Arithmetic opcodes map directly to i256 operations or EVM intrinsics:
ADD,MUL,SUB,DIV,SDIV,MOD,SMODADDMOD,MULMOD,EXP,SIGNEXTEND
For example, ADD is selected from LLVM add i256 patterns.
Memory
Memory instructions operate on the heap address space:
MLOADmaps to a load fromAS_HEAPMSTOREmaps to a store intoAS_HEAPMCOPYlowers memory copy operations in heap memory
Storage
Storage instructions map to storage address spaces:
SLOAD,SSTOREuseAS_STORAGETLOAD,TSTOREuseAS_TSTORAGE
Control Flow
Control flow instructions are selected from LLVM branch forms:
JUMPmaps from unconditionalbrJUMPImaps from conditionalbr i1
The backend also uses helper pseudos (for example JUMP_UNLESS) that are lowered before emission.
Stack
EVM stack manipulation opcodes are emitted as needed:
DUP1..DUP16SWAP1..SWAP16POP
They are introduced and optimized by stackification passes rather than directly authored in frontend IR.
Cryptographic
SHA3/KECCAK256 is represented through EVM-specific intrinsic plumbing:
- Machine instruction:
KECCAK256 - LLVM intrinsic path:
llvm.evm.sha3
Runtime Library (evm-stdlib.ll)
The backend links helper wrappers from the EVM runtime standard library:
__addmod__mulmod__signextend__exp__byte__sdiv__div__smod__mod__shl__shr__sar__sha3
These wrappers forward to corresponding llvm.evm.* intrinsics.
Stackification Pipeline
The late codegen pipeline converts virtual-register machine IR to valid EVM stack code:
EVMSingleUseExpression: reorders machine instructions into expression-friendly form.EVMBackwardPropagationStackification: performs backward propagation stackification from register form.EVMStackSolverandEVMStackShuffler: compute and emit low-costDUP/SWAP/spill-reload sequences.EVMPeephole: runs late peephole optimizations before final emission.
Stack Depth Limit
The EVM stack itself can hold up to 1024 items, but DUP and SWAP instructions can only reach the top 16 positions. The backend enforces this depth-16 manipulation reach.
This limit is exposed by EVMSubtarget::stackDepthLimit().
Pseudo-Instructions
Several pseudos are used during lowering and removed or expanded before final bytecode:
PUSHDEPLOYADDRESS: materializes deploy-time address usage for libraries.SELECT: models conditional value selection.CONST_I256: represents immediate constants before stackification.COPY_I256: temporary register-copy form before stackification.
Yul Builtins Reference
This chapter lists all Yul builtin functions supported by solx and how each is lowered to LLVM IR for the EVM backend.
Lowering Strategies
Yul builtins are lowered through one of three strategies:
- Direct LLVM IR: the builtin maps to native LLVM integer or memory operations on
i256. - LLVM intrinsic: the builtin maps to an
llvm.evm.*intrinsic that the EVM backend expands to opcodes. - Address space access: the builtin maps to a load or store in a typed LLVM address space (see EVM Instructions: Address Space Model).
Arithmetic
| Builtin | Lowering | Notes |
|---|---|---|
add | Direct LLVM IR | add i256 |
sub | Direct LLVM IR | sub i256 |
mul | Direct LLVM IR | mul i256 |
div | Direct LLVM IR | Unsigned; returns 0 when divisor is 0 |
sdiv | Direct LLVM IR | Signed; returns 0 when divisor is 0 |
mod | Direct LLVM IR | Unsigned; returns 0 when divisor is 0 |
smod | Direct LLVM IR | Signed; returns 0 when divisor is 0 |
addmod | Intrinsic llvm.evm.addmod | (x + y) % m without intermediate overflow |
mulmod | Intrinsic llvm.evm.mulmod | (x * y) % m without intermediate overflow |
exp | Intrinsic llvm.evm.exp | Exponentiation |
signextend | Intrinsic llvm.evm.signextend | Sign extend from bit (i*8+7) |
Comparison
| Builtin | Lowering | Notes |
|---|---|---|
lt | Direct LLVM IR | Unsigned less-than |
gt | Direct LLVM IR | Unsigned greater-than |
slt | Direct LLVM IR | Signed less-than |
sgt | Direct LLVM IR | Signed greater-than |
eq | Direct LLVM IR | Equality |
iszero | Direct LLVM IR | Check if zero |
Bitwise
| Builtin | Lowering | Notes |
|---|---|---|
and | Direct LLVM IR | Bitwise AND |
or | Direct LLVM IR | Bitwise OR |
xor | Direct LLVM IR | Bitwise XOR |
not | Direct LLVM IR | Bitwise NOT |
shl | Direct LLVM IR | Shift left; shift >= 256 yields 0 |
shr | Direct LLVM IR | Logical shift right; shift >= 256 yields 0 |
sar | Direct LLVM IR | Arithmetic shift right; shift >= 256 yields sign-extended value |
byte | Intrinsic llvm.evm.byte | Extract nth byte |
clz | Intrinsic llvm.ctlz | Count leading zeros (requires Osaka EVM version) |
Hashing
| Builtin | Lowering | Notes |
|---|---|---|
keccak256 | Intrinsic llvm.evm.sha3 | Keccak-256 over memory range |
Memory
| Builtin | Lowering | Notes |
|---|---|---|
mload | Address space 1 load | Load 32 bytes from heap memory |
mstore | Address space 1 store | Store 32 bytes to heap memory |
mstore8 | Intrinsic llvm.evm.mstore8 | Store single byte to memory |
mcopy | memcpy in address space 1 | EIP-5656 memory copy |
msize | Intrinsic llvm.evm.msize | Highest accessed memory index |
Storage
| Builtin | Lowering | Notes |
|---|---|---|
sload | Address space 5 load | Load from persistent storage |
sstore | Address space 5 store | Store to persistent storage |
tload | Address space 6 load | Load from transient storage (EIP-1153) |
tstore | Address space 6 store | Store to transient storage (EIP-1153) |
Immutables
| Builtin | Lowering | Notes |
|---|---|---|
loadimmutable | Intrinsic llvm.evm.loadimmutable | Load immutable value with metadata identifier |
setimmutable | Special | Set immutable value during deployment |
Call Data and Return Data
| Builtin | Lowering | Notes |
|---|---|---|
calldataload | Address space 2 load | Load 32 bytes from calldata |
calldatasize | Intrinsic llvm.evm.calldatasize | Size of calldata |
calldatacopy | memcpy from address space 2 to 1 | Copy calldata to memory |
returndatasize | Intrinsic llvm.evm.returndatasize | Size of return data |
returndatacopy | memcpy from address space 3 to 1 | Copy return data to memory |
Code Operations
| Builtin | Lowering | Notes |
|---|---|---|
codesize | Intrinsic llvm.evm.codesize | Current contract code size |
codecopy | memcpy from address space 4 to 1 | Copy code to memory |
extcodesize | Intrinsic llvm.evm.extcodesize | External contract code size |
extcodecopy | Intrinsic llvm.evm.extcodecopy | Copy external code to memory |
extcodehash | Intrinsic llvm.evm.extcodehash | Hash of external contract code |
Object and Data Operations
| Builtin | Lowering | Notes |
|---|---|---|
datasize | Intrinsic llvm.evm.datasize | Size of a named data object |
dataoffset | Intrinsic llvm.evm.dataoffset | Offset of a named data object |
datacopy | Same as codecopy | Copy data to memory |
These builtins are used by deploy stubs to reference embedded runtime and dependency objects. See Binary Layout for details.
Event Logging
| Builtin | Lowering | Notes |
|---|---|---|
log0 | Intrinsic llvm.evm.log0 | Log with 0 topics |
log1 | Intrinsic llvm.evm.log1 | Log with 1 topic |
log2 | Intrinsic llvm.evm.log2 | Log with 2 topics |
log3 | Intrinsic llvm.evm.log3 | Log with 3 topics |
log4 | Intrinsic llvm.evm.log4 | Log with 4 topics |
Contract Calls
| Builtin | Lowering | Notes |
|---|---|---|
call | Intrinsic llvm.evm.call | Call with value transfer |
delegatecall | Intrinsic llvm.evm.delegatecall | Call preserving caller and callvalue |
staticcall | Intrinsic llvm.evm.staticcall | Read-only call |
Note: callcode is rejected at compile time. Use delegatecall instead.
Contract Creation
| Builtin | Lowering | Notes |
|---|---|---|
create | Intrinsic llvm.evm.create | Create new contract |
create2 | Intrinsic llvm.evm.create2 | Create at deterministic address |
Control Flow
| Builtin | Lowering | Notes |
|---|---|---|
return | Intrinsic llvm.evm.return | Return data from execution |
revert | Intrinsic llvm.evm.revert | Revert with return data |
stop | Intrinsic llvm.evm.stop | Stop execution |
invalid | Intrinsic llvm.evm.invalid | Invalid instruction (consumes all gas) |
Note: selfdestruct is rejected at compile time (deprecated by EIP-6049).
Block and Transaction Context
| Builtin | Lowering | Notes |
|---|---|---|
address | Intrinsic llvm.evm.address | Current contract address |
caller | Intrinsic llvm.evm.caller | Message sender |
callvalue | Intrinsic llvm.evm.callvalue | Wei sent with call |
gas | Intrinsic llvm.evm.gas | Remaining gas |
gasprice | Intrinsic llvm.evm.gasprice | Gas price of transaction |
balance | Intrinsic llvm.evm.balance | Balance of address |
selfbalance | Intrinsic llvm.evm.selfbalance | Current contract balance |
origin | Intrinsic llvm.evm.origin | Transaction sender |
Block Information
| Builtin | Lowering | Notes |
|---|---|---|
blockhash | Intrinsic llvm.evm.blockhash | Hash of given block |
number | Intrinsic llvm.evm.number | Current block number |
timestamp | Intrinsic llvm.evm.timestamp | Block timestamp |
coinbase | Intrinsic llvm.evm.coinbase | Block beneficiary |
difficulty | Intrinsic llvm.evm.difficulty | Block difficulty (pre-merge) |
prevrandao | Intrinsic llvm.evm.difficulty | Previous RANDAO value (EIP-4399, reuses difficulty) |
gaslimit | Intrinsic llvm.evm.gaslimit | Block gas limit |
chainid | Intrinsic llvm.evm.chainid | Chain ID (EIP-1344) |
basefee | Intrinsic llvm.evm.basefee | Base fee per gas (EIP-1559) |
blobhash | Rejected at compile time | Versioned hash of transaction's i-th blob (EIP-4844) |
blobbasefee | Rejected at compile time | Current block's blob base fee (EIP-7516/EIP-4844) |
Note: blobhash and blobbasefee are not yet supported and will produce a compile error.
Special and Meta
| Builtin | Lowering | Notes |
|---|---|---|
pop | Optimized away | No code generated |
linkersymbol | Intrinsic llvm.evm.linkersymbol | Library linker placeholder |
memoryguard | Special | Reserves a memory region; used by solx to configure the spill area for stack-too-deep mitigation |
Binary Layout and Linking
This chapter describes how solx models deploy/runtime bytecode objects, dependency data, and post-compilation linking.
Contract Object Model
EVM contracts have two code segments:
- Deploy code (init code): runs only during contract creation.
- Runtime code: returned by deploy code and stored as the contract's permanent code.
Deploy code typically builds runtime bytes in memory and executes RETURN(offset, size).
solc JSON Assembly Layout
In legacy assembly JSON, the object is split into top-level deploy code and nested runtime code:
- Top-level
.code: deploy instruction stream. .data["0"]: runtime object..data[<hex>]: additional referenced data objects (for example constructor-time dependencies).
Conceptually:
{
".code": [ /* deploy instructions */ ],
".data": {
"0": { /* runtime assembly object */ },
"ab12...": { /* dependency object or hash */ }
}
}
The EVM assembly layer exposes this as Assembly { code, data }, with runtime_code() reading data["0"].
Dependencies and CREATE / CREATE2
Factory-style deploy code can reference other contract objects. In assembly, this is represented via data entries and push-style aliases:
PUSH [$](PUSH_DataOffset) for object offsetPUSH #[$](PUSH_DataSize) for object sizePUSH data(PUSH_Data) for raw dependency chunks
These operands are resolved during assembly preprocessing before LLVM lowering.
Deploy Stub Shape
The minimal deploy stub pattern is:
- Load runtime size (
datasize). - Load runtime offset (
dataoffset). - Copy bytes from code section to memory.
- Return copied bytes.
The EVM codegen emits this canonical form in minimal_deploy_code() using:
llvm.evm.datasize(metadata !"...")llvm.evm.dataoffset(metadata !"...")llvm.memcpyfromaddrspace(4)(code) toaddrspace(1)(heap)llvm.evm.return
datasize / dataoffset Builtins
Yul builtins datasize(<object>) and dataoffset(<object>) lower to EVM intrinsics with metadata object names.
In solx, these are translated to LLVM intrinsics:
llvm.evm.datasizellvm.evm.dataoffset
This is how deploy stubs reference embedded runtime/dependency objects without hardcoding absolute byte offsets.
Metadata Hash and CBOR Tail
Runtime bytecode may include CBOR metadata appended at the end.
- The payload can include compiler version info and optional metadata hash fields.
- Hash behavior is configurable with
--metadata-hash(for exampleipfs). - CBOR appending can be disabled with
--no-cbor-metadata.
In the build pipeline, metadata bytes are appended to runtime objects before final assembly/linking.
Library Linking
Library references are resolved at link time:
- The linker patches linker symbols with final addresses.
- If a symbol is unresolved, solx records its offsets and emits placeholders in hex output.
- Placeholder format follows the common pattern
__$<keccak-256-digest>$__.
Standard JSON output reports unresolved positions through evm.*.linkReferences so external tooling can link later.
Dependency Resolution and Path Aliasing
The assembly preprocessor performs a normalization pass over all contracts before lowering:
- Hash deploy and runtime sub-objects.
- Build
hash -> full contract pathmapping. - Rewrite
.dataentries from embedded objects to stable path references (Data::Path). - Build index mappings for deploy and runtime dependency tables.
- Replace instruction aliases (
PUSH_DataOffset,PUSH_DataSize,PUSH_Data) with resolved identifiers.
Two details are important:
- Entry
"0"is always treated as runtime code and mapped to<contract>.runtime. - Hex indices are normalized to 32-byte (64 hex char) aliases before lookup, so short keys and padded keys resolve consistently.
This path aliasing step gives deterministic dependency identifiers for later object assembly and linking.
Testing
This page describes how to run tests for the solx compiler and the format of test files.
Unit and CLI Tests
Run the standard Rust test suite:
# Run all tests (unit + CLI)
cargo test
# Run only unit tests
cargo test --lib
# Run only CLI/integration tests
cargo test --test cli
# Run a specific test
cargo test --test cli -- cli::bin::default
Integration Tests
The solx-tester tool runs integration tests by compiling contracts and executing them with revm.
# Build the compiler and tester
cargo build --release
# Run all integration tests
./target/release/solx-tester --solidity-compiler ./target/release/solx
# Run tests for a specific file
./target/release/solx-tester --solidity-compiler ./target/release/solx --path tests/solidity/simple/default.sol
# Run only Yul IR pipeline tests (excludes EVMLA pipeline)
./target/release/solx-tester --solidity-compiler ./target/release/solx --via-ir
# Run tests with specific optimizer settings
./target/release/solx-tester --solidity-compiler ./target/release/solx --optimizer M3B3
# Combine filters: Yul IR pipeline with M3B3 optimizer
./target/release/solx-tester --solidity-compiler ./target/release/solx --via-ir --optimizer M3B3
Filtering Options
--via-ir— Run only tests using the Yul IR pipeline (codegenY). Without this flag, both Yul IR and EVMLA pipelines are tested.--optimizer <PATTERN>— Filter by optimizer settings. Examples:M3B3— Match exact optimizer levelM^B3— Match M3 or Mz with B3M*B*— Match any M and B levels
--path <PATTERN>— Run only tests whose path contains the pattern.
Foundry and Hardhat Projects
The solx-dev tool can run tests against real-world Foundry and Hardhat projects:
# Build solx-dev
cargo build --release --bin solx-dev
# Run Foundry project tests
./target/release/solx-dev test foundry --test-config-path solx-dev/foundry-tests.toml
# Run Hardhat project tests
./target/release/solx-dev test hardhat --test-config-path solx-dev/hardhat-tests.toml
The test configurations list projects that are cloned and tested automatically. See foundry-tests.toml and hardhat-tests.toml for the full list of tested projects.
Test Collection
This section describes the format of test files used by solx-tester.
Test Types
The repository contains three types of tests:
- Upstream — Tests following the Solidity semantic test format.
- Simple — Single-contract tests.
- Complex — Multi-contract tests and vendored DeFi projects.
Test data is located in:
tests/solidity/— Solidity test contractstests/yul/— Yul test contractstests/llvm-ir/— LLVM IR test contracts
Test Format
Each test comprises source code files and metadata.
Simple tests have only one source file, and their metadata is written in comments that start with !, for example, //! for Solidity.
Complex tests use a test.json file to describe their metadata and refer to source code files.
Metadata
Metadata is a JSON object that contains the following fields:
cases— An array of test cases (described below).contracts— Used for complex tests to describe the contract instances to deploy. In simple tests, only oneTestcontract instance is deployed.
"contracts": {
"Main": "main.sol:Main",
"Callable": "callable.sol:Callable"
}
libraries— An optional field that specifies library addresses for linker:
"libraries": {
"libraries/UQ112x112.sol": { "UQ112x112": "UQ112x112" },
"libraries/Math.sol": { "Math": "Math" }
}
ignore— An optional flag that disables a test.modes— An optional field that specifies mode filters.Ystands for Yul pipeline,Efor EVM assembly pipeline. Compiler versions can be specified as SemVer ranges:
"modes": [
"Y",
"E",
"E >=0.8.30"
]
group— An optional string field that specifies a test group for benchmarking.
Test Cases
All test cases are executed in a clean context, making them independent of each other.
Each test case contains the following fields:
name— A string name.comment— An optional string comment.inputs— An array of inputs (described below).expected— The expected return data for the last input.ignore,modes— Same as in test metadata.
Inputs
Inputs specify the contract calls in the test case:
comment— An optional string comment.instance— The contract instance to call. Default:Test.caller— The caller address. Default:0xdeadbeef01000000000000000000000000000000.method— The method to call:#deployerfor the deployer call.#fallbackto perform a call with raw calldata.- Any other string is recognized as a function name. The function selector will be prepended to the calldata.
calldata— The input calldata:- A hexadecimal string:
"calldata": "0x00" - A numbers array (hex, decimal, or instance addresses). Each number is padded to 32 bytes:
"calldata": ["1", "2"]
- A hexadecimal string:
value— An optionalmsg.value, a decimal number withweiorETHsuffix.storage— Storage values to set before the call:
"storage": {
"Test.address": ["1", "2", "3", "4"]
}
expected— The expected return data:- An array of numbers:
"expected": ["1", "2"] - Extended format with
return_data,exception, andevents:
- An array of numbers:
"expected": {
"return_data": ["0x01"],
"events": [
{
"topics": [
"0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef"
],
"values": ["0xff"]
}
],
"exception": false
}
The expected field can be an array of objects if different expected data is needed for different compiler versions. Use compiler_version as a SemVer range in extended expected format.
Notes:
InstanceName.addresscan be used in expected, calldata, and storage fields to insert a contract instance address.- If a deployer call is not specified for an instance, it will be generated automatically with empty calldata.
Upstream Solidity Semantic Tests
These tests follow the Solidity semantic test format.
Test descriptions and expected results are embedded as comments in the test file. Lines begin with // for Solidity files. The beginning of the test description is indicated by a comment line containing ----.
Debugging and Inspecting Compiler Output
This guide shows how to use solx debug flags to inspect intermediate representations at each compilation stage.
IR Dump Flags
Each flag writes files to the output directory (-o):
| Flag | Extension | Description |
|---|---|---|
--evmla | .evmla | EVM legacy assembly from solc (legacy pipeline only) |
--ethir | .ethir | EthIR (translated from EVM assembly, legacy pipeline only) |
--ir / --ir-optimized | .yul | Yul IR from solc |
--emit-llvm-ir | .unoptimized.ll, .optimized.ll | LLVM IR before and after optimization |
--asm | .asm | Final EVM assembly |
The --debug-info and --debug-info-runtime flags are output selectors that print deploy and runtime debug info to stdout (or to files when -o is used). They are not IR dump flags.
Example:
solx contract.sol -o ./debug --evmla --ethir --emit-llvm-ir --asm --overwrite
This produces one file per contract per stage in ./debug/.
Quick Dump with SOLX_OUTPUT_DIR
Setting the SOLX_OUTPUT_DIR environment variable enables all IR dumps at once without listing individual flags:
export SOLX_OUTPUT_DIR=./ir_dumps
solx contract.sol
This writes all applicable IR files for every contract, with automatic overwrite. Which files are produced depends on the pipeline used: the Yul pipeline dumps Yul and LLVM IR, while the legacy pipeline dumps EVMLA, EthIR, and LLVM IR.
Benchmarking
The --benchmarks flag prints timing information for each pipeline stage:
solx contract.sol --benchmarks
Output includes per-contract compilation timing in milliseconds.
LLVM Diagnostics
Two flags control LLVM-level diagnostics:
--llvm-verify-each— runs LLVM IR verification after every optimization pass. Useful for catching miscompilations. Silent on success; only reports errors when verification fails.--llvm-debug-logging— enables detailed LLVM pass execution logging to stderr. Shows which passes and analyses run, with instruction counts.
solx contract.sol --llvm-verify-each --llvm-debug-logging
LLVM Options Pass-Through
Arbitrary LLVM backend options can be passed with --llvm-options:
solx contract.sol --llvm-options='-evm-metadata-size 10'
The value must be a single string following =. See the LLVM Options guide for available options, including EVM backend options and standard LLVM diagnostic options like -time-passes and -stats.
Optimization Levels
solx maps optimization levels to LLVM pipelines:
| Flag | Middle-end | Size level | Back-end |
|---|---|---|---|
-O1 | Less | Zero | Less |
-O2 | Default | Zero | Default |
-O3 (default) | Aggressive | Zero | Aggressive |
-Os | Default | S | Aggressive |
-Oz | Default | Z | Aggressive |
The default is -O3, optimizing for runtime performance.
The optimization level can also be set with the SOLX_OPTIMIZATION environment variable (values: 1, 2, 3, s, z).
Size Fallback
The --optimization-size-fallback flag (or SOLX_OPTIMIZATION_SIZE_FALLBACK env var) recompiles with -Oz when bytecode exceeds the 24,576-byte EVM contract size limit (EIP-170). When triggered, output files include a .size_fallback suffix.
Spill Area Suffix
When the compiler uses a memory spill region to mitigate stack-too-deep errors, output files include an .o{offset}s{size} suffix indicating the spill area parameters. For example: MyContract.o256s1024.ethir.
Typical Debugging Workflow
- Reproduce the issue with a minimal Solidity file.
- Dump all IRs using
SOLX_OUTPUT_DIR:SOLX_OUTPUT_DIR=./debug solx contract.sol - Inspect stage by stage:
- Yul pipeline: Yul → LLVM IR (unoptimized) → LLVM IR (optimized) → assembly.
- Legacy pipeline: EVMLA → EthIR → LLVM IR (unoptimized) → LLVM IR (optimized) → assembly.
- Narrow down which stage introduces the problem.
- Use LLVM verification if the issue is in the optimizer:
solx contract.sol --llvm-verify-each --emit-llvm-ir -o ./debug --overwrite - Compare with solc using the integration tester:
cargo run --release --bin solx-tester -- \ --solidity-compiler ./target/release/solx \ --path contract.sol
LLVM Options
This guide documents LLVM backend options available in solx through the --llvm-options flag.
Usage
Pass options as a single string after =:
solx contract.sol --llvm-options='-option1 value1 -option2 value2'
EVM Backend Options
These options are specific to the custom LLVM EVM backend and affect compilation behavior directly.
-evm-stack-region-size <value>
Sets the stack spill region size in bytes. The compiler uses this region to spill values that cannot remain on the EVM stack (stack-too-deep mitigation). Normally set automatically based on optimizer settings. Requires -evm-stack-region-offset to be set as well.
-evm-stack-region-offset <value>
Sets the stack spill region memory offset. Normally set automatically to match the solc user memory offset.
-evm-metadata-size <value>
Sets the metadata size hint used by the backend for gas and code size tradeoff decisions.
Standard LLVM Diagnostic Options
Standard LLVM diagnostic options can be passed through --llvm-options and their output is printed to stderr. Some options (such as -debug and -debug-only) require LLVM built with assertions enabled (-DLLVM_ENABLE_ASSERTIONS=ON). When building from source, pass --enable-assertions to solx-dev llvm build.
-time-passes
Print timing information for each LLVM pass.
solx contract.sol --bin --llvm-options='-time-passes'
-stats
Print statistics from LLVM passes (number of transformations applied, etc.).
-print-after-all
Print LLVM IR after every optimization pass. Produces very large output (tens of thousands of lines) but useful for tracing pass behavior.
-print-before-all
Print LLVM IR before every optimization pass.
-debug-only=<pass-name>
Enable debug output for a specific LLVM pass. Note that --llvm-debug-logging controls pass-builder logging specifically, not the general LLVM DEBUG() macro categories.
CLI Debug Flags
These are top-level solx flags (not passed through --llvm-options):
| Flag | Effect |
|---|---|
--llvm-verify-each | Run IR verifier after each LLVM pass. Silent on success; produces an error if verification fails. |
--llvm-debug-logging | Enable pass-builder debug logging. Shows which passes and analyses run, with instruction counts. |
See the Debugging guide for the full set of diagnostic flags.
Building with Sanitizers
This is the guide on building solx with sanitizers enabled.
Introduction
Sanitizers are tools that help find bugs in code. They are used to detect memory corruption, leaks, and undefined behavior.
The most common sanitizers are AddressSanitizer, MemorySanitizer, and ThreadSanitizer.
If you are not familiar with sanitizers, see the official documentation.
Who is this guide for?
This guide is for developers who want to debug issues with solx.
Prerequisites
You can check the LLVM version used by rustc by running the following command rustc --version --verbose.
Build steps
The general steps to have a sanitizer enabled build include:
- Build the LLVM framework with the required sanitizer enabled.
- Build solx with the LLVM build from the previous step.
Please, follow the common installation instructions until the LLVM build step.
This guide assumes the build with AddressSanitizer enabled.
Build LLVM with sanitizer enabled
When building LLVM, use --sanitizer <sanitizer> option and set build type to RelWithDebInfo:
./target/release/solx-dev llvm build --sanitizer=Address --build-type=RelWithDebInfo
./target/release/solx-dev llvm build --sanitizer=Address --build-type=RelWithDebInfo --extra-args '-DCMAKE_C_COMPILER=/opt/homebrew/opt/llvm/bin/clang' '-DCMAKE_CXX_COMPILER=/opt/homebrew/opt/llvm/bin/clang++'
Build solx with the sanitizer enabled
To build solx with the sanitizer enabled, you need to set the RUSTFLAGS environment variable
to -Z sanitizer=address and run the cargo build command.
Sanitizers build is a feature that is available only for the nightly Rust compiler, it is recommended
to set RUSTC_BOOTSTRAP=1 environment variable before the build.
It is also mandatory to use --target option to specify the target architecture. Otherwise, the build will fail.
Please, check the table below to find the correct target for your platform.
| Platform | LLVM Target Triple |
|---|---|
| MacOS-arm64 | aarch64-apple-darwin |
| MacOS-x86 | x86_64-apple-darwin |
| Linux-arm64 | aarch64-unknown-linux-gnu |
| Linux-x86 | x86_64-unknown-linux-gnu |
Additionally, for proper reports symbolization it is recommended to set the ASAN_SYMBOLIZER_PATH environment variable.
For more info, see symbolizing reports section of LLVM documentation.
For example, to build solx for MacOS-arm64 with AddressSanitizer enabled, run the following command:
export RUSTC_BOOTSTRAP=1
export ASAN_SYMBOLIZER_PATH=$(which llvm-symbolizer) # check the path to llvm-symbolizer
TARGET=aarch64-apple-darwin # Change to your target
RUSTFLAGS="-Z sanitizer=address" cargo test --target=${TARGET}
Congratulations! You have successfully built solx with the sanitizers enabled.
Please, refer to the official documentation for more information on how to use sanitizers and their types.