OpenVM Rust Frontend
OpenVM supports a Rust frontend via compilation to a 32-bit RISC-V target which is then transpiled into an OpenVM executable with instructions from the RV32IM VM extension. We implement this by cross-compiling for a platform which differs from the machine performing the build. This involves the following:
- Host: The target and compiler toolchain used to run the program build and proving binaries.
- Guest: The target and compiler toolchain used to build the program to be proven.
We detail the host and guest target and toolchain as well as the guest runtime for the OpenVM Rust frontend below.
Host and Guest Target and Toolchain
The OpenVM Rust frontend supports the following host and guest target and toolchains:
- Host: We support
aarch64-apple-darwin
andx86_64-unknown-linux-gnu
with Rust 1.86.0:rustc 1.86.0 (05f9846f8 2025-03-31)
. For reproducible builds, we recommend using thex86_64-unknown-linux-gnu
platform. - Guest:
riscv32im-risc0-zkvm-elf
with Rustnightly-2025-02-14
:rustc 1.86.0-nightly (a567209da 2025-02-13)
.
The riscv32im-risc0-zkvm-elf
guest target incorporates special support for
zkVMs and has official support by the
Rust toolchain.
We anticipate upstreaming an OpenVM-specific target to Rust in the future.
Guest Runtime
The OpenVM Rust runtime supports no_std
Rust by default, with optional std
support available
through the "std"
feature. This section documents the different features of the runtime.
Memory Allocator
OpenVM supports 512MB of guest memory, with stack growing down from STACK_TOP = 0x0020_0400
.
program loading starting at TEXT_START = 0x0020_0800
, and heap starting right afterwards.
We support two allocators:
- A bump allocator which increments a heap pointer for each successive allocation without deallocating. This is the default allocator.
- A linked-list allocator from the
embedded-alloc
crate which supports deallocation at the cost of additional allocation overhead.
The linked-list allocator can be selected by enabling the heap-embedded-alloc
feature on the
openvm
crate.
System Calls
OpenVM currently does not support any system calls via the RISC-V ecall
instruction. Instead, OpenVM supports custom RISC-V instruction set extensions directly via VM extensions.
In particular, support for the Rust std
library is implemented via custom RISC-V instructions.
OpenVM Intrinsics
OpenVM supports custom RISC-V instructions, known as intrinsic instructions, within the RISC-V ELF binary. These instructions may be inserted directly from the Rust program code using the Rust asm!
macro and the .insn
directive.
For convenience, we define two procedural macros custom_insn_i!
and custom_insn_r!
that provide more streamlined interfaces for calling intrinsic instructions within Rust code. These macros are defined in the openvm-custom-insn
crate and are re-exported in the openvm-platform
crate. They may be accessed from the openvm
crate via openvm::platform::custom_insn_i!
and openvm::platform::custom_insn_r!
.
OpenVM Kernels
OpenVM also supports insertion of custom kernel code into the RISC-V ELF. Kernel code is used as a means to statically link foreign OpenVM assembly code into the ELF without a custom linker. Custom kernel code should be inserted from Rust using the asm!
macro and the .insn
directive. We recommend that the kernel code is generated in a separate build script and then directly included within the asm!
macro invocation using include_str!
.
std
Support
OpenVM supports compilation of guest code using the Rust std
library when the openvm
crate is imported with the "std"
feature enabled. When the "std"
feature is enabled, the openvm
crate defines extern "C"
functions that are compatible with the platform abstraction layer (PAL) ABI defined in the Rust standard library for the guest target. These ABI definitions are statically linked with the Rust standard library at compile time.
Users should be aware of the limitations of the Rust std
support within OpenVM, which are documented in the book.
The PAL ABI is implemented in openvm
using Rust and direct calls to OpenVM intrinsics without the use of system calls. We list below the PAL ABI and how the openvm
crate handles each of the ABI functions:
pub extern "C" fn sys_halt(_user_exit: u8, _out_state: *const [u32; DIGEST_WORDS]);
Calls the special terminate
intrinsic with exit code HALT = 4
.
pub extern "C" fn sys_output(_output_id: u32, _output_value: u32);
This is not used by OpenVM. The function will call the special terminate
intrinsic with exit code UNIMP = 2
.
pub unsafe extern "C" fn sys_sha_compress(
_out_state: *mut [u32; DIGEST_WORDS],
_in_state: *const [u32; DIGEST_WORDS],
_block1_ptr: *const [u32; DIGEST_WORDS],
_block2_ptr: *const [u32; DIGEST_WORDS],
);
This is not used by OpenVM and unreachable!()
. This function panics through the Rust panic handler, which calls sys_panic
below.
pub unsafe extern "C" fn sys_sha_buffer(
_out_state: *mut [u32; DIGEST_WORDS],
_in_state: *const [u32; DIGEST_WORDS],
_buf: *const u8,
_count: u32,
);
This is not used by OpenVM and unreachable!()
.
pub unsafe extern "C" fn sys_rand(recv_buf: *mut u32, words: usize);
Calls hintrandom
intrinsic to generate words
random u32
values using a fixed-seed random number generator on the host machine. Then calls the hintbuffer
intrinsic to write the random values into recv_buf
.
unsafe extern "C" fn sys_panic(msg_ptr: *const u8, len: usize);
Calls printstr
intrinsic to print the panic message, represented in UTF-8 format, stored at msg_ptr
of length len
. Then calls special terminate
intrinsic with exit code PANIC = 1
.
pub unsafe extern "C" fn sys_log(msg_ptr: *const u8, len: usize);
Calls printstr
intrinsic to print the panic message, represented in UTF-8 format, stored at msg_ptr
of length len
.
pub extern "C" fn sys_cycle_count() -> u64;
This function is currently unimplemented and calls the terminate
intrinsic with exit code UNIMP = 2
. However we plan to introduce a new intrinsic for recording the number of instructions executed by the VM in the near future.
pub unsafe extern "C" fn sys_read(_fd: u32, _recv_ptr: *mut u8, _nread: usize);
This function is currently unimplemented and calls the terminate
intrinsic with exit code UNIMP = 2
. We plan to add an implementation using hint intrinsics for compatibility with std::io::Read
in the future.
pub unsafe extern "C" fn sys_read_words(_fd: u32, _recv_ptr: *mut u32, _nwords: usize) -> usize;
This function is not used by OpenVM and calls the terminate
intrinsic with exit code UNIMP = 2
.
pub unsafe extern "C" fn sys_write(fd: u32, write_ptr: *const u8, nbytes: usize);
This function is only supported when fd
equals STDOUT = 1
OR STDERR = 2
. In both cases, the function calls the printstr
intrinsic to print the message, represented in UTF-8 format, stored at write_ptr
of length nbytes
, to stdout on the host machine. For other fd
, the function calls the terminate
intrinsic with exit code UNIMP = 2
.
pub unsafe extern "C" fn sys_getenv(
_out_words: *mut u32,
_out_nwords: usize,
_varname: *const u8,
_varname_len: usize,
) -> usize;
This function always returns 0
, which is equivalent to std::env::var
always returning None
.
pub extern "C" fn sys_argc() -> usize;
This function always returns 0
, which is equivalent to calls to get argc
returning nothing.
pub unsafe extern "C" fn sys_argv(
_out_words: *mut u32,
_out_nwords: usize,
_arg_index: usize,
) -> usize;
This function always returns 0
, which is equivalent to calls to get argv
returning nothing.
pub extern "C" fn sys_alloc_words(nwords: usize) -> *mut u32;
This function allocates nwords * 4
bytes by advancing the heap pointer.