Tuning Wasmtime for Fast Instantiation

Before a WebAssembly module can begin execution, it must first be compiled and then instantiated. Compilation can happen ahead of time, which removes compilation from the critical path. That leaves just instantiation on the critical path. This page documents methods for tuning Wasmtime for fast instantiation.

Enable the Pooling Allocator

By enabling the pooling allocator, you are configuring Wasmtime to up-front and ahead-of-time allocate a large pool containing all the resources necessary to run the configured maximum number of concurrent instances. Creating a new instance doesn't require allocating new Wasm memories and tables on demand, it just takes pre-allocated memories and tables from the pool, which is generally much faster. Deallocating an instance returns its memories and tables to the pool.

See wasmtime::PoolingAllocationConfig, wasmtime::InstanceAllocationStrategy, and wasmtime::Config::allocation_strategy for more details.

Enable Copy-on-Write Heap Images

Initializing a WebAssembly linear memory via a copy-on-write mapping can drastically improve instantiation costs because copying memory is deferred from instantiation time to when the data is first mutated. When the Wasm module only reads the initial data, and never overwrites it, then the copying is completely avoided.

See the API docs for wasmtime::Config::memory_init_cow for more details.

Use InstancePre

To instantiate a WebAssembly module or component, Wasmtime must look up each of the module's imports and check that they are of the expected types. If the imports are always the same, this work can be done ahead of time, before instantiation. A wasmtime::InstancePre represents an instance just before it is instantiated, after all type-checking and imports have been resolved. The only thing left to do for this instance is to actually allocate its memories, tables, and internal runtime context, initialize its state, and run its start function, if any.

See the API docs for wasmtime::InstancePre, wasmtime::Linker::instantiate_pre, wasmtime::component::InstancePre, and wasmtime::component::Linker::instantiate_pre for more details.

Putting It All Together

//! Tuning Wasmtime for fast instantiation.

use anyhow::anyhow;
use wasmtime::{
    Config, Engine, InstanceAllocationStrategy, Linker, Module, PoolingAllocationConfig, Result,
    Store,
};

fn main() -> Result<()> {
    let mut config = Config::new();

    // Configure and enable the pooling allocator with space for 100 memories of
    // up to 2GiB in size, 100 tables holding up to 5000 elements, and with a
    // limit of no more than 100 concurrent instances.
    let mut pool = PoolingAllocationConfig::new();
    pool.total_memories(100);
    pool.max_memory_size(1 << 31); // 2 GiB
    pool.total_tables(100);
    pool.table_elements(5000);
    pool.total_core_instances(100);
    config.allocation_strategy(InstanceAllocationStrategy::Pooling(pool));

    // Enable copy-on-write heap images.
    config.memory_init_cow(true);

    // Create an engine with our configuration.
    let engine = Engine::new(&config)?;

    // Create a linker and populate it with all the imports needed for the Wasm
    // programs we will run. In a more realistic Wasmtime embedding, this would
    // probably involve adding WASI functions to the linker, for example.
    let mut linker = Linker::<()>::new(&engine);
    linker.func_wrap("math", "add", |a: u32, b: u32| -> u32 { a + b })?;

    // Create a new module, load a pre-compiled module from disk, or etc...
    let module = Module::new(
        &engine,
        r#"
            (module
                (import "math" "add" (func $add (param i32 i32) (result i32)))
                (func (export "run")
                    (call $add (i32.const 29) (i32.const 13))
                )
            )
        "#,
    )?;

    // Create an `InstancePre` for our module, doing import resolution and
    // type-checking ahead-of-time and removing it from the instantiation
    // critical path.
    let instance_pre = linker.instantiate_pre(&module)?;

    // Now we can very quickly instantiate our module, so long as we have no
    // more than 100 concurrent instances at a time!
    //
    // For example, we can spawn 100 threads and have each of them instantiate
    // and run our Wasm module in a loop.
    //
    // In a real Wasmtime embedding, this would be doing something like handling
    // new HTTP requests, game events, or etc... instead of just calling the
    // same function. A production embedding would likely also be using async,
    // in which case it would want some sort of back-pressure mechanism (like a
    // semaphore) on incoming tasks to avoid attempting to allocate more than
    // the pool's maximum-supported concurrent instances (at which point,
    // instantiation will start returning errors).
    let handles: Vec<std::thread::JoinHandle<Result<()>>> = (0..100)
        .map(|_| {
            let engine = engine.clone();
            let instance_pre = instance_pre.clone();
            std::thread::spawn(move || -> Result<()> {
                for _ in 0..999 {
                    // Create a new store for this instance.
                    let mut store = Store::new(&engine, ());
                    // Instantiate our module in this store.
                    let instance = instance_pre.instantiate(&mut store)?;
                    // Call the instance's `run` function!
                    let _result = instance
                        .get_typed_func::<(), i32>(&mut store, "run")?
                        .call(&mut store, ());
                }
                Ok(())
            })
        })
        .collect();

    // Wait for the threads to finish.
    for h in handles.into_iter() {
        h.join().map_err(|_| anyhow!("thread panicked!"))??;
    }

    Ok(())
}

See Also