Struct PoolingAllocationConfig
pub struct PoolingAllocationConfig { /* private fields */ }
Expand description
Configuration options used with InstanceAllocationStrategy::Pooling
to
change the behavior of the pooling instance allocator.
This structure has a builder-style API in the same manner as Config
and
is configured with Config::allocation_strategy
.
Note that usage of the pooling allocator does not affect compiled
WebAssembly code. Compiled *.cwasm
files, for example, are usable both
with and without the pooling allocator.
§Advantages of Pooled Allocation
The main benefit of the pooling allocator is to make WebAssembly
instantiation both faster and more scalable in terms of parallelism.
Allocation is faster because virtual memory is already configured and ready
to go within the pool, there’s no need to mmap
(for example on Unix) a
new region and configure it with guard pages. By avoiding mmap
this
avoids whole-process virtual memory locks which can improve scalability and
performance through avoiding this.
Additionally with pooled allocation it’s possible to create “affine slots”
to a particular WebAssembly module or component over time. For example if
the same module is multiple times over time the pooling allocator will, by
default, attempt to reuse the same slot. This mean that the slot has been
pre-configured and can retain virtual memory mappings for a copy-on-write
image, for example (see Config::memory_init_cow
for more information.
This means that in a steady state instance deallocation is a single
madvise
to reset linear memory to its original contents followed by a
single (optional) mprotect
during the next instantiation to shrink
memory back to its original size. Compared to non-pooled allocation this
avoids the need to mmap
a new region of memory, munmap
it, and
mprotect
regions too.
Another benefit of pooled allocation is that it’s possible to configure things such that no virtual memory management is required at all in a steady state. For example a pooling allocator can be configured with:
Config::memory_init_cow
disabledConfig::memory_guard_size
disabledConfig::memory_reservation
shrunk to minimal sizePoolingAllocationConfig::table_keep_resident
sufficiently largePoolingAllocationConfig::linear_memory_keep_resident
sufficiently large
With all these options in place no virtual memory tricks are used at all and
everything is manually managed by Wasmtime (for example resetting memory is
a memset(0)
). This is not as fast in a single-threaded scenario but can
provide benefits in high-parallelism situations as no virtual memory locks
or IPIs need happen.
§Disadvantages of Pooled Allocation
Despite the above advantages to instantiation performance the pooling allocator is not enabled by default in Wasmtime. One reason is that the performance advantages are not necessarily portable, for example while the pooling allocator works on Windows it has not been tuned for performance on Windows in the same way it has on Linux.
Additionally the main cost of the pooling allocator is that it requires a very large reservation of virtual memory (on the order of most of the addressable virtual address space). WebAssembly 32-bit linear memories in Wasmtime are, by default 4G address space reservations with a small guard region both before and after the linear memory. Memories in the pooling allocator are contiguous which means that we only need a guard after linear memory because the previous linear memory’s slot post-guard is our own pre-guard. This means that, by default, the pooling allocator uses roughly 4G of virtual memory per WebAssembly linear memory slot. 4G of virtual memory is 32 bits of a 64-bit address. Many 64-bit systems can only actually use 48-bit addresses by default (although this can be extended on architectures nowadays too), and of those 48 bits one of them is reserved to indicate kernel-vs-userspace. This leaves 47-32=15 bits left, meaning you can only have at most 32k slots of linear memories on many systems by default. This is a relatively small number and shows how the pooling allocator can quickly exhaust all of virtual memory.
Another disadvantage of the pooling allocator is that it may keep memory
alive when nothing is using it. A previously used slot for an instance might
have paged-in memory that will not get paged out until the
Engine
owning the pooling allocator is dropped. While
suitable for some applications this behavior may not be suitable for all
applications.
Finally the last disadvantage of the pooling allocator is that the configuration values for the maximum number of instances, memories, tables, etc, must all be fixed up-front. There’s not always a clear answer as to what these values should be so not all applications may be able to work with this constraint.
Implementations§
§impl PoolingAllocationConfig
impl PoolingAllocationConfig
pub fn new() -> PoolingAllocationConfig
pub fn new() -> PoolingAllocationConfig
Returns a new configuration builder with all default settings configured.
pub fn max_unused_warm_slots(
&mut self,
max: u32,
) -> &mut PoolingAllocationConfig
pub fn max_unused_warm_slots( &mut self, max: u32, ) -> &mut PoolingAllocationConfig
Configures the maximum number of “unused warm slots” to retain in the pooling allocator.
The pooling allocator operates over slots to allocate from, and each slot is considered “cold” if it’s never been used before or “warm” if it’s been used by some module in the past. Slots in the pooling allocator additionally track an “affinity” flag to a particular core wasm module. When a module is instantiated into a slot then the slot is considered affine to that module, even after the instance has been deallocated.
When a new instance is created then a slot must be chosen, and the current algorithm for selecting a slot is:
-
If there are slots that are affine to the module being instantiated, then the most recently used slot is selected to be allocated from. This is done to improve reuse of resources such as memory mappings and additionally try to benefit from temporal locality for things like caches.
-
Otherwise if there are more than N affine slots to other modules, then one of those affine slots is chosen to be allocated. The slot chosen is picked on a least-recently-used basis.
-
Finally, if there are less than N affine slots to other modules, then the non-affine slots are allocated from.
This setting, max_unused_warm_slots
, is the value for N in the above
algorithm. The purpose of this setting is to have a knob over the RSS
impact of “unused slots” for a long-running wasm server.
If this setting is set to 0, for example, then affine slots are aggressively reused on a least-recently-used basis. A “cold” slot is only used if there are no affine slots available to allocate from. This means that the set of slots used over the lifetime of a program is the same as the maximum concurrent number of wasm instances.
If this setting is set to infinity, however, then cold slots are
prioritized to be allocated from. This means that the set of slots used
over the lifetime of a program will approach
PoolingAllocationConfig::total_memories
, or the maximum number of
slots in the pooling allocator.
Wasmtime does not aggressively decommit all resources associated with a
slot when the slot is not in use. For example the
PoolingAllocationConfig::linear_memory_keep_resident
option can be
used to keep memory associated with a slot, even when it’s not in use.
This means that the total set of used slots in the pooling instance
allocator can impact the overall RSS usage of a program.
The default value for this option is 100
.
pub fn decommit_batch_size(
&mut self,
batch_size: usize,
) -> &mut PoolingAllocationConfig
pub fn decommit_batch_size( &mut self, batch_size: usize, ) -> &mut PoolingAllocationConfig
The target number of decommits to do per batch.
This is not precise, as we can queue up decommits at times when we aren’t prepared to immediately flush them, and so we may go over this target size occasionally.
A batch size of one effectively disables batching.
Defaults to 1
.
pub fn async_stack_zeroing(
&mut self,
enable: bool,
) -> &mut PoolingAllocationConfig
pub fn async_stack_zeroing( &mut self, enable: bool, ) -> &mut PoolingAllocationConfig
Configures whether or not stacks used for async futures are reset to zero after usage.
When the async_support
method is enabled for
Wasmtime and the call_async
variant
of calling WebAssembly is used then Wasmtime will create a separate
runtime execution stack for each future produced by call_async
.
During the deallocation process Wasmtime won’t by default reset the
contents of the stack back to zero.
When this option is enabled it can be seen as a defense-in-depth mechanism to reset a stack back to zero. This is not required for correctness and can be a costly operation in highly concurrent environments due to modifications of the virtual address space requiring process-wide synchronization.
This option defaults to false
.
pub fn async_stack_keep_resident(
&mut self,
size: usize,
) -> &mut PoolingAllocationConfig
pub fn async_stack_keep_resident( &mut self, size: usize, ) -> &mut PoolingAllocationConfig
How much memory, in bytes, to keep resident for async stacks allocated with the pooling allocator.
When PoolingAllocationConfig::async_stack_zeroing
is enabled then
Wasmtime will reset the contents of async stacks back to zero upon
deallocation. This option can be used to perform the zeroing operation
with memset
up to a certain threshold of bytes instead of using system
calls to reset the stack to zero.
Note that when using this option the memory with async stacks will never be decommitted.
pub fn linear_memory_keep_resident(
&mut self,
size: usize,
) -> &mut PoolingAllocationConfig
pub fn linear_memory_keep_resident( &mut self, size: usize, ) -> &mut PoolingAllocationConfig
How much memory, in bytes, to keep resident for each linear memory after deallocation.
This option is only applicable on Linux and has no effect on other platforms.
By default Wasmtime will use madvise
to reset the entire contents of
linear memory back to zero when a linear memory is deallocated. This
option can be used to use memset
instead to set memory back to zero
which can, in some configurations, reduce the number of page faults
taken when a slot is reused.
pub fn table_keep_resident(
&mut self,
size: usize,
) -> &mut PoolingAllocationConfig
pub fn table_keep_resident( &mut self, size: usize, ) -> &mut PoolingAllocationConfig
How much memory, in bytes, to keep resident for each table after deallocation.
This option is only applicable on Linux and has no effect on other platforms.
This option is the same as
PoolingAllocationConfig::linear_memory_keep_resident
except that it
is applicable to tables instead.
pub fn total_component_instances(
&mut self,
count: u32,
) -> &mut PoolingAllocationConfig
pub fn total_component_instances( &mut self, count: u32, ) -> &mut PoolingAllocationConfig
The maximum number of concurrent component instances supported (default
is 1000
).
This provides an upper-bound on the total size of component
metadata-related allocations, along with
PoolingAllocationConfig::max_component_instance_size
. The upper bound is
total_component_instances * max_component_instance_size
where max_component_instance_size
is rounded up to the size and alignment
of the internal representation of the metadata.
pub fn max_component_instance_size(
&mut self,
size: usize,
) -> &mut PoolingAllocationConfig
pub fn max_component_instance_size( &mut self, size: usize, ) -> &mut PoolingAllocationConfig
The maximum size, in bytes, allocated for a component instance’s
VMComponentContext
metadata.
The wasmtime::component::Instance
type
has a static size but its internal VMComponentContext
is dynamically
sized depending on the component being instantiated. This size limit
loosely correlates to the size of the component, taking into account
factors such as:
- number of lifted and lowered functions,
- number of memories
- number of inner instances
- number of resources
If the allocated size per instance is too small then instantiation of a module will fail at runtime with an error indicating how many bytes were needed.
The default value for this is 1MiB.
This provides an upper-bound on the total size of component
metadata-related allocations, along with
PoolingAllocationConfig::total_component_instances
. The upper bound is
total_component_instances * max_component_instance_size
where max_component_instance_size
is rounded up to the size and alignment
of the internal representation of the metadata.
pub fn max_core_instances_per_component(
&mut self,
count: u32,
) -> &mut PoolingAllocationConfig
pub fn max_core_instances_per_component( &mut self, count: u32, ) -> &mut PoolingAllocationConfig
The maximum number of core instances a single component may contain (default is unlimited).
This method (along with
PoolingAllocationConfig::max_memories_per_component
,
PoolingAllocationConfig::max_tables_per_component
, and
PoolingAllocationConfig::max_component_instance_size
) allows you to cap
the amount of resources a single component allocation consumes.
If a component will instantiate more core instances than count
, then
the component will fail to instantiate.
pub fn max_memories_per_component(
&mut self,
count: u32,
) -> &mut PoolingAllocationConfig
pub fn max_memories_per_component( &mut self, count: u32, ) -> &mut PoolingAllocationConfig
The maximum number of Wasm linear memories that a single component may transitively contain (default is unlimited).
This method (along with
PoolingAllocationConfig::max_core_instances_per_component
,
PoolingAllocationConfig::max_tables_per_component
, and
PoolingAllocationConfig::max_component_instance_size
) allows you to cap
the amount of resources a single component allocation consumes.
If a component transitively contains more linear memories than count
,
then the component will fail to instantiate.
pub fn max_tables_per_component(
&mut self,
count: u32,
) -> &mut PoolingAllocationConfig
pub fn max_tables_per_component( &mut self, count: u32, ) -> &mut PoolingAllocationConfig
The maximum number of tables that a single component may transitively contain (default is unlimited).
This method (along with
PoolingAllocationConfig::max_core_instances_per_component
,
PoolingAllocationConfig::max_memories_per_component
,
PoolingAllocationConfig::max_component_instance_size
) allows you to cap
the amount of resources a single component allocation consumes.
If a component will transitively contains more tables than count
, then
the component will fail to instantiate.
pub fn total_memories(&mut self, count: u32) -> &mut PoolingAllocationConfig
pub fn total_memories(&mut self, count: u32) -> &mut PoolingAllocationConfig
The maximum number of concurrent Wasm linear memories supported (default
is 1000
).
This value has a direct impact on the amount of memory allocated by the pooling instance allocator.
The pooling instance allocator allocates a memory pool, where each entry in the pool contains the reserved address space for each linear memory supported by an instance.
The memory pool will reserve a large quantity of host process address space to elide the bounds checks required for correct WebAssembly memory semantics. Even with 64-bit address spaces, the address space is limited when dealing with a large number of linear memories.
For example, on Linux x86_64, the userland address space limit is 128 TiB. That might seem like a lot, but each linear memory will reserve 6 GiB of space by default.
pub fn total_tables(&mut self, count: u32) -> &mut PoolingAllocationConfig
pub fn total_tables(&mut self, count: u32) -> &mut PoolingAllocationConfig
The maximum number of concurrent tables supported (default is 1000
).
This value has a direct impact on the amount of memory allocated by the pooling instance allocator.
The pooling instance allocator allocates a table pool, where each entry
in the pool contains the space needed for each WebAssembly table
supported by an instance (see table_elements
to control the size of
each table).
pub fn total_stacks(&mut self, count: u32) -> &mut PoolingAllocationConfig
pub fn total_stacks(&mut self, count: u32) -> &mut PoolingAllocationConfig
The maximum number of execution stacks allowed for asynchronous
execution, when enabled (default is 1000
).
This value has a direct impact on the amount of memory allocated by the pooling instance allocator.
pub fn total_core_instances(
&mut self,
count: u32,
) -> &mut PoolingAllocationConfig
pub fn total_core_instances( &mut self, count: u32, ) -> &mut PoolingAllocationConfig
The maximum number of concurrent core instances supported (default is
1000
).
This provides an upper-bound on the total size of core instance
metadata-related allocations, along with
PoolingAllocationConfig::max_core_instance_size
. The upper bound is
total_core_instances * max_core_instance_size
where max_core_instance_size
is rounded up to the size and alignment of
the internal representation of the metadata.
pub fn max_core_instance_size(
&mut self,
size: usize,
) -> &mut PoolingAllocationConfig
pub fn max_core_instance_size( &mut self, size: usize, ) -> &mut PoolingAllocationConfig
The maximum size, in bytes, allocated for a core instance’s VMContext
metadata.
The Instance
type has a static size but its
VMContext
metadata is dynamically sized depending on the module being
instantiated. This size limit loosely correlates to the size of the Wasm
module, taking into account factors such as:
- number of functions
- number of globals
- number of memories
- number of tables
- number of function types
If the allocated size per instance is too small then instantiation of a module will fail at runtime with an error indicating how many bytes were needed.
The default value for this is 1MiB.
This provides an upper-bound on the total size of core instance
metadata-related allocations, along with
PoolingAllocationConfig::total_core_instances
. The upper bound is
total_core_instances * max_core_instance_size
where max_core_instance_size
is rounded up to the size and alignment of
the internal representation of the metadata.
pub fn max_tables_per_module(
&mut self,
tables: u32,
) -> &mut PoolingAllocationConfig
pub fn max_tables_per_module( &mut self, tables: u32, ) -> &mut PoolingAllocationConfig
The maximum number of defined tables for a core module (default is 1
).
This value controls the capacity of the VMTableDefinition
table in
each instance’s VMContext
structure.
The allocated size of the table will be tables * sizeof(VMTableDefinition)
for each instance regardless of how many
tables are defined by an instance’s module.
pub fn table_elements(
&mut self,
elements: usize,
) -> &mut PoolingAllocationConfig
pub fn table_elements( &mut self, elements: usize, ) -> &mut PoolingAllocationConfig
The maximum table elements for any table defined in a module (default is
20000
).
If a table’s minimum element limit is greater than this value, the module will fail to instantiate.
If a table’s maximum element limit is unbounded or greater than this
value, the maximum will be table_elements
for the purpose of any
table.grow
instruction.
This value is used to reserve the maximum space for each supported
table; table elements are pointer-sized in the Wasmtime runtime.
Therefore, the space reserved for each instance is tables * table_elements * sizeof::<*const ()>
.
pub fn max_memories_per_module(
&mut self,
memories: u32,
) -> &mut PoolingAllocationConfig
pub fn max_memories_per_module( &mut self, memories: u32, ) -> &mut PoolingAllocationConfig
The maximum number of defined linear memories for a module (default is
1
).
This value controls the capacity of the VMMemoryDefinition
table in
each core instance’s VMContext
structure.
The allocated size of the table will be memories * sizeof(VMMemoryDefinition)
for each core instance regardless of how
many memories are defined by the core instance’s module.
pub fn max_memory_size(&mut self, bytes: usize) -> &mut PoolingAllocationConfig
pub fn max_memory_size(&mut self, bytes: usize) -> &mut PoolingAllocationConfig
The maximum byte size that any WebAssembly linear memory may grow to.
This option defaults to 4 GiB meaning that for 32-bit linear memories there is no restrictions. 64-bit linear memories will not be allowed to grow beyond 4 GiB by default.
If a memory’s minimum size is greater than this value, the module will fail to instantiate.
If a memory’s maximum size is unbounded or greater than this value, the
maximum will be max_memory_size
for the purpose of any memory.grow
instruction.
This value is used to control the maximum accessible space for each
linear memory of a core instance. This can be thought of as a simple
mechanism like Store::limiter
to limit memory
at runtime. This value can also affect striping/coloring behavior when
used in conjunction with
memory_protection_keys
.
The virtual memory reservation size of each linear memory is controlled
by the Config::memory_reservation
setting and this method’s
configuration cannot exceed Config::memory_reservation
.
pub fn memory_protection_keys(
&mut self,
enable: MpkEnabled,
) -> &mut PoolingAllocationConfig
pub fn memory_protection_keys( &mut self, enable: MpkEnabled, ) -> &mut PoolingAllocationConfig
Configures whether memory protection keys (MPK) should be used for more efficient layout of pool-allocated memories.
When using the pooling allocator (see Config::allocation_strategy
,
InstanceAllocationStrategy::Pooling
), memory protection keys can
reduce the total amount of allocated virtual memory by eliminating guard
regions between WebAssembly memories in the pool. It does so by
“coloring” memory regions with different memory keys and setting which
regions are accessible each time executions switches from host to guest
(or vice versa).
Leveraging MPK requires configuring a smaller-than-default
max_memory_size
to enable
this coloring/striping behavior. For example embeddings might want to
reduce the default 4G allowance to 128M.
MPK is only available on Linux (called pku
there) and recent x86
systems; we check for MPK support at runtime by examining the CPUID
register. This configuration setting can be in three states:
auto
: if MPK support is available the guard regions are removed; if not, the guard regions remainenable
: use MPK to eliminate guard regions; fail if MPK is not supporteddisable
: never use MPK
By default this value is disabled
, but may become auto
in future
releases.
WARNING: this configuration options is still experimental–use at your own risk! MPK uses kernel and CPU features to protect memory regions; you may observe segmentation faults if anything is misconfigured.
pub fn max_memory_protection_keys(
&mut self,
max: usize,
) -> &mut PoolingAllocationConfig
pub fn max_memory_protection_keys( &mut self, max: usize, ) -> &mut PoolingAllocationConfig
Sets an upper limit on how many memory protection keys (MPK) Wasmtime will use.
This setting is only applicable when
PoolingAllocationConfig::memory_protection_keys
is set to enable
or auto
. Configuring this above the HW and OS limits (typically 15)
has no effect.
If multiple Wasmtime engines are used in the same process, note that all engines will share the same set of allocated keys; this setting will limit how many keys are allocated initially and thus available to all other engines.
pub fn are_memory_protection_keys_available() -> bool
pub fn are_memory_protection_keys_available() -> bool
Check if memory protection keys (MPK) are available on the current host.
This is a convenience method for determining MPK availability using the
same method that MpkEnabled::Auto
does. See
PoolingAllocationConfig::memory_protection_keys
for more
information.
pub fn total_gc_heaps(&mut self, count: u32) -> &mut PoolingAllocationConfig
pub fn total_gc_heaps(&mut self, count: u32) -> &mut PoolingAllocationConfig
The maximum number of concurrent GC heaps supported (default is 1000
).
This value has a direct impact on the amount of memory allocated by the pooling instance allocator.
The pooling instance allocator allocates a GC heap pool, where each entry in the pool contains the space needed for each GC heap used by a store.
Trait Implementations§
§impl Clone for PoolingAllocationConfig
impl Clone for PoolingAllocationConfig
§fn clone(&self) -> PoolingAllocationConfig
fn clone(&self) -> PoolingAllocationConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more§impl Debug for PoolingAllocationConfig
impl Debug for PoolingAllocationConfig
§impl Default for PoolingAllocationConfig
impl Default for PoolingAllocationConfig
§fn default() -> PoolingAllocationConfig
fn default() -> PoolingAllocationConfig
§impl From<PoolingAllocationConfig> for InstanceAllocationStrategy
impl From<PoolingAllocationConfig> for InstanceAllocationStrategy
§fn from(cfg: PoolingAllocationConfig) -> InstanceAllocationStrategy
fn from(cfg: PoolingAllocationConfig) -> InstanceAllocationStrategy
Auto Trait Implementations§
impl Freeze for PoolingAllocationConfig
impl RefUnwindSafe for PoolingAllocationConfig
impl Send for PoolingAllocationConfig
impl Sync for PoolingAllocationConfig
impl Unpin for PoolingAllocationConfig
impl UnwindSafe for PoolingAllocationConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self> ⓘ
fn instrument(self, span: Span) -> Instrumented<Self> ⓘ
Source§fn in_current_span(self) -> Instrumented<Self> ⓘ
fn in_current_span(self) -> Instrumented<Self> ⓘ
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more