wasmtime_environ/component/translate/adapt.rs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455
//! Identification and creation of fused adapter modules in Wasmtime.
//!
//! A major piece of the component model is the ability for core wasm modules to
//! talk to each other through the use of lifted and lowered functions. For
//! example one core wasm module can export a function which is lifted. Another
//! component could import that lifted function, lower it, and pass it as the
//! import to another core wasm module. This is what Wasmtime calls "adapter
//! fusion" where two core wasm functions are coming together through the
//! component model.
//!
//! There are a few ingredients during adapter fusion:
//!
//! * A core wasm function which is "lifted".
//! * A "lift type" which is the type that the component model function had in
//! the original component
//! * A "lower type" which is the type that the component model function has
//! in the destination component (the one the uses `canon lower`)
//! * Configuration options for both the lift and the lower operations such as
//! memories, reallocs, etc.
//!
//! With these ingredients combined Wasmtime must produce a function which
//! connects the two components through the options specified. The fused adapter
//! performs tasks such as validation of passed values, copying data between
//! linear memories, etc.
//!
//! Wasmtime's current implementation of fused adapters is designed to reduce
//! complexity elsewhere as much as possible while also being suitable for being
//! used as a polyfill for the component model in JS environments as well. To
//! that end Wasmtime implements a fused adapter with another wasm module that
//! it itself generates on the fly. The usage of WebAssembly for fused adapters
//! has a number of advantages:
//!
//! * There is no need to create a raw Cranelift-based compiler. This is where
//! majority of "unsafety" lives in Wasmtime so reducing the need to lean on
//! this or audit another compiler is predicted to weed out a whole class of
//! bugs in the fused adapter compiler.
//!
//! * As mentioned above generation of WebAssembly modules means that this is
//! suitable for use in JS environments. For example a hypothetical tool which
//! polyfills a component onto the web today would need to do something for
//! adapter modules, and ideally the adapters themselves are speedy. While
//! this could all be written in JS the adapting process is quite nontrivial
//! so sharing code with Wasmtime would be ideal.
//!
//! * Using WebAssembly insulates the implementation to bugs to a certain
//! degree. While logic bugs are still possible it should be much more
//! difficult to have segfaults or things like that. With adapters exclusively
//! executing inside a WebAssembly sandbox like everything else the failure
//! modes to the host at least should be minimized.
//!
//! * Integration into the runtime is relatively simple, the adapter modules are
//! just another kind of wasm module to instantiate and wire up at runtime.
//! The goal is that the `GlobalInitializer` list that is processed at runtime
//! will have all of its `Adapter`-using variants erased by the time it makes
//! its way all the way up to Wasmtime. This means that the support in
//! Wasmtime prior to adapter modules is actually the same as the support
//! after adapter modules are added, keeping the runtime fiddly bits quite
//! minimal.
//!
//! This isn't to say that this approach isn't without its disadvantages of
//! course. For now though this seems to be a reasonable set of tradeoffs for
//! the development stage of the component model proposal.
//!
//! ## Creating adapter modules
//!
//! With WebAssembly itself being used to implement fused adapters, Wasmtime
//! still has the question of how to organize the adapter functions into actual
//! wasm modules.
//!
//! The first thing you might reach for is to put all the adapters into the same
//! wasm module. This cannot be done, however, because some adapters may depend
//! on other adapters (transitively) to be created. This means that if
//! everything were in the same module there would be no way to instantiate the
//! module. An example of this dependency is an adapter (A) used to create a
//! core wasm instance (M) whose exported memory is then referenced by another
//! adapter (B). In this situation the adapter B cannot be in the same module
//! as adapter A because B needs the memory of M but M is created with A which
//! would otherwise create a circular dependency.
//!
//! The second possibility of organizing adapter modules would be to place each
//! fused adapter into its own module. Each `canon lower` would effectively
//! become a core wasm module instantiation at that point. While this works it's
//! currently believed to be a bit too fine-grained. For example it would mean
//! that importing a dozen lowered functions into a module could possibly result
//! in up to a dozen different adapter modules. While this possibility could
//! work it has been ruled out as "probably too expensive at runtime".
//!
//! Thus the purpose and existence of this module is now evident -- this module
//! exists to identify what exactly goes into which adapter module. This will
//! evaluate the `GlobalInitializer` lists coming out of the `inline` pass and
//! insert `InstantiateModule` entries for where adapter modules should be
//! created.
//!
//! ## Partitioning adapter modules
//!
//! Currently this module does not attempt to be really all that fancy about
//! grouping adapters into adapter modules. The main idea is that most items
//! within an adapter module are likely to be close together since they're
//! theoretically going to be used for an instantiation of a core wasm module
//! just after the fused adapter was declared. With that in mind the current
//! algorithm is a one-pass approach to partitioning everything into adapter
//! modules.
//!
//! Adapters were identified in-order as part of the inlining phase of
//! translation where we're guaranteed that once an adapter is identified
//! it can't depend on anything identified later. The pass implemented here is
//! to visit all transitive dependencies of an adapter. If one of the
//! dependencies of an adapter is an adapter in the current adapter module
//! being built then the current module is finished and a new adapter module is
//! started. This should quickly partition adapters into contiugous chunks of
//! their index space which can be in adapter modules together.
//!
//! There's probably more general algorithms for this but for now this should be
//! fast enough as it's "just" a linear pass. As we get more components over
//! time this may want to be revisited if too many adapter modules are being
//! created.
use crate::component::translate::*;
use crate::fact;
use crate::EntityType;
use std::collections::HashSet;
/// Metadata information about a fused adapter.
#[derive(Debug, Clone, Hash, Eq, PartialEq)]
pub struct Adapter {
/// The type used when the original core wasm function was lifted.
///
/// Note that this could be different than `lower_ty` (but still matches
/// according to subtyping rules).
pub lift_ty: TypeFuncIndex,
/// Canonical ABI options used when the function was lifted.
pub lift_options: AdapterOptions,
/// The type used when the function was lowered back into a core wasm
/// function.
///
/// Note that this could be different than `lift_ty` (but still matches
/// according to subtyping rules).
pub lower_ty: TypeFuncIndex,
/// Canonical ABI options used when the function was lowered.
pub lower_options: AdapterOptions,
/// The original core wasm function which was lifted.
pub func: dfg::CoreDef,
}
/// Configuration options which can be specified as part of the canonical ABI
/// in the component model.
#[derive(Debug, Clone, Hash, Eq, PartialEq)]
pub struct AdapterOptions {
/// The Wasmtime-assigned component instance index where the options were
/// originally specified.
pub instance: RuntimeComponentInstanceIndex,
/// How strings are encoded.
pub string_encoding: StringEncoding,
/// An optional memory definition supplied.
pub memory: Option<dfg::CoreExport<MemoryIndex>>,
/// If `memory` is specified, whether it's a 64-bit memory.
pub memory64: bool,
/// An optional definition of `realloc` to used.
pub realloc: Option<dfg::CoreDef>,
/// An optional definition of a `post-return` to use.
pub post_return: Option<dfg::CoreDef>,
}
impl<'data> Translator<'_, 'data> {
/// This is the entrypoint of functionality within this module which
/// performs all the work of identifying adapter usages and organizing
/// everything into adapter modules.
///
/// This will mutate the provided `component` in-place and fill out the dfg
/// metadata for adapter modules.
pub(super) fn partition_adapter_modules(&mut self, component: &mut dfg::ComponentDfg) {
// Visit each adapter, in order of its original definition, during the
// partitioning. This allows for the guarantee that dependencies are
// visited in a topological fashion ideally.
let mut state = PartitionAdapterModules::default();
for (id, adapter) in component.adapters.iter() {
state.adapter(component, id, adapter);
}
state.finish_adapter_module();
// Now that all adapters have been partitioned into modules this loop
// generates a core wasm module for each adapter module, translates
// the module using standard core wasm translation, and then fills out
// the dfg metadata for each adapter.
for (module_id, adapter_module) in state.adapter_modules.iter() {
let mut module =
fact::Module::new(self.types.types(), self.tunables.debug_adapter_modules);
let mut names = Vec::with_capacity(adapter_module.adapters.len());
for adapter in adapter_module.adapters.iter() {
let name = format!("adapter{}", adapter.as_u32());
module.adapt(&name, &component.adapters[*adapter]);
names.push(name);
}
let wasm = module.encode();
let imports = module.imports().to_vec();
// Extend the lifetime of the owned `wasm: Vec<u8>` on the stack to
// a higher scope defined by our original caller. That allows to
// transform `wasm` into `&'data [u8]` which is much easier to work
// with here.
let wasm = &*self.scope_vec.push(wasm);
if log::log_enabled!(log::Level::Trace) {
match wasmprinter::print_bytes(wasm) {
Ok(s) => log::trace!("generated adapter module:\n{}", s),
Err(e) => log::trace!("failed to print adapter module: {}", e),
}
}
// With the wasm binary this is then pushed through general
// translation, validation, etc. Note that multi-memory is
// specifically enabled here since the adapter module is highly
// likely to use that if anything is actually indirected through
// memory.
self.validator.reset();
let translation = ModuleEnvironment::new(
self.tunables,
&mut self.validator,
self.types.module_types_builder(),
)
.translate(Parser::new(0), wasm)
.expect("invalid adapter module generated");
// Record, for each adapter in this adapter module, the module that
// the adapter was placed within as well as the function index of
// the adapter in the wasm module generated. Note that adapters are
// paritioned in-order so we're guaranteed to push the adapters
// in-order here as well. (with an assert to double-check)
for (adapter, name) in adapter_module.adapters.iter().zip(&names) {
let index = translation.module.exports[name];
let i = component.adapter_paritionings.push((module_id, index));
assert_eq!(i, *adapter);
}
// Finally the metadata necessary to instantiate this adapter
// module is also recorded in the dfg. This metadata will be used
// to generate `GlobalInitializer` entries during the linearization
// final phase.
assert_eq!(imports.len(), translation.module.imports().len());
let args = imports
.iter()
.zip(translation.module.imports())
.map(|(arg, (_, _, ty))| fact_import_to_core_def(component, arg, ty))
.collect::<Vec<_>>();
let static_index = self.static_modules.push(translation);
let id = component.adapter_modules.push((static_index, args.into()));
assert_eq!(id, module_id);
}
}
}
fn fact_import_to_core_def(
dfg: &mut dfg::ComponentDfg,
import: &fact::Import,
ty: EntityType,
) -> dfg::CoreDef {
let mut simple_intrinsic = |trampoline: dfg::Trampoline| {
let signature = ty.unwrap_func();
let index = dfg
.trampolines
.push((signature.unwrap_module_type_index(), trampoline));
dfg::CoreDef::Trampoline(index)
};
match import {
fact::Import::CoreDef(def) => def.clone(),
fact::Import::Transcode {
op,
from,
from64,
to,
to64,
} => {
fn unwrap_memory(def: &dfg::CoreDef) -> dfg::CoreExport<MemoryIndex> {
match def {
dfg::CoreDef::Export(e) => e.clone().map_index(|i| match i {
EntityIndex::Memory(i) => i,
_ => unreachable!(),
}),
_ => unreachable!(),
}
}
let from = dfg.memories.push(unwrap_memory(from));
let to = dfg.memories.push(unwrap_memory(to));
let signature = ty.unwrap_func();
let index = dfg.trampolines.push((
signature.unwrap_module_type_index(),
dfg::Trampoline::Transcoder {
op: *op,
from,
from64: *from64,
to,
to64: *to64,
},
));
dfg::CoreDef::Trampoline(index)
}
fact::Import::ResourceTransferOwn => simple_intrinsic(dfg::Trampoline::ResourceTransferOwn),
fact::Import::ResourceTransferBorrow => {
simple_intrinsic(dfg::Trampoline::ResourceTransferBorrow)
}
fact::Import::ResourceEnterCall => simple_intrinsic(dfg::Trampoline::ResourceEnterCall),
fact::Import::ResourceExitCall => simple_intrinsic(dfg::Trampoline::ResourceExitCall),
}
}
#[derive(Default)]
struct PartitionAdapterModules {
/// The next adapter module that's being created. This may be empty.
next_module: AdapterModuleInProgress,
/// The set of items which are known to be defined which the adapter module
/// in progress is allowed to depend on.
defined_items: HashSet<Def>,
/// Finished adapter modules that won't be added to.
///
/// In theory items could be added to preexisting modules here but to keep
/// this pass linear this is never modified after insertion.
adapter_modules: PrimaryMap<dfg::AdapterModuleId, AdapterModuleInProgress>,
}
#[derive(Default)]
struct AdapterModuleInProgress {
/// The adapters which have been placed into this module.
adapters: Vec<dfg::AdapterId>,
}
/// Items that adapters can depend on.
///
/// Note that this is somewhat of a flat list and is intended to mostly model
/// core wasm instances which are side-effectful unlike other host items like
/// lowerings or always-trapping functions.
#[derive(Copy, Clone, Hash, Eq, PartialEq)]
enum Def {
Adapter(dfg::AdapterId),
Instance(dfg::InstanceId),
}
impl PartitionAdapterModules {
fn adapter(&mut self, dfg: &dfg::ComponentDfg, id: dfg::AdapterId, adapter: &Adapter) {
// Visit all dependencies of this adapter and if anything depends on
// the current adapter module in progress then a new adapter module is
// started.
self.adapter_options(dfg, &adapter.lift_options);
self.adapter_options(dfg, &adapter.lower_options);
self.core_def(dfg, &adapter.func);
// With all dependencies visited this adapter is added to the next
// module.
//
// This will either get added the preexisting module if this adapter
// didn't depend on anything in that module itself or it will be added
// to a fresh module if this adapter depended on something that the
// current adapter module created.
log::debug!("adding {id:?} to adapter module");
self.next_module.adapters.push(id);
}
fn adapter_options(&mut self, dfg: &dfg::ComponentDfg, options: &AdapterOptions) {
if let Some(memory) = &options.memory {
self.core_export(dfg, memory);
}
if let Some(def) = &options.realloc {
self.core_def(dfg, def);
}
if let Some(def) = &options.post_return {
self.core_def(dfg, def);
}
}
fn core_def(&mut self, dfg: &dfg::ComponentDfg, def: &dfg::CoreDef) {
match def {
dfg::CoreDef::Export(e) => self.core_export(dfg, e),
dfg::CoreDef::Adapter(id) => {
// If this adapter is already defined then we can safely depend
// on it with no consequences.
if self.defined_items.contains(&Def::Adapter(*id)) {
log::debug!("using existing adapter {id:?} ");
return;
}
log::debug!("splitting module needing {id:?} ");
// .. otherwise we found a case of an adapter depending on an
// adapter-module-in-progress meaning that the current adapter
// module must be completed and then a new one is started.
self.finish_adapter_module();
assert!(self.defined_items.contains(&Def::Adapter(*id)));
}
// These items can't transitively depend on an adapter
dfg::CoreDef::Trampoline(_) | dfg::CoreDef::InstanceFlags(_) => {}
}
}
fn core_export<T>(&mut self, dfg: &dfg::ComponentDfg, export: &dfg::CoreExport<T>) {
// When an adapter depends on an exported item it actually depends on
// the instance of that exported item. The caveat here is that the
// adapter not only depends on that particular instance, but also all
// prior instances to that instance as well because instance
// instantiation order is fixed and cannot change.
//
// To model this the instance index space is looped over here and while
// an instance hasn't been visited it's visited. Note that if an
// instance has already been visited then all prior instances have
// already been visited so there's no need to continue.
let mut instance = export.instance;
while self.defined_items.insert(Def::Instance(instance)) {
self.instance(dfg, instance);
if instance.as_u32() == 0 {
break;
}
instance = dfg::InstanceId::from_u32(instance.as_u32() - 1);
}
}
fn instance(&mut self, dfg: &dfg::ComponentDfg, instance: dfg::InstanceId) {
log::debug!("visiting instance {instance:?}");
// ... otherwise if this is the first timet he instance has been seen
// then the instances own arguments are recursively visited to find
// transitive dependencies on adapters.
match &dfg.instances[instance] {
dfg::Instance::Static(_, args) => {
for arg in args.iter() {
self.core_def(dfg, arg);
}
}
dfg::Instance::Import(_, args) => {
for (_, values) in args {
for (_, def) in values {
self.core_def(dfg, def);
}
}
}
}
}
fn finish_adapter_module(&mut self) {
if self.next_module.adapters.is_empty() {
return;
}
// Reset the state of the current module-in-progress and then flag all
// pending adapters as now defined since the current module is being
// committed.
let module = mem::take(&mut self.next_module);
for adapter in module.adapters.iter() {
let inserted = self.defined_items.insert(Def::Adapter(*adapter));
assert!(inserted);
}
let idx = self.adapter_modules.push(module);
log::debug!("finishing adapter module {idx:?}");
}
}