Skip to main content

wasmtime_environ/component/translate/
adapt.rs

1//! Identification and creation of fused adapter modules in Wasmtime.
2//!
3//! A major piece of the component model is the ability for core wasm modules to
4//! talk to each other through the use of lifted and lowered functions. For
5//! example one core wasm module can export a function which is lifted. Another
6//! component could import that lifted function, lower it, and pass it as the
7//! import to another core wasm module. This is what Wasmtime calls "adapter
8//! fusion" where two core wasm functions are coming together through the
9//! component model.
10//!
11//! There are a few ingredients during adapter fusion:
12//!
13//! * A core wasm function which is "lifted".
14//! * A "lift type" which is the type that the component model function had in
15//!   the original component
16//! * A "lower type" which is the type that the component model function has
17//!   in the destination component (the one the uses `canon lower`)
18//! * Configuration options for both the lift and the lower operations such as
19//!   memories, reallocs, etc.
20//!
21//! With these ingredients combined Wasmtime must produce a function which
22//! connects the two components through the options specified. The fused adapter
23//! performs tasks such as validation of passed values, copying data between
24//! linear memories, etc.
25//!
26//! Wasmtime's current implementation of fused adapters is designed to reduce
27//! complexity elsewhere as much as possible while also being suitable for being
28//! used as a polyfill for the component model in JS environments as well. To
29//! that end Wasmtime implements a fused adapter with another wasm module that
30//! it itself generates on the fly. The usage of WebAssembly for fused adapters
31//! has a number of advantages:
32//!
33//! * There is no need to create a raw Cranelift-based compiler. This is where
34//!   majority of "unsafety" lives in Wasmtime so reducing the need to lean on
35//!   this or audit another compiler is predicted to weed out a whole class of
36//!   bugs in the fused adapter compiler.
37//!
38//! * As mentioned above generation of WebAssembly modules means that this is
39//!   suitable for use in JS environments. For example a hypothetical tool which
40//!   polyfills a component onto the web today would need to do something for
41//!   adapter modules, and ideally the adapters themselves are speedy. While
42//!   this could all be written in JS the adapting process is quite nontrivial
43//!   so sharing code with Wasmtime would be ideal.
44//!
45//! * Using WebAssembly insulates the implementation to bugs to a certain
46//!   degree. While logic bugs are still possible it should be much more
47//!   difficult to have segfaults or things like that. With adapters exclusively
48//!   executing inside a WebAssembly sandbox like everything else the failure
49//!   modes to the host at least should be minimized.
50//!
51//! * Integration into the runtime is relatively simple, the adapter modules are
52//!   just another kind of wasm module to instantiate and wire up at runtime.
53//!   The goal is that the `GlobalInitializer` list that is processed at runtime
54//!   will have all of its `Adapter`-using variants erased by the time it makes
55//!   its way all the way up to Wasmtime. This means that the support in
56//!   Wasmtime prior to adapter modules is actually the same as the support
57//!   after adapter modules are added, keeping the runtime fiddly bits quite
58//!   minimal.
59//!
60//! This isn't to say that this approach isn't without its disadvantages of
61//! course. For now though this seems to be a reasonable set of tradeoffs for
62//! the development stage of the component model proposal.
63//!
64//! ## Creating adapter modules
65//!
66//! With WebAssembly itself being used to implement fused adapters, Wasmtime
67//! still has the question of how to organize the adapter functions into actual
68//! wasm modules.
69//!
70//! The first thing you might reach for is to put all the adapters into the same
71//! wasm module. This cannot be done, however, because some adapters may depend
72//! on other adapters (transitively) to be created. This means that if
73//! everything were in the same module there would be no way to instantiate the
74//! module. An example of this dependency is an adapter (A) used to create a
75//! core wasm instance (M) whose exported memory is then referenced by another
76//! adapter (B). In this situation the adapter B cannot be in the same module
77//! as adapter A because B needs the memory of M but M is created with A which
78//! would otherwise create a circular dependency.
79//!
80//! The second possibility of organizing adapter modules would be to place each
81//! fused adapter into its own module. Each `canon lower` would effectively
82//! become a core wasm module instantiation at that point. While this works it's
83//! currently believed to be a bit too fine-grained. For example it would mean
84//! that importing a dozen lowered functions into a module could possibly result
85//! in up to a dozen different adapter modules. While this possibility could
86//! work it has been ruled out as "probably too expensive at runtime".
87//!
88//! Thus the purpose and existence of this module is now evident -- this module
89//! exists to identify what exactly goes into which adapter module. This will
90//! evaluate the `GlobalInitializer` lists coming out of the `inline` pass and
91//! insert `InstantiateModule` entries for where adapter modules should be
92//! created.
93//!
94//! ## Partitioning adapter modules
95//!
96//! Currently this module does not attempt to be really all that fancy about
97//! grouping adapters into adapter modules. The main idea is that most items
98//! within an adapter module are likely to be close together since they're
99//! theoretically going to be used for an instantiation of a core wasm module
100//! just after the fused adapter was declared. With that in mind the current
101//! algorithm is a one-pass approach to partitioning everything into adapter
102//! modules.
103//!
104//! Adapters were identified in-order as part of the inlining phase of
105//! translation where we're guaranteed that once an adapter is identified
106//! it can't depend on anything identified later. The pass implemented here is
107//! to visit all transitive dependencies of an adapter. If one of the
108//! dependencies of an adapter is an adapter in the current adapter module
109//! being built then the current module is finished and a new adapter module is
110//! started. This should quickly partition adapters into contiugous chunks of
111//! their index space which can be in adapter modules together.
112//!
113//! There's probably more general algorithms for this but for now this should be
114//! fast enough as it's "just" a linear pass. As we get more components over
115//! time this may want to be revisited if too many adapter modules are being
116//! created.
117
118use crate::EntityType;
119use crate::component::translate::*;
120use crate::fact;
121use std::collections::HashSet;
122
123/// Metadata information about a fused adapter.
124#[derive(Debug, Clone, Hash, Eq, PartialEq)]
125pub struct Adapter {
126    /// The type used when the original core wasm function was lifted.
127    ///
128    /// Note that this could be different than `lower_ty` (but still matches
129    /// according to subtyping rules).
130    pub lift_ty: TypeFuncIndex,
131    /// Canonical ABI options used when the function was lifted.
132    pub lift_options: AdapterOptions,
133    /// The type used when the function was lowered back into a core wasm
134    /// function.
135    ///
136    /// Note that this could be different than `lift_ty` (but still matches
137    /// according to subtyping rules).
138    pub lower_ty: TypeFuncIndex,
139    /// Canonical ABI options used when the function was lowered.
140    pub lower_options: AdapterOptions,
141    /// The original core wasm function which was lifted.
142    pub func: dfg::CoreDef,
143}
144
145/// The data model for objects that are not unboxed in locals.
146#[derive(Debug, Clone, Hash, Eq, PartialEq)]
147pub enum DataModel {
148    /// Data is stored in GC objects.
149    Gc {},
150
151    /// Data is stored in a linear memory.
152    LinearMemory {
153        /// An optional memory definition supplied.
154        memory: Option<dfg::CoreExport<MemoryIndex>>,
155        /// If `memory` is specified, whether it's a 64-bit memory.
156        memory64: bool,
157        /// An optional definition of `realloc` to used.
158        realloc: Option<dfg::CoreDef>,
159    },
160}
161
162/// Configuration options which can be specified as part of the canonical ABI
163/// in the component model.
164#[derive(Debug, Clone, Hash, Eq, PartialEq)]
165pub struct AdapterOptions {
166    /// The Wasmtime-assigned component instance index where the options were
167    /// originally specified.
168    pub instance: RuntimeComponentInstanceIndex,
169    /// The ancestors (i.e. chain of instantiating instances) of the instance
170    /// specified in the `instance` field.
171    pub ancestors: Vec<RuntimeComponentInstanceIndex>,
172    /// How strings are encoded.
173    pub string_encoding: StringEncoding,
174    /// The async callback function used by these options, if specified.
175    pub callback: Option<dfg::CoreDef>,
176    /// An optional definition of a `post-return` to use.
177    pub post_return: Option<dfg::CoreDef>,
178    /// Whether to use the async ABI for lifting or lowering.
179    pub async_: bool,
180    /// Whether or not this intrinsic can consume a task cancellation
181    /// notification.
182    pub cancellable: bool,
183    /// The core function type that is being lifted from / lowered to.
184    pub core_type: ModuleInternedTypeIndex,
185    /// The data model used by this adapter: linear memory or GC objects.
186    pub data_model: DataModel,
187}
188
189impl<'data> Translator<'_, 'data> {
190    /// This is the entrypoint of functionality within this module which
191    /// performs all the work of identifying adapter usages and organizing
192    /// everything into adapter modules.
193    ///
194    /// This will mutate the provided `component` in-place and fill out the dfg
195    /// metadata for adapter modules.
196    pub(super) fn partition_adapter_modules(&mut self, component: &mut dfg::ComponentDfg) {
197        // Visit each adapter, in order of its original definition, during the
198        // partitioning. This allows for the guarantee that dependencies are
199        // visited in a topological fashion ideally.
200        let mut state = PartitionAdapterModules::default();
201        for (id, adapter) in component.adapters.iter() {
202            state.adapter(component, id, adapter);
203        }
204        state.finish_adapter_module();
205
206        // Now that all adapters have been partitioned into modules this loop
207        // generates a core wasm module for each adapter module, translates
208        // the module using standard core wasm translation, and then fills out
209        // the dfg metadata for each adapter.
210        for (module_id, adapter_module) in state.adapter_modules.iter() {
211            let mut module = fact::Module::new(self.types.types(), self.tunables);
212            let mut names = Vec::with_capacity(adapter_module.adapters.len());
213            for adapter in adapter_module.adapters.iter() {
214                let name = format!("adapter{}", adapter.as_u32());
215                module.adapt(&name, &component.adapters[*adapter]);
216                names.push(name);
217            }
218            let wasm = module.encode();
219            let imports = module.imports().to_vec();
220
221            // Extend the lifetime of the owned `wasm: Vec<u8>` on the stack to
222            // a higher scope defined by our original caller. That allows to
223            // transform `wasm` into `&'data [u8]` which is much easier to work
224            // with here.
225            let wasm = &*self.scope_vec.push(wasm);
226            if log::log_enabled!(log::Level::Trace) {
227                match wasmprinter::print_bytes(wasm) {
228                    Ok(s) => log::trace!("generated adapter module:\n{s}"),
229                    Err(e) => log::trace!("failed to print adapter module: {e}"),
230                }
231            }
232
233            // With the wasm binary this is then pushed through general
234            // translation, validation, etc. Note that multi-memory is
235            // specifically enabled here since the adapter module is highly
236            // likely to use that if anything is actually indirected through
237            // memory.
238            self.validator.reset();
239            let static_module_index = self.static_modules.next_key();
240            let translation = ModuleEnvironment::new(
241                self.tunables,
242                &mut self.validator,
243                self.types.module_types_builder(),
244                static_module_index,
245            )
246            .translate(Parser::new(0), wasm)
247            .expect("invalid adapter module generated");
248
249            // Record, for each adapter in this adapter module, the module that
250            // the adapter was placed within as well as the function index of
251            // the adapter in the wasm module generated. Note that adapters are
252            // partitioned in-order so we're guaranteed to push the adapters
253            // in-order here as well. (with an assert to double-check)
254            for (adapter, name) in adapter_module.adapters.iter().zip(&names) {
255                let index = translation.module.exports[name];
256                let i = component.adapter_partitionings.push((module_id, index));
257                assert_eq!(i, *adapter);
258            }
259
260            // Finally the metadata necessary to instantiate this adapter
261            // module is also recorded in the dfg. This metadata will be used
262            // to generate `GlobalInitializer` entries during the linearization
263            // final phase.
264            assert_eq!(imports.len(), translation.module.imports().len());
265            let args = imports
266                .iter()
267                .zip(translation.module.imports())
268                .map(|(arg, (_, _, ty))| fact_import_to_core_def(component, arg, ty))
269                .collect::<Vec<_>>();
270            let static_module_index2 = self.static_modules.push(translation);
271            assert_eq!(static_module_index, static_module_index2);
272            let id = component.adapter_modules.push((static_module_index, args));
273            assert_eq!(id, module_id);
274        }
275    }
276}
277
278fn fact_import_to_core_def(
279    dfg: &mut dfg::ComponentDfg,
280    import: &fact::Import,
281    ty: EntityType,
282) -> dfg::CoreDef {
283    fn unwrap_memory(def: &dfg::CoreDef) -> dfg::CoreExport<MemoryIndex> {
284        match def {
285            dfg::CoreDef::Export(e) => e.clone().map_index(|i| match i {
286                EntityIndex::Memory(i) => i,
287                _ => unreachable!(),
288            }),
289            _ => unreachable!(),
290        }
291    }
292
293    let mut simple_intrinsic = |trampoline: dfg::Trampoline| {
294        let signature = ty.unwrap_func();
295        let index = dfg
296            .trampolines
297            .push((signature.unwrap_module_type_index(), trampoline));
298        dfg::CoreDef::Trampoline(index)
299    };
300    match import {
301        fact::Import::CoreDef(def) => def.clone(),
302        fact::Import::Transcode {
303            op,
304            from,
305            from64,
306            to,
307            to64,
308        } => {
309            let from = dfg.memories.push(unwrap_memory(from));
310            let to = dfg.memories.push(unwrap_memory(to));
311            let signature = ty.unwrap_func();
312            let index = dfg.trampolines.push((
313                signature.unwrap_module_type_index(),
314                dfg::Trampoline::Transcoder {
315                    op: *op,
316                    from,
317                    from64: *from64,
318                    to,
319                    to64: *to64,
320                },
321            ));
322            dfg::CoreDef::Trampoline(index)
323        }
324        fact::Import::ResourceTransferOwn => simple_intrinsic(dfg::Trampoline::ResourceTransferOwn),
325        fact::Import::ResourceTransferBorrow => {
326            simple_intrinsic(dfg::Trampoline::ResourceTransferBorrow)
327        }
328        fact::Import::ResourceEnterCall => simple_intrinsic(dfg::Trampoline::ResourceEnterCall),
329        fact::Import::ResourceExitCall => simple_intrinsic(dfg::Trampoline::ResourceExitCall),
330        fact::Import::PrepareCall { memory } => simple_intrinsic(dfg::Trampoline::PrepareCall {
331            memory: memory.as_ref().map(|v| dfg.memories.push(unwrap_memory(v))),
332        }),
333        fact::Import::SyncStartCall { callback } => {
334            simple_intrinsic(dfg::Trampoline::SyncStartCall {
335                callback: callback.clone().map(|v| dfg.callbacks.push(v)),
336            })
337        }
338        fact::Import::AsyncStartCall {
339            callback,
340            post_return,
341        } => simple_intrinsic(dfg::Trampoline::AsyncStartCall {
342            callback: callback.clone().map(|v| dfg.callbacks.push(v)),
343            post_return: post_return.clone().map(|v| dfg.post_returns.push(v)),
344        }),
345        fact::Import::FutureTransfer => simple_intrinsic(dfg::Trampoline::FutureTransfer),
346        fact::Import::StreamTransfer => simple_intrinsic(dfg::Trampoline::StreamTransfer),
347        fact::Import::ErrorContextTransfer => {
348            simple_intrinsic(dfg::Trampoline::ErrorContextTransfer)
349        }
350        fact::Import::Trap => simple_intrinsic(dfg::Trampoline::Trap),
351        fact::Import::EnterSyncCall => simple_intrinsic(dfg::Trampoline::EnterSyncCall),
352        fact::Import::ExitSyncCall => simple_intrinsic(dfg::Trampoline::ExitSyncCall),
353    }
354}
355
356#[derive(Default)]
357struct PartitionAdapterModules {
358    /// The next adapter module that's being created. This may be empty.
359    next_module: AdapterModuleInProgress,
360
361    /// The set of items which are known to be defined which the adapter module
362    /// in progress is allowed to depend on.
363    defined_items: HashSet<Def>,
364
365    /// Finished adapter modules that won't be added to.
366    ///
367    /// In theory items could be added to preexisting modules here but to keep
368    /// this pass linear this is never modified after insertion.
369    adapter_modules: PrimaryMap<dfg::AdapterModuleId, AdapterModuleInProgress>,
370}
371
372#[derive(Default)]
373struct AdapterModuleInProgress {
374    /// The adapters which have been placed into this module.
375    adapters: Vec<dfg::AdapterId>,
376}
377
378/// Items that adapters can depend on.
379///
380/// Note that this is somewhat of a flat list and is intended to mostly model
381/// core wasm instances which are side-effectful unlike other host items like
382/// lowerings or always-trapping functions.
383#[derive(Copy, Clone, Hash, Eq, PartialEq)]
384enum Def {
385    Adapter(dfg::AdapterId),
386    Instance(dfg::InstanceId),
387}
388
389impl PartitionAdapterModules {
390    fn adapter(&mut self, dfg: &dfg::ComponentDfg, id: dfg::AdapterId, adapter: &Adapter) {
391        // Visit all dependencies of this adapter and if anything depends on
392        // the current adapter module in progress then a new adapter module is
393        // started.
394        self.adapter_options(dfg, &adapter.lift_options);
395        self.adapter_options(dfg, &adapter.lower_options);
396        self.core_def(dfg, &adapter.func);
397
398        // With all dependencies visited this adapter is added to the next
399        // module.
400        //
401        // This will either get added the preexisting module if this adapter
402        // didn't depend on anything in that module itself or it will be added
403        // to a fresh module if this adapter depended on something that the
404        // current adapter module created.
405        log::debug!("adding {id:?} to adapter module");
406        self.next_module.adapters.push(id);
407    }
408
409    fn adapter_options(&mut self, dfg: &dfg::ComponentDfg, options: &AdapterOptions) {
410        if let Some(def) = &options.callback {
411            self.core_def(dfg, def);
412        }
413        if let Some(def) = &options.post_return {
414            self.core_def(dfg, def);
415        }
416        match &options.data_model {
417            DataModel::Gc {} => {
418                // Nothing to do here yet.
419            }
420            DataModel::LinearMemory {
421                memory,
422                memory64: _,
423                realloc,
424            } => {
425                if let Some(memory) = memory {
426                    self.core_export(dfg, memory);
427                }
428                if let Some(def) = realloc {
429                    self.core_def(dfg, def);
430                }
431            }
432        }
433    }
434
435    fn core_def(&mut self, dfg: &dfg::ComponentDfg, def: &dfg::CoreDef) {
436        match def {
437            dfg::CoreDef::Export(e) => self.core_export(dfg, e),
438            dfg::CoreDef::Adapter(id) => {
439                // If this adapter is already defined then we can safely depend
440                // on it with no consequences.
441                if self.defined_items.contains(&Def::Adapter(*id)) {
442                    log::debug!("using existing adapter {id:?} ");
443                    return;
444                }
445
446                log::debug!("splitting module needing {id:?} ");
447
448                // .. otherwise we found a case of an adapter depending on an
449                // adapter-module-in-progress meaning that the current adapter
450                // module must be completed and then a new one is started.
451                self.finish_adapter_module();
452                assert!(self.defined_items.contains(&Def::Adapter(*id)));
453            }
454
455            // These items can't transitively depend on an adapter
456            dfg::CoreDef::Trampoline(_)
457            | dfg::CoreDef::InstanceFlags(_)
458            | dfg::CoreDef::UnsafeIntrinsic(..)
459            | dfg::CoreDef::TaskMayBlock => {}
460        }
461    }
462
463    fn core_export<T>(&mut self, dfg: &dfg::ComponentDfg, export: &dfg::CoreExport<T>) {
464        // When an adapter depends on an exported item it actually depends on
465        // the instance of that exported item. The caveat here is that the
466        // adapter not only depends on that particular instance, but also all
467        // prior instances to that instance as well because instance
468        // instantiation order is fixed and cannot change.
469        //
470        // To model this the instance index space is looped over here and while
471        // an instance hasn't been visited it's visited. Note that if an
472        // instance has already been visited then all prior instances have
473        // already been visited so there's no need to continue.
474        let mut instance = export.instance;
475        while self.defined_items.insert(Def::Instance(instance)) {
476            self.instance(dfg, instance);
477            if instance.as_u32() == 0 {
478                break;
479            }
480            instance = dfg::InstanceId::from_u32(instance.as_u32() - 1);
481        }
482    }
483
484    fn instance(&mut self, dfg: &dfg::ComponentDfg, instance: dfg::InstanceId) {
485        log::debug!("visiting instance {instance:?}");
486
487        // ... otherwise if this is the first timet he instance has been seen
488        // then the instances own arguments are recursively visited to find
489        // transitive dependencies on adapters.
490        match &dfg.instances[instance] {
491            dfg::Instance::Static(_, args) => {
492                for arg in args.iter() {
493                    self.core_def(dfg, arg);
494                }
495            }
496            dfg::Instance::Import(_, args) => {
497                for (_, values) in args {
498                    for (_, def) in values {
499                        self.core_def(dfg, def);
500                    }
501                }
502            }
503        }
504    }
505
506    fn finish_adapter_module(&mut self) {
507        if self.next_module.adapters.is_empty() {
508            return;
509        }
510
511        // Reset the state of the current module-in-progress and then flag all
512        // pending adapters as now defined since the current module is being
513        // committed.
514        let module = mem::take(&mut self.next_module);
515        for adapter in module.adapters.iter() {
516            let inserted = self.defined_items.insert(Def::Adapter(*adapter));
517            assert!(inserted);
518        }
519        let idx = self.adapter_modules.push(module);
520        log::debug!("finishing adapter module {idx:?}");
521    }
522}