wasmtime_environ/component/translate/
adapt.rs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
//! Identification and creation of fused adapter modules in Wasmtime.
//!
//! A major piece of the component model is the ability for core wasm modules to
//! talk to each other through the use of lifted and lowered functions. For
//! example one core wasm module can export a function which is lifted. Another
//! component could import that lifted function, lower it, and pass it as the
//! import to another core wasm module. This is what Wasmtime calls "adapter
//! fusion" where two core wasm functions are coming together through the
//! component model.
//!
//! There are a few ingredients during adapter fusion:
//!
//! * A core wasm function which is "lifted".
//! * A "lift type" which is the type that the component model function had in
//!   the original component
//! * A "lower type" which is the type that the component model function has
//!   in the destination component (the one the uses `canon lower`)
//! * Configuration options for both the lift and the lower operations such as
//!   memories, reallocs, etc.
//!
//! With these ingredients combined Wasmtime must produce a function which
//! connects the two components through the options specified. The fused adapter
//! performs tasks such as validation of passed values, copying data between
//! linear memories, etc.
//!
//! Wasmtime's current implementation of fused adapters is designed to reduce
//! complexity elsewhere as much as possible while also being suitable for being
//! used as a polyfill for the component model in JS environments as well. To
//! that end Wasmtime implements a fused adapter with another wasm module that
//! it itself generates on the fly. The usage of WebAssembly for fused adapters
//! has a number of advantages:
//!
//! * There is no need to create a raw Cranelift-based compiler. This is where
//!   majority of "unsafety" lives in Wasmtime so reducing the need to lean on
//!   this or audit another compiler is predicted to weed out a whole class of
//!   bugs in the fused adapter compiler.
//!
//! * As mentioned above generation of WebAssembly modules means that this is
//!   suitable for use in JS environments. For example a hypothetical tool which
//!   polyfills a component onto the web today would need to do something for
//!   adapter modules, and ideally the adapters themselves are speedy. While
//!   this could all be written in JS the adapting process is quite nontrivial
//!   so sharing code with Wasmtime would be ideal.
//!
//! * Using WebAssembly insulates the implementation to bugs to a certain
//!   degree. While logic bugs are still possible it should be much more
//!   difficult to have segfaults or things like that. With adapters exclusively
//!   executing inside a WebAssembly sandbox like everything else the failure
//!   modes to the host at least should be minimized.
//!
//! * Integration into the runtime is relatively simple, the adapter modules are
//!   just another kind of wasm module to instantiate and wire up at runtime.
//!   The goal is that the `GlobalInitializer` list that is processed at runtime
//!   will have all of its `Adapter`-using variants erased by the time it makes
//!   its way all the way up to Wasmtime. This means that the support in
//!   Wasmtime prior to adapter modules is actually the same as the support
//!   after adapter modules are added, keeping the runtime fiddly bits quite
//!   minimal.
//!
//! This isn't to say that this approach isn't without its disadvantages of
//! course. For now though this seems to be a reasonable set of tradeoffs for
//! the development stage of the component model proposal.
//!
//! ## Creating adapter modules
//!
//! With WebAssembly itself being used to implement fused adapters, Wasmtime
//! still has the question of how to organize the adapter functions into actual
//! wasm modules.
//!
//! The first thing you might reach for is to put all the adapters into the same
//! wasm module. This cannot be done, however, because some adapters may depend
//! on other adapters (transitively) to be created. This means that if
//! everything were in the same module there would be no way to instantiate the
//! module. An example of this dependency is an adapter (A) used to create a
//! core wasm instance (M) whose exported memory is then referenced by another
//! adapter (B). In this situation the adapter B cannot be in the same module
//! as adapter A because B needs the memory of M but M is created with A which
//! would otherwise create a circular dependency.
//!
//! The second possibility of organizing adapter modules would be to place each
//! fused adapter into its own module. Each `canon lower` would effectively
//! become a core wasm module instantiation at that point. While this works it's
//! currently believed to be a bit too fine-grained. For example it would mean
//! that importing a dozen lowered functions into a module could possibly result
//! in up to a dozen different adapter modules. While this possibility could
//! work it has been ruled out as "probably too expensive at runtime".
//!
//! Thus the purpose and existence of this module is now evident -- this module
//! exists to identify what exactly goes into which adapter module. This will
//! evaluate the `GlobalInitializer` lists coming out of the `inline` pass and
//! insert `InstantiateModule` entries for where adapter modules should be
//! created.
//!
//! ## Partitioning adapter modules
//!
//! Currently this module does not attempt to be really all that fancy about
//! grouping adapters into adapter modules. The main idea is that most items
//! within an adapter module are likely to be close together since they're
//! theoretically going to be used for an instantiation of a core wasm module
//! just after the fused adapter was declared. With that in mind the current
//! algorithm is a one-pass approach to partitioning everything into adapter
//! modules.
//!
//! Adapters were identified in-order as part of the inlining phase of
//! translation where we're guaranteed that once an adapter is identified
//! it can't depend on anything identified later. The pass implemented here is
//! to visit all transitive dependencies of an adapter. If one of the
//! dependencies of an adapter is an adapter in the current adapter module
//! being built then the current module is finished and a new adapter module is
//! started. This should quickly partition adapters into contiugous chunks of
//! their index space which can be in adapter modules together.
//!
//! There's probably more general algorithms for this but for now this should be
//! fast enough as it's "just" a linear pass. As we get more components over
//! time this may want to be revisited if too many adapter modules are being
//! created.

use crate::component::translate::*;
use crate::fact;
use crate::EntityType;
use std::collections::HashSet;

/// Metadata information about a fused adapter.
#[derive(Debug, Clone, Hash, Eq, PartialEq)]
pub struct Adapter {
    /// The type used when the original core wasm function was lifted.
    ///
    /// Note that this could be different than `lower_ty` (but still matches
    /// according to subtyping rules).
    pub lift_ty: TypeFuncIndex,
    /// Canonical ABI options used when the function was lifted.
    pub lift_options: AdapterOptions,
    /// The type used when the function was lowered back into a core wasm
    /// function.
    ///
    /// Note that this could be different than `lift_ty` (but still matches
    /// according to subtyping rules).
    pub lower_ty: TypeFuncIndex,
    /// Canonical ABI options used when the function was lowered.
    pub lower_options: AdapterOptions,
    /// The original core wasm function which was lifted.
    pub func: dfg::CoreDef,
}

/// Configuration options which can be specified as part of the canonical ABI
/// in the component model.
#[derive(Debug, Clone, Hash, Eq, PartialEq)]
pub struct AdapterOptions {
    /// The Wasmtime-assigned component instance index where the options were
    /// originally specified.
    pub instance: RuntimeComponentInstanceIndex,
    /// How strings are encoded.
    pub string_encoding: StringEncoding,
    /// An optional memory definition supplied.
    pub memory: Option<dfg::CoreExport<MemoryIndex>>,
    /// If `memory` is specified, whether it's a 64-bit memory.
    pub memory64: bool,
    /// An optional definition of `realloc` to used.
    pub realloc: Option<dfg::CoreDef>,
    /// An optional definition of a `post-return` to use.
    pub post_return: Option<dfg::CoreDef>,
}

impl<'data> Translator<'_, 'data> {
    /// This is the entrypoint of functionality within this module which
    /// performs all the work of identifying adapter usages and organizing
    /// everything into adapter modules.
    ///
    /// This will mutate the provided `component` in-place and fill out the dfg
    /// metadata for adapter modules.
    pub(super) fn partition_adapter_modules(&mut self, component: &mut dfg::ComponentDfg) {
        // Visit each adapter, in order of its original definition, during the
        // partitioning. This allows for the guarantee that dependencies are
        // visited in a topological fashion ideally.
        let mut state = PartitionAdapterModules::default();
        for (id, adapter) in component.adapters.iter() {
            state.adapter(component, id, adapter);
        }
        state.finish_adapter_module();

        // Now that all adapters have been partitioned into modules this loop
        // generates a core wasm module for each adapter module, translates
        // the module using standard core wasm translation, and then fills out
        // the dfg metadata for each adapter.
        for (module_id, adapter_module) in state.adapter_modules.iter() {
            let mut module =
                fact::Module::new(self.types.types(), self.tunables.debug_adapter_modules);
            let mut names = Vec::with_capacity(adapter_module.adapters.len());
            for adapter in adapter_module.adapters.iter() {
                let name = format!("adapter{}", adapter.as_u32());
                module.adapt(&name, &component.adapters[*adapter]);
                names.push(name);
            }
            let wasm = module.encode();
            let imports = module.imports().to_vec();

            // Extend the lifetime of the owned `wasm: Vec<u8>` on the stack to
            // a higher scope defined by our original caller. That allows to
            // transform `wasm` into `&'data [u8]` which is much easier to work
            // with here.
            let wasm = &*self.scope_vec.push(wasm);
            if log::log_enabled!(log::Level::Trace) {
                match wasmprinter::print_bytes(wasm) {
                    Ok(s) => log::trace!("generated adapter module:\n{}", s),
                    Err(e) => log::trace!("failed to print adapter module: {}", e),
                }
            }

            // With the wasm binary this is then pushed through general
            // translation, validation, etc. Note that multi-memory is
            // specifically enabled here since the adapter module is highly
            // likely to use that if anything is actually indirected through
            // memory.
            self.validator.reset();
            let translation = ModuleEnvironment::new(
                self.tunables,
                &mut self.validator,
                self.types.module_types_builder(),
            )
            .translate(Parser::new(0), wasm)
            .expect("invalid adapter module generated");

            // Record, for each adapter in this adapter module, the module that
            // the adapter was placed within as well as the function index of
            // the adapter in the wasm module generated. Note that adapters are
            // paritioned in-order so we're guaranteed to push the adapters
            // in-order here as well. (with an assert to double-check)
            for (adapter, name) in adapter_module.adapters.iter().zip(&names) {
                let index = translation.module.exports[name];
                let i = component.adapter_paritionings.push((module_id, index));
                assert_eq!(i, *adapter);
            }

            // Finally the metadata necessary to instantiate this adapter
            // module is also recorded in the dfg. This metadata will be used
            // to generate `GlobalInitializer` entries during the linearization
            // final phase.
            assert_eq!(imports.len(), translation.module.imports().len());
            let args = imports
                .iter()
                .zip(translation.module.imports())
                .map(|(arg, (_, _, ty))| fact_import_to_core_def(component, arg, ty))
                .collect::<Vec<_>>();
            let static_index = self.static_modules.push(translation);
            let id = component.adapter_modules.push((static_index, args.into()));
            assert_eq!(id, module_id);
        }
    }
}

fn fact_import_to_core_def(
    dfg: &mut dfg::ComponentDfg,
    import: &fact::Import,
    ty: EntityType,
) -> dfg::CoreDef {
    let mut simple_intrinsic = |trampoline: dfg::Trampoline| {
        let signature = ty.unwrap_func();
        let index = dfg
            .trampolines
            .push((signature.unwrap_module_type_index(), trampoline));
        dfg::CoreDef::Trampoline(index)
    };
    match import {
        fact::Import::CoreDef(def) => def.clone(),
        fact::Import::Transcode {
            op,
            from,
            from64,
            to,
            to64,
        } => {
            fn unwrap_memory(def: &dfg::CoreDef) -> dfg::CoreExport<MemoryIndex> {
                match def {
                    dfg::CoreDef::Export(e) => e.clone().map_index(|i| match i {
                        EntityIndex::Memory(i) => i,
                        _ => unreachable!(),
                    }),
                    _ => unreachable!(),
                }
            }

            let from = dfg.memories.push(unwrap_memory(from));
            let to = dfg.memories.push(unwrap_memory(to));
            let signature = ty.unwrap_func();
            let index = dfg.trampolines.push((
                signature.unwrap_module_type_index(),
                dfg::Trampoline::Transcoder {
                    op: *op,
                    from,
                    from64: *from64,
                    to,
                    to64: *to64,
                },
            ));
            dfg::CoreDef::Trampoline(index)
        }
        fact::Import::ResourceTransferOwn => simple_intrinsic(dfg::Trampoline::ResourceTransferOwn),
        fact::Import::ResourceTransferBorrow => {
            simple_intrinsic(dfg::Trampoline::ResourceTransferBorrow)
        }
        fact::Import::ResourceEnterCall => simple_intrinsic(dfg::Trampoline::ResourceEnterCall),
        fact::Import::ResourceExitCall => simple_intrinsic(dfg::Trampoline::ResourceExitCall),
    }
}

#[derive(Default)]
struct PartitionAdapterModules {
    /// The next adapter module that's being created. This may be empty.
    next_module: AdapterModuleInProgress,

    /// The set of items which are known to be defined which the adapter module
    /// in progress is allowed to depend on.
    defined_items: HashSet<Def>,

    /// Finished adapter modules that won't be added to.
    ///
    /// In theory items could be added to preexisting modules here but to keep
    /// this pass linear this is never modified after insertion.
    adapter_modules: PrimaryMap<dfg::AdapterModuleId, AdapterModuleInProgress>,
}

#[derive(Default)]
struct AdapterModuleInProgress {
    /// The adapters which have been placed into this module.
    adapters: Vec<dfg::AdapterId>,
}

/// Items that adapters can depend on.
///
/// Note that this is somewhat of a flat list and is intended to mostly model
/// core wasm instances which are side-effectful unlike other host items like
/// lowerings or always-trapping functions.
#[derive(Copy, Clone, Hash, Eq, PartialEq)]
enum Def {
    Adapter(dfg::AdapterId),
    Instance(dfg::InstanceId),
}

impl PartitionAdapterModules {
    fn adapter(&mut self, dfg: &dfg::ComponentDfg, id: dfg::AdapterId, adapter: &Adapter) {
        // Visit all dependencies of this adapter and if anything depends on
        // the current adapter module in progress then a new adapter module is
        // started.
        self.adapter_options(dfg, &adapter.lift_options);
        self.adapter_options(dfg, &adapter.lower_options);
        self.core_def(dfg, &adapter.func);

        // With all dependencies visited this adapter is added to the next
        // module.
        //
        // This will either get added the preexisting module if this adapter
        // didn't depend on anything in that module itself or it will be added
        // to a fresh module if this adapter depended on something that the
        // current adapter module created.
        log::debug!("adding {id:?} to adapter module");
        self.next_module.adapters.push(id);
    }

    fn adapter_options(&mut self, dfg: &dfg::ComponentDfg, options: &AdapterOptions) {
        if let Some(memory) = &options.memory {
            self.core_export(dfg, memory);
        }
        if let Some(def) = &options.realloc {
            self.core_def(dfg, def);
        }
        if let Some(def) = &options.post_return {
            self.core_def(dfg, def);
        }
    }

    fn core_def(&mut self, dfg: &dfg::ComponentDfg, def: &dfg::CoreDef) {
        match def {
            dfg::CoreDef::Export(e) => self.core_export(dfg, e),
            dfg::CoreDef::Adapter(id) => {
                // If this adapter is already defined then we can safely depend
                // on it with no consequences.
                if self.defined_items.contains(&Def::Adapter(*id)) {
                    log::debug!("using existing adapter {id:?} ");
                    return;
                }

                log::debug!("splitting module needing {id:?} ");

                // .. otherwise we found a case of an adapter depending on an
                // adapter-module-in-progress meaning that the current adapter
                // module must be completed and then a new one is started.
                self.finish_adapter_module();
                assert!(self.defined_items.contains(&Def::Adapter(*id)));
            }

            // These items can't transitively depend on an adapter
            dfg::CoreDef::Trampoline(_) | dfg::CoreDef::InstanceFlags(_) => {}
        }
    }

    fn core_export<T>(&mut self, dfg: &dfg::ComponentDfg, export: &dfg::CoreExport<T>) {
        // When an adapter depends on an exported item it actually depends on
        // the instance of that exported item. The caveat here is that the
        // adapter not only depends on that particular instance, but also all
        // prior instances to that instance as well because instance
        // instantiation order is fixed and cannot change.
        //
        // To model this the instance index space is looped over here and while
        // an instance hasn't been visited it's visited. Note that if an
        // instance has already been visited then all prior instances have
        // already been visited so there's no need to continue.
        let mut instance = export.instance;
        while self.defined_items.insert(Def::Instance(instance)) {
            self.instance(dfg, instance);
            if instance.as_u32() == 0 {
                break;
            }
            instance = dfg::InstanceId::from_u32(instance.as_u32() - 1);
        }
    }

    fn instance(&mut self, dfg: &dfg::ComponentDfg, instance: dfg::InstanceId) {
        log::debug!("visiting instance {instance:?}");

        // ... otherwise if this is the first timet he instance has been seen
        // then the instances own arguments are recursively visited to find
        // transitive dependencies on adapters.
        match &dfg.instances[instance] {
            dfg::Instance::Static(_, args) => {
                for arg in args.iter() {
                    self.core_def(dfg, arg);
                }
            }
            dfg::Instance::Import(_, args) => {
                for (_, values) in args {
                    for (_, def) in values {
                        self.core_def(dfg, def);
                    }
                }
            }
        }
    }

    fn finish_adapter_module(&mut self) {
        if self.next_module.adapters.is_empty() {
            return;
        }

        // Reset the state of the current module-in-progress and then flag all
        // pending adapters as now defined since the current module is being
        // committed.
        let module = mem::take(&mut self.next_module);
        for adapter in module.adapters.iter() {
            let inserted = self.defined_items.insert(Def::Adapter(*adapter));
            assert!(inserted);
        }
        let idx = self.adapter_modules.push(module);
        log::debug!("finishing adapter module {idx:?}");
    }
}