cranelift_codegen/machinst/
buffer.rs

1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//!   emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//!   For example, on AArch64, conditional branches can only target code +/- 1MB
12//!   from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//!   edge blocks when the register allocator does not need to insert any
17//!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18//!   patterns. The lowering also pays no attention to block order, and so
19//!   two-target conditional forms (cond-br followed by uncond-br) can often by
20//!   avoided because one of the targets is the fallthrough. There are several
21//!   cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//!   forms, but with zero offsets and optimistically assuming the target will be
29//!   in range. Record the "fixup" for later. Targets are denoted instead by
30//!   symbolic "labels" that are then bound to certain offsets in the buffer as
31//!   we emit code. (Nominally, there is a label at the start of every basic
32//!   block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//!   reference "goes out of range". We call this the "deadline". If we reach the
36//!   deadline and we still have not bound the label to which an unresolved branch
37//!   refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//!   simply a chunk of code inserted in the middle of the code actually produced
41//!   by the emitter (e.g., vcode iterating over instruction structs). The emitter
42//!   has some awareness of this: it either asks for an island between blocks, so
43//!   it is not accidentally executed, or else it emits a branch around the island
44//!   when all other options fail (see `Inst::EmitIsland` meta-instruction).
45//!
46//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47//!   that implements a longer-range reference to a label. The idea is that, for
48//!   example, a branch with a limited range can branch to a "veneer" instead,
49//!   which is simply a branch in a form that can use a longer-range reference. On
50//!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51//!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52//!   conditional branch's label reference can be fixed up with a "veneer" to
53//!   achieve a longer range.
54//!
55//! - To implement all of this, we require the backend to provide a `LabelUse`
56//!   type that implements a trait. This is nominally an enum that records one of
57//!   several kinds of references to an offset in code -- basically, a relocation
58//!   type -- and will usually correspond to different instruction formats. The
59//!   `LabelUse` implementation specifies the maximum range, how to patch in the
60//!   actual label location when known, and how to generate a veneer to extend the
61//!   range.
62//!
63//! That satisfies label references, but we still may have suboptimal branch
64//! patterns. To clean up the branches, we do a simple "peephole"-style
65//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66//! informs the buffer of branches in the code and, in the case of conditionals,
67//! the code that would have been emitted to invert this branch's condition. We
68//! track the "latest branches": these are branches that are contiguous up to
69//! the current offset. (If any code is emitted after a branch, that branch or
70//! run of contiguous branches is no longer "latest".) The latest branches are
71//! those that we can edit by simply truncating the buffer and doing something
72//! else instead.
73//!
74//! To optimize branches, we implement several simple rules, and try to apply
75//! them to the "latest branches" when possible:
76//!
77//! - A branch with a label target, when that label is bound to the ending
78//!   offset of the branch (the fallthrough location), can be removed altogether,
79//!   because the branch would have no effect).
80//!
81//! - An unconditional branch that starts at a label location, and branches to
82//!   another label, results in a "label alias": all references to the label bound
83//!   *to* this branch instruction are instead resolved to the *target* of the
84//!   branch instruction. This effectively removes empty blocks that just
85//!   unconditionally branch to the next block. We call this "branch threading".
86//!
87//! - A conditional followed by an unconditional, when the conditional branches
88//!   to the unconditional's fallthrough, results in (i) the truncation of the
89//!   unconditional, (ii) the inversion of the condition's condition, and (iii)
90//!   replacement of the conditional's target (using the original target of the
91//!   unconditional). This is a fancy way of saying "we can flip a two-target
92//!   conditional branch's taken/not-taken targets if it works better with our
93//!   fallthrough". To make this work, the emitter actually gives the buffer
94//!   *both* forms of every conditional branch: the true form is emitted into the
95//!   buffer, and the "inverted" machine-code bytes are provided as part of the
96//!   branch-fixup metadata.
97//!
98//! - An unconditional B preceded by another unconditional P, when B's label(s) have
99//!   been redirected to target(B), can be removed entirely. This is an extension
100//!   of the branch-threading optimization, and is valid because if we know there
101//!   will be no fallthrough into this branch instruction (the prior instruction
102//!   is an unconditional jump), and if we know we have successfully redirected
103//!   all labels, then this branch instruction is unreachable. Note that this
104//!   works because the redirection happens before the label is ever resolved
105//!   (fixups happen at island emission time, at which point latest-branches are
106//!   cleared, or at the end of emission), so we are sure to catch and redirect
107//!   all possible paths to this instruction.
108//!
109//! # Branch-optimization Correctness
110//!
111//! The branch-optimization mechanism depends on a few data structures with
112//! invariants, which are always held outside the scope of top-level public
113//! methods:
114//!
115//! - The latest-branches list. Each entry describes a span of the buffer
116//!   (start/end offsets), the label target, the corresponding fixup-list entry
117//!   index, and the bytes (must be the same length) for the inverted form, if
118//!   conditional. The list of labels that are bound to the start-offset of this
119//!   branch is *complete* (if any label has a resolved offset equal to `start`
120//!   and is not an alias, it must appear in this list) and *precise* (no label
121//!   in this list can be bound to another offset). No label in this list should
122//!   be an alias.  No two branch ranges can overlap, and branches are in
123//!   ascending-offset order.
124//!
125//! - The labels-at-tail list. This contains all MachLabels that have been bound
126//!   to (whose resolved offsets are equal to) the tail offset of the buffer.
127//!   No label in this list should be an alias.
128//!
129//! - The label_offsets array, containing the bound offset of a label or
130//!   UNKNOWN. No label can be bound at an offset greater than the current
131//!   buffer tail.
132//!
133//! - The label_aliases array, containing another label to which a label is
134//!   bound or UNKNOWN. A label's resolved offset is the resolved offset
135//!   of the label it is aliased to, if this is set.
136//!
137//! We argue below, at each method, how the invariants in these data structures
138//! are maintained (grep for "Post-invariant").
139//!
140//! Given these invariants, we argue why each optimization preserves execution
141//! semantics below (grep for "Preserves execution semantics").
142//!
143//! # Avoiding Quadratic Behavior
144//!
145//! There are two cases where we've had to take some care to avoid
146//! quadratic worst-case behavior:
147//!
148//! - The "labels at this branch" list can grow unboundedly if the
149//!   code generator binds many labels at one location. If the count
150//!   gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151//!   simply abort an optimization early in a way that is always correct
152//!   but is conservative.
153//!
154//! - The fixup list can interact with island emission to create
155//!   "quadratic island behavior". In a little more detail, one can hit
156//!   this behavior by having some pending fixups (forward label
157//!   references) with long-range label-use kinds, and some others
158//!   with shorter-range references that nonetheless still are pending
159//!   long enough to trigger island generation. In such a case, we
160//!   process the fixup list, generate veneers to extend some forward
161//!   references' ranges, but leave the other (longer-range) ones
162//!   alone. The way this was implemented put them back on a list and
163//!   resulted in quadratic behavior.
164//!
165//!   To avoid this fixups are split into two lists: one "pending" list and one
166//!   final list. The pending list is kept around for handling fixups related to
167//!   branches so it can be edited/truncated. When an island is reached, which
168//!   starts processing fixups, all pending fixups are flushed into the final
169//!   list. The final list is a `BinaryHeap` which enables fixup processing to
170//!   only process those which are required during island emission, deferring
171//!   all longer-range fixups to later.
172
173use crate::binemit::{Addend, CodeOffset, Reloc};
174use crate::ir::function::FunctionParameters;
175use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176use crate::isa::unwind::UnwindInst;
177use crate::machinst::{
178    BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179};
180use crate::trace;
181use crate::{MachInstEmitState, ir};
182use crate::{VCodeConstantData, timing};
183use core::ops::Range;
184use cranelift_control::ControlPlane;
185use cranelift_entity::{PrimaryMap, entity_impl};
186use smallvec::SmallVec;
187use std::cmp::Ordering;
188use std::collections::BinaryHeap;
189use std::mem;
190use std::string::String;
191use std::vec::Vec;
192
193#[cfg(feature = "enable-serde")]
194use serde::{Deserialize, Serialize};
195
196#[cfg(feature = "enable-serde")]
197pub trait CompilePhase {
198    type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
199    type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
200}
201
202#[cfg(not(feature = "enable-serde"))]
203pub trait CompilePhase {
204    type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
205    type SourceLocType: core::fmt::Debug + PartialEq + Clone;
206}
207
208/// Status of a compiled artifact that needs patching before being used.
209#[derive(Clone, Debug, PartialEq)]
210#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
211pub struct Stencil;
212
213/// Status of a compiled artifact ready to use.
214#[derive(Clone, Debug, PartialEq)]
215pub struct Final;
216
217impl CompilePhase for Stencil {
218    type MachSrcLocType = MachSrcLoc<Stencil>;
219    type SourceLocType = RelSourceLoc;
220}
221
222impl CompilePhase for Final {
223    type MachSrcLocType = MachSrcLoc<Final>;
224    type SourceLocType = SourceLoc;
225}
226
227#[derive(Clone, Copy, Debug, PartialEq, Eq)]
228enum ForceVeneers {
229    Yes,
230    No,
231}
232
233/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
234/// in bulk.
235///
236/// This struct uses `SmallVec`s to support small-ish function bodies without
237/// any heap allocation. As such, it will be several kilobytes large. This is
238/// likely fine as long as it is stack-allocated for function emission then
239/// thrown away; but beware if many buffer objects are retained persistently.
240pub struct MachBuffer<I: VCodeInst> {
241    /// The buffer contents, as raw bytes.
242    data: SmallVec<[u8; 1024]>,
243    /// The required alignment of this buffer.
244    min_alignment: u32,
245    /// Any relocations referring to this code. Note that only *external*
246    /// relocations are tracked here; references to labels within the buffer are
247    /// resolved before emission.
248    relocs: SmallVec<[MachReloc; 16]>,
249    /// Any trap records referring to this code.
250    traps: SmallVec<[MachTrap; 16]>,
251    /// Any call site records referring to this code.
252    call_sites: SmallVec<[MachCallSite; 16]>,
253    /// Any exception-handler records referred to at call sites.
254    exception_handlers: SmallVec<[MachExceptionHandler; 16]>,
255    /// Any source location mappings referring to this code.
256    srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
257    /// Any user stack maps for this code.
258    ///
259    /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
260    /// by code offset, and each stack map covers `span` bytes on the stack.
261    user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
262    /// Any unwind info at a given location.
263    unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
264    /// The current source location in progress (after `start_srcloc()` and
265    /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
266    cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
267    /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
268    label_offsets: SmallVec<[CodeOffset; 16]>,
269    /// Label aliases: when one label points to an unconditional jump, and that
270    /// jump points to another label, we can redirect references to the first
271    /// label immediately to the second.
272    ///
273    /// Invariant: we don't have label-alias cycles. We ensure this by,
274    /// before setting label A to alias label B, resolving B's alias
275    /// target (iteratively until a non-aliased label); if B is already
276    /// aliased to A, then we cannot alias A back to B.
277    label_aliases: SmallVec<[MachLabel; 16]>,
278    /// Constants that must be emitted at some point.
279    pending_constants: SmallVec<[VCodeConstant; 16]>,
280    /// Byte size of all constants in `pending_constants`.
281    pending_constants_size: CodeOffset,
282    /// Traps that must be emitted at some point.
283    pending_traps: SmallVec<[MachLabelTrap; 16]>,
284    /// Fixups that haven't yet been flushed into `fixup_records` below and may
285    /// be related to branches that are chomped. These all get added to
286    /// `fixup_records` during island emission.
287    pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
288    /// The nearest upcoming deadline for entries in `pending_fixup_records`.
289    pending_fixup_deadline: CodeOffset,
290    /// Fixups that must be performed after all code is emitted.
291    fixup_records: BinaryHeap<MachLabelFixup<I>>,
292    /// Latest branches, to facilitate in-place editing for better fallthrough
293    /// behavior and empty-block removal.
294    latest_branches: SmallVec<[MachBranch; 4]>,
295    /// All labels at the current offset (emission tail). This is lazily
296    /// cleared: it is actually accurate as long as the current offset is
297    /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
298    /// be considered as empty.
299    ///
300    /// For correctness, this *must* be complete (i.e., the vector must contain
301    /// all labels whose offsets are resolved to the current tail), because we
302    /// rely on it to update labels when we truncate branches.
303    labels_at_tail: SmallVec<[MachLabel; 4]>,
304    /// The last offset at which `labels_at_tail` is valid. It is conceptually
305    /// always describing the tail of the buffer, but we do not clear
306    /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
307    /// when the offset has grown past this (`labels_at_tail_off`) point.
308    /// Always <= `cur_offset()`.
309    labels_at_tail_off: CodeOffset,
310    /// Metadata about all constants that this function has access to.
311    ///
312    /// This records the size/alignment of all constants (not the actual data)
313    /// along with the last available label generated for the constant. This map
314    /// is consulted when constants are referred to and the label assigned to a
315    /// constant may change over time as well.
316    constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
317    /// All recorded usages of constants as pairs of the constant and where the
318    /// constant needs to be placed within `self.data`. Note that the same
319    /// constant may appear in this array multiple times if it was emitted
320    /// multiple times.
321    used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
322    /// Indicates when a patchable region is currently open, to guard that it's
323    /// not possible to nest patchable regions.
324    open_patchable: bool,
325}
326
327impl MachBufferFinalized<Stencil> {
328    /// Get a finalized machine buffer by applying the function's base source location.
329    pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
330        MachBufferFinalized {
331            data: self.data,
332            relocs: self.relocs,
333            traps: self.traps,
334            call_sites: self.call_sites,
335            exception_handlers: self.exception_handlers,
336            srclocs: self
337                .srclocs
338                .into_iter()
339                .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
340                .collect(),
341            user_stack_maps: self.user_stack_maps,
342            unwind_info: self.unwind_info,
343            alignment: self.alignment,
344        }
345    }
346}
347
348/// A `MachBuffer` once emission is completed: holds generated code and records,
349/// without fixups. This allows the type to be independent of the backend.
350#[derive(PartialEq, Debug, Clone)]
351#[cfg_attr(
352    feature = "enable-serde",
353    derive(serde_derive::Serialize, serde_derive::Deserialize)
354)]
355pub struct MachBufferFinalized<T: CompilePhase> {
356    /// The buffer contents, as raw bytes.
357    pub(crate) data: SmallVec<[u8; 1024]>,
358    /// Any relocations referring to this code. Note that only *external*
359    /// relocations are tracked here; references to labels within the buffer are
360    /// resolved before emission.
361    pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
362    /// Any trap records referring to this code.
363    pub(crate) traps: SmallVec<[MachTrap; 16]>,
364    /// Any call site records referring to this code.
365    pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
366    /// Any exception-handler records referred to at call sites.
367    pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,
368    /// Any source location mappings referring to this code.
369    pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
370    /// Any user stack maps for this code.
371    ///
372    /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
373    /// by code offset, and each stack map covers `span` bytes on the stack.
374    pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
375    /// Any unwind info at a given location.
376    pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
377    /// The required alignment of this buffer.
378    pub alignment: u32,
379}
380
381const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
382const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
383
384/// Threshold on max length of `labels_at_this_branch` list to avoid
385/// unbounded quadratic behavior (see comment below at use-site).
386const LABEL_LIST_THRESHOLD: usize = 100;
387
388/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
389/// the point at which it is used by emitted code; the buffer records "fixups"
390/// for references to the label, and will come back and patch the code
391/// appropriately when the label's location is eventually known.
392#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
393pub struct MachLabel(u32);
394entity_impl!(MachLabel);
395
396impl MachLabel {
397    /// Get a label for a block. (The first N MachLabels are always reserved for
398    /// the N blocks in the vcode.)
399    pub fn from_block(bindex: BlockIndex) -> MachLabel {
400        MachLabel(bindex.index() as u32)
401    }
402
403    /// Creates a string representing this label, for convenience.
404    pub fn to_string(&self) -> String {
405        format!("label{}", self.0)
406    }
407}
408
409impl Default for MachLabel {
410    fn default() -> Self {
411        UNKNOWN_LABEL
412    }
413}
414
415/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
416/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
417/// the [`OpenPatchRegion`] token in the process.
418pub struct OpenPatchRegion(usize);
419
420/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
421/// of where you might want to use this is for patching instructions that mention constants that
422/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
423/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
424/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
425/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
426/// bytes, and the constants uses can be updated directly.
427pub struct PatchRegion {
428    range: Range<usize>,
429}
430
431impl PatchRegion {
432    /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
433    pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
434        &mut buffer.data[self.range]
435    }
436}
437
438impl<I: VCodeInst> MachBuffer<I> {
439    /// Create a new section, known to start at `start_offset` and with a size limited to
440    /// `length_limit`.
441    pub fn new() -> MachBuffer<I> {
442        MachBuffer {
443            data: SmallVec::new(),
444            min_alignment: I::function_alignment().minimum,
445            relocs: SmallVec::new(),
446            traps: SmallVec::new(),
447            call_sites: SmallVec::new(),
448            exception_handlers: SmallVec::new(),
449            srclocs: SmallVec::new(),
450            user_stack_maps: SmallVec::new(),
451            unwind_info: SmallVec::new(),
452            cur_srcloc: None,
453            label_offsets: SmallVec::new(),
454            label_aliases: SmallVec::new(),
455            pending_constants: SmallVec::new(),
456            pending_constants_size: 0,
457            pending_traps: SmallVec::new(),
458            pending_fixup_records: SmallVec::new(),
459            pending_fixup_deadline: u32::MAX,
460            fixup_records: Default::default(),
461            latest_branches: SmallVec::new(),
462            labels_at_tail: SmallVec::new(),
463            labels_at_tail_off: 0,
464            constants: Default::default(),
465            used_constants: Default::default(),
466            open_patchable: false,
467        }
468    }
469
470    /// Current offset from start of buffer.
471    pub fn cur_offset(&self) -> CodeOffset {
472        self.data.len() as CodeOffset
473    }
474
475    /// Add a byte.
476    pub fn put1(&mut self, value: u8) {
477        self.data.push(value);
478
479        // Post-invariant: conceptual-labels_at_tail contains a complete and
480        // precise list of labels bound at `cur_offset()`. We have advanced
481        // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
482        // before, it is not anymore (and it cannot become equal, because
483        // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
484        // conceptually empty (even though it is only lazily cleared). No labels
485        // can be bound at this new offset (by invariant on `label_offsets`).
486        // Hence the invariant holds.
487    }
488
489    /// Add 2 bytes.
490    pub fn put2(&mut self, value: u16) {
491        let bytes = value.to_le_bytes();
492        self.data.extend_from_slice(&bytes[..]);
493
494        // Post-invariant: as for `put1()`.
495    }
496
497    /// Add 4 bytes.
498    pub fn put4(&mut self, value: u32) {
499        let bytes = value.to_le_bytes();
500        self.data.extend_from_slice(&bytes[..]);
501
502        // Post-invariant: as for `put1()`.
503    }
504
505    /// Add 8 bytes.
506    pub fn put8(&mut self, value: u64) {
507        let bytes = value.to_le_bytes();
508        self.data.extend_from_slice(&bytes[..]);
509
510        // Post-invariant: as for `put1()`.
511    }
512
513    /// Add a slice of bytes.
514    pub fn put_data(&mut self, data: &[u8]) {
515        self.data.extend_from_slice(data);
516
517        // Post-invariant: as for `put1()`.
518    }
519
520    /// Reserve appended space and return a mutable slice referring to it.
521    pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
522        let off = self.data.len();
523        let new_len = self.data.len() + len;
524        self.data.resize(new_len, 0);
525        &mut self.data[off..]
526
527        // Post-invariant: as for `put1()`.
528    }
529
530    /// Align up to the given alignment.
531    pub fn align_to(&mut self, align_to: CodeOffset) {
532        trace!("MachBuffer: align to {}", align_to);
533        assert!(
534            align_to.is_power_of_two(),
535            "{align_to} is not a power of two"
536        );
537        while self.cur_offset() & (align_to - 1) != 0 {
538            self.put1(0);
539        }
540
541        // Post-invariant: as for `put1()`.
542    }
543
544    /// Begin a region of patchable code. There is one requirement for the
545    /// code that is emitted: It must not introduce any instructions that
546    /// could be chomped (branches are an example of this). In other words,
547    /// you must not call [`MachBuffer::add_cond_branch`] or
548    /// [`MachBuffer::add_uncond_branch`] between calls to this method and
549    /// [`MachBuffer::end_patchable`].
550    pub fn start_patchable(&mut self) -> OpenPatchRegion {
551        assert!(!self.open_patchable, "Patchable regions may not be nested");
552        self.open_patchable = true;
553        OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
554    }
555
556    /// End a region of patchable code, yielding a [`PatchRegion`] value that
557    /// can be consumed later to produce a one-off mutable slice to the
558    /// associated region of the data buffer.
559    pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
560        // No need to assert the state of `open_patchable` here, as we take
561        // ownership of the only `OpenPatchable` value.
562        self.open_patchable = false;
563        let end = usize::try_from(self.cur_offset()).unwrap();
564        PatchRegion { range: open.0..end }
565    }
566
567    /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
568    /// offset yet.
569    pub fn get_label(&mut self) -> MachLabel {
570        let l = self.label_offsets.len() as u32;
571        self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
572        self.label_aliases.push(UNKNOWN_LABEL);
573        trace!("MachBuffer: new label -> {:?}", MachLabel(l));
574        MachLabel(l)
575
576        // Post-invariant: the only mutation is to add a new label; it has no
577        // bound offset yet, so it trivially satisfies all invariants.
578    }
579
580    /// Reserve the first N MachLabels for blocks.
581    pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
582        trace!("MachBuffer: first {} labels are for blocks", blocks);
583        debug_assert!(self.label_offsets.is_empty());
584        self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
585        self.label_aliases.resize(blocks, UNKNOWN_LABEL);
586
587        // Post-invariant: as for `get_label()`.
588    }
589
590    /// Registers metadata in this `MachBuffer` about the `constants` provided.
591    ///
592    /// This will record the size/alignment of all constants which will prepare
593    /// them for emission later on.
594    pub fn register_constants(&mut self, constants: &VCodeConstants) {
595        for (c, val) in constants.iter() {
596            self.register_constant(&c, val);
597        }
598    }
599
600    /// Similar to [`MachBuffer::register_constants`] but registers a
601    /// single constant metadata. This function is useful in
602    /// situations where not all constants are known at the time of
603    /// emission.
604    pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
605        let c2 = self.constants.push(MachBufferConstant {
606            upcoming_label: None,
607            align: data.alignment(),
608            size: data.as_slice().len(),
609        });
610        assert_eq!(*constant, c2);
611    }
612
613    /// Completes constant emission by iterating over `self.used_constants` and
614    /// filling in the "holes" with the constant values provided by `constants`.
615    ///
616    /// Returns the alignment required for this entire buffer. Alignment starts
617    /// at the ISA's minimum function alignment and can be increased due to
618    /// constant requirements.
619    fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
620        let mut alignment = self.min_alignment;
621        for (constant, offset) in mem::take(&mut self.used_constants) {
622            let constant = constants.get(constant);
623            let data = constant.as_slice();
624            self.data[offset as usize..][..data.len()].copy_from_slice(data);
625            alignment = constant.alignment().max(alignment);
626        }
627        alignment
628    }
629
630    /// Returns a label that can be used to refer to the `constant` provided.
631    ///
632    /// This will automatically defer a new constant to be emitted for
633    /// `constant` if it has not been previously emitted. Note that this
634    /// function may return a different label for the same constant at
635    /// different points in time. The label is valid to use only from the
636    /// current location; the MachBuffer takes care to emit the same constant
637    /// multiple times if needed so the constant is always in range.
638    pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
639        let MachBufferConstant {
640            align,
641            size,
642            upcoming_label,
643        } = self.constants[constant];
644        if let Some(label) = upcoming_label {
645            return label;
646        }
647
648        let label = self.get_label();
649        trace!(
650            "defer constant: eventually emit {size} bytes aligned \
651             to {align} at label {label:?}",
652        );
653        self.pending_constants.push(constant);
654        self.pending_constants_size += size as u32;
655        self.constants[constant].upcoming_label = Some(label);
656        label
657    }
658
659    /// Bind a label to the current offset. A label can only be bound once.
660    pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
661        trace!(
662            "MachBuffer: bind label {:?} at offset {}",
663            label,
664            self.cur_offset()
665        );
666        debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
667        debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
668        let offset = self.cur_offset();
669        self.label_offsets[label.0 as usize] = offset;
670        self.lazily_clear_labels_at_tail();
671        self.labels_at_tail.push(label);
672
673        // Invariants hold: bound offset of label is <= cur_offset (in fact it
674        // is equal). If the `labels_at_tail` list was complete and precise
675        // before, it is still, because we have bound this label to the current
676        // offset and added it to the list (which contains all labels at the
677        // current offset).
678
679        self.optimize_branches(ctrl_plane);
680
681        // Post-invariant: by `optimize_branches()` (see argument there).
682    }
683
684    /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
685    /// offset that it applies to.
686    fn lazily_clear_labels_at_tail(&mut self) {
687        let offset = self.cur_offset();
688        if offset > self.labels_at_tail_off {
689            self.labels_at_tail_off = offset;
690            self.labels_at_tail.clear();
691        }
692
693        // Post-invariant: either labels_at_tail_off was at cur_offset, and
694        // state is untouched, or was less than cur_offset, in which case the
695        // labels_at_tail list was conceptually empty, and is now actually
696        // empty.
697    }
698
699    /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
700    pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
701        let mut iters = 0;
702        while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
703            label = self.label_aliases[label.0 as usize];
704            // To protect against an infinite loop (despite our assurances to
705            // ourselves that the invariants make this impossible), assert out
706            // after 1M iterations. The number of basic blocks is limited
707            // in most contexts anyway so this should be impossible to hit with
708            // a legitimate input.
709            iters += 1;
710            assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
711        }
712        self.label_offsets[label.0 as usize]
713
714        // Post-invariant: no mutations.
715    }
716
717    /// Emit a reference to the given label with the given reference type (i.e.,
718    /// branch-instruction format) at the current offset.  This is like a
719    /// relocation, but handled internally.
720    ///
721    /// This can be called before the branch is actually emitted; fixups will
722    /// not happen until an island is emitted or the buffer is finished.
723    pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
724        trace!(
725            "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
726            offset, label, kind
727        );
728
729        // Add the fixup, and update the worst-case island size based on a
730        // veneer for this label use.
731        let fixup = MachLabelFixup {
732            label,
733            offset,
734            kind,
735        };
736        self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
737        self.pending_fixup_records.push(fixup);
738
739        // Post-invariant: no mutations to branches/labels data structures.
740    }
741
742    /// Inform the buffer of an unconditional branch at the given offset,
743    /// targeting the given label. May be used to optimize branches.
744    /// The last added label-use must correspond to this branch.
745    /// This must be called when the current offset is equal to `start`; i.e.,
746    /// before actually emitting the branch. This implies that for a branch that
747    /// uses a label and is eligible for optimizations by the MachBuffer, the
748    /// proper sequence is:
749    ///
750    /// - Call `use_label_at_offset()` to emit the fixup record.
751    /// - Call `add_uncond_branch()` to make note of the branch.
752    /// - Emit the bytes for the branch's machine code.
753    ///
754    /// Additional requirement: no labels may be bound between `start` and `end`
755    /// (exclusive on both ends).
756    pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
757        debug_assert!(
758            !self.open_patchable,
759            "Branch instruction inserted within a patchable region"
760        );
761        assert!(self.cur_offset() == start);
762        debug_assert!(end > start);
763        assert!(!self.pending_fixup_records.is_empty());
764        let fixup = self.pending_fixup_records.len() - 1;
765        self.lazily_clear_labels_at_tail();
766        self.latest_branches.push(MachBranch {
767            start,
768            end,
769            target,
770            fixup,
771            inverted: None,
772            labels_at_this_branch: self.labels_at_tail.clone(),
773        });
774
775        // Post-invariant: we asserted branch start is current tail; the list of
776        // labels at branch is cloned from list of labels at current tail.
777    }
778
779    /// Inform the buffer of a conditional branch at the given offset,
780    /// targeting the given label. May be used to optimize branches.
781    /// The last added label-use must correspond to this branch.
782    ///
783    /// Additional requirement: no labels may be bound between `start` and `end`
784    /// (exclusive on both ends).
785    pub fn add_cond_branch(
786        &mut self,
787        start: CodeOffset,
788        end: CodeOffset,
789        target: MachLabel,
790        inverted: &[u8],
791    ) {
792        debug_assert!(
793            !self.open_patchable,
794            "Branch instruction inserted within a patchable region"
795        );
796        assert!(self.cur_offset() == start);
797        debug_assert!(end > start);
798        assert!(!self.pending_fixup_records.is_empty());
799        debug_assert!(
800            inverted.len() == (end - start) as usize,
801            "branch length = {}, but inverted length = {}",
802            end - start,
803            inverted.len()
804        );
805        let fixup = self.pending_fixup_records.len() - 1;
806        let inverted = Some(SmallVec::from(inverted));
807        self.lazily_clear_labels_at_tail();
808        self.latest_branches.push(MachBranch {
809            start,
810            end,
811            target,
812            fixup,
813            inverted,
814            labels_at_this_branch: self.labels_at_tail.clone(),
815        });
816
817        // Post-invariant: we asserted branch start is current tail; labels at
818        // branch list is cloned from list of labels at current tail.
819    }
820
821    fn truncate_last_branch(&mut self) {
822        debug_assert!(
823            !self.open_patchable,
824            "Branch instruction truncated within a patchable region"
825        );
826
827        self.lazily_clear_labels_at_tail();
828        // Invariants hold at this point.
829
830        let b = self.latest_branches.pop().unwrap();
831        assert!(b.end == self.cur_offset());
832
833        // State:
834        //    [PRE CODE]
835        //  Offset b.start, b.labels_at_this_branch:
836        //    [BRANCH CODE]
837        //  cur_off, self.labels_at_tail -->
838        //    (end of buffer)
839        self.data.truncate(b.start as usize);
840        self.pending_fixup_records.truncate(b.fixup);
841        while let Some(last_srcloc) = self.srclocs.last_mut() {
842            if last_srcloc.end <= b.start {
843                break;
844            }
845            if last_srcloc.start < b.start {
846                last_srcloc.end = b.start;
847                break;
848            }
849            self.srclocs.pop();
850        }
851        // State:
852        //    [PRE CODE]
853        //  cur_off, Offset b.start, b.labels_at_this_branch:
854        //    (end of buffer)
855        //
856        //  self.labels_at_tail -->  (past end of buffer)
857        let cur_off = self.cur_offset();
858        self.labels_at_tail_off = cur_off;
859        // State:
860        //    [PRE CODE]
861        //  cur_off, Offset b.start, b.labels_at_this_branch,
862        //  self.labels_at_tail:
863        //    (end of buffer)
864        //
865        // resolve_label_offset(l) for l in labels_at_tail:
866        //    (past end of buffer)
867
868        trace!(
869            "truncate_last_branch: truncated {:?}; off now {}",
870            b, cur_off
871        );
872
873        // Fix up resolved label offsets for labels at tail.
874        for &l in &self.labels_at_tail {
875            self.label_offsets[l.0 as usize] = cur_off;
876        }
877        // Old labels_at_this_branch are now at cur_off.
878        self.labels_at_tail.extend(b.labels_at_this_branch);
879
880        // Post-invariant: this operation is defined to truncate the buffer,
881        // which moves cur_off backward, and to move labels at the end of the
882        // buffer back to the start-of-branch offset.
883        //
884        // latest_branches satisfies all invariants:
885        // - it has no branches past the end of the buffer (branches are in
886        //   order, we removed the last one, and we truncated the buffer to just
887        //   before the start of that branch)
888        // - no labels were moved to lower offsets than the (new) cur_off, so
889        //   the labels_at_this_branch list for any other branch need not change.
890        //
891        // labels_at_tail satisfies all invariants:
892        // - all labels that were at the tail after the truncated branch are
893        //   moved backward to just before the branch, which becomes the new tail;
894        //   thus every element in the list should remain (ensured by `.extend()`
895        //   above).
896        // - all labels that refer to the new tail, which is the start-offset of
897        //   the truncated branch, must be present. The `labels_at_this_branch`
898        //   list in the truncated branch's record is a complete and precise list
899        //   of exactly these labels; we append these to labels_at_tail.
900        // - labels_at_tail_off is at cur_off after truncation occurs, so the
901        //   list is valid (not to be lazily cleared).
902        //
903        // The stated operation was performed:
904        // - For each label at the end of the buffer prior to this method, it
905        //   now resolves to the new (truncated) end of the buffer: it must have
906        //   been in `labels_at_tail` (this list is precise and complete, and
907        //   the tail was at the end of the truncated branch on entry), and we
908        //   iterate over this list and set `label_offsets` to the new tail.
909        //   None of these labels could have been an alias (by invariant), so
910        //   `label_offsets` is authoritative for each.
911        // - No other labels will be past the end of the buffer, because of the
912        //   requirement that no labels be bound to the middle of branch ranges
913        //   (see comments to `add_{cond,uncond}_branch()`).
914        // - The buffer is truncated to just before the last branch, and the
915        //   fixup record referring to that last branch is removed.
916    }
917
918    /// Performs various optimizations on branches pointing at the current label.
919    pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
920        if ctrl_plane.get_decision() {
921            return;
922        }
923
924        self.lazily_clear_labels_at_tail();
925        // Invariants valid at this point.
926
927        trace!(
928            "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
929            self.latest_branches, self.labels_at_tail, self.pending_fixup_records
930        );
931
932        // We continue to munch on branches at the tail of the buffer until no
933        // more rules apply. Note that the loop only continues if a branch is
934        // actually truncated (or if labels are redirected away from a branch),
935        // so this always makes progress.
936        while let Some(b) = self.latest_branches.last() {
937            let cur_off = self.cur_offset();
938            trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
939            // If there has been any code emission since the end of the last branch or
940            // label definition, then there's nothing we can edit (because we
941            // don't move code once placed, only back up and overwrite), so
942            // clear the records and finish.
943            if b.end < cur_off {
944                break;
945            }
946
947            // If the "labels at this branch" list on this branch is
948            // longer than a threshold, don't do any simplification,
949            // and let the branch remain to separate those labels from
950            // the current tail. This avoids quadratic behavior (see
951            // #3468): otherwise, if a long string of "goto next;
952            // next:" patterns are emitted, all of the labels will
953            // coalesce into a long list of aliases for the current
954            // buffer tail. We must track all aliases of the current
955            // tail for correctness, but we are also allowed to skip
956            // optimization (removal) of any branch, so we take the
957            // escape hatch here and let it stand. In effect this
958            // "spreads" the many thousands of labels in the
959            // pathological case among an actual (harmless but
960            // suboptimal) instruction once per N labels.
961            if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
962                break;
963            }
964
965            // Invariant: we are looking at a branch that ends at the tail of
966            // the buffer.
967
968            // For any branch, conditional or unconditional:
969            // - If the target is a label at the current offset, then remove
970            //   the conditional branch, and reset all labels that targeted
971            //   the current offset (end of branch) to the truncated
972            //   end-of-code.
973            //
974            // Preserves execution semantics: a branch to its own fallthrough
975            // address is equivalent to a no-op; in both cases, nextPC is the
976            // fallthrough.
977            if self.resolve_label_offset(b.target) == cur_off {
978                trace!("branch with target == cur off; truncating");
979                self.truncate_last_branch();
980                continue;
981            }
982
983            // If latest is an unconditional branch:
984            //
985            // - If the branch's target is not its own start address, then for
986            //   each label at the start of branch, make the label an alias of the
987            //   branch target, and remove the label from the "labels at this
988            //   branch" list.
989            //
990            //   - Preserves execution semantics: an unconditional branch's
991            //     only effect is to set PC to a new PC; this change simply
992            //     collapses one step in the step-semantics.
993            //
994            //   - Post-invariant: the labels that were bound to the start of
995            //     this branch become aliases, so they must not be present in any
996            //     labels-at-this-branch list or the labels-at-tail list. The
997            //     labels are removed form the latest-branch record's
998            //     labels-at-this-branch list, and are never placed in the
999            //     labels-at-tail list. Furthermore, it is correct that they are
1000            //     not in either list, because they are now aliases, and labels
1001            //     that are aliases remain aliases forever.
1002            //
1003            // - If there is a prior unconditional branch that ends just before
1004            //   this one begins, and this branch has no labels bound to its
1005            //   start, then we can truncate this branch, because it is entirely
1006            //   unreachable (we have redirected all labels that make it
1007            //   reachable otherwise). Do so and continue around the loop.
1008            //
1009            //   - Preserves execution semantics: the branch is unreachable,
1010            //     because execution can only flow into an instruction from the
1011            //     prior instruction's fallthrough or from a branch bound to that
1012            //     instruction's start offset. Unconditional branches have no
1013            //     fallthrough, so if the prior instruction is an unconditional
1014            //     branch, no fallthrough entry can happen. The
1015            //     labels-at-this-branch list is complete (by invariant), so if it
1016            //     is empty, then the instruction is entirely unreachable. Thus,
1017            //     it can be removed.
1018            //
1019            //   - Post-invariant: ensured by truncate_last_branch().
1020            //
1021            // - If there is a prior conditional branch whose target label
1022            //   resolves to the current offset (branches around the
1023            //   unconditional branch), then remove the unconditional branch,
1024            //   and make the target of the unconditional the target of the
1025            //   conditional instead.
1026            //
1027            //   - Preserves execution semantics: previously we had:
1028            //
1029            //         L1:
1030            //            cond_br L2
1031            //            br L3
1032            //         L2:
1033            //            (end of buffer)
1034            //
1035            //     by removing the last branch, we have:
1036            //
1037            //         L1:
1038            //            cond_br L2
1039            //         L2:
1040            //            (end of buffer)
1041            //
1042            //     we then fix up the records for the conditional branch to
1043            //     have:
1044            //
1045            //         L1:
1046            //           cond_br.inverted L3
1047            //         L2:
1048            //
1049            //     In the original code, control flow reaches L2 when the
1050            //     conditional branch's predicate is true, and L3 otherwise. In
1051            //     the optimized code, the same is true.
1052            //
1053            //   - Post-invariant: all edits to latest_branches and
1054            //     labels_at_tail are performed by `truncate_last_branch()`,
1055            //     which maintains the invariants at each step.
1056
1057            if b.is_uncond() {
1058                // Set any label equal to current branch's start as an alias of
1059                // the branch's target, if the target is not the branch itself
1060                // (i.e., an infinite loop).
1061                //
1062                // We cannot perform this aliasing if the target of this branch
1063                // ultimately aliases back here; if so, we need to keep this
1064                // branch, so break out of this loop entirely (and clear the
1065                // latest-branches list below).
1066                //
1067                // Note that this check is what prevents cycles from forming in
1068                // `self.label_aliases`. To see why, consider an arbitrary start
1069                // state:
1070                //
1071                // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1072                // Ln, which is not aliased.
1073                //
1074                // We would create a cycle if we assigned label_aliases[Ln]
1075                // = L1.  Note that the below assignment is the only write
1076                // to label_aliases.
1077                //
1078                // By our other invariants, we have that Ln (`l` below)
1079                // resolves to the offset `b.start`, because it is in the
1080                // set `b.labels_at_this_branch`.
1081                //
1082                // If L1 were already aliased, through some arbitrarily deep
1083                // chain, to Ln, then it must also resolve to this offset
1084                // `b.start`.
1085                //
1086                // By checking the resolution of `L1` against this offset,
1087                // and aborting this branch-simplification if they are
1088                // equal, we prevent the below assignment from ever creating
1089                // a cycle.
1090                if self.resolve_label_offset(b.target) != b.start {
1091                    let redirected = b.labels_at_this_branch.len();
1092                    for &l in &b.labels_at_this_branch {
1093                        trace!(
1094                            " -> label at start of branch {:?} redirected to target {:?}",
1095                            l, b.target
1096                        );
1097                        self.label_aliases[l.0 as usize] = b.target;
1098                        // NOTE: we continue to ensure the invariant that labels
1099                        // pointing to tail of buffer are in `labels_at_tail`
1100                        // because we already ensured above that the last branch
1101                        // cannot have a target of `cur_off`; so we never have
1102                        // to put the label into `labels_at_tail` when moving it
1103                        // here.
1104                    }
1105                    // Maintain invariant: all branches have been redirected
1106                    // and are no longer pointing at the start of this branch.
1107                    let mut_b = self.latest_branches.last_mut().unwrap();
1108                    mut_b.labels_at_this_branch.clear();
1109
1110                    if redirected > 0 {
1111                        trace!(" -> after label redirects, restarting loop");
1112                        continue;
1113                    }
1114                } else {
1115                    break;
1116                }
1117
1118                let b = self.latest_branches.last().unwrap();
1119
1120                // Examine any immediately preceding branch.
1121                if self.latest_branches.len() > 1 {
1122                    let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1123                    trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1124                    // This uncond is immediately after another uncond; we
1125                    // should have already redirected labels to this uncond away
1126                    // (but check to be sure); so we can truncate this uncond.
1127                    if prev_b.is_uncond()
1128                        && prev_b.end == b.start
1129                        && b.labels_at_this_branch.is_empty()
1130                    {
1131                        trace!(" -> uncond follows another uncond; truncating");
1132                        self.truncate_last_branch();
1133                        continue;
1134                    }
1135
1136                    // This uncond is immediately after a conditional, and the
1137                    // conditional's target is the end of this uncond, and we've
1138                    // already redirected labels to this uncond away; so we can
1139                    // truncate this uncond, flip the sense of the conditional, and
1140                    // set the conditional's target (in `latest_branches` and in
1141                    // `fixup_records`) to the uncond's target.
1142                    if prev_b.is_cond()
1143                        && prev_b.end == b.start
1144                        && self.resolve_label_offset(prev_b.target) == cur_off
1145                    {
1146                        trace!(
1147                            " -> uncond follows a conditional, and conditional's target resolves to current offset"
1148                        );
1149                        // Save the target of the uncond (this becomes the
1150                        // target of the cond), and truncate the uncond.
1151                        let target = b.target;
1152                        let data = prev_b.inverted.clone().unwrap();
1153                        self.truncate_last_branch();
1154
1155                        // Mutate the code and cond branch.
1156                        let off_before_edit = self.cur_offset();
1157                        let prev_b = self.latest_branches.last_mut().unwrap();
1158                        let not_inverted = SmallVec::from(
1159                            &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1160                        );
1161
1162                        // Low-level edit: replaces bytes of branch with
1163                        // inverted form. cur_off remains the same afterward, so
1164                        // we do not need to modify label data structures.
1165                        self.data.truncate(prev_b.start as usize);
1166                        self.data.extend_from_slice(&data[..]);
1167
1168                        // Save the original code as the inversion of the
1169                        // inverted branch, in case we later edit this branch
1170                        // again.
1171                        prev_b.inverted = Some(not_inverted);
1172                        self.pending_fixup_records[prev_b.fixup].label = target;
1173                        trace!(" -> reassigning target of condbr to {:?}", target);
1174                        prev_b.target = target;
1175                        debug_assert_eq!(off_before_edit, self.cur_offset());
1176                        continue;
1177                    }
1178                }
1179            }
1180
1181            // If we couldn't do anything with the last branch, then break.
1182            break;
1183        }
1184
1185        self.purge_latest_branches();
1186
1187        trace!(
1188            "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1189            self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1190        );
1191    }
1192
1193    fn purge_latest_branches(&mut self) {
1194        // All of our branch simplification rules work only if a branch ends at
1195        // the tail of the buffer, with no following code; and branches are in
1196        // order in latest_branches; so if the last entry ends prior to
1197        // cur_offset, then clear all entries.
1198        let cur_off = self.cur_offset();
1199        if let Some(l) = self.latest_branches.last() {
1200            if l.end < cur_off {
1201                trace!("purge_latest_branches: removing branch {:?}", l);
1202                self.latest_branches.clear();
1203            }
1204        }
1205
1206        // Post-invariant: no invariant requires any branch to appear in
1207        // `latest_branches`; it is always optional. The list-clear above thus
1208        // preserves all semantics.
1209    }
1210
1211    /// Emit a trap at some point in the future with the specified code and
1212    /// stack map.
1213    ///
1214    /// This function returns a [`MachLabel`] which will be the future address
1215    /// of the trap. Jumps should refer to this label, likely by using the
1216    /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1217    /// patched in once the address of the trap is known.
1218    ///
1219    /// This will batch all traps into the end of the function.
1220    pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1221        let label = self.get_label();
1222        self.pending_traps.push(MachLabelTrap {
1223            label,
1224            code,
1225            loc: self.cur_srcloc.map(|(_start, loc)| loc),
1226        });
1227        label
1228    }
1229
1230    /// Is an island needed within the next N bytes?
1231    pub fn island_needed(&self, distance: CodeOffset) -> bool {
1232        let deadline = match self.fixup_records.peek() {
1233            Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1234            None => self.pending_fixup_deadline,
1235        };
1236        deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1237    }
1238
1239    /// Returns the maximal offset that islands can reach if `distance` more
1240    /// bytes are appended.
1241    ///
1242    /// This is used to determine if veneers need insertions since jumps that
1243    /// can't reach past this point must get a veneer of some form.
1244    fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1245        // Assume that all fixups will require veneers and that the veneers are
1246        // the worst-case size for each platform. This is an over-generalization
1247        // to avoid iterating over the `fixup_records` list or maintaining
1248        // information about it as we go along.
1249        let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1250            as u32)
1251            * (I::LabelUse::worst_case_veneer_size())
1252            + self.pending_constants_size
1253            + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1254        self.cur_offset()
1255            .saturating_add(distance)
1256            .saturating_add(island_worst_case_size)
1257    }
1258
1259    /// Emit all pending constants and required pending veneers.
1260    ///
1261    /// Should only be called if `island_needed()` returns true, i.e., if we
1262    /// actually reach a deadline. It's not necessarily a problem to do so
1263    /// otherwise but it may result in unnecessary work during emission.
1264    pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1265        self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1266    }
1267
1268    /// Same as `emit_island`, but an internal API with a `force_veneers`
1269    /// argument to force all veneers to always get emitted for debugging.
1270    fn emit_island_maybe_forced(
1271        &mut self,
1272        force_veneers: ForceVeneers,
1273        distance: CodeOffset,
1274        ctrl_plane: &mut ControlPlane,
1275    ) {
1276        // We're going to purge fixups, so no latest-branch editing can happen
1277        // anymore.
1278        self.latest_branches.clear();
1279
1280        // End the current location tracking since anything emitted during this
1281        // function shouldn't be attributed to whatever the current source
1282        // location is.
1283        //
1284        // Note that the current source location, if it's set right now, will be
1285        // restored at the end of this island emission.
1286        let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1287        if cur_loc.is_some() {
1288            self.end_srcloc();
1289        }
1290
1291        let forced_threshold = self.worst_case_end_of_island(distance);
1292
1293        // First flush out all traps/constants so we have more labels in case
1294        // fixups are applied against these labels.
1295        //
1296        // Note that traps are placed first since this typically happens at the
1297        // end of the function and for disassemblers we try to keep all the code
1298        // contiguously together.
1299        for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1300            // If this trap has source information associated with it then
1301            // emit this information for the trap instruction going out now too.
1302            if let Some(loc) = loc {
1303                self.start_srcloc(loc);
1304            }
1305            self.align_to(I::LabelUse::ALIGN);
1306            self.bind_label(label, ctrl_plane);
1307            self.add_trap(code);
1308            self.put_data(I::TRAP_OPCODE);
1309            if loc.is_some() {
1310                self.end_srcloc();
1311            }
1312        }
1313
1314        for constant in mem::take(&mut self.pending_constants) {
1315            let MachBufferConstant { align, size, .. } = self.constants[constant];
1316            let label = self.constants[constant].upcoming_label.take().unwrap();
1317            self.align_to(align);
1318            self.bind_label(label, ctrl_plane);
1319            self.used_constants.push((constant, self.cur_offset()));
1320            self.get_appended_space(size);
1321        }
1322
1323        // Either handle all pending fixups because they're ready or move them
1324        // onto the `BinaryHeap` tracking all pending fixups if they aren't
1325        // ready.
1326        assert!(self.latest_branches.is_empty());
1327        for fixup in mem::take(&mut self.pending_fixup_records) {
1328            if self.should_apply_fixup(&fixup, forced_threshold) {
1329                self.handle_fixup(fixup, force_veneers, forced_threshold);
1330            } else {
1331                self.fixup_records.push(fixup);
1332            }
1333        }
1334        self.pending_fixup_deadline = u32::MAX;
1335        while let Some(fixup) = self.fixup_records.peek() {
1336            trace!("emit_island: fixup {:?}", fixup);
1337
1338            // If this fixup shouldn't be applied, that means its label isn't
1339            // defined yet and there'll be remaining space to apply a veneer if
1340            // necessary in the future after this island. In that situation
1341            // because `fixup_records` is sorted by deadline this loop can
1342            // exit.
1343            if !self.should_apply_fixup(fixup, forced_threshold) {
1344                break;
1345            }
1346
1347            let fixup = self.fixup_records.pop().unwrap();
1348            self.handle_fixup(fixup, force_veneers, forced_threshold);
1349        }
1350
1351        if let Some(loc) = cur_loc {
1352            self.start_srcloc(loc);
1353        }
1354    }
1355
1356    fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1357        let label_offset = self.resolve_label_offset(fixup.label);
1358        label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1359    }
1360
1361    fn handle_fixup(
1362        &mut self,
1363        fixup: MachLabelFixup<I>,
1364        force_veneers: ForceVeneers,
1365        forced_threshold: CodeOffset,
1366    ) {
1367        let MachLabelFixup {
1368            label,
1369            offset,
1370            kind,
1371        } = fixup;
1372        let start = offset as usize;
1373        let end = (offset + kind.patch_size()) as usize;
1374        let label_offset = self.resolve_label_offset(label);
1375
1376        if label_offset != UNKNOWN_LABEL_OFFSET {
1377            // If the offset of the label for this fixup is known then
1378            // we're going to do something here-and-now. We're either going
1379            // to patch the original offset because it's an in-bounds jump,
1380            // or we're going to generate a veneer, patch the fixup to jump
1381            // to the veneer, and then keep going.
1382            //
1383            // If the label comes after the original fixup, then we should
1384            // be guaranteed that the jump is in-bounds. Otherwise there's
1385            // a bug somewhere because this method wasn't called soon
1386            // enough. All forward-jumps are tracked and should get veneers
1387            // before their deadline comes and they're unable to jump
1388            // further.
1389            //
1390            // Otherwise if the label is before the fixup, then that's a
1391            // backwards jump. If it's past the maximum negative range
1392            // then we'll emit a veneer that to jump forward to which can
1393            // then jump backwards.
1394            let veneer_required = if label_offset >= offset {
1395                assert!((label_offset - offset) <= kind.max_pos_range());
1396                false
1397            } else {
1398                (offset - label_offset) > kind.max_neg_range()
1399            };
1400            trace!(
1401                " -> label_offset = {}, known, required = {} (pos {} neg {})",
1402                label_offset,
1403                veneer_required,
1404                kind.max_pos_range(),
1405                kind.max_neg_range()
1406            );
1407
1408            if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1409                self.emit_veneer(label, offset, kind);
1410            } else {
1411                let slice = &mut self.data[start..end];
1412                trace!(
1413                    "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1414                );
1415                kind.patch(slice, offset, label_offset);
1416            }
1417        } else {
1418            // If the offset of this label is not known at this time then
1419            // that means that a veneer is required because after this
1420            // island the target can't be in range of the original target.
1421            assert!(forced_threshold - offset > kind.max_pos_range());
1422            self.emit_veneer(label, offset, kind);
1423        }
1424    }
1425
1426    /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1427    ///
1428    /// This will generate extra machine code, using `kind`, to get a
1429    /// larger-jump-kind than `kind` allows. The code at `offset` is then
1430    /// patched to jump to our new code, and then the new code is enqueued for
1431    /// a fixup to get processed at some later time.
1432    fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1433        // If this `kind` doesn't support a veneer then that's a bug in the
1434        // backend because we need to implement support for such a veneer.
1435        assert!(
1436            kind.supports_veneer(),
1437            "jump beyond the range of {kind:?} but a veneer isn't supported",
1438        );
1439
1440        // Allocate space for a veneer in the island.
1441        self.align_to(I::LabelUse::ALIGN);
1442        let veneer_offset = self.cur_offset();
1443        trace!("making a veneer at {}", veneer_offset);
1444        let start = offset as usize;
1445        let end = (offset + kind.patch_size()) as usize;
1446        let slice = &mut self.data[start..end];
1447        // Patch the original label use to refer to the veneer.
1448        trace!(
1449            "patching original at offset {} to veneer offset {}",
1450            offset, veneer_offset
1451        );
1452        kind.patch(slice, offset, veneer_offset);
1453        // Generate the veneer.
1454        let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1455        let (veneer_fixup_off, veneer_label_use) =
1456            kind.generate_veneer(veneer_slice, veneer_offset);
1457        trace!(
1458            "generated veneer; fixup offset {}, label_use {:?}",
1459            veneer_fixup_off, veneer_label_use
1460        );
1461        // Register a new use of `label` with our new veneer fixup and
1462        // offset. This'll recalculate deadlines accordingly and
1463        // enqueue this fixup to get processed at some later
1464        // time.
1465        self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1466    }
1467
1468    fn finish_emission_maybe_forcing_veneers(
1469        &mut self,
1470        force_veneers: ForceVeneers,
1471        ctrl_plane: &mut ControlPlane,
1472    ) {
1473        while !self.pending_constants.is_empty()
1474            || !self.pending_traps.is_empty()
1475            || !self.fixup_records.is_empty()
1476            || !self.pending_fixup_records.is_empty()
1477        {
1478            // `emit_island()` will emit any pending veneers and constants, and
1479            // as a side-effect, will also take care of any fixups with resolved
1480            // labels eagerly.
1481            self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1482        }
1483
1484        // Ensure that all labels have been fixed up after the last island is emitted. This is a
1485        // full (release-mode) assert because an unresolved label means the emitted code is
1486        // incorrect.
1487        assert!(self.fixup_records.is_empty());
1488        assert!(self.pending_fixup_records.is_empty());
1489    }
1490
1491    /// Finish any deferred emissions and/or fixups.
1492    pub fn finish(
1493        mut self,
1494        constants: &VCodeConstants,
1495        ctrl_plane: &mut ControlPlane,
1496    ) -> MachBufferFinalized<Stencil> {
1497        let _tt = timing::vcode_emit_finish();
1498
1499        self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1500
1501        let alignment = self.finish_constants(constants);
1502
1503        // Resolve all labels to their offsets.
1504        let finalized_relocs = self
1505            .relocs
1506            .iter()
1507            .map(|reloc| FinalizedMachReloc {
1508                offset: reloc.offset,
1509                kind: reloc.kind,
1510                addend: reloc.addend,
1511                target: match &reloc.target {
1512                    RelocTarget::ExternalName(name) => {
1513                        FinalizedRelocTarget::ExternalName(name.clone())
1514                    }
1515                    RelocTarget::Label(label) => {
1516                        FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1517                    }
1518                },
1519            })
1520            .collect();
1521
1522        let finalized_exception_handlers = self
1523            .exception_handlers
1524            .iter()
1525            .map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))
1526            .collect();
1527
1528        let mut srclocs = self.srclocs;
1529        srclocs.sort_by_key(|entry| entry.start);
1530
1531        MachBufferFinalized {
1532            data: self.data,
1533            relocs: finalized_relocs,
1534            traps: self.traps,
1535            call_sites: self.call_sites,
1536            exception_handlers: finalized_exception_handlers,
1537            srclocs,
1538            user_stack_maps: self.user_stack_maps,
1539            unwind_info: self.unwind_info,
1540            alignment,
1541        }
1542    }
1543
1544    /// Add an external relocation at the given offset.
1545    pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1546        &mut self,
1547        offset: CodeOffset,
1548        kind: Reloc,
1549        target: &T,
1550        addend: Addend,
1551    ) {
1552        let target: RelocTarget = target.clone().into();
1553        // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1554        // generate a label-use statement to track whether an island is possibly
1555        // needed to escape this function to actually get to the external name.
1556        // This is most likely to come up on AArch64 where calls between
1557        // functions use a 26-bit signed offset which gives +/- 64MB. This means
1558        // that if a function is 128MB in size and there's a call in the middle
1559        // it's impossible to reach the actual target. Also, while it's
1560        // technically possible to jump to the start of a function and then jump
1561        // further, island insertion below always inserts islands after
1562        // previously appended code so for Cranelift's own implementation this
1563        // is also a problem for 64MB functions on AArch64 which start with a
1564        // call instruction, those won't be able to escape.
1565        //
1566        // Ideally what needs to happen here is that a `LabelUse` is
1567        // transparently generated (or call-sites of this function are audited
1568        // to generate a `LabelUse` instead) and tracked internally. The actual
1569        // relocation would then change over time if and when a veneer is
1570        // inserted, where the relocation here would be patched by this
1571        // `MachBuffer` to jump to the veneer. The problem, though, is that all
1572        // this still needs to end up, in the case of a singular function,
1573        // generating a final relocation pointing either to this particular
1574        // relocation or to the veneer inserted. Additionally
1575        // `MachBuffer` needs the concept of a label which will never be
1576        // resolved, so `emit_island` doesn't trip over not actually ever
1577        // knowning what some labels are. Currently the loop in
1578        // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1579        // loop.
1580        //
1581        // For now this means that because relocs aren't tracked at all that
1582        // AArch64 functions have a rough size limits of 64MB. For now that's
1583        // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1584        // when a relocation can't otherwise be resolved later, so it shouldn't
1585        // actually result in any memory unsafety or anything like that.
1586        self.relocs.push(MachReloc {
1587            offset,
1588            kind,
1589            target,
1590            addend,
1591        });
1592    }
1593
1594    /// Add an external relocation at the current offset.
1595    pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1596        &mut self,
1597        kind: Reloc,
1598        target: &T,
1599        addend: Addend,
1600    ) {
1601        self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1602    }
1603
1604    /// Add a trap record at the current offset.
1605    pub fn add_trap(&mut self, code: TrapCode) {
1606        self.traps.push(MachTrap {
1607            offset: self.data.len() as CodeOffset,
1608            code,
1609        });
1610    }
1611
1612    /// Add a call-site record at the current offset.
1613    pub fn add_call_site(&mut self) {
1614        self.add_try_call_site(None, core::iter::empty());
1615    }
1616
1617    /// Add a call-site record at the current offset with exception
1618    /// handlers.
1619    pub fn add_try_call_site(
1620        &mut self,
1621        frame_offset: Option<u32>,
1622        exception_handlers: impl Iterator<Item = MachExceptionHandler>,
1623    ) {
1624        let start = u32::try_from(self.exception_handlers.len()).unwrap();
1625        self.exception_handlers.extend(exception_handlers);
1626        let end = u32::try_from(self.exception_handlers.len()).unwrap();
1627        let exception_handler_range = start..end;
1628
1629        self.call_sites.push(MachCallSite {
1630            ret_addr: self.data.len() as CodeOffset,
1631            frame_offset,
1632            exception_handler_range,
1633        });
1634    }
1635
1636    /// Add an unwind record at the current offset.
1637    pub fn add_unwind(&mut self, unwind: UnwindInst) {
1638        self.unwind_info.push((self.cur_offset(), unwind));
1639    }
1640
1641    /// Set the `SourceLoc` for code from this offset until the offset at the
1642    /// next call to `end_srcloc()`.
1643    /// Returns the current [CodeOffset] and [RelSourceLoc].
1644    pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1645        let cur = (self.cur_offset(), loc);
1646        self.cur_srcloc = Some(cur);
1647        cur
1648    }
1649
1650    /// Mark the end of the `SourceLoc` segment started at the last
1651    /// `start_srcloc()` call.
1652    pub fn end_srcloc(&mut self) {
1653        let (start, loc) = self
1654            .cur_srcloc
1655            .take()
1656            .expect("end_srcloc() called without start_srcloc()");
1657        let end = self.cur_offset();
1658        // Skip zero-length extends.
1659        debug_assert!(end >= start);
1660        if end > start {
1661            self.srclocs.push(MachSrcLoc { start, end, loc });
1662        }
1663    }
1664
1665    /// Push a user stack map onto this buffer.
1666    ///
1667    /// The stack map is associated with the given `return_addr` code
1668    /// offset. This must be the PC for the instruction just *after* this stack
1669    /// map's associated instruction. For example in the sequence `call $foo;
1670    /// add r8, rax`, the `return_addr` must be the offset of the start of the
1671    /// `add` instruction.
1672    ///
1673    /// Stack maps must be pushed in sorted `return_addr` order.
1674    pub fn push_user_stack_map(
1675        &mut self,
1676        emit_state: &I::State,
1677        return_addr: CodeOffset,
1678        mut stack_map: ir::UserStackMap,
1679    ) {
1680        let span = emit_state.frame_layout().active_size();
1681        trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1682
1683        debug_assert!(
1684            self.user_stack_maps
1685                .last()
1686                .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1687            "pushed stack maps out of order: {} is not less than {}",
1688            self.user_stack_maps.last().unwrap().0,
1689            return_addr,
1690        );
1691
1692        stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1693        self.user_stack_maps.push((return_addr, span, stack_map));
1694    }
1695
1696    /// Increase the alignment of the buffer to the given alignment if bigger
1697    /// than the current alignment.
1698    pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1699        self.min_alignment = self.min_alignment.max(
1700            1u32.checked_shl(u32::from(align_to))
1701                .expect("log2_min_function_alignment too large"),
1702        );
1703    }
1704}
1705
1706impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1707    fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1708        for b in iter {
1709            self.put1(b);
1710        }
1711    }
1712}
1713
1714impl<T: CompilePhase> MachBufferFinalized<T> {
1715    /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1716    pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1717        &self.srclocs[..]
1718    }
1719
1720    /// Get the total required size for the code.
1721    pub fn total_size(&self) -> CodeOffset {
1722        self.data.len() as CodeOffset
1723    }
1724
1725    /// Return the code in this mach buffer as a hex string for testing purposes.
1726    pub fn stringify_code_bytes(&self) -> String {
1727        // This is pretty lame, but whatever ..
1728        use std::fmt::Write;
1729        let mut s = String::with_capacity(self.data.len() * 2);
1730        for b in &self.data {
1731            write!(&mut s, "{b:02X}").unwrap();
1732        }
1733        s
1734    }
1735
1736    /// Get the code bytes.
1737    pub fn data(&self) -> &[u8] {
1738        // N.B.: we emit every section into the .text section as far as
1739        // the `CodeSink` is concerned; we do not bother to segregate
1740        // the contents into the actual program text, the jumptable and the
1741        // rodata (constant pool). This allows us to generate code assuming
1742        // that these will not be relocated relative to each other, and avoids
1743        // having to designate each section as belonging in one of the three
1744        // fixed categories defined by `CodeSink`. If this becomes a problem
1745        // later (e.g. because of memory permissions or similar), we can
1746        // add this designation and segregate the output; take care, however,
1747        // to add the appropriate relocations in this case.
1748
1749        &self.data[..]
1750    }
1751
1752    /// Get the list of external relocations for this code.
1753    pub fn relocs(&self) -> &[FinalizedMachReloc] {
1754        &self.relocs[..]
1755    }
1756
1757    /// Get the list of trap records for this code.
1758    pub fn traps(&self) -> &[MachTrap] {
1759        &self.traps[..]
1760    }
1761
1762    /// Get the user stack map metadata for this code.
1763    pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1764        &self.user_stack_maps
1765    }
1766
1767    /// Take this buffer's user strack map metadata.
1768    pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1769        mem::take(&mut self.user_stack_maps)
1770    }
1771
1772    /// Get the list of call sites for this code, along with
1773    /// associated exception handlers.
1774    ///
1775    /// Each item yielded by the returned iterator is a struct with:
1776    ///
1777    /// - The call site metadata record, with a `ret_addr` field
1778    ///   directly accessible and denoting the offset of the return
1779    ///   address into this buffer's code.
1780    /// - The slice of pairs of exception tags and code offsets
1781    ///   denoting exception-handler entry points associated with this
1782    ///   call site.
1783    pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1784        self.call_sites.iter().map(|call_site| {
1785            let handler_range = call_site.exception_handler_range.clone();
1786            let handler_range = usize::try_from(handler_range.start).unwrap()
1787                ..usize::try_from(handler_range.end).unwrap();
1788            FinalizedMachCallSite {
1789                ret_addr: call_site.ret_addr,
1790                frame_offset: call_site.frame_offset,
1791                exception_handlers: &self.exception_handlers[handler_range],
1792            }
1793        })
1794    }
1795}
1796
1797/// An item in the exception-handler list for a callsite, with label
1798/// references.  Items are interpreted in left-to-right order and the
1799/// first match wins.
1800#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1801pub enum MachExceptionHandler {
1802    /// A specific tag (in the current dynamic context) should be
1803    /// handled by the code at the given offset.
1804    Tag(ExceptionTag, MachLabel),
1805    /// All exceptions should be handled by the code at the given
1806    /// offset.
1807    Default(MachLabel),
1808    /// The dynamic context for interpreting tags is updated to the
1809    /// value stored in the given machine location (in this frame's
1810    /// context).
1811    Context(ExceptionContextLoc),
1812}
1813
1814impl MachExceptionHandler {
1815    fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {
1816        match self {
1817            Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),
1818            Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),
1819            Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),
1820        }
1821    }
1822}
1823
1824/// An item in the exception-handler list for a callsite, with final
1825/// (lowered) code offsets. Items are interpreted in left-to-right
1826/// order and the first match wins.
1827#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1828#[cfg_attr(
1829    feature = "enable-serde",
1830    derive(serde_derive::Serialize, serde_derive::Deserialize)
1831)]
1832pub enum FinalizedMachExceptionHandler {
1833    /// A specific tag (in the current dynamic context) should be
1834    /// handled by the code at the given offset.
1835    Tag(ExceptionTag, CodeOffset),
1836    /// All exceptions should be handled by the code at the given
1837    /// offset.
1838    Default(CodeOffset),
1839    /// The dynamic context for interpreting tags is updated to the
1840    /// value stored in the given machine location (in this frame's
1841    /// context).
1842    Context(ExceptionContextLoc),
1843}
1844
1845/// A location for a dynamic exception context value.
1846#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1847#[cfg_attr(
1848    feature = "enable-serde",
1849    derive(serde_derive::Serialize, serde_derive::Deserialize)
1850)]
1851pub enum ExceptionContextLoc {
1852    /// An offset from SP at the callsite.
1853    SPOffset(u32),
1854    /// A GPR at the callsite. The physical register number for the
1855    /// GPR register file on the target architecture is used.
1856    GPR(u8),
1857}
1858
1859/// Metadata about a constant.
1860struct MachBufferConstant {
1861    /// A label which has not yet been bound which can be used for this
1862    /// constant.
1863    ///
1864    /// This is lazily created when a label is requested for a constant and is
1865    /// cleared when a constant is emitted.
1866    upcoming_label: Option<MachLabel>,
1867    /// Required alignment.
1868    align: CodeOffset,
1869    /// The byte size of this constant.
1870    size: usize,
1871}
1872
1873/// A trap that is deferred to the next time an island is emitted for either
1874/// traps, constants, or fixups.
1875struct MachLabelTrap {
1876    /// This label will refer to the trap's offset.
1877    label: MachLabel,
1878    /// The code associated with this trap.
1879    code: TrapCode,
1880    /// An optional source location to assign for this trap.
1881    loc: Option<RelSourceLoc>,
1882}
1883
1884/// A fixup to perform on the buffer once code is emitted. Fixups always refer
1885/// to labels and patch the code based on label offsets. Hence, they are like
1886/// relocations, but internal to one buffer.
1887#[derive(Debug)]
1888struct MachLabelFixup<I: VCodeInst> {
1889    /// The label whose offset controls this fixup.
1890    label: MachLabel,
1891    /// The offset to fix up / patch to refer to this label.
1892    offset: CodeOffset,
1893    /// The kind of fixup. This is architecture-specific; each architecture may have,
1894    /// e.g., several types of branch instructions, each with differently-sized
1895    /// offset fields and different places within the instruction to place the
1896    /// bits.
1897    kind: I::LabelUse,
1898}
1899
1900impl<I: VCodeInst> MachLabelFixup<I> {
1901    fn deadline(&self) -> CodeOffset {
1902        self.offset.saturating_add(self.kind.max_pos_range())
1903    }
1904}
1905
1906impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1907    fn eq(&self, other: &Self) -> bool {
1908        self.deadline() == other.deadline()
1909    }
1910}
1911
1912impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1913
1914impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1915    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1916        Some(self.cmp(other))
1917    }
1918}
1919
1920impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1921    fn cmp(&self, other: &Self) -> Ordering {
1922        other.deadline().cmp(&self.deadline())
1923    }
1924}
1925
1926/// A relocation resulting from a compilation.
1927#[derive(Clone, Debug, PartialEq)]
1928#[cfg_attr(
1929    feature = "enable-serde",
1930    derive(serde_derive::Serialize, serde_derive::Deserialize)
1931)]
1932pub struct MachRelocBase<T> {
1933    /// The offset at which the relocation applies, *relative to the
1934    /// containing section*.
1935    pub offset: CodeOffset,
1936    /// The kind of relocation.
1937    pub kind: Reloc,
1938    /// The external symbol / name to which this relocation refers.
1939    pub target: T,
1940    /// The addend to add to the symbol value.
1941    pub addend: i64,
1942}
1943
1944type MachReloc = MachRelocBase<RelocTarget>;
1945
1946/// A relocation resulting from a compilation.
1947pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1948
1949/// A Relocation target
1950#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1951pub enum RelocTarget {
1952    /// Points to an [ExternalName] outside the current function.
1953    ExternalName(ExternalName),
1954    /// Points to a [MachLabel] inside this function.
1955    /// This is different from [MachLabelFixup] in that both the relocation and the
1956    /// label will be emitted and are only resolved at link time.
1957    ///
1958    /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1959    Label(MachLabel),
1960}
1961
1962impl From<ExternalName> for RelocTarget {
1963    fn from(name: ExternalName) -> Self {
1964        Self::ExternalName(name)
1965    }
1966}
1967
1968impl From<MachLabel> for RelocTarget {
1969    fn from(label: MachLabel) -> Self {
1970        Self::Label(label)
1971    }
1972}
1973
1974/// A Relocation target
1975#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1976#[cfg_attr(
1977    feature = "enable-serde",
1978    derive(serde_derive::Serialize, serde_derive::Deserialize)
1979)]
1980pub enum FinalizedRelocTarget {
1981    /// Points to an [ExternalName] outside the current function.
1982    ExternalName(ExternalName),
1983    /// Points to a [CodeOffset] from the start of the current function.
1984    Func(CodeOffset),
1985}
1986
1987impl FinalizedRelocTarget {
1988    /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1989    /// output.
1990    pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1991        match self {
1992            FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1993            FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1994        }
1995    }
1996}
1997
1998/// A trap record resulting from a compilation.
1999#[derive(Clone, Debug, PartialEq)]
2000#[cfg_attr(
2001    feature = "enable-serde",
2002    derive(serde_derive::Serialize, serde_derive::Deserialize)
2003)]
2004pub struct MachTrap {
2005    /// The offset at which the trap instruction occurs, *relative to the
2006    /// containing section*.
2007    pub offset: CodeOffset,
2008    /// The trap code.
2009    pub code: TrapCode,
2010}
2011
2012/// A call site record resulting from a compilation.
2013#[derive(Clone, Debug, PartialEq)]
2014#[cfg_attr(
2015    feature = "enable-serde",
2016    derive(serde_derive::Serialize, serde_derive::Deserialize)
2017)]
2018pub struct MachCallSite {
2019    /// The offset of the call's return address, *relative to the
2020    /// start of the buffer*.
2021    pub ret_addr: CodeOffset,
2022
2023    /// The offset from the FP at this callsite down to the SP when
2024    /// the call occurs, if known. In other words, the size of the
2025    /// stack frame up to the saved FP slot. Useful to recover the
2026    /// start of the stack frame and to look up dynamic contexts
2027    /// stored in [`ExceptionContextLoc::SPOffset`].
2028    ///
2029    /// If `None`, the compiler backend did not specify a frame
2030    /// offset. The runtime in use with the compiled code may require
2031    /// the frame offset if exception handlers are present or dynamic
2032    /// context is used, but that is not Cranelift's concern: the
2033    /// frame offset is optional at this level.
2034    pub frame_offset: Option<u32>,
2035
2036    /// Range in `exception_handlers` corresponding to the exception
2037    /// handlers for this callsite.
2038    exception_handler_range: Range<u32>,
2039}
2040
2041/// A call site record resulting from a compilation.
2042#[derive(Clone, Debug, PartialEq)]
2043pub struct FinalizedMachCallSite<'a> {
2044    /// The offset of the call's return address, *relative to the
2045    /// start of the buffer*.
2046    pub ret_addr: CodeOffset,
2047
2048    /// The offset from the FP at this callsite down to the SP when
2049    /// the call occurs, if known. In other words, the size of the
2050    /// stack frame up to the saved FP slot. Useful to recover the
2051    /// start of the stack frame and to look up dynamic contexts
2052    /// stored in [`ExceptionContextLoc::SPOffset`].
2053    ///
2054    /// If `None`, the compiler backend did not specify a frame
2055    /// offset. The runtime in use with the compiled code may require
2056    /// the frame offset if exception handlers are present or dynamic
2057    /// context is used, but that is not Cranelift's concern: the
2058    /// frame offset is optional at this level.
2059    pub frame_offset: Option<u32>,
2060
2061    /// Exception handlers at this callsite, with target offsets
2062    /// *relative to the start of the buffer*.
2063    pub exception_handlers: &'a [FinalizedMachExceptionHandler],
2064}
2065
2066/// A source-location mapping resulting from a compilation.
2067#[derive(PartialEq, Debug, Clone)]
2068#[cfg_attr(
2069    feature = "enable-serde",
2070    derive(serde_derive::Serialize, serde_derive::Deserialize)
2071)]
2072pub struct MachSrcLoc<T: CompilePhase> {
2073    /// The start of the region of code corresponding to a source location.
2074    /// This is relative to the start of the function, not to the start of the
2075    /// section.
2076    pub start: CodeOffset,
2077    /// The end of the region of code corresponding to a source location.
2078    /// This is relative to the start of the function, not to the start of the
2079    /// section.
2080    pub end: CodeOffset,
2081    /// The source location.
2082    pub loc: T::SourceLocType,
2083}
2084
2085impl MachSrcLoc<Stencil> {
2086    fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
2087        MachSrcLoc {
2088            start: self.start,
2089            end: self.end,
2090            loc: self.loc.expand(base_srcloc),
2091        }
2092    }
2093}
2094
2095/// Record of branch instruction in the buffer, to facilitate editing.
2096#[derive(Clone, Debug)]
2097struct MachBranch {
2098    start: CodeOffset,
2099    end: CodeOffset,
2100    target: MachLabel,
2101    fixup: usize,
2102    inverted: Option<SmallVec<[u8; 8]>>,
2103    /// All labels pointing to the start of this branch. For correctness, this
2104    /// *must* be complete (i.e., must contain all labels whose resolved offsets
2105    /// are at the start of this branch): we rely on being able to redirect all
2106    /// labels that could jump to this branch before removing it, if it is
2107    /// otherwise unreachable.
2108    labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2109}
2110
2111impl MachBranch {
2112    fn is_cond(&self) -> bool {
2113        self.inverted.is_some()
2114    }
2115    fn is_uncond(&self) -> bool {
2116        self.inverted.is_none()
2117    }
2118}
2119
2120/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2121///
2122/// Note that `MachBuffer` was primarily written for intra-function references
2123/// of jumps between basic blocks, but it's also quite usable for entire text
2124/// sections and resolving references between functions themselves. This
2125/// builder interprets "blocks" as labeled functions for the purposes of
2126/// resolving labels internally in the buffer.
2127pub struct MachTextSectionBuilder<I: VCodeInst> {
2128    buf: MachBuffer<I>,
2129    next_func: usize,
2130    force_veneers: ForceVeneers,
2131}
2132
2133impl<I: VCodeInst> MachTextSectionBuilder<I> {
2134    /// Creates a new text section builder which will have `num_funcs` functions
2135    /// pushed into it.
2136    pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2137        let mut buf = MachBuffer::new();
2138        buf.reserve_labels_for_blocks(num_funcs);
2139        MachTextSectionBuilder {
2140            buf,
2141            next_func: 0,
2142            force_veneers: ForceVeneers::No,
2143        }
2144    }
2145}
2146
2147impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2148    fn append(
2149        &mut self,
2150        labeled: bool,
2151        func: &[u8],
2152        align: u32,
2153        ctrl_plane: &mut ControlPlane,
2154    ) -> u64 {
2155        // Conditionally emit an island if it's necessary to resolve jumps
2156        // between functions which are too far away.
2157        let size = func.len() as u32;
2158        if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2159            self.buf
2160                .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2161        }
2162
2163        self.buf.align_to(align);
2164        let pos = self.buf.cur_offset();
2165        if labeled {
2166            self.buf.bind_label(
2167                MachLabel::from_block(BlockIndex::new(self.next_func)),
2168                ctrl_plane,
2169            );
2170            self.next_func += 1;
2171        }
2172        self.buf.put_data(func);
2173        u64::from(pos)
2174    }
2175
2176    fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2177        crate::trace!(
2178            "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2179        );
2180        let label = MachLabel::from_block(BlockIndex::new(target));
2181        let offset = u32::try_from(offset).unwrap();
2182        match I::LabelUse::from_reloc(reloc, addend) {
2183            Some(label_use) => {
2184                self.buf.use_label_at_offset(offset, label, label_use);
2185                true
2186            }
2187            None => false,
2188        }
2189    }
2190
2191    fn force_veneers(&mut self) {
2192        self.force_veneers = ForceVeneers::Yes;
2193    }
2194
2195    fn write(&mut self, offset: u64, data: &[u8]) {
2196        self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2197    }
2198
2199    fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2200        // Double-check all functions were pushed.
2201        assert_eq!(self.next_func, self.buf.label_offsets.len());
2202
2203        // Finish up any veneers, if necessary.
2204        self.buf
2205            .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2206
2207        // We don't need the data any more, so return it to the caller.
2208        mem::take(&mut self.buf.data).into_vec()
2209    }
2210}
2211
2212// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2213#[cfg(all(test, feature = "arm64"))]
2214mod test {
2215    use cranelift_entity::EntityRef as _;
2216
2217    use super::*;
2218    use crate::ir::UserExternalNameRef;
2219    use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2220    use crate::isa::aarch64::inst::{OperandSize, xreg};
2221    use crate::machinst::{MachInstEmit, MachInstEmitState};
2222    use crate::settings;
2223
2224    fn label(n: u32) -> MachLabel {
2225        MachLabel::from_block(BlockIndex::new(n as usize))
2226    }
2227    fn target(n: u32) -> BranchTarget {
2228        BranchTarget::Label(label(n))
2229    }
2230
2231    #[test]
2232    fn test_elide_jump_to_next() {
2233        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2234        let mut buf = MachBuffer::new();
2235        let mut state = <Inst as MachInstEmit>::State::default();
2236        let constants = Default::default();
2237
2238        buf.reserve_labels_for_blocks(2);
2239        buf.bind_label(label(0), state.ctrl_plane_mut());
2240        let inst = Inst::Jump { dest: target(1) };
2241        inst.emit(&mut buf, &info, &mut state);
2242        buf.bind_label(label(1), state.ctrl_plane_mut());
2243        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2244        assert_eq!(0, buf.total_size());
2245    }
2246
2247    #[test]
2248    fn test_elide_trivial_jump_blocks() {
2249        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2250        let mut buf = MachBuffer::new();
2251        let mut state = <Inst as MachInstEmit>::State::default();
2252        let constants = Default::default();
2253
2254        buf.reserve_labels_for_blocks(4);
2255
2256        buf.bind_label(label(0), state.ctrl_plane_mut());
2257        let inst = Inst::CondBr {
2258            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2259            taken: target(1),
2260            not_taken: target(2),
2261        };
2262        inst.emit(&mut buf, &info, &mut state);
2263
2264        buf.bind_label(label(1), state.ctrl_plane_mut());
2265        let inst = Inst::Jump { dest: target(3) };
2266        inst.emit(&mut buf, &info, &mut state);
2267
2268        buf.bind_label(label(2), state.ctrl_plane_mut());
2269        let inst = Inst::Jump { dest: target(3) };
2270        inst.emit(&mut buf, &info, &mut state);
2271
2272        buf.bind_label(label(3), state.ctrl_plane_mut());
2273
2274        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2275        assert_eq!(0, buf.total_size());
2276    }
2277
2278    #[test]
2279    fn test_flip_cond() {
2280        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2281        let mut buf = MachBuffer::new();
2282        let mut state = <Inst as MachInstEmit>::State::default();
2283        let constants = Default::default();
2284
2285        buf.reserve_labels_for_blocks(4);
2286
2287        buf.bind_label(label(0), state.ctrl_plane_mut());
2288        let inst = Inst::CondBr {
2289            kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2290            taken: target(1),
2291            not_taken: target(2),
2292        };
2293        inst.emit(&mut buf, &info, &mut state);
2294
2295        buf.bind_label(label(1), state.ctrl_plane_mut());
2296        let inst = Inst::Nop4;
2297        inst.emit(&mut buf, &info, &mut state);
2298
2299        buf.bind_label(label(2), state.ctrl_plane_mut());
2300        let inst = Inst::Udf {
2301            trap_code: TrapCode::STACK_OVERFLOW,
2302        };
2303        inst.emit(&mut buf, &info, &mut state);
2304
2305        buf.bind_label(label(3), state.ctrl_plane_mut());
2306
2307        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2308
2309        let mut buf2 = MachBuffer::new();
2310        let mut state = Default::default();
2311        let inst = Inst::TrapIf {
2312            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2313            trap_code: TrapCode::STACK_OVERFLOW,
2314        };
2315        inst.emit(&mut buf2, &info, &mut state);
2316        let inst = Inst::Nop4;
2317        inst.emit(&mut buf2, &info, &mut state);
2318
2319        let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2320
2321        assert_eq!(buf.data, buf2.data);
2322    }
2323
2324    #[test]
2325    fn test_island() {
2326        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2327        let mut buf = MachBuffer::new();
2328        let mut state = <Inst as MachInstEmit>::State::default();
2329        let constants = Default::default();
2330
2331        buf.reserve_labels_for_blocks(4);
2332
2333        buf.bind_label(label(0), state.ctrl_plane_mut());
2334        let inst = Inst::CondBr {
2335            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2336            taken: target(2),
2337            not_taken: target(3),
2338        };
2339        inst.emit(&mut buf, &info, &mut state);
2340
2341        buf.bind_label(label(1), state.ctrl_plane_mut());
2342        while buf.cur_offset() < 2000000 {
2343            if buf.island_needed(0) {
2344                buf.emit_island(0, state.ctrl_plane_mut());
2345            }
2346            let inst = Inst::Nop4;
2347            inst.emit(&mut buf, &info, &mut state);
2348        }
2349
2350        buf.bind_label(label(2), state.ctrl_plane_mut());
2351        let inst = Inst::Nop4;
2352        inst.emit(&mut buf, &info, &mut state);
2353
2354        buf.bind_label(label(3), state.ctrl_plane_mut());
2355        let inst = Inst::Nop4;
2356        inst.emit(&mut buf, &info, &mut state);
2357
2358        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2359
2360        assert_eq!(2000000 + 8, buf.total_size());
2361
2362        let mut buf2 = MachBuffer::new();
2363        let mut state = Default::default();
2364        let inst = Inst::CondBr {
2365            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2366
2367            // This conditionally taken branch has a 19-bit constant, shifted
2368            // to the left by two, giving us a 21-bit range in total. Half of
2369            // this range positive so the we should be around 1 << 20 bytes
2370            // away for our jump target.
2371            //
2372            // There are two pending fixups by the time we reach this point,
2373            // one for this 19-bit jump and one for the unconditional 26-bit
2374            // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2375            // veneer is 20 bytes large, which means that pessimistically
2376            // assuming we'll need two veneers. Currently each veneer is
2377            // pessimistically assumed to be the maximal size which means we
2378            // need 40 bytes of extra space, meaning that the actual island
2379            // should come 40-bytes before the deadline.
2380            taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2381
2382            // This branch is in-range so no veneers should be needed, it should
2383            // go directly to the target.
2384            not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2385        };
2386        inst.emit(&mut buf2, &info, &mut state);
2387
2388        let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2389
2390        assert_eq!(&buf.data[0..8], &buf2.data[..]);
2391    }
2392
2393    #[test]
2394    fn test_island_backward() {
2395        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2396        let mut buf = MachBuffer::new();
2397        let mut state = <Inst as MachInstEmit>::State::default();
2398        let constants = Default::default();
2399
2400        buf.reserve_labels_for_blocks(4);
2401
2402        buf.bind_label(label(0), state.ctrl_plane_mut());
2403        let inst = Inst::Nop4;
2404        inst.emit(&mut buf, &info, &mut state);
2405
2406        buf.bind_label(label(1), state.ctrl_plane_mut());
2407        let inst = Inst::Nop4;
2408        inst.emit(&mut buf, &info, &mut state);
2409
2410        buf.bind_label(label(2), state.ctrl_plane_mut());
2411        while buf.cur_offset() < 2000000 {
2412            let inst = Inst::Nop4;
2413            inst.emit(&mut buf, &info, &mut state);
2414        }
2415
2416        buf.bind_label(label(3), state.ctrl_plane_mut());
2417        let inst = Inst::CondBr {
2418            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2419            taken: target(0),
2420            not_taken: target(1),
2421        };
2422        inst.emit(&mut buf, &info, &mut state);
2423
2424        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2425
2426        assert_eq!(2000000 + 12, buf.total_size());
2427
2428        let mut buf2 = MachBuffer::new();
2429        let mut state = Default::default();
2430        let inst = Inst::CondBr {
2431            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2432            taken: BranchTarget::ResolvedOffset(8),
2433            not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2434        };
2435        inst.emit(&mut buf2, &info, &mut state);
2436        let inst = Inst::Jump {
2437            dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2438        };
2439        inst.emit(&mut buf2, &info, &mut state);
2440
2441        let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2442
2443        assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2444    }
2445
2446    #[test]
2447    fn test_multiple_redirect() {
2448        // label0:
2449        //   cbz x0, label1
2450        //   b label2
2451        // label1:
2452        //   b label3
2453        // label2:
2454        //   nop
2455        //   nop
2456        //   b label0
2457        // label3:
2458        //   b label4
2459        // label4:
2460        //   b label5
2461        // label5:
2462        //   b label7
2463        // label6:
2464        //   nop
2465        // label7:
2466        //   ret
2467        //
2468        // -- should become:
2469        //
2470        // label0:
2471        //   cbz x0, label7
2472        // label2:
2473        //   nop
2474        //   nop
2475        //   b label0
2476        // label6:
2477        //   nop
2478        // label7:
2479        //   ret
2480
2481        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2482        let mut buf = MachBuffer::new();
2483        let mut state = <Inst as MachInstEmit>::State::default();
2484        let constants = Default::default();
2485
2486        buf.reserve_labels_for_blocks(8);
2487
2488        buf.bind_label(label(0), state.ctrl_plane_mut());
2489        let inst = Inst::CondBr {
2490            kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2491            taken: target(1),
2492            not_taken: target(2),
2493        };
2494        inst.emit(&mut buf, &info, &mut state);
2495
2496        buf.bind_label(label(1), state.ctrl_plane_mut());
2497        let inst = Inst::Jump { dest: target(3) };
2498        inst.emit(&mut buf, &info, &mut state);
2499
2500        buf.bind_label(label(2), state.ctrl_plane_mut());
2501        let inst = Inst::Nop4;
2502        inst.emit(&mut buf, &info, &mut state);
2503        inst.emit(&mut buf, &info, &mut state);
2504        let inst = Inst::Jump { dest: target(0) };
2505        inst.emit(&mut buf, &info, &mut state);
2506
2507        buf.bind_label(label(3), state.ctrl_plane_mut());
2508        let inst = Inst::Jump { dest: target(4) };
2509        inst.emit(&mut buf, &info, &mut state);
2510
2511        buf.bind_label(label(4), state.ctrl_plane_mut());
2512        let inst = Inst::Jump { dest: target(5) };
2513        inst.emit(&mut buf, &info, &mut state);
2514
2515        buf.bind_label(label(5), state.ctrl_plane_mut());
2516        let inst = Inst::Jump { dest: target(7) };
2517        inst.emit(&mut buf, &info, &mut state);
2518
2519        buf.bind_label(label(6), state.ctrl_plane_mut());
2520        let inst = Inst::Nop4;
2521        inst.emit(&mut buf, &info, &mut state);
2522
2523        buf.bind_label(label(7), state.ctrl_plane_mut());
2524        let inst = Inst::Ret {};
2525        inst.emit(&mut buf, &info, &mut state);
2526
2527        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2528
2529        let golden_data = vec![
2530            0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2531            0x1f, 0x20, 0x03, 0xd5, // nop
2532            0x1f, 0x20, 0x03, 0xd5, // nop
2533            0xfd, 0xff, 0xff, 0x17, // b 0
2534            0x1f, 0x20, 0x03, 0xd5, // nop
2535            0xc0, 0x03, 0x5f, 0xd6, // ret
2536        ];
2537
2538        assert_eq!(&golden_data[..], &buf.data[..]);
2539    }
2540
2541    #[test]
2542    fn test_handle_branch_cycle() {
2543        // label0:
2544        //   b label1
2545        // label1:
2546        //   b label2
2547        // label2:
2548        //   b label3
2549        // label3:
2550        //   b label4
2551        // label4:
2552        //   b label1  // note: not label0 (to make it interesting).
2553        //
2554        // -- should become:
2555        //
2556        // label0, label1, ..., label4:
2557        //   b label0
2558        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2559        let mut buf = MachBuffer::new();
2560        let mut state = <Inst as MachInstEmit>::State::default();
2561        let constants = Default::default();
2562
2563        buf.reserve_labels_for_blocks(5);
2564
2565        buf.bind_label(label(0), state.ctrl_plane_mut());
2566        let inst = Inst::Jump { dest: target(1) };
2567        inst.emit(&mut buf, &info, &mut state);
2568
2569        buf.bind_label(label(1), state.ctrl_plane_mut());
2570        let inst = Inst::Jump { dest: target(2) };
2571        inst.emit(&mut buf, &info, &mut state);
2572
2573        buf.bind_label(label(2), state.ctrl_plane_mut());
2574        let inst = Inst::Jump { dest: target(3) };
2575        inst.emit(&mut buf, &info, &mut state);
2576
2577        buf.bind_label(label(3), state.ctrl_plane_mut());
2578        let inst = Inst::Jump { dest: target(4) };
2579        inst.emit(&mut buf, &info, &mut state);
2580
2581        buf.bind_label(label(4), state.ctrl_plane_mut());
2582        let inst = Inst::Jump { dest: target(1) };
2583        inst.emit(&mut buf, &info, &mut state);
2584
2585        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2586
2587        let golden_data = vec![
2588            0x00, 0x00, 0x00, 0x14, // b 0
2589        ];
2590
2591        assert_eq!(&golden_data[..], &buf.data[..]);
2592    }
2593
2594    #[test]
2595    fn metadata_records() {
2596        let mut buf = MachBuffer::<Inst>::new();
2597        let ctrl_plane = &mut Default::default();
2598        let constants = Default::default();
2599
2600        buf.reserve_labels_for_blocks(3);
2601
2602        buf.bind_label(label(0), ctrl_plane);
2603        buf.put1(1);
2604        buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2605        buf.put1(2);
2606        buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2607        buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2608        buf.add_try_call_site(
2609            Some(0x10),
2610            [
2611                MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),
2612                MachExceptionHandler::Default(label(1)),
2613            ]
2614            .into_iter(),
2615        );
2616        buf.add_reloc(
2617            Reloc::Abs4,
2618            &ExternalName::User(UserExternalNameRef::new(0)),
2619            0,
2620        );
2621        buf.put1(3);
2622        buf.add_reloc(
2623            Reloc::Abs8,
2624            &ExternalName::User(UserExternalNameRef::new(1)),
2625            1,
2626        );
2627        buf.put1(4);
2628        buf.bind_label(label(1), ctrl_plane);
2629        buf.put1(0xff);
2630        buf.bind_label(label(2), ctrl_plane);
2631        buf.put1(0xff);
2632
2633        let buf = buf.finish(&constants, ctrl_plane);
2634
2635        assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2636        assert_eq!(
2637            buf.traps()
2638                .iter()
2639                .map(|trap| (trap.offset, trap.code))
2640                .collect::<Vec<_>>(),
2641            vec![
2642                (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2643                (2, TrapCode::INTEGER_OVERFLOW),
2644                (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2645            ]
2646        );
2647        let call_sites: Vec<_> = buf.call_sites().collect();
2648        assert_eq!(call_sites[0].ret_addr, 2);
2649        assert_eq!(call_sites[0].frame_offset, Some(0x10));
2650        assert_eq!(
2651            call_sites[0].exception_handlers,
2652            &[
2653                FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),
2654                FinalizedMachExceptionHandler::Default(4)
2655            ],
2656        );
2657        assert_eq!(
2658            buf.relocs()
2659                .iter()
2660                .map(|reloc| (reloc.offset, reloc.kind))
2661                .collect::<Vec<_>>(),
2662            vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2663        );
2664    }
2665}