Skip to main content

cranelift_codegen/machinst/
buffer.rs

1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//!   emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//!   For example, on AArch64, conditional branches can only target code +/- 1MB
12//!   from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//!   edge blocks when the register allocator does not need to insert any
17//!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18//!   patterns. The lowering also pays no attention to block order, and so
19//!   two-target conditional forms (cond-br followed by uncond-br) can often by
20//!   avoided because one of the targets is the fallthrough. There are several
21//!   cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//!   forms, but with zero offsets and optimistically assuming the target will be
29//!   in range. Record the "fixup" for later. Targets are denoted instead by
30//!   symbolic "labels" that are then bound to certain offsets in the buffer as
31//!   we emit code. (Nominally, there is a label at the start of every basic
32//!   block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//!   reference "goes out of range". We call this the "deadline". If we reach the
36//!   deadline and we still have not bound the label to which an unresolved branch
37//!   refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//!   simply a chunk of code inserted in the middle of the code actually produced
41//!   by the emitter (e.g., VCode iterating over instruction structs). Islands
42//!   are emitted at "safe" points (no fall-through into the island contents):
43//!   between basic blocks during emission, or via a jump around the island.
44//!
45//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
46//!   that implements a longer-range reference to a label. The idea is that, for
47//!   example, a branch with a limited range can branch to a "veneer" instead,
48//!   which is simply a branch in a form that can use a longer-range reference. On
49//!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
50//!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
51//!   conditional branch's label reference can be fixed up with a "veneer" to
52//!   achieve a longer range.
53//!
54//! - To implement all of this, we require the backend to provide a `LabelUse`
55//!   type that implements a trait. This is nominally an enum that records one of
56//!   several kinds of references to an offset in code -- basically, a relocation
57//!   type -- and will usually correspond to different instruction formats. The
58//!   `LabelUse` implementation specifies the maximum range, how to patch in the
59//!   actual label location when known, and how to generate a veneer to extend the
60//!   range.
61//!
62//! That satisfies label references, but we still may have suboptimal branch
63//! patterns. To clean up the branches, we do a simple "peephole"-style
64//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
65//! informs the buffer of branches in the code and, in the case of conditionals,
66//! the code that would have been emitted to invert this branch's condition. We
67//! track the "latest branches": these are branches that are contiguous up to
68//! the current offset. (If any code is emitted after a branch, that branch or
69//! run of contiguous branches is no longer "latest".) The latest branches are
70//! those that we can edit by simply truncating the buffer and doing something
71//! else instead.
72//!
73//! To optimize branches, we implement several simple rules, and try to apply
74//! them to the "latest branches" when possible:
75//!
76//! - A branch with a label target, when that label is bound to the ending
77//!   offset of the branch (the fallthrough location), can be removed altogether,
78//!   because the branch would have no effect).
79//!
80//! - An unconditional branch that starts at a label location, and branches to
81//!   another label, results in a "label alias": all references to the label bound
82//!   *to* this branch instruction are instead resolved to the *target* of the
83//!   branch instruction. This effectively removes empty blocks that just
84//!   unconditionally branch to the next block. We call this "branch threading".
85//!
86//! - A conditional followed by an unconditional, when the conditional branches
87//!   to the unconditional's fallthrough, results in (i) the truncation of the
88//!   unconditional, (ii) the inversion of the condition's condition, and (iii)
89//!   replacement of the conditional's target (using the original target of the
90//!   unconditional). This is a fancy way of saying "we can flip a two-target
91//!   conditional branch's taken/not-taken targets if it works better with our
92//!   fallthrough". To make this work, the emitter actually gives the buffer
93//!   *both* forms of every conditional branch: the true form is emitted into the
94//!   buffer, and the "inverted" machine-code bytes are provided as part of the
95//!   branch-fixup metadata.
96//!
97//! - An unconditional B preceded by another unconditional P, when B's label(s) have
98//!   been redirected to target(B), can be removed entirely. This is an extension
99//!   of the branch-threading optimization, and is valid because if we know there
100//!   will be no fallthrough into this branch instruction (the prior instruction
101//!   is an unconditional jump), and if we know we have successfully redirected
102//!   all labels, then this branch instruction is unreachable. Note that this
103//!   works because the redirection happens before the label is ever resolved
104//!   (fixups happen at island emission time, at which point latest-branches are
105//!   cleared, or at the end of emission), so we are sure to catch and redirect
106//!   all possible paths to this instruction.
107//!
108//! # Branch-optimization Correctness
109//!
110//! The branch-optimization mechanism depends on a few data structures with
111//! invariants, which are always held outside the scope of top-level public
112//! methods:
113//!
114//! - The latest-branches list. Each entry describes a span of the buffer
115//!   (start/end offsets), the label target, the corresponding fixup-list entry
116//!   index, and the bytes (must be the same length) for the inverted form, if
117//!   conditional. The list of labels that are bound to the start-offset of this
118//!   branch is *complete* (if any label has a resolved offset equal to `start`
119//!   and is not an alias, it must appear in this list) and *precise* (no label
120//!   in this list can be bound to another offset). No label in this list should
121//!   be an alias.  No two branch ranges can overlap, and branches are in
122//!   ascending-offset order.
123//!
124//! - The labels-at-tail list. This contains all MachLabels that have been bound
125//!   to (whose resolved offsets are equal to) the tail offset of the buffer.
126//!   No label in this list should be an alias.
127//!
128//! - The label_offsets array, containing the bound offset of a label or
129//!   UNKNOWN. No label can be bound at an offset greater than the current
130//!   buffer tail.
131//!
132//! - The label_aliases array, containing another label to which a label is
133//!   bound or UNKNOWN. A label's resolved offset is the resolved offset
134//!   of the label it is aliased to, if this is set.
135//!
136//! We argue below, at each method, how the invariants in these data structures
137//! are maintained (grep for "Post-invariant").
138//!
139//! Given these invariants, we argue why each optimization preserves execution
140//! semantics below (grep for "Preserves execution semantics").
141//!
142//! # Deadline-correctness for islands
143//!
144//! Every label-use (and indirectly every pending constant/trap, since
145//! each is referred to by a fixup) imposes a *deadline*: the maximum
146//! offset at which the use's target may be bound while still
147//! remaining in range. Each item that may be emitted into an island
148//! (a veneer, a pending constant, or a pending trap) also contributes
149//! a bounded number of bytes to a worst-case island size. The
150//! buffer's central invariant is:
151//!
152//! > `worst_case_end_of_island(0) <= soonest_deadline`
153//!
154//! Equivalently, "if we emitted an island right now, its end offset
155//! would land before the closest expiring deadline." Given this
156//! invariant, an island is always *feasible*: items can be laid out
157//! in any order and each one lands at an offset no later than the
158//! soonest deadline, which is no later than each individual item's
159//! deadline.
160//!
161//! To maintain the invariant, the buffer's user is expected to treat
162//! *one `MachInst` emission* as the atomic commit unit. After each
163//! instruction, the worst-case end-of-island and the soonest deadline
164//! can shift by no more than `worst_case_size() +
165//! worst_case_island_growth()` and one "smallest label-use range"
166//! worth of new deadline, respectively. The user (in VCode emission)
167//! consults [`MachBuffer::island_needed`] after each instruction and
168//! if one is needed, emits a jump-around branch followed by
169//! [`MachBuffer::emit_island`].
170//!
171//! # Avoiding Quadratic Behavior
172//!
173//! There are two cases where we've had to take some care to avoid
174//! quadratic worst-case behavior:
175//!
176//! - The "labels at this branch" list can grow unboundedly if the
177//!   code generator binds many labels at one location. If the count
178//!   gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
179//!   simply abort an optimization early in a way that is always correct
180//!   but is conservative.
181//!
182//! - The fixup list can interact with island emission to create
183//!   "quadratic island behavior". In a little more detail, one can hit
184//!   this behavior by having some pending fixups (forward label
185//!   references) with long-range label-use kinds, and some others
186//!   with shorter-range references that nonetheless still are pending
187//!   long enough to trigger island generation. In such a case, we
188//!   process the fixup list, generate veneers to extend some forward
189//!   references' ranges, but leave the other (longer-range) ones
190//!   alone. The way this was implemented put them back on a list and
191//!   resulted in quadratic behavior.
192//!
193//!   To avoid this fixups are split into two lists: one "pending" list and one
194//!   final list. The pending list is kept around for handling fixups related to
195//!   branches so it can be edited/truncated. When an island is reached, which
196//!   starts processing fixups, all pending fixups are flushed into the final
197//!   list. The final list is a `BinaryHeap` which enables fixup processing to
198//!   only process those which are required during island emission, deferring
199//!   all longer-range fixups to later.
200
201use crate::binemit::{Addend, CodeOffset, Reloc};
202use crate::ir::function::FunctionParameters;
203use crate::ir::{DebugTag, ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
204use crate::isa::unwind::UnwindInst;
205use crate::machinst::{
206    BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
207};
208use crate::trace;
209use crate::{MachInstEmitState, ir};
210use crate::{VCodeConstantData, timing};
211use alloc::collections::BinaryHeap;
212use alloc::string::String;
213use alloc::vec::Vec;
214use core::cmp::Ordering;
215use core::mem;
216use core::ops::Range;
217use cranelift_control::ControlPlane;
218use cranelift_entity::{PrimaryMap, SecondaryMap, entity_impl};
219use smallvec::SmallVec;
220
221#[cfg(feature = "enable-serde")]
222use serde::{Deserialize, Serialize};
223
224#[cfg(feature = "enable-serde")]
225pub trait CompilePhase {
226    type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
227    type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
228}
229
230#[cfg(not(feature = "enable-serde"))]
231pub trait CompilePhase {
232    type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
233    type SourceLocType: core::fmt::Debug + PartialEq + Clone;
234}
235
236/// Status of a compiled artifact that needs patching before being used.
237#[derive(Clone, Debug, PartialEq)]
238#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
239pub struct Stencil;
240
241/// Status of a compiled artifact ready to use.
242#[derive(Clone, Debug, PartialEq)]
243pub struct Final;
244
245impl CompilePhase for Stencil {
246    type MachSrcLocType = MachSrcLoc<Stencil>;
247    type SourceLocType = RelSourceLoc;
248}
249
250impl CompilePhase for Final {
251    type MachSrcLocType = MachSrcLoc<Final>;
252    type SourceLocType = SourceLoc;
253}
254
255#[derive(Clone, Copy, Debug, PartialEq, Eq)]
256enum ForceVeneers {
257    Yes,
258    No,
259}
260
261/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
262/// in bulk.
263///
264/// This struct uses `SmallVec`s to support small-ish function bodies without
265/// any heap allocation. As such, it will be several kilobytes large. This is
266/// likely fine as long as it is stack-allocated for function emission then
267/// thrown away; but beware if many buffer objects are retained persistently.
268pub struct MachBuffer<I: VCodeInst> {
269    /// The buffer contents, as raw bytes.
270    data: SmallVec<[u8; 1024]>,
271    /// The required alignment of this buffer.
272    min_alignment: u32,
273    /// Any relocations referring to this code. Note that only *external*
274    /// relocations are tracked here; references to labels within the buffer are
275    /// resolved before emission.
276    relocs: SmallVec<[MachReloc; 16]>,
277    /// Any trap records referring to this code.
278    traps: SmallVec<[MachTrap; 16]>,
279    /// Any call site records referring to this code.
280    call_sites: SmallVec<[MachCallSite; 16]>,
281    /// Any patchable call site locations.
282    patchable_call_sites: SmallVec<[MachPatchableCallSite; 16]>,
283    /// Any exception-handler records referred to at call sites.
284    exception_handlers: SmallVec<[MachExceptionHandler; 16]>,
285    /// Any source location mappings referring to this code.
286    srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
287    /// Any debug tags referring to this code.
288    debug_tags: Vec<MachDebugTags>,
289    /// Pool of debug tags referenced by `MachDebugTags` entries.
290    debug_tag_pool: Vec<DebugTag>,
291    /// Any user stack maps for this code.
292    ///
293    /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
294    /// by code offset, and each stack map covers `span` bytes on the stack.
295    user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
296    /// Any unwind info at a given location.
297    unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
298    /// The current source location in progress (after `start_srcloc()` and
299    /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
300    cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
301    /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
302    label_offsets: SmallVec<[CodeOffset; 16]>,
303    /// Label aliases: when one label points to an unconditional jump, and that
304    /// jump points to another label, we can redirect references to the first
305    /// label immediately to the second.
306    ///
307    /// Invariant: we don't have label-alias cycles. We ensure this by,
308    /// before setting label A to alias label B, resolving B's alias
309    /// target (iteratively until a non-aliased label); if B is already
310    /// aliased to A, then we cannot alias A back to B.
311    label_aliases: SmallVec<[MachLabel; 16]>,
312    /// Constants that must be emitted at some point.
313    pending_constants: SmallVec<[VCodeConstant; 16]>,
314    /// Byte size of all constants in `pending_constants`.
315    pending_constants_size: CodeOffset,
316    /// Traps that must be emitted at some point.
317    pending_traps: SmallVec<[MachLabelTrap; 16]>,
318    /// Fixups that haven't yet been flushed into `fixup_records` below and may
319    /// be related to branches that are chomped. These all get added to
320    /// `fixup_records` during island emission.
321    pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
322    /// The nearest upcoming deadline for entries in `pending_fixup_records`.
323    pending_fixup_deadline: CodeOffset,
324    /// Fixups that must be performed after all code is emitted.
325    fixup_records: BinaryHeap<MachLabelFixup<I>>,
326    /// Latest branches, to facilitate in-place editing for better fallthrough
327    /// behavior and empty-block removal.
328    latest_branches: SmallVec<[MachBranch; 4]>,
329    /// All labels at the current offset (emission tail). This is lazily
330    /// cleared: it is actually accurate as long as the current offset is
331    /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
332    /// be considered as empty.
333    ///
334    /// For correctness, this *must* be complete (i.e., the vector must contain
335    /// all labels whose offsets are resolved to the current tail), because we
336    /// rely on it to update labels when we truncate branches.
337    labels_at_tail: SmallVec<[MachLabel; 4]>,
338    /// The last offset at which `labels_at_tail` is valid. It is conceptually
339    /// always describing the tail of the buffer, but we do not clear
340    /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
341    /// when the offset has grown past this (`labels_at_tail_off`) point.
342    /// Always <= `cur_offset()`.
343    labels_at_tail_off: CodeOffset,
344    /// Metadata about all constants that this function has access to.
345    ///
346    /// This records the size/alignment of all constants (not the actual data)
347    /// along with the last available label generated for the constant. This map
348    /// is consulted when constants are referred to and the label assigned to a
349    /// constant may change over time as well.
350    constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
351    /// All recorded usages of constants as pairs of the constant and where the
352    /// constant needs to be placed within `self.data`. Note that the same
353    /// constant may appear in this array multiple times if it was emitted
354    /// multiple times.
355    used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
356    /// Indicates when a patchable region is currently open, to guard that it's
357    /// not possible to nest patchable regions.
358    open_patchable: bool,
359    /// Stack frame layout metadata. If provided for a MachBuffer
360    /// containing a function body, this allows interpretation of
361    /// runtime state given a view of an active stack frame.
362    frame_layout: Option<MachBufferFrameLayout>,
363}
364
365impl MachBufferFinalized<Stencil> {
366    /// Get a finalized machine buffer by applying the function's base source location.
367    pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
368        MachBufferFinalized {
369            data: self.data,
370            relocs: self.relocs,
371            traps: self.traps,
372            call_sites: self.call_sites,
373            patchable_call_sites: self.patchable_call_sites,
374            exception_handlers: self.exception_handlers,
375            srclocs: self
376                .srclocs
377                .into_iter()
378                .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
379                .collect(),
380            debug_tags: self.debug_tags,
381            debug_tag_pool: self.debug_tag_pool,
382            user_stack_maps: self.user_stack_maps,
383            unwind_info: self.unwind_info,
384            alignment: self.alignment,
385            frame_layout: self.frame_layout,
386            nop_units: self.nop_units,
387        }
388    }
389}
390
391/// A `MachBuffer` once emission is completed: holds generated code and records,
392/// without fixups. This allows the type to be independent of the backend.
393#[derive(PartialEq, Debug, Clone)]
394#[cfg_attr(
395    feature = "enable-serde",
396    derive(serde_derive::Serialize, serde_derive::Deserialize)
397)]
398pub struct MachBufferFinalized<T: CompilePhase> {
399    /// The buffer contents, as raw bytes.
400    pub(crate) data: SmallVec<[u8; 1024]>,
401    /// Any relocations referring to this code. Note that only *external*
402    /// relocations are tracked here; references to labels within the buffer are
403    /// resolved before emission.
404    pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
405    /// Any trap records referring to this code.
406    pub(crate) traps: SmallVec<[MachTrap; 16]>,
407    /// Any call site records referring to this code.
408    pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
409    /// Any patchable call site locations referring to this code.
410    pub(crate) patchable_call_sites: SmallVec<[MachPatchableCallSite; 16]>,
411    /// Any exception-handler records referred to at call sites.
412    pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,
413    /// Any source location mappings referring to this code.
414    pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
415    /// Any debug tags referring to this code.
416    pub(crate) debug_tags: Vec<MachDebugTags>,
417    /// Pool of debug tags referenced by `MachDebugTags` entries.
418    pub(crate) debug_tag_pool: Vec<DebugTag>,
419    /// Any user stack maps for this code.
420    ///
421    /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
422    /// by code offset, and each stack map covers `span` bytes on the stack.
423    pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
424    /// Stack frame layout metadata. If provided for a MachBuffer
425    /// containing a function body, this allows interpretation of
426    /// runtime state given a view of an active stack frame.
427    pub(crate) frame_layout: Option<MachBufferFrameLayout>,
428    /// Any unwind info at a given location.
429    pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
430    /// The required alignment of this buffer.
431    pub alignment: u32,
432    /// The means by which to NOP out patchable call sites.
433    ///
434    /// This allows a consumer of a `MachBufferFinalized` to disable
435    /// patchable call sites (which are enabled by default) without
436    /// specific knowledge of the target ISA.
437    ///
438    /// Each entry is one form of nop, and these are required to be
439    /// sorted in ascending-size order.
440    pub nop_units: Vec<Vec<u8>>,
441}
442
443const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
444const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
445
446/// Threshold on max length of `labels_at_this_branch` list to avoid
447/// unbounded quadratic behavior (see comment below at use-site).
448const LABEL_LIST_THRESHOLD: usize = 100;
449
450/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
451/// the point at which it is used by emitted code; the buffer records "fixups"
452/// for references to the label, and will come back and patch the code
453/// appropriately when the label's location is eventually known.
454#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
455pub struct MachLabel(u32);
456entity_impl!(MachLabel);
457
458impl MachLabel {
459    /// Get a label for a block. (The first N MachLabels are always reserved for
460    /// the N blocks in the vcode.)
461    pub fn from_block(bindex: BlockIndex) -> MachLabel {
462        MachLabel(bindex.index() as u32)
463    }
464
465    /// Creates a string representing this label, for convenience.
466    pub fn to_string(&self) -> String {
467        format!("label{}", self.0)
468    }
469}
470
471impl Default for MachLabel {
472    fn default() -> Self {
473        UNKNOWN_LABEL
474    }
475}
476
477/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
478/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
479/// the [`OpenPatchRegion`] token in the process.
480pub struct OpenPatchRegion(usize);
481
482/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
483/// of where you might want to use this is for patching instructions that mention constants that
484/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
485/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
486/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
487/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
488/// bytes, and the constants uses can be updated directly.
489pub struct PatchRegion {
490    range: Range<usize>,
491}
492
493impl PatchRegion {
494    /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
495    pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
496        &mut buffer.data[self.range]
497    }
498}
499
500impl<I: VCodeInst> MachBuffer<I> {
501    /// Create a new section, known to start at `start_offset` and with a size limited to
502    /// `length_limit`.
503    pub fn new() -> MachBuffer<I> {
504        MachBuffer {
505            data: SmallVec::new(),
506            min_alignment: I::function_alignment().minimum,
507            relocs: SmallVec::new(),
508            traps: SmallVec::new(),
509            call_sites: SmallVec::new(),
510            patchable_call_sites: SmallVec::new(),
511            exception_handlers: SmallVec::new(),
512            srclocs: SmallVec::new(),
513            debug_tags: vec![],
514            debug_tag_pool: vec![],
515            user_stack_maps: SmallVec::new(),
516            unwind_info: SmallVec::new(),
517            cur_srcloc: None,
518            label_offsets: SmallVec::new(),
519            label_aliases: SmallVec::new(),
520            pending_constants: SmallVec::new(),
521            pending_constants_size: 0,
522            pending_traps: SmallVec::new(),
523            pending_fixup_records: SmallVec::new(),
524            pending_fixup_deadline: u32::MAX,
525            fixup_records: Default::default(),
526            latest_branches: SmallVec::new(),
527            labels_at_tail: SmallVec::new(),
528            labels_at_tail_off: 0,
529            constants: Default::default(),
530            used_constants: Default::default(),
531            open_patchable: false,
532            frame_layout: None,
533        }
534    }
535
536    /// Current offset from start of buffer.
537    pub fn cur_offset(&self) -> CodeOffset {
538        self.data.len() as CodeOffset
539    }
540
541    /// Add a byte.
542    pub fn put1(&mut self, value: u8) {
543        self.data.push(value);
544
545        // Post-invariant: conceptual-labels_at_tail contains a complete and
546        // precise list of labels bound at `cur_offset()`. We have advanced
547        // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
548        // before, it is not anymore (and it cannot become equal, because
549        // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
550        // conceptually empty (even though it is only lazily cleared). No labels
551        // can be bound at this new offset (by invariant on `label_offsets`).
552        // Hence the invariant holds.
553    }
554
555    /// Add 2 bytes.
556    pub fn put2(&mut self, value: u16) {
557        let bytes = value.to_le_bytes();
558        self.data.extend_from_slice(&bytes[..]);
559
560        // Post-invariant: as for `put1()`.
561    }
562
563    /// Add 4 bytes.
564    pub fn put4(&mut self, value: u32) {
565        let bytes = value.to_le_bytes();
566        self.data.extend_from_slice(&bytes[..]);
567
568        // Post-invariant: as for `put1()`.
569    }
570
571    /// Add 8 bytes.
572    pub fn put8(&mut self, value: u64) {
573        let bytes = value.to_le_bytes();
574        self.data.extend_from_slice(&bytes[..]);
575
576        // Post-invariant: as for `put1()`.
577    }
578
579    /// Add a slice of bytes.
580    pub fn put_data(&mut self, data: &[u8]) {
581        self.data.extend_from_slice(data);
582
583        // Post-invariant: as for `put1()`.
584    }
585
586    /// Reserve appended space and return a mutable slice referring to it.
587    pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
588        let off = self.data.len();
589        let new_len = self.data.len() + len;
590        self.data.resize(new_len, 0);
591        &mut self.data[off..]
592
593        // Post-invariant: as for `put1()`.
594    }
595
596    /// Align up to the given alignment.
597    pub fn align_to(&mut self, align_to: CodeOffset) {
598        trace!("MachBuffer: align to {}", align_to);
599        assert!(
600            align_to.is_power_of_two(),
601            "{align_to} is not a power of two"
602        );
603        while self.cur_offset() & (align_to - 1) != 0 {
604            self.put1(0);
605        }
606
607        // Post-invariant: as for `put1()`.
608    }
609
610    /// Begin a region of patchable code. There is one requirement for the
611    /// code that is emitted: It must not introduce any instructions that
612    /// could be chomped (branches are an example of this). In other words,
613    /// you must not call [`MachBuffer::add_cond_branch`] or
614    /// [`MachBuffer::add_uncond_branch`] between calls to this method and
615    /// [`MachBuffer::end_patchable`].
616    pub fn start_patchable(&mut self) -> OpenPatchRegion {
617        assert!(!self.open_patchable, "Patchable regions may not be nested");
618        self.open_patchable = true;
619        OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
620    }
621
622    /// End a region of patchable code, yielding a [`PatchRegion`] value that
623    /// can be consumed later to produce a one-off mutable slice to the
624    /// associated region of the data buffer.
625    pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
626        // No need to assert the state of `open_patchable` here, as we take
627        // ownership of the only `OpenPatchable` value.
628        self.open_patchable = false;
629        let end = usize::try_from(self.cur_offset()).unwrap();
630        PatchRegion { range: open.0..end }
631    }
632
633    /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
634    /// offset yet.
635    pub fn get_label(&mut self) -> MachLabel {
636        let l = self.label_offsets.len() as u32;
637        self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
638        self.label_aliases.push(UNKNOWN_LABEL);
639        trace!("MachBuffer: new label -> {:?}", MachLabel(l));
640        MachLabel(l)
641
642        // Post-invariant: the only mutation is to add a new label; it has no
643        // bound offset yet, so it trivially satisfies all invariants.
644    }
645
646    /// Reserve the first N MachLabels for blocks.
647    pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
648        trace!("MachBuffer: first {} labels are for blocks", blocks);
649        debug_assert!(self.label_offsets.is_empty());
650        self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
651        self.label_aliases.resize(blocks, UNKNOWN_LABEL);
652
653        // Post-invariant: as for `get_label()`.
654    }
655
656    /// Registers metadata in this `MachBuffer` about the `constants` provided.
657    ///
658    /// This will record the size/alignment of all constants which will prepare
659    /// them for emission later on.
660    pub fn register_constants(&mut self, constants: &VCodeConstants) {
661        for (c, val) in constants.iter() {
662            self.register_constant(&c, val);
663        }
664    }
665
666    /// Similar to [`MachBuffer::register_constants`] but registers a
667    /// single constant metadata. This function is useful in
668    /// situations where not all constants are known at the time of
669    /// emission.
670    pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
671        let c2 = self.constants.push(MachBufferConstant {
672            upcoming_label: None,
673            align: data.alignment(),
674            size: data.as_slice().len(),
675        });
676        assert_eq!(*constant, c2);
677    }
678
679    /// Completes constant emission by iterating over `self.used_constants` and
680    /// filling in the "holes" with the constant values provided by `constants`.
681    ///
682    /// Returns the alignment required for this entire buffer. Alignment starts
683    /// at the ISA's minimum function alignment and can be increased due to
684    /// constant requirements.
685    fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
686        let mut alignment = self.min_alignment;
687        for (constant, offset) in mem::take(&mut self.used_constants) {
688            let constant = constants.get(constant);
689            let data = constant.as_slice();
690            self.data[offset as usize..][..data.len()].copy_from_slice(data);
691            alignment = constant.alignment().max(alignment);
692        }
693        alignment
694    }
695
696    /// Returns a label that can be used to refer to the `constant` provided.
697    ///
698    /// This will automatically defer a new constant to be emitted for
699    /// `constant` if it has not been previously emitted. Note that this
700    /// function may return a different label for the same constant at
701    /// different points in time. The label is valid to use only from the
702    /// current location; the MachBuffer takes care to emit the same constant
703    /// multiple times if needed so the constant is always in range.
704    pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
705        let MachBufferConstant {
706            align,
707            size,
708            upcoming_label,
709        } = self.constants[constant];
710        if let Some(label) = upcoming_label {
711            return label;
712        }
713
714        let label = self.get_label();
715        trace!(
716            "defer constant: eventually emit {size} bytes aligned \
717             to {align} at label {label:?}",
718        );
719        self.pending_constants.push(constant);
720        self.pending_constants_size += size as u32;
721        self.constants[constant].upcoming_label = Some(label);
722        label
723    }
724
725    /// Bind a label to the current offset. A label can only be bound once.
726    pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
727        trace!(
728            "MachBuffer: bind label {:?} at offset {}",
729            label,
730            self.cur_offset()
731        );
732        debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
733        debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
734        let offset = self.cur_offset();
735        self.label_offsets[label.0 as usize] = offset;
736        self.lazily_clear_labels_at_tail();
737        self.labels_at_tail.push(label);
738
739        // Invariants hold: bound offset of label is <= cur_offset (in fact it
740        // is equal). If the `labels_at_tail` list was complete and precise
741        // before, it is still, because we have bound this label to the current
742        // offset and added it to the list (which contains all labels at the
743        // current offset).
744
745        self.optimize_branches(ctrl_plane);
746
747        // Post-invariant: by `optimize_branches()` (see argument there).
748    }
749
750    /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
751    /// offset that it applies to.
752    fn lazily_clear_labels_at_tail(&mut self) {
753        let offset = self.cur_offset();
754        if offset > self.labels_at_tail_off {
755            self.labels_at_tail_off = offset;
756            self.labels_at_tail.clear();
757        }
758
759        // Post-invariant: either labels_at_tail_off was at cur_offset, and
760        // state is untouched, or was less than cur_offset, in which case the
761        // labels_at_tail list was conceptually empty, and is now actually
762        // empty.
763    }
764
765    /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
766    pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
767        let mut iters = 0;
768        while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
769            label = self.label_aliases[label.0 as usize];
770            // To protect against an infinite loop (despite our assurances to
771            // ourselves that the invariants make this impossible), assert out
772            // after 1M iterations. The number of basic blocks is limited
773            // in most contexts anyway so this should be impossible to hit with
774            // a legitimate input.
775            iters += 1;
776            assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
777        }
778        self.label_offsets[label.0 as usize]
779
780        // Post-invariant: no mutations.
781    }
782
783    /// Emit a reference to the given label with the given reference type (i.e.,
784    /// branch-instruction format) at the current offset.  This is like a
785    /// relocation, but handled internally.
786    ///
787    /// This can be called before the branch is actually emitted; fixups will
788    /// not happen until an island is emitted or the buffer is finished.
789    pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
790        trace!(
791            "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
792            offset, label, kind
793        );
794
795        // Add the fixup, and update the worst-case island size based on a
796        // veneer for this label use.
797        let fixup = MachLabelFixup {
798            label,
799            offset,
800            kind,
801        };
802        self.pending_fixup_deadline = self
803            .pending_fixup_deadline
804            // Subtract one alignment here to the deadline to account for
805            // extra space taken by aligning an island.
806            .min(fixup.deadline() - I::LabelUse::ALIGN);
807        trace!("pending_fixup_deadline = {}", self.pending_fixup_deadline);
808        self.pending_fixup_records.push(fixup);
809
810        // Post-invariant: no mutations to branches/labels data structures.
811    }
812
813    /// Inform the buffer of an unconditional branch at the given offset,
814    /// targeting the given label. May be used to optimize branches.
815    /// The last added label-use must correspond to this branch.
816    /// This must be called when the current offset is equal to `start`; i.e.,
817    /// before actually emitting the branch. This implies that for a branch that
818    /// uses a label and is eligible for optimizations by the MachBuffer, the
819    /// proper sequence is:
820    ///
821    /// - Call `use_label_at_offset()` to emit the fixup record.
822    /// - Call `add_uncond_branch()` to make note of the branch.
823    /// - Emit the bytes for the branch's machine code.
824    ///
825    /// Additional requirement: no labels may be bound between `start` and `end`
826    /// (exclusive on both ends).
827    pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
828        debug_assert!(
829            !self.open_patchable,
830            "Branch instruction inserted within a patchable region"
831        );
832        assert!(self.cur_offset() == start);
833        debug_assert!(end > start);
834        assert!(!self.pending_fixup_records.is_empty());
835        let fixup = self.pending_fixup_records.len() - 1;
836        self.lazily_clear_labels_at_tail();
837        self.latest_branches.push(MachBranch {
838            start,
839            end,
840            target,
841            fixup,
842            inverted: None,
843            labels_at_this_branch: self.labels_at_tail.clone(),
844        });
845
846        // Post-invariant: we asserted branch start is current tail; the list of
847        // labels at branch is cloned from list of labels at current tail.
848    }
849
850    /// Inform the buffer of a conditional branch at the given offset,
851    /// targeting the given label. May be used to optimize branches.
852    /// The last added label-use must correspond to this branch.
853    ///
854    /// Additional requirement: no labels may be bound between `start` and `end`
855    /// (exclusive on both ends).
856    pub fn add_cond_branch(
857        &mut self,
858        start: CodeOffset,
859        end: CodeOffset,
860        target: MachLabel,
861        inverted: &[u8],
862    ) {
863        debug_assert!(
864            !self.open_patchable,
865            "Branch instruction inserted within a patchable region"
866        );
867        assert!(self.cur_offset() == start);
868        debug_assert!(end > start);
869        assert!(!self.pending_fixup_records.is_empty());
870        debug_assert!(
871            inverted.len() == (end - start) as usize,
872            "branch length = {}, but inverted length = {}",
873            end - start,
874            inverted.len()
875        );
876        let fixup = self.pending_fixup_records.len() - 1;
877        let inverted = Some(SmallVec::from(inverted));
878        self.lazily_clear_labels_at_tail();
879        self.latest_branches.push(MachBranch {
880            start,
881            end,
882            target,
883            fixup,
884            inverted,
885            labels_at_this_branch: self.labels_at_tail.clone(),
886        });
887
888        // Post-invariant: we asserted branch start is current tail; labels at
889        // branch list is cloned from list of labels at current tail.
890    }
891
892    fn truncate_last_branch(&mut self) {
893        debug_assert!(
894            !self.open_patchable,
895            "Branch instruction truncated within a patchable region"
896        );
897
898        self.lazily_clear_labels_at_tail();
899        // Invariants hold at this point.
900
901        let b = self.latest_branches.pop().unwrap();
902        assert!(b.end == self.cur_offset());
903
904        // State:
905        //    [PRE CODE]
906        //  Offset b.start, b.labels_at_this_branch:
907        //    [BRANCH CODE]
908        //  cur_off, self.labels_at_tail -->
909        //    (end of buffer)
910        self.data.truncate(b.start as usize);
911        self.pending_fixup_records.truncate(b.fixup);
912
913        // Trim srclocs and debug tags now past the end of the buffer.
914        while let Some(last_srcloc) = self.srclocs.last_mut() {
915            if last_srcloc.end <= b.start {
916                break;
917            }
918            if last_srcloc.start < b.start {
919                last_srcloc.end = b.start;
920                break;
921            }
922            self.srclocs.pop();
923        }
924        while let Some(last_debug_tag) = self.debug_tags.last() {
925            if last_debug_tag.offset <= b.start {
926                break;
927            }
928            self.debug_tags.pop();
929        }
930
931        // State:
932        //    [PRE CODE]
933        //  cur_off, Offset b.start, b.labels_at_this_branch:
934        //    (end of buffer)
935        //
936        //  self.labels_at_tail -->  (past end of buffer)
937        let cur_off = self.cur_offset();
938        self.labels_at_tail_off = cur_off;
939        // State:
940        //    [PRE CODE]
941        //  cur_off, Offset b.start, b.labels_at_this_branch,
942        //  self.labels_at_tail:
943        //    (end of buffer)
944        //
945        // resolve_label_offset(l) for l in labels_at_tail:
946        //    (past end of buffer)
947
948        trace!(
949            "truncate_last_branch: truncated {:?}; off now {}",
950            b, cur_off
951        );
952
953        // Fix up resolved label offsets for labels at tail.
954        for &l in &self.labels_at_tail {
955            self.label_offsets[l.0 as usize] = cur_off;
956        }
957        // Old labels_at_this_branch are now at cur_off.
958        self.labels_at_tail.extend(b.labels_at_this_branch);
959
960        // Post-invariant: this operation is defined to truncate the buffer,
961        // which moves cur_off backward, and to move labels at the end of the
962        // buffer back to the start-of-branch offset.
963        //
964        // latest_branches satisfies all invariants:
965        // - it has no branches past the end of the buffer (branches are in
966        //   order, we removed the last one, and we truncated the buffer to just
967        //   before the start of that branch)
968        // - no labels were moved to lower offsets than the (new) cur_off, so
969        //   the labels_at_this_branch list for any other branch need not change.
970        //
971        // labels_at_tail satisfies all invariants:
972        // - all labels that were at the tail after the truncated branch are
973        //   moved backward to just before the branch, which becomes the new tail;
974        //   thus every element in the list should remain (ensured by `.extend()`
975        //   above).
976        // - all labels that refer to the new tail, which is the start-offset of
977        //   the truncated branch, must be present. The `labels_at_this_branch`
978        //   list in the truncated branch's record is a complete and precise list
979        //   of exactly these labels; we append these to labels_at_tail.
980        // - labels_at_tail_off is at cur_off after truncation occurs, so the
981        //   list is valid (not to be lazily cleared).
982        //
983        // The stated operation was performed:
984        // - For each label at the end of the buffer prior to this method, it
985        //   now resolves to the new (truncated) end of the buffer: it must have
986        //   been in `labels_at_tail` (this list is precise and complete, and
987        //   the tail was at the end of the truncated branch on entry), and we
988        //   iterate over this list and set `label_offsets` to the new tail.
989        //   None of these labels could have been an alias (by invariant), so
990        //   `label_offsets` is authoritative for each.
991        // - No other labels will be past the end of the buffer, because of the
992        //   requirement that no labels be bound to the middle of branch ranges
993        //   (see comments to `add_{cond,uncond}_branch()`).
994        // - The buffer is truncated to just before the last branch, and the
995        //   fixup record referring to that last branch is removed.
996    }
997
998    /// Performs various optimizations on branches pointing at the current label.
999    pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
1000        if ctrl_plane.get_decision() {
1001            return;
1002        }
1003
1004        self.lazily_clear_labels_at_tail();
1005        // Invariants valid at this point.
1006
1007        trace!(
1008            "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1009            self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1010        );
1011
1012        // We continue to munch on branches at the tail of the buffer until no
1013        // more rules apply. Note that the loop only continues if a branch is
1014        // actually truncated (or if labels are redirected away from a branch),
1015        // so this always makes progress.
1016        while let Some(b) = self.latest_branches.last() {
1017            let cur_off = self.cur_offset();
1018            trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
1019            // If there has been any code emission since the end of the last branch or
1020            // label definition, then there's nothing we can edit (because we
1021            // don't move code once placed, only back up and overwrite), so
1022            // clear the records and finish.
1023            if b.end < cur_off {
1024                break;
1025            }
1026
1027            // If the "labels at this branch" list on this branch is
1028            // longer than a threshold, don't do any simplification,
1029            // and let the branch remain to separate those labels from
1030            // the current tail. This avoids quadratic behavior (see
1031            // #3468): otherwise, if a long string of "goto next;
1032            // next:" patterns are emitted, all of the labels will
1033            // coalesce into a long list of aliases for the current
1034            // buffer tail. We must track all aliases of the current
1035            // tail for correctness, but we are also allowed to skip
1036            // optimization (removal) of any branch, so we take the
1037            // escape hatch here and let it stand. In effect this
1038            // "spreads" the many thousands of labels in the
1039            // pathological case among an actual (harmless but
1040            // suboptimal) instruction once per N labels.
1041            if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
1042                break;
1043            }
1044
1045            // Invariant: we are looking at a branch that ends at the tail of
1046            // the buffer.
1047
1048            // For any branch, conditional or unconditional:
1049            // - If the target is a label at the current offset, then remove
1050            //   the conditional branch, and reset all labels that targeted
1051            //   the current offset (end of branch) to the truncated
1052            //   end-of-code.
1053            //
1054            // Preserves execution semantics: a branch to its own fallthrough
1055            // address is equivalent to a no-op; in both cases, nextPC is the
1056            // fallthrough.
1057            if self.resolve_label_offset(b.target) == cur_off {
1058                trace!("branch with target == cur off; truncating");
1059                self.truncate_last_branch();
1060                continue;
1061            }
1062
1063            // If latest is an unconditional branch:
1064            //
1065            // - If the branch's target is not its own start address, then for
1066            //   each label at the start of branch, make the label an alias of the
1067            //   branch target, and remove the label from the "labels at this
1068            //   branch" list.
1069            //
1070            //   - Preserves execution semantics: an unconditional branch's
1071            //     only effect is to set PC to a new PC; this change simply
1072            //     collapses one step in the step-semantics.
1073            //
1074            //   - Post-invariant: the labels that were bound to the start of
1075            //     this branch become aliases, so they must not be present in any
1076            //     labels-at-this-branch list or the labels-at-tail list. The
1077            //     labels are removed form the latest-branch record's
1078            //     labels-at-this-branch list, and are never placed in the
1079            //     labels-at-tail list. Furthermore, it is correct that they are
1080            //     not in either list, because they are now aliases, and labels
1081            //     that are aliases remain aliases forever.
1082            //
1083            // - If there is a prior unconditional branch that ends just before
1084            //   this one begins, and this branch has no labels bound to its
1085            //   start, then we can truncate this branch, because it is entirely
1086            //   unreachable (we have redirected all labels that make it
1087            //   reachable otherwise). Do so and continue around the loop.
1088            //
1089            //   - Preserves execution semantics: the branch is unreachable,
1090            //     because execution can only flow into an instruction from the
1091            //     prior instruction's fallthrough or from a branch bound to that
1092            //     instruction's start offset. Unconditional branches have no
1093            //     fallthrough, so if the prior instruction is an unconditional
1094            //     branch, no fallthrough entry can happen. The
1095            //     labels-at-this-branch list is complete (by invariant), so if it
1096            //     is empty, then the instruction is entirely unreachable. Thus,
1097            //     it can be removed.
1098            //
1099            //   - Post-invariant: ensured by truncate_last_branch().
1100            //
1101            // - If there is a prior conditional branch whose target label
1102            //   resolves to the current offset (branches around the
1103            //   unconditional branch), then remove the unconditional branch,
1104            //   and make the target of the unconditional the target of the
1105            //   conditional instead.
1106            //
1107            //   - Preserves execution semantics: previously we had:
1108            //
1109            //         L1:
1110            //            cond_br L2
1111            //            br L3
1112            //         L2:
1113            //            (end of buffer)
1114            //
1115            //     by removing the last branch, we have:
1116            //
1117            //         L1:
1118            //            cond_br L2
1119            //         L2:
1120            //            (end of buffer)
1121            //
1122            //     we then fix up the records for the conditional branch to
1123            //     have:
1124            //
1125            //         L1:
1126            //           cond_br.inverted L3
1127            //         L2:
1128            //
1129            //     In the original code, control flow reaches L2 when the
1130            //     conditional branch's predicate is true, and L3 otherwise. In
1131            //     the optimized code, the same is true.
1132            //
1133            //   - Post-invariant: all edits to latest_branches and
1134            //     labels_at_tail are performed by `truncate_last_branch()`,
1135            //     which maintains the invariants at each step.
1136
1137            if b.is_uncond() {
1138                // Set any label equal to current branch's start as an alias of
1139                // the branch's target, if the target is not the branch itself
1140                // (i.e., an infinite loop).
1141                //
1142                // We cannot perform this aliasing if the target of this branch
1143                // ultimately aliases back here; if so, we need to keep this
1144                // branch, so break out of this loop entirely (and clear the
1145                // latest-branches list below).
1146                //
1147                // Note that this check is what prevents cycles from forming in
1148                // `self.label_aliases`. To see why, consider an arbitrary start
1149                // state:
1150                //
1151                // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1152                // Ln, which is not aliased.
1153                //
1154                // We would create a cycle if we assigned label_aliases[Ln]
1155                // = L1.  Note that the below assignment is the only write
1156                // to label_aliases.
1157                //
1158                // By our other invariants, we have that Ln (`l` below)
1159                // resolves to the offset `b.start`, because it is in the
1160                // set `b.labels_at_this_branch`.
1161                //
1162                // If L1 were already aliased, through some arbitrarily deep
1163                // chain, to Ln, then it must also resolve to this offset
1164                // `b.start`.
1165                //
1166                // By checking the resolution of `L1` against this offset,
1167                // and aborting this branch-simplification if they are
1168                // equal, we prevent the below assignment from ever creating
1169                // a cycle.
1170                if self.resolve_label_offset(b.target) != b.start {
1171                    let redirected = b.labels_at_this_branch.len();
1172                    for &l in &b.labels_at_this_branch {
1173                        trace!(
1174                            " -> label at start of branch {:?} redirected to target {:?}",
1175                            l, b.target
1176                        );
1177                        self.label_aliases[l.0 as usize] = b.target;
1178                        // NOTE: we continue to ensure the invariant that labels
1179                        // pointing to tail of buffer are in `labels_at_tail`
1180                        // because we already ensured above that the last branch
1181                        // cannot have a target of `cur_off`; so we never have
1182                        // to put the label into `labels_at_tail` when moving it
1183                        // here.
1184                    }
1185                    // Maintain invariant: all branches have been redirected
1186                    // and are no longer pointing at the start of this branch.
1187                    let mut_b = self.latest_branches.last_mut().unwrap();
1188                    mut_b.labels_at_this_branch.clear();
1189
1190                    if redirected > 0 {
1191                        trace!(" -> after label redirects, restarting loop");
1192                        continue;
1193                    }
1194                } else {
1195                    break;
1196                }
1197
1198                let b = self.latest_branches.last().unwrap();
1199
1200                // Examine any immediately preceding branch.
1201                if self.latest_branches.len() > 1 {
1202                    let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1203                    trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1204                    // This uncond is immediately after another uncond; we
1205                    // should have already redirected labels to this uncond away
1206                    // (but check to be sure); so we can truncate this uncond.
1207                    if prev_b.is_uncond()
1208                        && prev_b.end == b.start
1209                        && b.labels_at_this_branch.is_empty()
1210                    {
1211                        trace!(" -> uncond follows another uncond; truncating");
1212                        self.truncate_last_branch();
1213                        continue;
1214                    }
1215
1216                    // This uncond is immediately after a conditional, and the
1217                    // conditional's target is the end of this uncond, and we've
1218                    // already redirected labels to this uncond away; so we can
1219                    // truncate this uncond, flip the sense of the conditional, and
1220                    // set the conditional's target (in `latest_branches` and in
1221                    // `fixup_records`) to the uncond's target.
1222                    if prev_b.is_cond()
1223                        && prev_b.end == b.start
1224                        && self.resolve_label_offset(prev_b.target) == cur_off
1225                    {
1226                        trace!(
1227                            " -> uncond follows a conditional, and conditional's target resolves to current offset"
1228                        );
1229                        // Save the target of the uncond (this becomes the
1230                        // target of the cond), and truncate the uncond.
1231                        let target = b.target;
1232                        let data = prev_b.inverted.clone().unwrap();
1233                        self.truncate_last_branch();
1234
1235                        // Mutate the code and cond branch.
1236                        let off_before_edit = self.cur_offset();
1237                        let prev_b = self.latest_branches.last_mut().unwrap();
1238                        let not_inverted = SmallVec::from(
1239                            &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1240                        );
1241
1242                        // Low-level edit: replaces bytes of branch with
1243                        // inverted form. cur_off remains the same afterward, so
1244                        // we do not need to modify label data structures.
1245                        self.data.truncate(prev_b.start as usize);
1246                        self.data.extend_from_slice(&data[..]);
1247
1248                        // Save the original code as the inversion of the
1249                        // inverted branch, in case we later edit this branch
1250                        // again.
1251                        prev_b.inverted = Some(not_inverted);
1252                        self.pending_fixup_records[prev_b.fixup].label = target;
1253                        trace!(" -> reassigning target of condbr to {:?}", target);
1254                        prev_b.target = target;
1255                        debug_assert_eq!(off_before_edit, self.cur_offset());
1256                        continue;
1257                    }
1258                }
1259            }
1260
1261            // If we couldn't do anything with the last branch, then break.
1262            break;
1263        }
1264
1265        self.purge_latest_branches();
1266
1267        trace!(
1268            "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1269            self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1270        );
1271    }
1272
1273    fn purge_latest_branches(&mut self) {
1274        // All of our branch simplification rules work only if a branch ends at
1275        // the tail of the buffer, with no following code; and branches are in
1276        // order in latest_branches; so if the last entry ends prior to
1277        // cur_offset, then clear all entries.
1278        let cur_off = self.cur_offset();
1279        if let Some(l) = self.latest_branches.last() {
1280            if l.end < cur_off {
1281                trace!("purge_latest_branches: removing branch {:?}", l);
1282                self.latest_branches.clear();
1283            }
1284        }
1285
1286        // Post-invariant: no invariant requires any branch to appear in
1287        // `latest_branches`; it is always optional. The list-clear above thus
1288        // preserves all semantics.
1289    }
1290
1291    /// Emit a trap at some point in the future with the specified code and
1292    /// stack map.
1293    ///
1294    /// This function returns a [`MachLabel`] which will be the future address
1295    /// of the trap. Jumps should refer to this label, likely by using the
1296    /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1297    /// patched in once the address of the trap is known.
1298    ///
1299    /// This will batch all traps into the end of the function.
1300    pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1301        let label = self.get_label();
1302        self.pending_traps.push(MachLabelTrap {
1303            label,
1304            code,
1305            loc: self.cur_srcloc.map(|(_start, loc)| loc),
1306        });
1307        label
1308    }
1309
1310    /// Is an island needed within the next N bytes?
1311    pub fn island_needed(&self, distance: CodeOffset) -> bool {
1312        let deadline = match self.fixup_records.peek() {
1313            Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1314            None => self.pending_fixup_deadline,
1315        };
1316        trace!(
1317            "checking island_needed: cur_offset = {} deadline = {} worst_case_end_of_island = {}",
1318            self.cur_offset(),
1319            deadline,
1320            self.worst_case_end_of_island(distance)
1321        );
1322        let needed = deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline;
1323        trace!(" -> needed = {needed}");
1324        needed
1325    }
1326
1327    /// Returns the maximal offset that islands can reach if `distance` more
1328    /// bytes are appended.
1329    ///
1330    /// This is used to determine if veneers need insertions since jumps that
1331    /// can't reach past this point must get a veneer of some form.
1332    fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1333        // Assume that all fixups will require veneers and that the veneers are
1334        // the worst-case size for each platform. This is an over-generalization
1335        // to avoid iterating over the `fixup_records` list or maintaining
1336        // information about it as we go along.
1337        let max_veneer_count =
1338            u32::try_from(self.fixup_records.len() + self.pending_fixup_records.len()).unwrap();
1339        let island_worst_case_size = max_veneer_count
1340            .saturating_mul(I::LabelUse::worst_case_veneer_size())
1341            + self.pending_constants_size
1342            + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1343        self.cur_offset()
1344            .saturating_add(distance)
1345            .saturating_add(island_worst_case_size)
1346    }
1347
1348    /// Emit all pending constants and required pending veneers.
1349    ///
1350    /// Should only be called if `island_needed()` returns true, i.e., if we
1351    /// actually reach a deadline. It's not necessarily a problem to do so
1352    /// otherwise but it may result in unnecessary work during emission.
1353    ///
1354    /// The current code-emission position must be a "safe" location for an
1355    /// island: i.e., no fallthrough into the island contents from the
1356    /// previous instruction is possible. Callers emitting inside a basic
1357    /// block should emit a jump-around branch.
1358    pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1359        self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1360    }
1361
1362    /// Same as `emit_island`, but an internal API with a `force_veneers`
1363    /// argument to force all veneers to always get emitted for debugging.
1364    fn emit_island_maybe_forced(
1365        &mut self,
1366        force_veneers: ForceVeneers,
1367        distance: CodeOffset,
1368        ctrl_plane: &mut ControlPlane,
1369    ) {
1370        trace!(
1371            "emitting island at {}, distance = {distance}",
1372            self.cur_offset()
1373        );
1374
1375        // We're going to purge fixups, so no latest-branch editing can happen
1376        // anymore.
1377        self.latest_branches.clear();
1378
1379        // End the current location tracking since anything emitted during this
1380        // function shouldn't be attributed to whatever the current source
1381        // location is.
1382        //
1383        // Note that the current source location, if it's set right now, will be
1384        // restored at the end of this island emission.
1385        let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1386        if cur_loc.is_some() {
1387            self.end_srcloc();
1388        }
1389
1390        let forced_threshold = self.worst_case_end_of_island(distance);
1391        trace!("forced_threshold = {forced_threshold}");
1392
1393        // Emit traps/constants after the island: with potentially
1394        // unbounded pending constants/traps and potentially small
1395        // deadlines, it would otherwise be possible to emit a
1396        // small-range jump, have a nearby deadline *before* the end
1397        // of pending constants/traps, and not be able to emit a
1398        // veneer in time.
1399        //
1400        // Fixups whose labels aren't yet defined (e.g. references to
1401        // pending constants/traps) are simply deferred here; they'll
1402        // be resolved in the next island or in the final fixup pass
1403        // at the end of emission.
1404
1405        // Either handle all pending fixups because they're ready or move them
1406        // onto the `BinaryHeap` tracking all pending fixups if they aren't
1407        // ready.
1408        assert!(self.latest_branches.is_empty());
1409        trace!(
1410            "About to handle fixups at offset {}: {:?}",
1411            self.cur_offset(),
1412            self.pending_fixup_records
1413        );
1414        for fixup in mem::take(&mut self.pending_fixup_records) {
1415            if self.should_apply_fixup(&fixup, forced_threshold) {
1416                self.handle_fixup(fixup, force_veneers, forced_threshold);
1417            } else {
1418                self.fixup_records.push(fixup);
1419            }
1420        }
1421        self.pending_fixup_deadline = u32::MAX;
1422        while let Some(fixup) = self.fixup_records.peek() {
1423            trace!(
1424                "emit_island: fixup {:?} deadline {}",
1425                fixup,
1426                fixup.deadline()
1427            );
1428
1429            // If this fixup shouldn't be applied, that means its label isn't
1430            // defined yet and there'll be remaining space to apply a veneer if
1431            // necessary in the future after this island. In that situation
1432            // because `fixup_records` is sorted by deadline this loop can
1433            // exit.
1434            if !self.should_apply_fixup(fixup, forced_threshold) {
1435                break;
1436            }
1437
1438            let fixup = self.fixup_records.pop().unwrap();
1439            self.handle_fixup(fixup, force_veneers, forced_threshold);
1440        }
1441
1442        // Now emit pending traps and constants.
1443        //
1444        // Note that traps are placed first since this typically happens at the
1445        // end of the function and for disassemblers we try to keep all the code
1446        // contiguously together.
1447        trace!("emitting pending traps: {:?}", self.pending_traps);
1448        for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1449            // If this trap has source information associated with it then
1450            // emit this information for the trap instruction going out now too.
1451            if let Some(loc) = loc {
1452                self.start_srcloc(loc);
1453            }
1454            self.align_to(I::LabelUse::ALIGN);
1455            self.bind_label(label, ctrl_plane);
1456            self.add_trap(code);
1457            self.put_data(I::TRAP_OPCODE);
1458            if loc.is_some() {
1459                self.end_srcloc();
1460            }
1461        }
1462
1463        trace!("emitting pending constants: {:?}", self.pending_constants);
1464        for constant in mem::take(&mut self.pending_constants) {
1465            let MachBufferConstant { align, size, .. } = self.constants[constant];
1466            let label = self.constants[constant].upcoming_label.take().unwrap();
1467            self.align_to(align);
1468            self.bind_label(label, ctrl_plane);
1469            self.used_constants.push((constant, self.cur_offset()));
1470            self.get_appended_space(size);
1471        }
1472
1473        if let Some(loc) = cur_loc {
1474            self.start_srcloc(loc);
1475        }
1476    }
1477
1478    fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1479        let label_offset = self.resolve_label_offset(fixup.label);
1480        trace!(
1481            "should_apply_fixup: fixup {fixup:?} label_offset {label_offset} deadline {} forced_threshold {forced_threshold} supports_veneer {}",
1482            fixup.deadline(),
1483            fixup.kind.supports_veneer()
1484        );
1485        let result = (label_offset != UNKNOWN_LABEL_OFFSET)
1486            || ((fixup.deadline() < forced_threshold) && fixup.kind.supports_veneer());
1487        trace!(
1488            " -> {}, {}, {} -> {result}",
1489            label_offset != UNKNOWN_LABEL_OFFSET,
1490            fixup.deadline() < forced_threshold,
1491            fixup.kind.supports_veneer()
1492        );
1493        result
1494    }
1495
1496    fn handle_fixup(
1497        &mut self,
1498        fixup: MachLabelFixup<I>,
1499        force_veneers: ForceVeneers,
1500        forced_threshold: CodeOffset,
1501    ) {
1502        let MachLabelFixup {
1503            label,
1504            offset,
1505            kind,
1506        } = fixup;
1507        let start = offset as usize;
1508        let end = (offset + kind.patch_size()) as usize;
1509        let label_offset = self.resolve_label_offset(label);
1510
1511        if label_offset != UNKNOWN_LABEL_OFFSET {
1512            // If the offset of the label for this fixup is known then
1513            // we're going to do something here-and-now. We're either going
1514            // to patch the original offset because it's an in-bounds jump,
1515            // or we're going to generate a veneer, patch the fixup to jump
1516            // to the veneer, and then keep going.
1517            //
1518            // If the label comes after the original fixup, then we should
1519            // be guaranteed that the jump is in-bounds. Otherwise there's
1520            // a bug somewhere because this method wasn't called soon
1521            // enough. All forward-jumps are tracked and should get veneers
1522            // before their deadline comes and they're unable to jump
1523            // further.
1524            //
1525            // Otherwise if the label is before the fixup, then that's a
1526            // backwards jump. If it's past the maximum negative range
1527            // then we'll emit a veneer that to jump forward to which can
1528            // then jump backwards.
1529            let veneer_required = if label_offset >= offset {
1530                assert!((label_offset - offset) <= kind.max_pos_range());
1531                false
1532            } else {
1533                (offset - label_offset) > kind.max_neg_range()
1534            };
1535            trace!(
1536                " -> label_offset = {}, known, required = {} (pos {} neg {})",
1537                label_offset,
1538                veneer_required,
1539                kind.max_pos_range(),
1540                kind.max_neg_range()
1541            );
1542
1543            if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1544                self.emit_veneer(label, offset, kind);
1545            } else {
1546                let slice = &mut self.data[start..end];
1547                trace!(
1548                    "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1549                );
1550                kind.patch(slice, offset, label_offset);
1551            }
1552        } else {
1553            // If the offset of this label is not known at this time then
1554            // that means that a veneer is required because after this
1555            // island the target can't be in range of the original target.
1556            assert!(forced_threshold - offset > kind.max_pos_range());
1557            self.emit_veneer(label, offset, kind);
1558        }
1559    }
1560
1561    /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1562    ///
1563    /// This will generate extra machine code, using `kind`, to get a
1564    /// larger-jump-kind than `kind` allows. The code at `offset` is then
1565    /// patched to jump to our new code, and then the new code is enqueued for
1566    /// a fixup to get processed at some later time.
1567    fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1568        // If this `kind` doesn't support a veneer then that's a bug in the
1569        // backend because we need to implement support for such a veneer.
1570        assert!(
1571            kind.supports_veneer(),
1572            "jump beyond the range of {kind:?} but a veneer isn't supported",
1573        );
1574
1575        // Allocate space for a veneer in the island.
1576        self.align_to(I::LabelUse::ALIGN);
1577        let veneer_offset = self.cur_offset();
1578        trace!("making a veneer at {}", veneer_offset);
1579        let start = offset as usize;
1580        let end = (offset + kind.patch_size()) as usize;
1581        let slice = &mut self.data[start..end];
1582        // Patch the original label use to refer to the veneer.
1583        trace!(
1584            "patching original at offset {} to veneer offset {}",
1585            offset, veneer_offset
1586        );
1587        kind.patch(slice, offset, veneer_offset);
1588        // Generate the veneer.
1589        let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1590        let (veneer_fixup_off, veneer_label_use) =
1591            kind.generate_veneer(veneer_slice, veneer_offset);
1592        trace!(
1593            "generated veneer; fixup offset {}, label_use {:?}",
1594            veneer_fixup_off, veneer_label_use
1595        );
1596        // Register a new use of `label` with our new veneer fixup and
1597        // offset. This'll recalculate deadlines accordingly and
1598        // enqueue this fixup to get processed at some later
1599        // time.
1600        self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1601    }
1602
1603    fn finish_emission_maybe_forcing_veneers(
1604        &mut self,
1605        force_veneers: ForceVeneers,
1606        ctrl_plane: &mut ControlPlane,
1607    ) {
1608        while !self.pending_constants.is_empty()
1609            || !self.pending_traps.is_empty()
1610            || !self.fixup_records.is_empty()
1611            || !self.pending_fixup_records.is_empty()
1612        {
1613            // `emit_island()` will emit any pending veneers and constants, and
1614            // as a side-effect, will also take care of any fixups with resolved
1615            // labels eagerly.
1616            self.emit_island_maybe_forced(force_veneers, 0, ctrl_plane);
1617        }
1618
1619        // Ensure that all labels have been fixed up after the last island is emitted. This is a
1620        // full (release-mode) assert because an unresolved label means the emitted code is
1621        // incorrect.
1622        assert!(self.fixup_records.is_empty());
1623        assert!(self.pending_fixup_records.is_empty());
1624    }
1625
1626    /// Finish any deferred emissions and/or fixups.
1627    pub fn finish(
1628        mut self,
1629        constants: &VCodeConstants,
1630        ctrl_plane: &mut ControlPlane,
1631    ) -> MachBufferFinalized<Stencil> {
1632        let _tt = timing::vcode_emit_finish();
1633
1634        self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1635
1636        let alignment = self.finish_constants(constants);
1637
1638        // Resolve all labels to their offsets.
1639        let finalized_relocs = self
1640            .relocs
1641            .iter()
1642            .map(|reloc| FinalizedMachReloc {
1643                offset: reloc.offset,
1644                kind: reloc.kind,
1645                addend: reloc.addend,
1646                target: match &reloc.target {
1647                    RelocTarget::ExternalName(name) => {
1648                        FinalizedRelocTarget::ExternalName(name.clone())
1649                    }
1650                    RelocTarget::Label(label) => {
1651                        FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1652                    }
1653                },
1654            })
1655            .collect();
1656
1657        let finalized_exception_handlers = self
1658            .exception_handlers
1659            .iter()
1660            .map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))
1661            .collect();
1662
1663        let mut srclocs = self.srclocs;
1664        srclocs.sort_by_key(|entry| entry.start);
1665
1666        MachBufferFinalized {
1667            data: self.data,
1668            relocs: finalized_relocs,
1669            traps: self.traps,
1670            call_sites: self.call_sites,
1671            patchable_call_sites: self.patchable_call_sites,
1672            exception_handlers: finalized_exception_handlers,
1673            srclocs,
1674            debug_tags: self.debug_tags,
1675            debug_tag_pool: self.debug_tag_pool,
1676            user_stack_maps: self.user_stack_maps,
1677            unwind_info: self.unwind_info,
1678            alignment,
1679            frame_layout: self.frame_layout,
1680            nop_units: I::gen_nop_units(),
1681        }
1682    }
1683
1684    /// Add an external relocation at the given offset.
1685    pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1686        &mut self,
1687        offset: CodeOffset,
1688        kind: Reloc,
1689        target: &T,
1690        addend: Addend,
1691    ) {
1692        let target: RelocTarget = target.clone().into();
1693        // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1694        // generate a label-use statement to track whether an island is possibly
1695        // needed to escape this function to actually get to the external name.
1696        // This is most likely to come up on AArch64 where calls between
1697        // functions use a 26-bit signed offset which gives +/- 64MB. This means
1698        // that if a function is 128MB in size and there's a call in the middle
1699        // it's impossible to reach the actual target. Also, while it's
1700        // technically possible to jump to the start of a function and then jump
1701        // further, island insertion below always inserts islands after
1702        // previously appended code so for Cranelift's own implementation this
1703        // is also a problem for 64MB functions on AArch64 which start with a
1704        // call instruction, those won't be able to escape.
1705        //
1706        // Ideally what needs to happen here is that a `LabelUse` is
1707        // transparently generated (or call-sites of this function are audited
1708        // to generate a `LabelUse` instead) and tracked internally. The actual
1709        // relocation would then change over time if and when a veneer is
1710        // inserted, where the relocation here would be patched by this
1711        // `MachBuffer` to jump to the veneer. The problem, though, is that all
1712        // this still needs to end up, in the case of a singular function,
1713        // generating a final relocation pointing either to this particular
1714        // relocation or to the veneer inserted. Additionally
1715        // `MachBuffer` needs the concept of a label which will never be
1716        // resolved, so `emit_island` doesn't trip over not actually ever
1717        // knowing what some labels are. Currently the loop in
1718        // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1719        // loop.
1720        //
1721        // For now this means that because relocs aren't tracked at all that
1722        // AArch64 functions have a rough size limits of 64MB. For now that's
1723        // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1724        // when a relocation can't otherwise be resolved later, so it shouldn't
1725        // actually result in any memory unsafety or anything like that.
1726        self.relocs.push(MachReloc {
1727            offset,
1728            kind,
1729            target,
1730            addend,
1731        });
1732    }
1733
1734    /// Add an external relocation at the current offset.
1735    pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1736        &mut self,
1737        kind: Reloc,
1738        target: &T,
1739        addend: Addend,
1740    ) {
1741        self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1742    }
1743
1744    /// Add a trap record at the current offset.
1745    pub fn add_trap(&mut self, code: TrapCode) {
1746        self.traps.push(MachTrap {
1747            offset: self.data.len() as CodeOffset,
1748            code,
1749        });
1750    }
1751
1752    /// Add a call-site record at the current offset.
1753    pub fn add_call_site(&mut self) {
1754        self.add_try_call_site(None, core::iter::empty());
1755    }
1756
1757    /// Add a call-site record at the current offset with exception
1758    /// handlers.
1759    pub fn add_try_call_site(
1760        &mut self,
1761        frame_offset: Option<u32>,
1762        exception_handlers: impl Iterator<Item = MachExceptionHandler>,
1763    ) {
1764        let start = u32::try_from(self.exception_handlers.len()).unwrap();
1765        self.exception_handlers.extend(exception_handlers);
1766        let end = u32::try_from(self.exception_handlers.len()).unwrap();
1767        let exception_handler_range = start..end;
1768
1769        self.call_sites.push(MachCallSite {
1770            ret_addr: self.data.len() as CodeOffset,
1771            frame_offset,
1772            exception_handler_range,
1773        });
1774    }
1775
1776    /// Add a patchable call record at the current offset The actual
1777    /// call is expected to have been emitted; the VCodeInst trait
1778    /// specifies how to NOP it out, and we carry that information to
1779    /// the finalized Machbuffer.
1780    pub fn add_patchable_call_site(&mut self, len: u32) {
1781        self.patchable_call_sites.push(MachPatchableCallSite {
1782            ret_addr: self.cur_offset(),
1783            len,
1784        });
1785    }
1786
1787    /// Add an unwind record at the current offset.
1788    pub fn add_unwind(&mut self, unwind: UnwindInst) {
1789        self.unwind_info.push((self.cur_offset(), unwind));
1790    }
1791
1792    /// Set the `SourceLoc` for code from this offset until the offset at the
1793    /// next call to `end_srcloc()`.
1794    /// Returns the current [CodeOffset] and [RelSourceLoc].
1795    pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1796        let cur = (self.cur_offset(), loc);
1797        self.cur_srcloc = Some(cur);
1798        cur
1799    }
1800
1801    /// Mark the end of the `SourceLoc` segment started at the last
1802    /// `start_srcloc()` call.
1803    pub fn end_srcloc(&mut self) {
1804        let (start, loc) = self
1805            .cur_srcloc
1806            .take()
1807            .expect("end_srcloc() called without start_srcloc()");
1808        let end = self.cur_offset();
1809        // Skip zero-length extends.
1810        debug_assert!(end >= start);
1811        if end > start {
1812            self.srclocs.push(MachSrcLoc { start, end, loc });
1813        }
1814    }
1815
1816    /// Push a user stack map onto this buffer.
1817    ///
1818    /// The stack map is associated with the given `return_addr` code
1819    /// offset. This must be the PC for the instruction just *after* this stack
1820    /// map's associated instruction. For example in the sequence `call $foo;
1821    /// add r8, rax`, the `return_addr` must be the offset of the start of the
1822    /// `add` instruction.
1823    ///
1824    /// Stack maps must be pushed in sorted `return_addr` order.
1825    pub fn push_user_stack_map(
1826        &mut self,
1827        emit_state: &I::State,
1828        return_addr: CodeOffset,
1829        mut stack_map: ir::UserStackMap,
1830    ) {
1831        let span = emit_state.frame_layout().active_size();
1832        trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1833
1834        debug_assert!(
1835            self.user_stack_maps
1836                .last()
1837                .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1838            "pushed stack maps out of order: {} is not less than {}",
1839            self.user_stack_maps.last().unwrap().0,
1840            return_addr,
1841        );
1842
1843        stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1844        self.user_stack_maps.push((return_addr, span, stack_map));
1845    }
1846
1847    /// Push a debug tag associated with the current buffer offset.
1848    pub fn push_debug_tags(&mut self, pos: MachDebugTagPos, tags: &[DebugTag]) {
1849        trace!("debug tags at offset {}: {tags:?}", self.cur_offset());
1850        let start = u32::try_from(self.debug_tag_pool.len()).unwrap();
1851        self.debug_tag_pool.extend(tags.iter().cloned());
1852        let end = u32::try_from(self.debug_tag_pool.len()).unwrap();
1853        self.debug_tags.push(MachDebugTags {
1854            offset: self.cur_offset(),
1855            pos,
1856            range: start..end,
1857        });
1858    }
1859
1860    /// Increase the alignment of the buffer to the given alignment if bigger
1861    /// than the current alignment.
1862    pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1863        self.min_alignment = self.min_alignment.max(
1864            1u32.checked_shl(u32::from(align_to))
1865                .expect("log2_min_function_alignment too large"),
1866        );
1867    }
1868
1869    /// Set the frame layout metadata.
1870    pub fn set_frame_layout(&mut self, frame_layout: MachBufferFrameLayout) {
1871        debug_assert!(self.frame_layout.is_none());
1872        self.frame_layout = Some(frame_layout);
1873    }
1874}
1875
1876impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1877    fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1878        for b in iter {
1879            self.put1(b);
1880        }
1881    }
1882}
1883
1884impl<T: CompilePhase> MachBufferFinalized<T> {
1885    /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1886    pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1887        &self.srclocs[..]
1888    }
1889
1890    /// Get all debug tags, sorted by associated offset.
1891    pub fn debug_tags(&self) -> impl Iterator<Item = MachBufferDebugTagList<'_>> {
1892        self.debug_tags.iter().map(|tags| {
1893            let start = usize::try_from(tags.range.start).unwrap();
1894            let end = usize::try_from(tags.range.end).unwrap();
1895            MachBufferDebugTagList {
1896                offset: tags.offset,
1897                pos: tags.pos,
1898                tags: &self.debug_tag_pool[start..end],
1899            }
1900        })
1901    }
1902
1903    /// Get the total required size for the code.
1904    pub fn total_size(&self) -> CodeOffset {
1905        self.data.len() as CodeOffset
1906    }
1907
1908    /// Return the code in this mach buffer as a hex string for testing purposes.
1909    pub fn stringify_code_bytes(&self) -> String {
1910        // This is pretty lame, but whatever ..
1911        use core::fmt::Write;
1912        let mut s = String::with_capacity(self.data.len() * 2);
1913        for b in &self.data {
1914            write!(&mut s, "{b:02X}").unwrap();
1915        }
1916        s
1917    }
1918
1919    /// Get the code bytes.
1920    pub fn data(&self) -> &[u8] {
1921        // N.B.: we emit every section into the .text section as far as
1922        // the `CodeSink` is concerned; we do not bother to segregate
1923        // the contents into the actual program text, the jumptable and the
1924        // rodata (constant pool). This allows us to generate code assuming
1925        // that these will not be relocated relative to each other, and avoids
1926        // having to designate each section as belonging in one of the three
1927        // fixed categories defined by `CodeSink`. If this becomes a problem
1928        // later (e.g. because of memory permissions or similar), we can
1929        // add this designation and segregate the output; take care, however,
1930        // to add the appropriate relocations in this case.
1931
1932        &self.data[..]
1933    }
1934
1935    /// Get a mutable slice of the code bytes, allowing patching
1936    /// post-passes.
1937    pub fn data_mut(&mut self) -> &mut [u8] {
1938        &mut self.data[..]
1939    }
1940
1941    /// Get the list of external relocations for this code.
1942    pub fn relocs(&self) -> &[FinalizedMachReloc] {
1943        &self.relocs[..]
1944    }
1945
1946    /// Get the list of trap records for this code.
1947    pub fn traps(&self) -> &[MachTrap] {
1948        &self.traps[..]
1949    }
1950
1951    /// Get the user stack map metadata for this code.
1952    pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1953        &self.user_stack_maps
1954    }
1955
1956    /// Take this buffer's user stack map metadata.
1957    pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1958        mem::take(&mut self.user_stack_maps)
1959    }
1960
1961    /// Get the list of call sites for this code, along with
1962    /// associated exception handlers.
1963    ///
1964    /// Each item yielded by the returned iterator is a struct with:
1965    ///
1966    /// - The call site metadata record, with a `ret_addr` field
1967    ///   directly accessible and denoting the offset of the return
1968    ///   address into this buffer's code.
1969    /// - The slice of pairs of exception tags and code offsets
1970    ///   denoting exception-handler entry points associated with this
1971    ///   call site.
1972    pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1973        self.call_sites.iter().map(|call_site| {
1974            let handler_range = call_site.exception_handler_range.clone();
1975            let handler_range = usize::try_from(handler_range.start).unwrap()
1976                ..usize::try_from(handler_range.end).unwrap();
1977            FinalizedMachCallSite {
1978                ret_addr: call_site.ret_addr,
1979                frame_offset: call_site.frame_offset,
1980                exception_handlers: &self.exception_handlers[handler_range],
1981            }
1982        })
1983    }
1984
1985    /// Get the frame layout, if known.
1986    pub fn frame_layout(&self) -> Option<&MachBufferFrameLayout> {
1987        self.frame_layout.as_ref()
1988    }
1989
1990    /// Get the list of patchable call sites for this code.
1991    ///
1992    /// Each location in the buffer contains the bytes for a call
1993    /// instruction to the specified target. If the call is to be
1994    /// patched out, the bytes in the region should be replaced with
1995    /// those given in the `MachBufferFinalized::nop` array, repeated
1996    /// as many times as necessary. (The length of the patchable
1997    /// region is guaranteed to be an integer multiple of that NOP
1998    /// unit size.)
1999    pub fn patchable_call_sites(&self) -> impl Iterator<Item = &MachPatchableCallSite> + '_ {
2000        self.patchable_call_sites.iter()
2001    }
2002}
2003
2004/// An item in the exception-handler list for a callsite, with label
2005/// references.  Items are interpreted in left-to-right order and the
2006/// first match wins.
2007#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2008pub enum MachExceptionHandler {
2009    /// A specific tag (in the current dynamic context) should be
2010    /// handled by the code at the given offset.
2011    Tag(ExceptionTag, MachLabel),
2012    /// All exceptions should be handled by the code at the given
2013    /// offset.
2014    Default(MachLabel),
2015    /// The dynamic context for interpreting tags is updated to the
2016    /// value stored in the given machine location (in this frame's
2017    /// context).
2018    Context(ExceptionContextLoc),
2019}
2020
2021impl MachExceptionHandler {
2022    fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {
2023        match self {
2024            Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),
2025            Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),
2026            Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),
2027        }
2028    }
2029}
2030
2031/// An item in the exception-handler list for a callsite, with final
2032/// (lowered) code offsets. Items are interpreted in left-to-right
2033/// order and the first match wins.
2034#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2035#[cfg_attr(
2036    feature = "enable-serde",
2037    derive(serde_derive::Serialize, serde_derive::Deserialize)
2038)]
2039pub enum FinalizedMachExceptionHandler {
2040    /// A specific tag (in the current dynamic context) should be
2041    /// handled by the code at the given offset.
2042    Tag(ExceptionTag, CodeOffset),
2043    /// All exceptions should be handled by the code at the given
2044    /// offset.
2045    Default(CodeOffset),
2046    /// The dynamic context for interpreting tags is updated to the
2047    /// value stored in the given machine location (in this frame's
2048    /// context).
2049    Context(ExceptionContextLoc),
2050}
2051
2052/// A location for a dynamic exception context value.
2053#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2054#[cfg_attr(
2055    feature = "enable-serde",
2056    derive(serde_derive::Serialize, serde_derive::Deserialize)
2057)]
2058pub enum ExceptionContextLoc {
2059    /// An offset from SP at the callsite.
2060    SPOffset(u32),
2061    /// A GPR at the callsite. The physical register number for the
2062    /// GPR register file on the target architecture is used.
2063    GPR(u8),
2064}
2065
2066/// Metadata about a constant.
2067struct MachBufferConstant {
2068    /// A label which has not yet been bound which can be used for this
2069    /// constant.
2070    ///
2071    /// This is lazily created when a label is requested for a constant and is
2072    /// cleared when a constant is emitted.
2073    upcoming_label: Option<MachLabel>,
2074    /// Required alignment.
2075    align: CodeOffset,
2076    /// The byte size of this constant.
2077    size: usize,
2078}
2079
2080/// A trap that is deferred to the next time an island is emitted for either
2081/// traps, constants, or fixups.
2082#[derive(Debug)]
2083struct MachLabelTrap {
2084    /// This label will refer to the trap's offset.
2085    label: MachLabel,
2086    /// The code associated with this trap.
2087    code: TrapCode,
2088    /// An optional source location to assign for this trap.
2089    loc: Option<RelSourceLoc>,
2090}
2091
2092/// A fixup to perform on the buffer once code is emitted. Fixups always refer
2093/// to labels and patch the code based on label offsets. Hence, they are like
2094/// relocations, but internal to one buffer.
2095#[derive(Debug)]
2096struct MachLabelFixup<I: VCodeInst> {
2097    /// The label whose offset controls this fixup.
2098    label: MachLabel,
2099    /// The offset to fix up / patch to refer to this label.
2100    offset: CodeOffset,
2101    /// The kind of fixup. This is architecture-specific; each architecture may have,
2102    /// e.g., several types of branch instructions, each with differently-sized
2103    /// offset fields and different places within the instruction to place the
2104    /// bits.
2105    kind: I::LabelUse,
2106}
2107
2108impl<I: VCodeInst> MachLabelFixup<I> {
2109    fn deadline(&self) -> CodeOffset {
2110        self.offset.saturating_add(self.kind.max_pos_range())
2111    }
2112}
2113
2114impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
2115    fn eq(&self, other: &Self) -> bool {
2116        self.deadline() == other.deadline()
2117    }
2118}
2119
2120impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
2121
2122impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
2123    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
2124        Some(self.cmp(other))
2125    }
2126}
2127
2128impl<I: VCodeInst> Ord for MachLabelFixup<I> {
2129    fn cmp(&self, other: &Self) -> Ordering {
2130        other.deadline().cmp(&self.deadline())
2131    }
2132}
2133
2134/// A relocation resulting from a compilation.
2135#[derive(Clone, Debug, PartialEq)]
2136#[cfg_attr(
2137    feature = "enable-serde",
2138    derive(serde_derive::Serialize, serde_derive::Deserialize)
2139)]
2140pub struct MachRelocBase<T> {
2141    /// The offset at which the relocation applies, *relative to the
2142    /// containing section*.
2143    pub offset: CodeOffset,
2144    /// The kind of relocation.
2145    pub kind: Reloc,
2146    /// The external symbol / name to which this relocation refers.
2147    pub target: T,
2148    /// The addend to add to the symbol value.
2149    pub addend: i64,
2150}
2151
2152type MachReloc = MachRelocBase<RelocTarget>;
2153
2154/// A relocation resulting from a compilation.
2155pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
2156
2157/// A Relocation target
2158#[derive(Debug, Clone, PartialEq, Eq, Hash)]
2159pub enum RelocTarget {
2160    /// Points to an [ExternalName] outside the current function.
2161    ExternalName(ExternalName),
2162    /// Points to a [MachLabel] inside this function.
2163    /// This is different from [MachLabelFixup] in that both the relocation and the
2164    /// label will be emitted and are only resolved at link time.
2165    ///
2166    /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
2167    Label(MachLabel),
2168}
2169
2170impl From<ExternalName> for RelocTarget {
2171    fn from(name: ExternalName) -> Self {
2172        Self::ExternalName(name)
2173    }
2174}
2175
2176impl From<MachLabel> for RelocTarget {
2177    fn from(label: MachLabel) -> Self {
2178        Self::Label(label)
2179    }
2180}
2181
2182/// A Relocation target
2183#[derive(Debug, Clone, PartialEq, Eq, Hash)]
2184#[cfg_attr(
2185    feature = "enable-serde",
2186    derive(serde_derive::Serialize, serde_derive::Deserialize)
2187)]
2188pub enum FinalizedRelocTarget {
2189    /// Points to an [ExternalName] outside the current function.
2190    ExternalName(ExternalName),
2191    /// Points to a [CodeOffset] from the start of the current function.
2192    Func(CodeOffset),
2193}
2194
2195impl FinalizedRelocTarget {
2196    /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
2197    /// output.
2198    pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
2199        match self {
2200            FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
2201            FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
2202        }
2203    }
2204}
2205
2206/// A trap record resulting from a compilation.
2207#[derive(Clone, Debug, PartialEq)]
2208#[cfg_attr(
2209    feature = "enable-serde",
2210    derive(serde_derive::Serialize, serde_derive::Deserialize)
2211)]
2212pub struct MachTrap {
2213    /// The offset at which the trap instruction occurs, *relative to the
2214    /// containing section*.
2215    pub offset: CodeOffset,
2216    /// The trap code.
2217    pub code: TrapCode,
2218}
2219
2220/// A call site record resulting from a compilation.
2221#[derive(Clone, Debug, PartialEq)]
2222#[cfg_attr(
2223    feature = "enable-serde",
2224    derive(serde_derive::Serialize, serde_derive::Deserialize)
2225)]
2226pub struct MachCallSite {
2227    /// The offset of the call's return address, *relative to the
2228    /// start of the buffer*.
2229    pub ret_addr: CodeOffset,
2230
2231    /// The offset from the FP at this callsite down to the SP when
2232    /// the call occurs, if known. In other words, the size of the
2233    /// stack frame up to the saved FP slot. Useful to recover the
2234    /// start of the stack frame and to look up dynamic contexts
2235    /// stored in [`ExceptionContextLoc::SPOffset`].
2236    ///
2237    /// If `None`, the compiler backend did not specify a frame
2238    /// offset. The runtime in use with the compiled code may require
2239    /// the frame offset if exception handlers are present or dynamic
2240    /// context is used, but that is not Cranelift's concern: the
2241    /// frame offset is optional at this level.
2242    pub frame_offset: Option<u32>,
2243
2244    /// Range in `exception_handlers` corresponding to the exception
2245    /// handlers for this callsite.
2246    exception_handler_range: Range<u32>,
2247}
2248
2249/// A call site record resulting from a compilation.
2250#[derive(Clone, Debug, PartialEq)]
2251pub struct FinalizedMachCallSite<'a> {
2252    /// The offset of the call's return address, *relative to the
2253    /// start of the buffer*.
2254    pub ret_addr: CodeOffset,
2255
2256    /// The offset from the FP at this callsite down to the SP when
2257    /// the call occurs, if known. In other words, the size of the
2258    /// stack frame up to the saved FP slot. Useful to recover the
2259    /// start of the stack frame and to look up dynamic contexts
2260    /// stored in [`ExceptionContextLoc::SPOffset`].
2261    ///
2262    /// If `None`, the compiler backend did not specify a frame
2263    /// offset. The runtime in use with the compiled code may require
2264    /// the frame offset if exception handlers are present or dynamic
2265    /// context is used, but that is not Cranelift's concern: the
2266    /// frame offset is optional at this level.
2267    pub frame_offset: Option<u32>,
2268
2269    /// Exception handlers at this callsite, with target offsets
2270    /// *relative to the start of the buffer*.
2271    pub exception_handlers: &'a [FinalizedMachExceptionHandler],
2272}
2273
2274/// A patchable call site record resulting from a compilation.
2275#[derive(Clone, Debug, PartialEq)]
2276#[cfg_attr(
2277    feature = "enable-serde",
2278    derive(serde_derive::Serialize, serde_derive::Deserialize)
2279)]
2280pub struct MachPatchableCallSite {
2281    /// The offset of the call's return address (i.e., the address
2282    /// after the end of the patchable region), *relative to the start
2283    /// of the buffer*.
2284    pub ret_addr: CodeOffset,
2285
2286    /// The length of the region to be patched by NOP bytes.
2287    pub len: u32,
2288}
2289
2290/// A source-location mapping resulting from a compilation.
2291#[derive(PartialEq, Debug, Clone)]
2292#[cfg_attr(
2293    feature = "enable-serde",
2294    derive(serde_derive::Serialize, serde_derive::Deserialize)
2295)]
2296pub struct MachSrcLoc<T: CompilePhase> {
2297    /// The start of the region of code corresponding to a source location.
2298    /// This is relative to the start of the function, not to the start of the
2299    /// section.
2300    pub start: CodeOffset,
2301    /// The end of the region of code corresponding to a source location.
2302    /// This is relative to the start of the function, not to the start of the
2303    /// section.
2304    pub end: CodeOffset,
2305    /// The source location.
2306    pub loc: T::SourceLocType,
2307}
2308
2309impl MachSrcLoc<Stencil> {
2310    fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
2311        MachSrcLoc {
2312            start: self.start,
2313            end: self.end,
2314            loc: self.loc.expand(base_srcloc),
2315        }
2316    }
2317}
2318
2319/// Record of branch instruction in the buffer, to facilitate editing.
2320#[derive(Clone, Debug)]
2321struct MachBranch {
2322    start: CodeOffset,
2323    end: CodeOffset,
2324    target: MachLabel,
2325    fixup: usize,
2326    inverted: Option<SmallVec<[u8; 8]>>,
2327    /// All labels pointing to the start of this branch. For correctness, this
2328    /// *must* be complete (i.e., must contain all labels whose resolved offsets
2329    /// are at the start of this branch): we rely on being able to redirect all
2330    /// labels that could jump to this branch before removing it, if it is
2331    /// otherwise unreachable.
2332    labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2333}
2334
2335impl MachBranch {
2336    fn is_cond(&self) -> bool {
2337        self.inverted.is_some()
2338    }
2339    fn is_uncond(&self) -> bool {
2340        self.inverted.is_none()
2341    }
2342}
2343
2344/// Stack-frame layout information carried through to machine
2345/// code. This provides sufficient information to interpret an active
2346/// stack frame from a running function, if provided.
2347#[derive(Clone, Debug, PartialEq)]
2348#[cfg_attr(
2349    feature = "enable-serde",
2350    derive(serde_derive::Serialize, serde_derive::Deserialize)
2351)]
2352pub struct MachBufferFrameLayout {
2353    /// Offset from bottom of frame to FP (near top of frame). This
2354    /// allows reading the frame given only FP.
2355    pub frame_to_fp_offset: u32,
2356    /// Offset from bottom of frame for each StackSlot,
2357    pub stackslots: SecondaryMap<ir::StackSlot, MachBufferStackSlot>,
2358}
2359
2360/// Descriptor for a single stack slot in the compiled function.
2361#[derive(Clone, Debug, PartialEq, Default)]
2362#[cfg_attr(
2363    feature = "enable-serde",
2364    derive(serde_derive::Serialize, serde_derive::Deserialize)
2365)]
2366pub struct MachBufferStackSlot {
2367    /// Offset from the bottom of the stack frame.
2368    pub offset: u32,
2369
2370    /// User-provided key to describe this stack slot.
2371    pub key: Option<ir::StackSlotKey>,
2372}
2373
2374/// Debug tags: a sequence of references to a stack slot, or a
2375/// user-defined value, at a particular PC.
2376#[derive(Clone, Debug, PartialEq)]
2377#[cfg_attr(
2378    feature = "enable-serde",
2379    derive(serde_derive::Serialize, serde_derive::Deserialize)
2380)]
2381pub(crate) struct MachDebugTags {
2382    /// Offset at which this tag applies.
2383    pub offset: CodeOffset,
2384
2385    /// Position on the attached instruction. This indicates whether
2386    /// the tags attach to the prior instruction (i.e., as a return
2387    /// point from a call) or the current instruction (i.e., as a PC
2388    /// seen during a trap).
2389    pub pos: MachDebugTagPos,
2390
2391    /// The range in the tag pool.
2392    pub range: Range<u32>,
2393}
2394
2395/// Debug tag position on an instruction.
2396///
2397/// We need to distinguish position on an instruction, and not just
2398/// use offsets, because of the following case:
2399///
2400/// ```plain
2401/// <tag1, tag2> call ...
2402/// <tag3, tag4> trapping_store ...
2403/// ```
2404///
2405/// If the stack is walked and interpreted with debug tags while
2406/// within the call, the PC seen will be the return point, i.e. the
2407/// address after the call. If the stack is walked and interpreted
2408/// with debug tags upon a trap of the following instruction, it will
2409/// be the PC of that instruction -- which is the same PC! Thus to
2410/// disambiguate which tags we want, we attach a "pre/post" flag to
2411/// every group of tags at an offset; and when we look up tags, we
2412/// look them up for an offset and "position" at that offset.
2413///
2414/// Thus there are logically two positions at every offset -- so the
2415/// above will be emitted as
2416///
2417/// ```plain
2418/// 0: call ...
2419///                          4, post: <tag1, tag2>
2420///                          4, pre: <tag3, tag4>
2421/// 4: trapping_store ...
2422/// ```
2423#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2424#[cfg_attr(
2425    feature = "enable-serde",
2426    derive(serde_derive::Serialize, serde_derive::Deserialize)
2427)]
2428pub enum MachDebugTagPos {
2429    /// Tags attached after the instruction that ends at this offset.
2430    ///
2431    /// This is used to attach tags to a call, because the PC we see
2432    /// when walking the stack is the *return point*.
2433    Post,
2434    /// Tags attached before the instruction that starts at this offset.
2435    ///
2436    /// This is used to attach tags to every other kind of
2437    /// instruction, because the PC we see when processing a trap of
2438    /// that instruction is the PC of that instruction, not the
2439    /// following one.
2440    Pre,
2441}
2442
2443/// Iterator item for visiting debug tags.
2444pub struct MachBufferDebugTagList<'a> {
2445    /// Offset at which this tag applies.
2446    pub offset: CodeOffset,
2447
2448    /// Position at this offset ("post", attaching to prior
2449    /// instruction, or "pre", attaching to next instruction).
2450    pub pos: MachDebugTagPos,
2451
2452    /// The underlying tags.
2453    pub tags: &'a [DebugTag],
2454}
2455
2456/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2457///
2458/// Note that `MachBuffer` was primarily written for intra-function references
2459/// of jumps between basic blocks, but it's also quite usable for entire text
2460/// sections and resolving references between functions themselves. This
2461/// builder interprets "blocks" as labeled functions for the purposes of
2462/// resolving labels internally in the buffer.
2463pub struct MachTextSectionBuilder<I: VCodeInst> {
2464    buf: MachBuffer<I>,
2465    next_func: usize,
2466    force_veneers: ForceVeneers,
2467}
2468
2469impl<I: VCodeInst> MachTextSectionBuilder<I> {
2470    /// Creates a new text section builder which will have `num_funcs` functions
2471    /// pushed into it.
2472    pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2473        let mut buf = MachBuffer::new();
2474        buf.reserve_labels_for_blocks(num_funcs);
2475        MachTextSectionBuilder {
2476            buf,
2477            next_func: 0,
2478            force_veneers: ForceVeneers::No,
2479        }
2480    }
2481}
2482
2483impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2484    fn append(
2485        &mut self,
2486        labeled: bool,
2487        func: &[u8],
2488        align: u32,
2489        ctrl_plane: &mut ControlPlane,
2490    ) -> u64 {
2491        // Conditionally emit an island if it's necessary to resolve jumps
2492        // between functions which are too far away.
2493        let size = func.len() as u32;
2494        if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2495            self.buf
2496                .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2497        }
2498
2499        self.buf.align_to(align);
2500        let pos = self.buf.cur_offset();
2501        if labeled {
2502            self.buf.bind_label(
2503                MachLabel::from_block(BlockIndex::new(self.next_func)),
2504                ctrl_plane,
2505            );
2506            self.next_func += 1;
2507        }
2508        self.buf.put_data(func);
2509        u64::from(pos)
2510    }
2511
2512    fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2513        crate::trace!(
2514            "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2515        );
2516        let label = MachLabel::from_block(BlockIndex::new(target));
2517        let offset = u32::try_from(offset).unwrap();
2518        match I::LabelUse::from_reloc(reloc, addend) {
2519            Some(label_use) => {
2520                self.buf.use_label_at_offset(offset, label, label_use);
2521                true
2522            }
2523            None => false,
2524        }
2525    }
2526
2527    fn force_veneers(&mut self) {
2528        self.force_veneers = ForceVeneers::Yes;
2529    }
2530
2531    fn write(&mut self, offset: u64, data: &[u8]) {
2532        self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2533    }
2534
2535    fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2536        // Double-check all functions were pushed.
2537        assert_eq!(self.next_func, self.buf.label_offsets.len());
2538
2539        // Finish up any veneers, if necessary.
2540        self.buf
2541            .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2542
2543        // We don't need the data any more, so return it to the caller.
2544        mem::take(&mut self.buf.data).into_vec()
2545    }
2546}
2547
2548// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2549#[cfg(all(test, feature = "arm64"))]
2550mod test {
2551    use cranelift_entity::EntityRef as _;
2552
2553    use super::*;
2554    use crate::ir::UserExternalNameRef;
2555    use crate::isa::aarch64;
2556    use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2557    use crate::isa::aarch64::inst::{OperandSize, xreg};
2558    use crate::machinst::{MachInst, MachInstEmit, MachInstEmitState};
2559    use crate::settings;
2560
2561    fn label(n: u32) -> MachLabel {
2562        MachLabel::from_block(BlockIndex::new(n as usize))
2563    }
2564    fn target(n: u32) -> BranchTarget {
2565        BranchTarget::Label(label(n))
2566    }
2567
2568    fn emit_info() -> EmitInfo {
2569        let flags = settings::Flags::new(settings::builder());
2570        let isa_flags = aarch64::settings::Flags::new(&flags, &aarch64::settings::builder());
2571        EmitInfo::new(flags, isa_flags)
2572    }
2573
2574    #[test]
2575    fn test_elide_jump_to_next() {
2576        let info = emit_info();
2577        let mut buf = MachBuffer::new();
2578        let mut state = <Inst as MachInstEmit>::State::default();
2579        let constants = Default::default();
2580
2581        buf.reserve_labels_for_blocks(2);
2582        buf.bind_label(label(0), state.ctrl_plane_mut());
2583        let inst = Inst::Jump { dest: target(1) };
2584        inst.emit(&mut buf, &info, &mut state);
2585        buf.bind_label(label(1), state.ctrl_plane_mut());
2586        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2587        assert_eq!(0, buf.total_size());
2588    }
2589
2590    #[test]
2591    fn test_elide_trivial_jump_blocks() {
2592        let info = emit_info();
2593        let mut buf = MachBuffer::new();
2594        let mut state = <Inst as MachInstEmit>::State::default();
2595        let constants = Default::default();
2596
2597        buf.reserve_labels_for_blocks(4);
2598
2599        buf.bind_label(label(0), state.ctrl_plane_mut());
2600        let inst = Inst::CondBr {
2601            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2602            taken: target(1),
2603            not_taken: target(2),
2604        };
2605        inst.emit(&mut buf, &info, &mut state);
2606
2607        buf.bind_label(label(1), state.ctrl_plane_mut());
2608        let inst = Inst::Jump { dest: target(3) };
2609        inst.emit(&mut buf, &info, &mut state);
2610
2611        buf.bind_label(label(2), state.ctrl_plane_mut());
2612        let inst = Inst::Jump { dest: target(3) };
2613        inst.emit(&mut buf, &info, &mut state);
2614
2615        buf.bind_label(label(3), state.ctrl_plane_mut());
2616
2617        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2618        assert_eq!(0, buf.total_size());
2619    }
2620
2621    #[test]
2622    fn test_flip_cond() {
2623        let info = emit_info();
2624        let mut buf = MachBuffer::new();
2625        let mut state = <Inst as MachInstEmit>::State::default();
2626        let constants = Default::default();
2627
2628        buf.reserve_labels_for_blocks(4);
2629
2630        buf.bind_label(label(0), state.ctrl_plane_mut());
2631        let inst = Inst::CondBr {
2632            kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2633            taken: target(1),
2634            not_taken: target(2),
2635        };
2636        inst.emit(&mut buf, &info, &mut state);
2637
2638        buf.bind_label(label(1), state.ctrl_plane_mut());
2639        let inst = Inst::Nop4;
2640        inst.emit(&mut buf, &info, &mut state);
2641
2642        buf.bind_label(label(2), state.ctrl_plane_mut());
2643        let inst = Inst::Udf {
2644            trap_code: TrapCode::STACK_OVERFLOW,
2645        };
2646        inst.emit(&mut buf, &info, &mut state);
2647
2648        buf.bind_label(label(3), state.ctrl_plane_mut());
2649
2650        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2651
2652        let mut buf2 = MachBuffer::new();
2653        let mut state = Default::default();
2654        let inst = Inst::TrapIf {
2655            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2656            trap_code: TrapCode::STACK_OVERFLOW,
2657        };
2658        inst.emit(&mut buf2, &info, &mut state);
2659        let inst = Inst::Nop4;
2660        inst.emit(&mut buf2, &info, &mut state);
2661
2662        let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2663
2664        assert_eq!(buf.data, buf2.data);
2665    }
2666
2667    #[test]
2668    fn test_island() {
2669        let info = emit_info();
2670        let mut buf = MachBuffer::new();
2671        let mut state = <Inst as MachInstEmit>::State::default();
2672        let constants = Default::default();
2673
2674        buf.reserve_labels_for_blocks(4);
2675
2676        buf.bind_label(label(0), state.ctrl_plane_mut());
2677        let inst = Inst::CondBr {
2678            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2679            taken: target(2),
2680            not_taken: target(3),
2681        };
2682        inst.emit(&mut buf, &info, &mut state);
2683
2684        buf.bind_label(label(1), state.ctrl_plane_mut());
2685        while buf.cur_offset() < 2000000 {
2686            if buf.island_needed(0) {
2687                buf.emit_island(0, state.ctrl_plane_mut());
2688            }
2689            let inst = Inst::Nop4;
2690            inst.emit(&mut buf, &info, &mut state);
2691        }
2692
2693        buf.bind_label(label(2), state.ctrl_plane_mut());
2694        let inst = Inst::Nop4;
2695        inst.emit(&mut buf, &info, &mut state);
2696
2697        buf.bind_label(label(3), state.ctrl_plane_mut());
2698        let inst = Inst::Nop4;
2699        inst.emit(&mut buf, &info, &mut state);
2700
2701        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2702
2703        assert_eq!(2000000 + 8, buf.total_size());
2704
2705        let mut buf2 = MachBuffer::new();
2706        let mut state = Default::default();
2707        let inst = Inst::CondBr {
2708            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2709
2710            // This conditionally taken branch has a 19-bit constant, shifted
2711            // to the left by two, giving us a 21-bit range in total. Half of
2712            // this range positive so the we should be around 1 << 20 bytes
2713            // away for our jump target.
2714            //
2715            // There are two pending fixups by the time we reach this point,
2716            // one for this 19-bit jump and one for the unconditional 26-bit
2717            // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2718            // veneer is 20 bytes large, which means that pessimistically
2719            // assuming we'll need two veneers. Currently each veneer is
2720            // pessimistically assumed to be the maximal size which means we
2721            // need 40 bytes of extra space, meaning that the actual island
2722            // should come 40-bytes before the deadline.
2723            taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2724
2725            // This branch is in-range so no veneers should be needed, it should
2726            // go directly to the target.
2727            not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2728        };
2729        inst.emit(&mut buf2, &info, &mut state);
2730
2731        let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2732
2733        assert_eq!(&buf.data[0..8], &buf2.data[..]);
2734    }
2735
2736    #[test]
2737    fn test_island_backward() {
2738        let info = emit_info();
2739        let mut buf = MachBuffer::new();
2740        let mut state = <Inst as MachInstEmit>::State::default();
2741        let constants = Default::default();
2742
2743        buf.reserve_labels_for_blocks(4);
2744
2745        buf.bind_label(label(0), state.ctrl_plane_mut());
2746        let inst = Inst::Nop4;
2747        inst.emit(&mut buf, &info, &mut state);
2748
2749        buf.bind_label(label(1), state.ctrl_plane_mut());
2750        let inst = Inst::Nop4;
2751        inst.emit(&mut buf, &info, &mut state);
2752
2753        buf.bind_label(label(2), state.ctrl_plane_mut());
2754        while buf.cur_offset() < 2000000 {
2755            let inst = Inst::Nop4;
2756            inst.emit(&mut buf, &info, &mut state);
2757        }
2758
2759        buf.bind_label(label(3), state.ctrl_plane_mut());
2760        let inst = Inst::CondBr {
2761            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2762            taken: target(0),
2763            not_taken: target(1),
2764        };
2765        inst.emit(&mut buf, &info, &mut state);
2766
2767        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2768
2769        assert_eq!(2000000 + 12, buf.total_size());
2770
2771        let mut buf2 = MachBuffer::new();
2772        let mut state = Default::default();
2773        let inst = Inst::CondBr {
2774            kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2775            taken: BranchTarget::ResolvedOffset(8),
2776            not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2777        };
2778        inst.emit(&mut buf2, &info, &mut state);
2779        let inst = Inst::Jump {
2780            dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2781        };
2782        inst.emit(&mut buf2, &info, &mut state);
2783
2784        let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2785
2786        assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2787    }
2788
2789    #[test]
2790    fn test_multiple_redirect() {
2791        // label0:
2792        //   cbz x0, label1
2793        //   b label2
2794        // label1:
2795        //   b label3
2796        // label2:
2797        //   nop
2798        //   nop
2799        //   b label0
2800        // label3:
2801        //   b label4
2802        // label4:
2803        //   b label5
2804        // label5:
2805        //   b label7
2806        // label6:
2807        //   nop
2808        // label7:
2809        //   ret
2810        //
2811        // -- should become:
2812        //
2813        // label0:
2814        //   cbz x0, label7
2815        // label2:
2816        //   nop
2817        //   nop
2818        //   b label0
2819        // label6:
2820        //   nop
2821        // label7:
2822        //   ret
2823
2824        let info = emit_info();
2825        let mut buf = MachBuffer::new();
2826        let mut state = <Inst as MachInstEmit>::State::default();
2827        let constants = Default::default();
2828
2829        buf.reserve_labels_for_blocks(8);
2830
2831        buf.bind_label(label(0), state.ctrl_plane_mut());
2832        let inst = Inst::CondBr {
2833            kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2834            taken: target(1),
2835            not_taken: target(2),
2836        };
2837        inst.emit(&mut buf, &info, &mut state);
2838
2839        buf.bind_label(label(1), state.ctrl_plane_mut());
2840        let inst = Inst::Jump { dest: target(3) };
2841        inst.emit(&mut buf, &info, &mut state);
2842
2843        buf.bind_label(label(2), state.ctrl_plane_mut());
2844        let inst = Inst::Nop4;
2845        inst.emit(&mut buf, &info, &mut state);
2846        inst.emit(&mut buf, &info, &mut state);
2847        let inst = Inst::Jump { dest: target(0) };
2848        inst.emit(&mut buf, &info, &mut state);
2849
2850        buf.bind_label(label(3), state.ctrl_plane_mut());
2851        let inst = Inst::Jump { dest: target(4) };
2852        inst.emit(&mut buf, &info, &mut state);
2853
2854        buf.bind_label(label(4), state.ctrl_plane_mut());
2855        let inst = Inst::Jump { dest: target(5) };
2856        inst.emit(&mut buf, &info, &mut state);
2857
2858        buf.bind_label(label(5), state.ctrl_plane_mut());
2859        let inst = Inst::Jump { dest: target(7) };
2860        inst.emit(&mut buf, &info, &mut state);
2861
2862        buf.bind_label(label(6), state.ctrl_plane_mut());
2863        let inst = Inst::Nop4;
2864        inst.emit(&mut buf, &info, &mut state);
2865
2866        buf.bind_label(label(7), state.ctrl_plane_mut());
2867        let inst = Inst::Ret {};
2868        inst.emit(&mut buf, &info, &mut state);
2869
2870        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2871
2872        let golden_data = vec![
2873            0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2874            0x1f, 0x20, 0x03, 0xd5, // nop
2875            0x1f, 0x20, 0x03, 0xd5, // nop
2876            0xfd, 0xff, 0xff, 0x17, // b 0
2877            0x1f, 0x20, 0x03, 0xd5, // nop
2878            0xc0, 0x03, 0x5f, 0xd6, // ret
2879        ];
2880
2881        assert_eq!(&golden_data[..], &buf.data[..]);
2882    }
2883
2884    #[test]
2885    fn test_handle_branch_cycle() {
2886        // label0:
2887        //   b label1
2888        // label1:
2889        //   b label2
2890        // label2:
2891        //   b label3
2892        // label3:
2893        //   b label4
2894        // label4:
2895        //   b label1  // note: not label0 (to make it interesting).
2896        //
2897        // -- should become:
2898        //
2899        // label0, label1, ..., label4:
2900        //   b label0
2901        let info = emit_info();
2902        let mut buf = MachBuffer::new();
2903        let mut state = <Inst as MachInstEmit>::State::default();
2904        let constants = Default::default();
2905
2906        buf.reserve_labels_for_blocks(5);
2907
2908        buf.bind_label(label(0), state.ctrl_plane_mut());
2909        let inst = Inst::Jump { dest: target(1) };
2910        inst.emit(&mut buf, &info, &mut state);
2911
2912        buf.bind_label(label(1), state.ctrl_plane_mut());
2913        let inst = Inst::Jump { dest: target(2) };
2914        inst.emit(&mut buf, &info, &mut state);
2915
2916        buf.bind_label(label(2), state.ctrl_plane_mut());
2917        let inst = Inst::Jump { dest: target(3) };
2918        inst.emit(&mut buf, &info, &mut state);
2919
2920        buf.bind_label(label(3), state.ctrl_plane_mut());
2921        let inst = Inst::Jump { dest: target(4) };
2922        inst.emit(&mut buf, &info, &mut state);
2923
2924        buf.bind_label(label(4), state.ctrl_plane_mut());
2925        let inst = Inst::Jump { dest: target(1) };
2926        inst.emit(&mut buf, &info, &mut state);
2927
2928        let buf = buf.finish(&constants, state.ctrl_plane_mut());
2929
2930        let golden_data = vec![
2931            0x00, 0x00, 0x00, 0x14, // b 0
2932        ];
2933
2934        assert_eq!(&golden_data[..], &buf.data[..]);
2935    }
2936
2937    #[test]
2938    fn metadata_records() {
2939        let mut buf = MachBuffer::<Inst>::new();
2940        let ctrl_plane = &mut Default::default();
2941        let constants = Default::default();
2942
2943        buf.reserve_labels_for_blocks(3);
2944
2945        buf.bind_label(label(0), ctrl_plane);
2946        buf.put1(1);
2947        buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2948        buf.put1(2);
2949        buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2950        buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2951        buf.add_try_call_site(
2952            Some(0x10),
2953            [
2954                MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),
2955                MachExceptionHandler::Default(label(1)),
2956            ]
2957            .into_iter(),
2958        );
2959        buf.add_reloc(
2960            Reloc::Abs4,
2961            &ExternalName::User(UserExternalNameRef::new(0)),
2962            0,
2963        );
2964        buf.put1(3);
2965        buf.add_reloc(
2966            Reloc::Abs8,
2967            &ExternalName::User(UserExternalNameRef::new(1)),
2968            1,
2969        );
2970        buf.put1(4);
2971        buf.bind_label(label(1), ctrl_plane);
2972        buf.put1(0xff);
2973        buf.bind_label(label(2), ctrl_plane);
2974        buf.put1(0xff);
2975
2976        let buf = buf.finish(&constants, ctrl_plane);
2977
2978        assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2979        assert_eq!(
2980            buf.traps()
2981                .iter()
2982                .map(|trap| (trap.offset, trap.code))
2983                .collect::<Vec<_>>(),
2984            vec![
2985                (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2986                (2, TrapCode::INTEGER_OVERFLOW),
2987                (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2988            ]
2989        );
2990        let call_sites: Vec<_> = buf.call_sites().collect();
2991        assert_eq!(call_sites[0].ret_addr, 2);
2992        assert_eq!(call_sites[0].frame_offset, Some(0x10));
2993        assert_eq!(
2994            call_sites[0].exception_handlers,
2995            &[
2996                FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),
2997                FinalizedMachExceptionHandler::Default(4)
2998            ],
2999        );
3000        assert_eq!(
3001            buf.relocs()
3002                .iter()
3003                .map(|reloc| (reloc.offset, reloc.kind))
3004                .collect::<Vec<_>>(),
3005            vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
3006        );
3007    }
3008
3009    /// Drive the buffer in the same idiom that VCode emission uses:
3010    /// emit instruction bytes via the given closure, then run the
3011    /// per-instruction island check.
3012    fn emit_with_island_check(
3013        buf: &mut MachBuffer<Inst>,
3014        state: &mut <Inst as MachInstEmit>::State,
3015        f: impl FnOnce(&mut MachBuffer<Inst>, &mut <Inst as MachInstEmit>::State),
3016    ) {
3017        f(buf, state);
3018        let lookahead = Inst::worst_case_size() + Inst::worst_case_island_growth();
3019        if buf.island_needed(lookahead) {
3020            let jump_around = buf.get_label();
3021            Inst::gen_jump(jump_around).emit(buf, &emit_info(), state);
3022            buf.emit_island(0, state.ctrl_plane_mut());
3023            buf.bind_label(jump_around, state.ctrl_plane_mut());
3024        }
3025    }
3026
3027    /// Many constant loads in a single basic block: each emits an
3028    /// `Ldr19` reference (+/- 1 MiB range, no veneer support) and
3029    /// adds a 16-byte constant to the pool. Without the
3030    /// per-instruction island check, the pool would grow past the
3031    /// reach of earlier `ldr`s and the buffer would panic during the
3032    /// final fixup pass (see issue #12968).
3033    #[test]
3034    fn test_many_constants_in_one_block() {
3035        use crate::ir::constant::ConstantData;
3036
3037        let mut buf = MachBuffer::<Inst>::new();
3038        let mut state = <Inst as MachInstEmit>::State::default();
3039        let mut constants = VCodeConstants::default();
3040
3041        let n = 200_000;
3042        let mut handles = Vec::with_capacity(n);
3043        for i in 0..n {
3044            let bytes: Vec<u8> = (0..16).map(|b| ((i + b) & 0xff) as u8).collect();
3045            let data = VCodeConstantData::Generated(ConstantData::from(&bytes[..]));
3046            handles.push(constants.insert(data));
3047        }
3048        buf.register_constants(&constants);
3049
3050        buf.reserve_labels_for_blocks(1);
3051        buf.bind_label(label(0), state.ctrl_plane_mut());
3052
3053        for &handle in &handles {
3054            emit_with_island_check(&mut buf, &mut state, |buf, _state| {
3055                let off = buf.cur_offset();
3056                let const_label = buf.get_label_for_constant(handle);
3057                buf.use_label_at_offset(off, const_label, aarch64::inst::LabelUse::Ldr19);
3058                // Placeholder ldr literal instruction.
3059                buf.put4(0);
3060            });
3061        }
3062
3063        let _ = buf.finish(&constants, state.ctrl_plane_mut());
3064    }
3065
3066    /// Mix conditional branches with short ranges (`Branch19`, +/- 1
3067    /// MiB) and many constant loads. The branches' veneers must
3068    /// remain in range as the constant pool grows.
3069    #[test]
3070    fn test_short_branch_amid_constants() {
3071        use crate::ir::constant::ConstantData;
3072
3073        let mut buf = MachBuffer::<Inst>::new();
3074        let mut state = <Inst as MachInstEmit>::State::default();
3075        let mut constants = VCodeConstants::default();
3076
3077        let n = 100_000;
3078        let mut handles = Vec::with_capacity(n);
3079        for i in 0..n {
3080            let bytes: Vec<u8> = (0..16).map(|b| ((i ^ b) & 0xff) as u8).collect();
3081            handles.push(
3082                constants.insert(VCodeConstantData::Generated(ConstantData::from(&bytes[..]))),
3083            );
3084        }
3085        buf.register_constants(&constants);
3086
3087        buf.reserve_labels_for_blocks(2);
3088        buf.bind_label(label(0), state.ctrl_plane_mut());
3089
3090        emit_with_island_check(&mut buf, &mut state, |buf, state| {
3091            let inst = Inst::CondBr {
3092                kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
3093                taken: target(1),
3094                not_taken: target(0),
3095            };
3096            inst.emit(buf, &emit_info(), state);
3097        });
3098
3099        for &handle in &handles {
3100            emit_with_island_check(&mut buf, &mut state, |buf, _state| {
3101                let off = buf.cur_offset();
3102                let const_label = buf.get_label_for_constant(handle);
3103                buf.use_label_at_offset(off, const_label, aarch64::inst::LabelUse::Ldr19);
3104                buf.put4(0);
3105            });
3106        }
3107
3108        buf.bind_label(label(1), state.ctrl_plane_mut());
3109        let _ = buf.finish(&constants, state.ctrl_plane_mut());
3110    }
3111
3112    /// Driving an island in the middle of a block via the
3113    /// jump-around-plus-`emit_island` idiom must emit a branch followed
3114    /// by the island contents, such that fall-through reaches the
3115    /// post-island code.
3116    #[test]
3117    fn test_mid_block_island_via_gen_jump() {
3118        let mut buf = MachBuffer::<Inst>::new();
3119        let mut state = <Inst as MachInstEmit>::State::default();
3120        let constants = VCodeConstants::default();
3121
3122        buf.reserve_labels_for_blocks(1);
3123        buf.bind_label(label(0), state.ctrl_plane_mut());
3124
3125        // Place a trap which will need to be emitted in the next island.
3126        let trap_label = buf.defer_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
3127        let off = buf.cur_offset();
3128        buf.use_label_at_offset(off, trap_label, aarch64::inst::LabelUse::Branch19);
3129        buf.put4(0); // placeholder for cbnz-like reference
3130
3131        let before = buf.cur_offset();
3132        let jump_around = buf.get_label();
3133        Inst::gen_jump(jump_around).emit(&mut buf, &emit_info(), &mut state);
3134        buf.emit_island(0, state.ctrl_plane_mut());
3135        buf.bind_label(jump_around, state.ctrl_plane_mut());
3136        let after = buf.cur_offset();
3137
3138        // The jump-around branch (4 bytes on AArch64) plus at least the
3139        // deferred trap (4 bytes) gives a minimum island growth of 8 bytes.
3140        assert!(
3141            after - before >= 8,
3142            "island grew too little: {}",
3143            after - before
3144        );
3145
3146        let buf = buf.finish(&constants, state.ctrl_plane_mut());
3147        let _ = buf.total_size();
3148    }
3149}