cranelift_codegen/machinst/buffer.rs
1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//! emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//! For example, on AArch64, conditional branches can only target code +/- 1MB
12//! from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//! edge blocks when the register allocator does not need to insert any
17//! spills/reloads/moves in edge blocks, results in many suboptimal branch
18//! patterns. The lowering also pays no attention to block order, and so
19//! two-target conditional forms (cond-br followed by uncond-br) can often by
20//! avoided because one of the targets is the fallthrough. There are several
21//! cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//! forms, but with zero offsets and optimistically assuming the target will be
29//! in range. Record the "fixup" for later. Targets are denoted instead by
30//! symbolic "labels" that are then bound to certain offsets in the buffer as
31//! we emit code. (Nominally, there is a label at the start of every basic
32//! block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//! reference "goes out of range". We call this the "deadline". If we reach the
36//! deadline and we still have not bound the label to which an unresolved branch
37//! refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//! simply a chunk of code inserted in the middle of the code actually produced
41//! by the emitter (e.g., VCode iterating over instruction structs). Islands
42//! are emitted at "safe" points (no fall-through into the island contents):
43//! between basic blocks during emission, or via a jump around the island.
44//!
45//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
46//! that implements a longer-range reference to a label. The idea is that, for
47//! example, a branch with a limited range can branch to a "veneer" instead,
48//! which is simply a branch in a form that can use a longer-range reference. On
49//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
50//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
51//! conditional branch's label reference can be fixed up with a "veneer" to
52//! achieve a longer range.
53//!
54//! - To implement all of this, we require the backend to provide a `LabelUse`
55//! type that implements a trait. This is nominally an enum that records one of
56//! several kinds of references to an offset in code -- basically, a relocation
57//! type -- and will usually correspond to different instruction formats. The
58//! `LabelUse` implementation specifies the maximum range, how to patch in the
59//! actual label location when known, and how to generate a veneer to extend the
60//! range.
61//!
62//! That satisfies label references, but we still may have suboptimal branch
63//! patterns. To clean up the branches, we do a simple "peephole"-style
64//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
65//! informs the buffer of branches in the code and, in the case of conditionals,
66//! the code that would have been emitted to invert this branch's condition. We
67//! track the "latest branches": these are branches that are contiguous up to
68//! the current offset. (If any code is emitted after a branch, that branch or
69//! run of contiguous branches is no longer "latest".) The latest branches are
70//! those that we can edit by simply truncating the buffer and doing something
71//! else instead.
72//!
73//! To optimize branches, we implement several simple rules, and try to apply
74//! them to the "latest branches" when possible:
75//!
76//! - A branch with a label target, when that label is bound to the ending
77//! offset of the branch (the fallthrough location), can be removed altogether,
78//! because the branch would have no effect).
79//!
80//! - An unconditional branch that starts at a label location, and branches to
81//! another label, results in a "label alias": all references to the label bound
82//! *to* this branch instruction are instead resolved to the *target* of the
83//! branch instruction. This effectively removes empty blocks that just
84//! unconditionally branch to the next block. We call this "branch threading".
85//!
86//! - A conditional followed by an unconditional, when the conditional branches
87//! to the unconditional's fallthrough, results in (i) the truncation of the
88//! unconditional, (ii) the inversion of the condition's condition, and (iii)
89//! replacement of the conditional's target (using the original target of the
90//! unconditional). This is a fancy way of saying "we can flip a two-target
91//! conditional branch's taken/not-taken targets if it works better with our
92//! fallthrough". To make this work, the emitter actually gives the buffer
93//! *both* forms of every conditional branch: the true form is emitted into the
94//! buffer, and the "inverted" machine-code bytes are provided as part of the
95//! branch-fixup metadata.
96//!
97//! - An unconditional B preceded by another unconditional P, when B's label(s) have
98//! been redirected to target(B), can be removed entirely. This is an extension
99//! of the branch-threading optimization, and is valid because if we know there
100//! will be no fallthrough into this branch instruction (the prior instruction
101//! is an unconditional jump), and if we know we have successfully redirected
102//! all labels, then this branch instruction is unreachable. Note that this
103//! works because the redirection happens before the label is ever resolved
104//! (fixups happen at island emission time, at which point latest-branches are
105//! cleared, or at the end of emission), so we are sure to catch and redirect
106//! all possible paths to this instruction.
107//!
108//! # Branch-optimization Correctness
109//!
110//! The branch-optimization mechanism depends on a few data structures with
111//! invariants, which are always held outside the scope of top-level public
112//! methods:
113//!
114//! - The latest-branches list. Each entry describes a span of the buffer
115//! (start/end offsets), the label target, the corresponding fixup-list entry
116//! index, and the bytes (must be the same length) for the inverted form, if
117//! conditional. The list of labels that are bound to the start-offset of this
118//! branch is *complete* (if any label has a resolved offset equal to `start`
119//! and is not an alias, it must appear in this list) and *precise* (no label
120//! in this list can be bound to another offset). No label in this list should
121//! be an alias. No two branch ranges can overlap, and branches are in
122//! ascending-offset order.
123//!
124//! - The labels-at-tail list. This contains all MachLabels that have been bound
125//! to (whose resolved offsets are equal to) the tail offset of the buffer.
126//! No label in this list should be an alias.
127//!
128//! - The label_offsets array, containing the bound offset of a label or
129//! UNKNOWN. No label can be bound at an offset greater than the current
130//! buffer tail.
131//!
132//! - The label_aliases array, containing another label to which a label is
133//! bound or UNKNOWN. A label's resolved offset is the resolved offset
134//! of the label it is aliased to, if this is set.
135//!
136//! We argue below, at each method, how the invariants in these data structures
137//! are maintained (grep for "Post-invariant").
138//!
139//! Given these invariants, we argue why each optimization preserves execution
140//! semantics below (grep for "Preserves execution semantics").
141//!
142//! # Deadline-correctness for islands
143//!
144//! Every label-use (and indirectly every pending constant/trap, since
145//! each is referred to by a fixup) imposes a *deadline*: the maximum
146//! offset at which the use's target may be bound while still
147//! remaining in range. Each item that may be emitted into an island
148//! (a veneer, a pending constant, or a pending trap) also contributes
149//! a bounded number of bytes to a worst-case island size. The
150//! buffer's central invariant is:
151//!
152//! > `worst_case_end_of_island(0) <= soonest_deadline`
153//!
154//! Equivalently, "if we emitted an island right now, its end offset
155//! would land before the closest expiring deadline." Given this
156//! invariant, an island is always *feasible*: items can be laid out
157//! in any order and each one lands at an offset no later than the
158//! soonest deadline, which is no later than each individual item's
159//! deadline.
160//!
161//! To maintain the invariant, the buffer's user is expected to treat
162//! *one `MachInst` emission* as the atomic commit unit. After each
163//! instruction, the worst-case end-of-island and the soonest deadline
164//! can shift by no more than `worst_case_size() +
165//! worst_case_island_growth()` and one "smallest label-use range"
166//! worth of new deadline, respectively. The user (in VCode emission)
167//! consults [`MachBuffer::island_needed`] after each instruction and
168//! if one is needed, emits a jump-around branch followed by
169//! [`MachBuffer::emit_island`].
170//!
171//! # Avoiding Quadratic Behavior
172//!
173//! There are two cases where we've had to take some care to avoid
174//! quadratic worst-case behavior:
175//!
176//! - The "labels at this branch" list can grow unboundedly if the
177//! code generator binds many labels at one location. If the count
178//! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
179//! simply abort an optimization early in a way that is always correct
180//! but is conservative.
181//!
182//! - The fixup list can interact with island emission to create
183//! "quadratic island behavior". In a little more detail, one can hit
184//! this behavior by having some pending fixups (forward label
185//! references) with long-range label-use kinds, and some others
186//! with shorter-range references that nonetheless still are pending
187//! long enough to trigger island generation. In such a case, we
188//! process the fixup list, generate veneers to extend some forward
189//! references' ranges, but leave the other (longer-range) ones
190//! alone. The way this was implemented put them back on a list and
191//! resulted in quadratic behavior.
192//!
193//! To avoid this fixups are split into two lists: one "pending" list and one
194//! final list. The pending list is kept around for handling fixups related to
195//! branches so it can be edited/truncated. When an island is reached, which
196//! starts processing fixups, all pending fixups are flushed into the final
197//! list. The final list is a `BinaryHeap` which enables fixup processing to
198//! only process those which are required during island emission, deferring
199//! all longer-range fixups to later.
200
201use crate::binemit::{Addend, CodeOffset, Reloc};
202use crate::ir::function::FunctionParameters;
203use crate::ir::{DebugTag, ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
204use crate::isa::unwind::UnwindInst;
205use crate::machinst::{
206 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
207};
208use crate::trace;
209use crate::{MachInstEmitState, ir};
210use crate::{VCodeConstantData, timing};
211use alloc::collections::BinaryHeap;
212use alloc::string::String;
213use alloc::vec::Vec;
214use core::cmp::Ordering;
215use core::mem;
216use core::ops::Range;
217use cranelift_control::ControlPlane;
218use cranelift_entity::{PrimaryMap, SecondaryMap, entity_impl};
219use smallvec::SmallVec;
220
221#[cfg(feature = "enable-serde")]
222use serde::{Deserialize, Serialize};
223
224#[cfg(feature = "enable-serde")]
225pub trait CompilePhase {
226 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
227 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
228}
229
230#[cfg(not(feature = "enable-serde"))]
231pub trait CompilePhase {
232 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
233 type SourceLocType: core::fmt::Debug + PartialEq + Clone;
234}
235
236/// Status of a compiled artifact that needs patching before being used.
237#[derive(Clone, Debug, PartialEq)]
238#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
239pub struct Stencil;
240
241/// Status of a compiled artifact ready to use.
242#[derive(Clone, Debug, PartialEq)]
243pub struct Final;
244
245impl CompilePhase for Stencil {
246 type MachSrcLocType = MachSrcLoc<Stencil>;
247 type SourceLocType = RelSourceLoc;
248}
249
250impl CompilePhase for Final {
251 type MachSrcLocType = MachSrcLoc<Final>;
252 type SourceLocType = SourceLoc;
253}
254
255#[derive(Clone, Copy, Debug, PartialEq, Eq)]
256enum ForceVeneers {
257 Yes,
258 No,
259}
260
261/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
262/// in bulk.
263///
264/// This struct uses `SmallVec`s to support small-ish function bodies without
265/// any heap allocation. As such, it will be several kilobytes large. This is
266/// likely fine as long as it is stack-allocated for function emission then
267/// thrown away; but beware if many buffer objects are retained persistently.
268pub struct MachBuffer<I: VCodeInst> {
269 /// The buffer contents, as raw bytes.
270 data: SmallVec<[u8; 1024]>,
271 /// The required alignment of this buffer.
272 min_alignment: u32,
273 /// Any relocations referring to this code. Note that only *external*
274 /// relocations are tracked here; references to labels within the buffer are
275 /// resolved before emission.
276 relocs: SmallVec<[MachReloc; 16]>,
277 /// Any trap records referring to this code.
278 traps: SmallVec<[MachTrap; 16]>,
279 /// Any call site records referring to this code.
280 call_sites: SmallVec<[MachCallSite; 16]>,
281 /// Any patchable call site locations.
282 patchable_call_sites: SmallVec<[MachPatchableCallSite; 16]>,
283 /// Any exception-handler records referred to at call sites.
284 exception_handlers: SmallVec<[MachExceptionHandler; 16]>,
285 /// Any source location mappings referring to this code.
286 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
287 /// Any debug tags referring to this code.
288 debug_tags: Vec<MachDebugTags>,
289 /// Pool of debug tags referenced by `MachDebugTags` entries.
290 debug_tag_pool: Vec<DebugTag>,
291 /// Any user stack maps for this code.
292 ///
293 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
294 /// by code offset, and each stack map covers `span` bytes on the stack.
295 user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
296 /// Any unwind info at a given location.
297 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
298 /// The current source location in progress (after `start_srcloc()` and
299 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.
300 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
301 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
302 label_offsets: SmallVec<[CodeOffset; 16]>,
303 /// Label aliases: when one label points to an unconditional jump, and that
304 /// jump points to another label, we can redirect references to the first
305 /// label immediately to the second.
306 ///
307 /// Invariant: we don't have label-alias cycles. We ensure this by,
308 /// before setting label A to alias label B, resolving B's alias
309 /// target (iteratively until a non-aliased label); if B is already
310 /// aliased to A, then we cannot alias A back to B.
311 label_aliases: SmallVec<[MachLabel; 16]>,
312 /// Constants that must be emitted at some point.
313 pending_constants: SmallVec<[VCodeConstant; 16]>,
314 /// Byte size of all constants in `pending_constants`.
315 pending_constants_size: CodeOffset,
316 /// Traps that must be emitted at some point.
317 pending_traps: SmallVec<[MachLabelTrap; 16]>,
318 /// Fixups that haven't yet been flushed into `fixup_records` below and may
319 /// be related to branches that are chomped. These all get added to
320 /// `fixup_records` during island emission.
321 pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
322 /// The nearest upcoming deadline for entries in `pending_fixup_records`.
323 pending_fixup_deadline: CodeOffset,
324 /// Fixups that must be performed after all code is emitted.
325 fixup_records: BinaryHeap<MachLabelFixup<I>>,
326 /// Latest branches, to facilitate in-place editing for better fallthrough
327 /// behavior and empty-block removal.
328 latest_branches: SmallVec<[MachBranch; 4]>,
329 /// All labels at the current offset (emission tail). This is lazily
330 /// cleared: it is actually accurate as long as the current offset is
331 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
332 /// be considered as empty.
333 ///
334 /// For correctness, this *must* be complete (i.e., the vector must contain
335 /// all labels whose offsets are resolved to the current tail), because we
336 /// rely on it to update labels when we truncate branches.
337 labels_at_tail: SmallVec<[MachLabel; 4]>,
338 /// The last offset at which `labels_at_tail` is valid. It is conceptually
339 /// always describing the tail of the buffer, but we do not clear
340 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
341 /// when the offset has grown past this (`labels_at_tail_off`) point.
342 /// Always <= `cur_offset()`.
343 labels_at_tail_off: CodeOffset,
344 /// Metadata about all constants that this function has access to.
345 ///
346 /// This records the size/alignment of all constants (not the actual data)
347 /// along with the last available label generated for the constant. This map
348 /// is consulted when constants are referred to and the label assigned to a
349 /// constant may change over time as well.
350 constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
351 /// All recorded usages of constants as pairs of the constant and where the
352 /// constant needs to be placed within `self.data`. Note that the same
353 /// constant may appear in this array multiple times if it was emitted
354 /// multiple times.
355 used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
356 /// Indicates when a patchable region is currently open, to guard that it's
357 /// not possible to nest patchable regions.
358 open_patchable: bool,
359 /// Stack frame layout metadata. If provided for a MachBuffer
360 /// containing a function body, this allows interpretation of
361 /// runtime state given a view of an active stack frame.
362 frame_layout: Option<MachBufferFrameLayout>,
363}
364
365impl MachBufferFinalized<Stencil> {
366 /// Get a finalized machine buffer by applying the function's base source location.
367 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
368 MachBufferFinalized {
369 data: self.data,
370 relocs: self.relocs,
371 traps: self.traps,
372 call_sites: self.call_sites,
373 patchable_call_sites: self.patchable_call_sites,
374 exception_handlers: self.exception_handlers,
375 srclocs: self
376 .srclocs
377 .into_iter()
378 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
379 .collect(),
380 debug_tags: self.debug_tags,
381 debug_tag_pool: self.debug_tag_pool,
382 user_stack_maps: self.user_stack_maps,
383 unwind_info: self.unwind_info,
384 alignment: self.alignment,
385 frame_layout: self.frame_layout,
386 nop_units: self.nop_units,
387 }
388 }
389}
390
391/// A `MachBuffer` once emission is completed: holds generated code and records,
392/// without fixups. This allows the type to be independent of the backend.
393#[derive(PartialEq, Debug, Clone)]
394#[cfg_attr(
395 feature = "enable-serde",
396 derive(serde_derive::Serialize, serde_derive::Deserialize)
397)]
398pub struct MachBufferFinalized<T: CompilePhase> {
399 /// The buffer contents, as raw bytes.
400 pub(crate) data: SmallVec<[u8; 1024]>,
401 /// Any relocations referring to this code. Note that only *external*
402 /// relocations are tracked here; references to labels within the buffer are
403 /// resolved before emission.
404 pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
405 /// Any trap records referring to this code.
406 pub(crate) traps: SmallVec<[MachTrap; 16]>,
407 /// Any call site records referring to this code.
408 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
409 /// Any patchable call site locations referring to this code.
410 pub(crate) patchable_call_sites: SmallVec<[MachPatchableCallSite; 16]>,
411 /// Any exception-handler records referred to at call sites.
412 pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,
413 /// Any source location mappings referring to this code.
414 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
415 /// Any debug tags referring to this code.
416 pub(crate) debug_tags: Vec<MachDebugTags>,
417 /// Pool of debug tags referenced by `MachDebugTags` entries.
418 pub(crate) debug_tag_pool: Vec<DebugTag>,
419 /// Any user stack maps for this code.
420 ///
421 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
422 /// by code offset, and each stack map covers `span` bytes on the stack.
423 pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
424 /// Stack frame layout metadata. If provided for a MachBuffer
425 /// containing a function body, this allows interpretation of
426 /// runtime state given a view of an active stack frame.
427 pub(crate) frame_layout: Option<MachBufferFrameLayout>,
428 /// Any unwind info at a given location.
429 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
430 /// The required alignment of this buffer.
431 pub alignment: u32,
432 /// The means by which to NOP out patchable call sites.
433 ///
434 /// This allows a consumer of a `MachBufferFinalized` to disable
435 /// patchable call sites (which are enabled by default) without
436 /// specific knowledge of the target ISA.
437 ///
438 /// Each entry is one form of nop, and these are required to be
439 /// sorted in ascending-size order.
440 pub nop_units: Vec<Vec<u8>>,
441}
442
443const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
444const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
445
446/// Threshold on max length of `labels_at_this_branch` list to avoid
447/// unbounded quadratic behavior (see comment below at use-site).
448const LABEL_LIST_THRESHOLD: usize = 100;
449
450/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
451/// the point at which it is used by emitted code; the buffer records "fixups"
452/// for references to the label, and will come back and patch the code
453/// appropriately when the label's location is eventually known.
454#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
455pub struct MachLabel(u32);
456entity_impl!(MachLabel);
457
458impl MachLabel {
459 /// Get a label for a block. (The first N MachLabels are always reserved for
460 /// the N blocks in the vcode.)
461 pub fn from_block(bindex: BlockIndex) -> MachLabel {
462 MachLabel(bindex.index() as u32)
463 }
464
465 /// Creates a string representing this label, for convenience.
466 pub fn to_string(&self) -> String {
467 format!("label{}", self.0)
468 }
469}
470
471impl Default for MachLabel {
472 fn default() -> Self {
473 UNKNOWN_LABEL
474 }
475}
476
477/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
478/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
479/// the [`OpenPatchRegion`] token in the process.
480pub struct OpenPatchRegion(usize);
481
482/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
483/// of where you might want to use this is for patching instructions that mention constants that
484/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
485/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
486/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
487/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
488/// bytes, and the constants uses can be updated directly.
489pub struct PatchRegion {
490 range: Range<usize>,
491}
492
493impl PatchRegion {
494 /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
495 pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
496 &mut buffer.data[self.range]
497 }
498}
499
500impl<I: VCodeInst> MachBuffer<I> {
501 /// Create a new section, known to start at `start_offset` and with a size limited to
502 /// `length_limit`.
503 pub fn new() -> MachBuffer<I> {
504 MachBuffer {
505 data: SmallVec::new(),
506 min_alignment: I::function_alignment().minimum,
507 relocs: SmallVec::new(),
508 traps: SmallVec::new(),
509 call_sites: SmallVec::new(),
510 patchable_call_sites: SmallVec::new(),
511 exception_handlers: SmallVec::new(),
512 srclocs: SmallVec::new(),
513 debug_tags: vec![],
514 debug_tag_pool: vec![],
515 user_stack_maps: SmallVec::new(),
516 unwind_info: SmallVec::new(),
517 cur_srcloc: None,
518 label_offsets: SmallVec::new(),
519 label_aliases: SmallVec::new(),
520 pending_constants: SmallVec::new(),
521 pending_constants_size: 0,
522 pending_traps: SmallVec::new(),
523 pending_fixup_records: SmallVec::new(),
524 pending_fixup_deadline: u32::MAX,
525 fixup_records: Default::default(),
526 latest_branches: SmallVec::new(),
527 labels_at_tail: SmallVec::new(),
528 labels_at_tail_off: 0,
529 constants: Default::default(),
530 used_constants: Default::default(),
531 open_patchable: false,
532 frame_layout: None,
533 }
534 }
535
536 /// Current offset from start of buffer.
537 pub fn cur_offset(&self) -> CodeOffset {
538 self.data.len() as CodeOffset
539 }
540
541 /// Add a byte.
542 pub fn put1(&mut self, value: u8) {
543 self.data.push(value);
544
545 // Post-invariant: conceptual-labels_at_tail contains a complete and
546 // precise list of labels bound at `cur_offset()`. We have advanced
547 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
548 // before, it is not anymore (and it cannot become equal, because
549 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
550 // conceptually empty (even though it is only lazily cleared). No labels
551 // can be bound at this new offset (by invariant on `label_offsets`).
552 // Hence the invariant holds.
553 }
554
555 /// Add 2 bytes.
556 pub fn put2(&mut self, value: u16) {
557 let bytes = value.to_le_bytes();
558 self.data.extend_from_slice(&bytes[..]);
559
560 // Post-invariant: as for `put1()`.
561 }
562
563 /// Add 4 bytes.
564 pub fn put4(&mut self, value: u32) {
565 let bytes = value.to_le_bytes();
566 self.data.extend_from_slice(&bytes[..]);
567
568 // Post-invariant: as for `put1()`.
569 }
570
571 /// Add 8 bytes.
572 pub fn put8(&mut self, value: u64) {
573 let bytes = value.to_le_bytes();
574 self.data.extend_from_slice(&bytes[..]);
575
576 // Post-invariant: as for `put1()`.
577 }
578
579 /// Add a slice of bytes.
580 pub fn put_data(&mut self, data: &[u8]) {
581 self.data.extend_from_slice(data);
582
583 // Post-invariant: as for `put1()`.
584 }
585
586 /// Reserve appended space and return a mutable slice referring to it.
587 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
588 let off = self.data.len();
589 let new_len = self.data.len() + len;
590 self.data.resize(new_len, 0);
591 &mut self.data[off..]
592
593 // Post-invariant: as for `put1()`.
594 }
595
596 /// Align up to the given alignment.
597 pub fn align_to(&mut self, align_to: CodeOffset) {
598 trace!("MachBuffer: align to {}", align_to);
599 assert!(
600 align_to.is_power_of_two(),
601 "{align_to} is not a power of two"
602 );
603 while self.cur_offset() & (align_to - 1) != 0 {
604 self.put1(0);
605 }
606
607 // Post-invariant: as for `put1()`.
608 }
609
610 /// Begin a region of patchable code. There is one requirement for the
611 /// code that is emitted: It must not introduce any instructions that
612 /// could be chomped (branches are an example of this). In other words,
613 /// you must not call [`MachBuffer::add_cond_branch`] or
614 /// [`MachBuffer::add_uncond_branch`] between calls to this method and
615 /// [`MachBuffer::end_patchable`].
616 pub fn start_patchable(&mut self) -> OpenPatchRegion {
617 assert!(!self.open_patchable, "Patchable regions may not be nested");
618 self.open_patchable = true;
619 OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
620 }
621
622 /// End a region of patchable code, yielding a [`PatchRegion`] value that
623 /// can be consumed later to produce a one-off mutable slice to the
624 /// associated region of the data buffer.
625 pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
626 // No need to assert the state of `open_patchable` here, as we take
627 // ownership of the only `OpenPatchable` value.
628 self.open_patchable = false;
629 let end = usize::try_from(self.cur_offset()).unwrap();
630 PatchRegion { range: open.0..end }
631 }
632
633 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
634 /// offset yet.
635 pub fn get_label(&mut self) -> MachLabel {
636 let l = self.label_offsets.len() as u32;
637 self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
638 self.label_aliases.push(UNKNOWN_LABEL);
639 trace!("MachBuffer: new label -> {:?}", MachLabel(l));
640 MachLabel(l)
641
642 // Post-invariant: the only mutation is to add a new label; it has no
643 // bound offset yet, so it trivially satisfies all invariants.
644 }
645
646 /// Reserve the first N MachLabels for blocks.
647 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
648 trace!("MachBuffer: first {} labels are for blocks", blocks);
649 debug_assert!(self.label_offsets.is_empty());
650 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
651 self.label_aliases.resize(blocks, UNKNOWN_LABEL);
652
653 // Post-invariant: as for `get_label()`.
654 }
655
656 /// Registers metadata in this `MachBuffer` about the `constants` provided.
657 ///
658 /// This will record the size/alignment of all constants which will prepare
659 /// them for emission later on.
660 pub fn register_constants(&mut self, constants: &VCodeConstants) {
661 for (c, val) in constants.iter() {
662 self.register_constant(&c, val);
663 }
664 }
665
666 /// Similar to [`MachBuffer::register_constants`] but registers a
667 /// single constant metadata. This function is useful in
668 /// situations where not all constants are known at the time of
669 /// emission.
670 pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
671 let c2 = self.constants.push(MachBufferConstant {
672 upcoming_label: None,
673 align: data.alignment(),
674 size: data.as_slice().len(),
675 });
676 assert_eq!(*constant, c2);
677 }
678
679 /// Completes constant emission by iterating over `self.used_constants` and
680 /// filling in the "holes" with the constant values provided by `constants`.
681 ///
682 /// Returns the alignment required for this entire buffer. Alignment starts
683 /// at the ISA's minimum function alignment and can be increased due to
684 /// constant requirements.
685 fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
686 let mut alignment = self.min_alignment;
687 for (constant, offset) in mem::take(&mut self.used_constants) {
688 let constant = constants.get(constant);
689 let data = constant.as_slice();
690 self.data[offset as usize..][..data.len()].copy_from_slice(data);
691 alignment = constant.alignment().max(alignment);
692 }
693 alignment
694 }
695
696 /// Returns a label that can be used to refer to the `constant` provided.
697 ///
698 /// This will automatically defer a new constant to be emitted for
699 /// `constant` if it has not been previously emitted. Note that this
700 /// function may return a different label for the same constant at
701 /// different points in time. The label is valid to use only from the
702 /// current location; the MachBuffer takes care to emit the same constant
703 /// multiple times if needed so the constant is always in range.
704 pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
705 let MachBufferConstant {
706 align,
707 size,
708 upcoming_label,
709 } = self.constants[constant];
710 if let Some(label) = upcoming_label {
711 return label;
712 }
713
714 let label = self.get_label();
715 trace!(
716 "defer constant: eventually emit {size} bytes aligned \
717 to {align} at label {label:?}",
718 );
719 self.pending_constants.push(constant);
720 self.pending_constants_size += size as u32;
721 self.constants[constant].upcoming_label = Some(label);
722 label
723 }
724
725 /// Bind a label to the current offset. A label can only be bound once.
726 pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
727 trace!(
728 "MachBuffer: bind label {:?} at offset {}",
729 label,
730 self.cur_offset()
731 );
732 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
733 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
734 let offset = self.cur_offset();
735 self.label_offsets[label.0 as usize] = offset;
736 self.lazily_clear_labels_at_tail();
737 self.labels_at_tail.push(label);
738
739 // Invariants hold: bound offset of label is <= cur_offset (in fact it
740 // is equal). If the `labels_at_tail` list was complete and precise
741 // before, it is still, because we have bound this label to the current
742 // offset and added it to the list (which contains all labels at the
743 // current offset).
744
745 self.optimize_branches(ctrl_plane);
746
747 // Post-invariant: by `optimize_branches()` (see argument there).
748 }
749
750 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
751 /// offset that it applies to.
752 fn lazily_clear_labels_at_tail(&mut self) {
753 let offset = self.cur_offset();
754 if offset > self.labels_at_tail_off {
755 self.labels_at_tail_off = offset;
756 self.labels_at_tail.clear();
757 }
758
759 // Post-invariant: either labels_at_tail_off was at cur_offset, and
760 // state is untouched, or was less than cur_offset, in which case the
761 // labels_at_tail list was conceptually empty, and is now actually
762 // empty.
763 }
764
765 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
766 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
767 let mut iters = 0;
768 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
769 label = self.label_aliases[label.0 as usize];
770 // To protect against an infinite loop (despite our assurances to
771 // ourselves that the invariants make this impossible), assert out
772 // after 1M iterations. The number of basic blocks is limited
773 // in most contexts anyway so this should be impossible to hit with
774 // a legitimate input.
775 iters += 1;
776 assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
777 }
778 self.label_offsets[label.0 as usize]
779
780 // Post-invariant: no mutations.
781 }
782
783 /// Emit a reference to the given label with the given reference type (i.e.,
784 /// branch-instruction format) at the current offset. This is like a
785 /// relocation, but handled internally.
786 ///
787 /// This can be called before the branch is actually emitted; fixups will
788 /// not happen until an island is emitted or the buffer is finished.
789 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
790 trace!(
791 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
792 offset, label, kind
793 );
794
795 // Add the fixup, and update the worst-case island size based on a
796 // veneer for this label use.
797 let fixup = MachLabelFixup {
798 label,
799 offset,
800 kind,
801 };
802 self.pending_fixup_deadline = self
803 .pending_fixup_deadline
804 // Subtract one alignment here to the deadline to account for
805 // extra space taken by aligning an island.
806 .min(fixup.deadline() - I::LabelUse::ALIGN);
807 trace!("pending_fixup_deadline = {}", self.pending_fixup_deadline);
808 self.pending_fixup_records.push(fixup);
809
810 // Post-invariant: no mutations to branches/labels data structures.
811 }
812
813 /// Inform the buffer of an unconditional branch at the given offset,
814 /// targeting the given label. May be used to optimize branches.
815 /// The last added label-use must correspond to this branch.
816 /// This must be called when the current offset is equal to `start`; i.e.,
817 /// before actually emitting the branch. This implies that for a branch that
818 /// uses a label and is eligible for optimizations by the MachBuffer, the
819 /// proper sequence is:
820 ///
821 /// - Call `use_label_at_offset()` to emit the fixup record.
822 /// - Call `add_uncond_branch()` to make note of the branch.
823 /// - Emit the bytes for the branch's machine code.
824 ///
825 /// Additional requirement: no labels may be bound between `start` and `end`
826 /// (exclusive on both ends).
827 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
828 debug_assert!(
829 !self.open_patchable,
830 "Branch instruction inserted within a patchable region"
831 );
832 assert!(self.cur_offset() == start);
833 debug_assert!(end > start);
834 assert!(!self.pending_fixup_records.is_empty());
835 let fixup = self.pending_fixup_records.len() - 1;
836 self.lazily_clear_labels_at_tail();
837 self.latest_branches.push(MachBranch {
838 start,
839 end,
840 target,
841 fixup,
842 inverted: None,
843 labels_at_this_branch: self.labels_at_tail.clone(),
844 });
845
846 // Post-invariant: we asserted branch start is current tail; the list of
847 // labels at branch is cloned from list of labels at current tail.
848 }
849
850 /// Inform the buffer of a conditional branch at the given offset,
851 /// targeting the given label. May be used to optimize branches.
852 /// The last added label-use must correspond to this branch.
853 ///
854 /// Additional requirement: no labels may be bound between `start` and `end`
855 /// (exclusive on both ends).
856 pub fn add_cond_branch(
857 &mut self,
858 start: CodeOffset,
859 end: CodeOffset,
860 target: MachLabel,
861 inverted: &[u8],
862 ) {
863 debug_assert!(
864 !self.open_patchable,
865 "Branch instruction inserted within a patchable region"
866 );
867 assert!(self.cur_offset() == start);
868 debug_assert!(end > start);
869 assert!(!self.pending_fixup_records.is_empty());
870 debug_assert!(
871 inverted.len() == (end - start) as usize,
872 "branch length = {}, but inverted length = {}",
873 end - start,
874 inverted.len()
875 );
876 let fixup = self.pending_fixup_records.len() - 1;
877 let inverted = Some(SmallVec::from(inverted));
878 self.lazily_clear_labels_at_tail();
879 self.latest_branches.push(MachBranch {
880 start,
881 end,
882 target,
883 fixup,
884 inverted,
885 labels_at_this_branch: self.labels_at_tail.clone(),
886 });
887
888 // Post-invariant: we asserted branch start is current tail; labels at
889 // branch list is cloned from list of labels at current tail.
890 }
891
892 fn truncate_last_branch(&mut self) {
893 debug_assert!(
894 !self.open_patchable,
895 "Branch instruction truncated within a patchable region"
896 );
897
898 self.lazily_clear_labels_at_tail();
899 // Invariants hold at this point.
900
901 let b = self.latest_branches.pop().unwrap();
902 assert!(b.end == self.cur_offset());
903
904 // State:
905 // [PRE CODE]
906 // Offset b.start, b.labels_at_this_branch:
907 // [BRANCH CODE]
908 // cur_off, self.labels_at_tail -->
909 // (end of buffer)
910 self.data.truncate(b.start as usize);
911 self.pending_fixup_records.truncate(b.fixup);
912
913 // Trim srclocs and debug tags now past the end of the buffer.
914 while let Some(last_srcloc) = self.srclocs.last_mut() {
915 if last_srcloc.end <= b.start {
916 break;
917 }
918 if last_srcloc.start < b.start {
919 last_srcloc.end = b.start;
920 break;
921 }
922 self.srclocs.pop();
923 }
924 while let Some(last_debug_tag) = self.debug_tags.last() {
925 if last_debug_tag.offset <= b.start {
926 break;
927 }
928 self.debug_tags.pop();
929 }
930
931 // State:
932 // [PRE CODE]
933 // cur_off, Offset b.start, b.labels_at_this_branch:
934 // (end of buffer)
935 //
936 // self.labels_at_tail --> (past end of buffer)
937 let cur_off = self.cur_offset();
938 self.labels_at_tail_off = cur_off;
939 // State:
940 // [PRE CODE]
941 // cur_off, Offset b.start, b.labels_at_this_branch,
942 // self.labels_at_tail:
943 // (end of buffer)
944 //
945 // resolve_label_offset(l) for l in labels_at_tail:
946 // (past end of buffer)
947
948 trace!(
949 "truncate_last_branch: truncated {:?}; off now {}",
950 b, cur_off
951 );
952
953 // Fix up resolved label offsets for labels at tail.
954 for &l in &self.labels_at_tail {
955 self.label_offsets[l.0 as usize] = cur_off;
956 }
957 // Old labels_at_this_branch are now at cur_off.
958 self.labels_at_tail.extend(b.labels_at_this_branch);
959
960 // Post-invariant: this operation is defined to truncate the buffer,
961 // which moves cur_off backward, and to move labels at the end of the
962 // buffer back to the start-of-branch offset.
963 //
964 // latest_branches satisfies all invariants:
965 // - it has no branches past the end of the buffer (branches are in
966 // order, we removed the last one, and we truncated the buffer to just
967 // before the start of that branch)
968 // - no labels were moved to lower offsets than the (new) cur_off, so
969 // the labels_at_this_branch list for any other branch need not change.
970 //
971 // labels_at_tail satisfies all invariants:
972 // - all labels that were at the tail after the truncated branch are
973 // moved backward to just before the branch, which becomes the new tail;
974 // thus every element in the list should remain (ensured by `.extend()`
975 // above).
976 // - all labels that refer to the new tail, which is the start-offset of
977 // the truncated branch, must be present. The `labels_at_this_branch`
978 // list in the truncated branch's record is a complete and precise list
979 // of exactly these labels; we append these to labels_at_tail.
980 // - labels_at_tail_off is at cur_off after truncation occurs, so the
981 // list is valid (not to be lazily cleared).
982 //
983 // The stated operation was performed:
984 // - For each label at the end of the buffer prior to this method, it
985 // now resolves to the new (truncated) end of the buffer: it must have
986 // been in `labels_at_tail` (this list is precise and complete, and
987 // the tail was at the end of the truncated branch on entry), and we
988 // iterate over this list and set `label_offsets` to the new tail.
989 // None of these labels could have been an alias (by invariant), so
990 // `label_offsets` is authoritative for each.
991 // - No other labels will be past the end of the buffer, because of the
992 // requirement that no labels be bound to the middle of branch ranges
993 // (see comments to `add_{cond,uncond}_branch()`).
994 // - The buffer is truncated to just before the last branch, and the
995 // fixup record referring to that last branch is removed.
996 }
997
998 /// Performs various optimizations on branches pointing at the current label.
999 pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
1000 if ctrl_plane.get_decision() {
1001 return;
1002 }
1003
1004 self.lazily_clear_labels_at_tail();
1005 // Invariants valid at this point.
1006
1007 trace!(
1008 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1009 self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1010 );
1011
1012 // We continue to munch on branches at the tail of the buffer until no
1013 // more rules apply. Note that the loop only continues if a branch is
1014 // actually truncated (or if labels are redirected away from a branch),
1015 // so this always makes progress.
1016 while let Some(b) = self.latest_branches.last() {
1017 let cur_off = self.cur_offset();
1018 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
1019 // If there has been any code emission since the end of the last branch or
1020 // label definition, then there's nothing we can edit (because we
1021 // don't move code once placed, only back up and overwrite), so
1022 // clear the records and finish.
1023 if b.end < cur_off {
1024 break;
1025 }
1026
1027 // If the "labels at this branch" list on this branch is
1028 // longer than a threshold, don't do any simplification,
1029 // and let the branch remain to separate those labels from
1030 // the current tail. This avoids quadratic behavior (see
1031 // #3468): otherwise, if a long string of "goto next;
1032 // next:" patterns are emitted, all of the labels will
1033 // coalesce into a long list of aliases for the current
1034 // buffer tail. We must track all aliases of the current
1035 // tail for correctness, but we are also allowed to skip
1036 // optimization (removal) of any branch, so we take the
1037 // escape hatch here and let it stand. In effect this
1038 // "spreads" the many thousands of labels in the
1039 // pathological case among an actual (harmless but
1040 // suboptimal) instruction once per N labels.
1041 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
1042 break;
1043 }
1044
1045 // Invariant: we are looking at a branch that ends at the tail of
1046 // the buffer.
1047
1048 // For any branch, conditional or unconditional:
1049 // - If the target is a label at the current offset, then remove
1050 // the conditional branch, and reset all labels that targeted
1051 // the current offset (end of branch) to the truncated
1052 // end-of-code.
1053 //
1054 // Preserves execution semantics: a branch to its own fallthrough
1055 // address is equivalent to a no-op; in both cases, nextPC is the
1056 // fallthrough.
1057 if self.resolve_label_offset(b.target) == cur_off {
1058 trace!("branch with target == cur off; truncating");
1059 self.truncate_last_branch();
1060 continue;
1061 }
1062
1063 // If latest is an unconditional branch:
1064 //
1065 // - If the branch's target is not its own start address, then for
1066 // each label at the start of branch, make the label an alias of the
1067 // branch target, and remove the label from the "labels at this
1068 // branch" list.
1069 //
1070 // - Preserves execution semantics: an unconditional branch's
1071 // only effect is to set PC to a new PC; this change simply
1072 // collapses one step in the step-semantics.
1073 //
1074 // - Post-invariant: the labels that were bound to the start of
1075 // this branch become aliases, so they must not be present in any
1076 // labels-at-this-branch list or the labels-at-tail list. The
1077 // labels are removed form the latest-branch record's
1078 // labels-at-this-branch list, and are never placed in the
1079 // labels-at-tail list. Furthermore, it is correct that they are
1080 // not in either list, because they are now aliases, and labels
1081 // that are aliases remain aliases forever.
1082 //
1083 // - If there is a prior unconditional branch that ends just before
1084 // this one begins, and this branch has no labels bound to its
1085 // start, then we can truncate this branch, because it is entirely
1086 // unreachable (we have redirected all labels that make it
1087 // reachable otherwise). Do so and continue around the loop.
1088 //
1089 // - Preserves execution semantics: the branch is unreachable,
1090 // because execution can only flow into an instruction from the
1091 // prior instruction's fallthrough or from a branch bound to that
1092 // instruction's start offset. Unconditional branches have no
1093 // fallthrough, so if the prior instruction is an unconditional
1094 // branch, no fallthrough entry can happen. The
1095 // labels-at-this-branch list is complete (by invariant), so if it
1096 // is empty, then the instruction is entirely unreachable. Thus,
1097 // it can be removed.
1098 //
1099 // - Post-invariant: ensured by truncate_last_branch().
1100 //
1101 // - If there is a prior conditional branch whose target label
1102 // resolves to the current offset (branches around the
1103 // unconditional branch), then remove the unconditional branch,
1104 // and make the target of the unconditional the target of the
1105 // conditional instead.
1106 //
1107 // - Preserves execution semantics: previously we had:
1108 //
1109 // L1:
1110 // cond_br L2
1111 // br L3
1112 // L2:
1113 // (end of buffer)
1114 //
1115 // by removing the last branch, we have:
1116 //
1117 // L1:
1118 // cond_br L2
1119 // L2:
1120 // (end of buffer)
1121 //
1122 // we then fix up the records for the conditional branch to
1123 // have:
1124 //
1125 // L1:
1126 // cond_br.inverted L3
1127 // L2:
1128 //
1129 // In the original code, control flow reaches L2 when the
1130 // conditional branch's predicate is true, and L3 otherwise. In
1131 // the optimized code, the same is true.
1132 //
1133 // - Post-invariant: all edits to latest_branches and
1134 // labels_at_tail are performed by `truncate_last_branch()`,
1135 // which maintains the invariants at each step.
1136
1137 if b.is_uncond() {
1138 // Set any label equal to current branch's start as an alias of
1139 // the branch's target, if the target is not the branch itself
1140 // (i.e., an infinite loop).
1141 //
1142 // We cannot perform this aliasing if the target of this branch
1143 // ultimately aliases back here; if so, we need to keep this
1144 // branch, so break out of this loop entirely (and clear the
1145 // latest-branches list below).
1146 //
1147 // Note that this check is what prevents cycles from forming in
1148 // `self.label_aliases`. To see why, consider an arbitrary start
1149 // state:
1150 //
1151 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1152 // Ln, which is not aliased.
1153 //
1154 // We would create a cycle if we assigned label_aliases[Ln]
1155 // = L1. Note that the below assignment is the only write
1156 // to label_aliases.
1157 //
1158 // By our other invariants, we have that Ln (`l` below)
1159 // resolves to the offset `b.start`, because it is in the
1160 // set `b.labels_at_this_branch`.
1161 //
1162 // If L1 were already aliased, through some arbitrarily deep
1163 // chain, to Ln, then it must also resolve to this offset
1164 // `b.start`.
1165 //
1166 // By checking the resolution of `L1` against this offset,
1167 // and aborting this branch-simplification if they are
1168 // equal, we prevent the below assignment from ever creating
1169 // a cycle.
1170 if self.resolve_label_offset(b.target) != b.start {
1171 let redirected = b.labels_at_this_branch.len();
1172 for &l in &b.labels_at_this_branch {
1173 trace!(
1174 " -> label at start of branch {:?} redirected to target {:?}",
1175 l, b.target
1176 );
1177 self.label_aliases[l.0 as usize] = b.target;
1178 // NOTE: we continue to ensure the invariant that labels
1179 // pointing to tail of buffer are in `labels_at_tail`
1180 // because we already ensured above that the last branch
1181 // cannot have a target of `cur_off`; so we never have
1182 // to put the label into `labels_at_tail` when moving it
1183 // here.
1184 }
1185 // Maintain invariant: all branches have been redirected
1186 // and are no longer pointing at the start of this branch.
1187 let mut_b = self.latest_branches.last_mut().unwrap();
1188 mut_b.labels_at_this_branch.clear();
1189
1190 if redirected > 0 {
1191 trace!(" -> after label redirects, restarting loop");
1192 continue;
1193 }
1194 } else {
1195 break;
1196 }
1197
1198 let b = self.latest_branches.last().unwrap();
1199
1200 // Examine any immediately preceding branch.
1201 if self.latest_branches.len() > 1 {
1202 let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1203 trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1204 // This uncond is immediately after another uncond; we
1205 // should have already redirected labels to this uncond away
1206 // (but check to be sure); so we can truncate this uncond.
1207 if prev_b.is_uncond()
1208 && prev_b.end == b.start
1209 && b.labels_at_this_branch.is_empty()
1210 {
1211 trace!(" -> uncond follows another uncond; truncating");
1212 self.truncate_last_branch();
1213 continue;
1214 }
1215
1216 // This uncond is immediately after a conditional, and the
1217 // conditional's target is the end of this uncond, and we've
1218 // already redirected labels to this uncond away; so we can
1219 // truncate this uncond, flip the sense of the conditional, and
1220 // set the conditional's target (in `latest_branches` and in
1221 // `fixup_records`) to the uncond's target.
1222 if prev_b.is_cond()
1223 && prev_b.end == b.start
1224 && self.resolve_label_offset(prev_b.target) == cur_off
1225 {
1226 trace!(
1227 " -> uncond follows a conditional, and conditional's target resolves to current offset"
1228 );
1229 // Save the target of the uncond (this becomes the
1230 // target of the cond), and truncate the uncond.
1231 let target = b.target;
1232 let data = prev_b.inverted.clone().unwrap();
1233 self.truncate_last_branch();
1234
1235 // Mutate the code and cond branch.
1236 let off_before_edit = self.cur_offset();
1237 let prev_b = self.latest_branches.last_mut().unwrap();
1238 let not_inverted = SmallVec::from(
1239 &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1240 );
1241
1242 // Low-level edit: replaces bytes of branch with
1243 // inverted form. cur_off remains the same afterward, so
1244 // we do not need to modify label data structures.
1245 self.data.truncate(prev_b.start as usize);
1246 self.data.extend_from_slice(&data[..]);
1247
1248 // Save the original code as the inversion of the
1249 // inverted branch, in case we later edit this branch
1250 // again.
1251 prev_b.inverted = Some(not_inverted);
1252 self.pending_fixup_records[prev_b.fixup].label = target;
1253 trace!(" -> reassigning target of condbr to {:?}", target);
1254 prev_b.target = target;
1255 debug_assert_eq!(off_before_edit, self.cur_offset());
1256 continue;
1257 }
1258 }
1259 }
1260
1261 // If we couldn't do anything with the last branch, then break.
1262 break;
1263 }
1264
1265 self.purge_latest_branches();
1266
1267 trace!(
1268 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1269 self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1270 );
1271 }
1272
1273 fn purge_latest_branches(&mut self) {
1274 // All of our branch simplification rules work only if a branch ends at
1275 // the tail of the buffer, with no following code; and branches are in
1276 // order in latest_branches; so if the last entry ends prior to
1277 // cur_offset, then clear all entries.
1278 let cur_off = self.cur_offset();
1279 if let Some(l) = self.latest_branches.last() {
1280 if l.end < cur_off {
1281 trace!("purge_latest_branches: removing branch {:?}", l);
1282 self.latest_branches.clear();
1283 }
1284 }
1285
1286 // Post-invariant: no invariant requires any branch to appear in
1287 // `latest_branches`; it is always optional. The list-clear above thus
1288 // preserves all semantics.
1289 }
1290
1291 /// Emit a trap at some point in the future with the specified code and
1292 /// stack map.
1293 ///
1294 /// This function returns a [`MachLabel`] which will be the future address
1295 /// of the trap. Jumps should refer to this label, likely by using the
1296 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1297 /// patched in once the address of the trap is known.
1298 ///
1299 /// This will batch all traps into the end of the function.
1300 pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1301 let label = self.get_label();
1302 self.pending_traps.push(MachLabelTrap {
1303 label,
1304 code,
1305 loc: self.cur_srcloc.map(|(_start, loc)| loc),
1306 });
1307 label
1308 }
1309
1310 /// Is an island needed within the next N bytes?
1311 pub fn island_needed(&self, distance: CodeOffset) -> bool {
1312 let deadline = match self.fixup_records.peek() {
1313 Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1314 None => self.pending_fixup_deadline,
1315 };
1316 trace!(
1317 "checking island_needed: cur_offset = {} deadline = {} worst_case_end_of_island = {}",
1318 self.cur_offset(),
1319 deadline,
1320 self.worst_case_end_of_island(distance)
1321 );
1322 let needed = deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline;
1323 trace!(" -> needed = {needed}");
1324 needed
1325 }
1326
1327 /// Returns the maximal offset that islands can reach if `distance` more
1328 /// bytes are appended.
1329 ///
1330 /// This is used to determine if veneers need insertions since jumps that
1331 /// can't reach past this point must get a veneer of some form.
1332 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1333 // Assume that all fixups will require veneers and that the veneers are
1334 // the worst-case size for each platform. This is an over-generalization
1335 // to avoid iterating over the `fixup_records` list or maintaining
1336 // information about it as we go along.
1337 let max_veneer_count =
1338 u32::try_from(self.fixup_records.len() + self.pending_fixup_records.len()).unwrap();
1339 let island_worst_case_size = max_veneer_count
1340 .saturating_mul(I::LabelUse::worst_case_veneer_size())
1341 + self.pending_constants_size
1342 + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1343 self.cur_offset()
1344 .saturating_add(distance)
1345 .saturating_add(island_worst_case_size)
1346 }
1347
1348 /// Emit all pending constants and required pending veneers.
1349 ///
1350 /// Should only be called if `island_needed()` returns true, i.e., if we
1351 /// actually reach a deadline. It's not necessarily a problem to do so
1352 /// otherwise but it may result in unnecessary work during emission.
1353 ///
1354 /// The current code-emission position must be a "safe" location for an
1355 /// island: i.e., no fallthrough into the island contents from the
1356 /// previous instruction is possible. Callers emitting inside a basic
1357 /// block should emit a jump-around branch.
1358 pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1359 self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1360 }
1361
1362 /// Same as `emit_island`, but an internal API with a `force_veneers`
1363 /// argument to force all veneers to always get emitted for debugging.
1364 fn emit_island_maybe_forced(
1365 &mut self,
1366 force_veneers: ForceVeneers,
1367 distance: CodeOffset,
1368 ctrl_plane: &mut ControlPlane,
1369 ) {
1370 trace!(
1371 "emitting island at {}, distance = {distance}",
1372 self.cur_offset()
1373 );
1374
1375 // We're going to purge fixups, so no latest-branch editing can happen
1376 // anymore.
1377 self.latest_branches.clear();
1378
1379 // End the current location tracking since anything emitted during this
1380 // function shouldn't be attributed to whatever the current source
1381 // location is.
1382 //
1383 // Note that the current source location, if it's set right now, will be
1384 // restored at the end of this island emission.
1385 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1386 if cur_loc.is_some() {
1387 self.end_srcloc();
1388 }
1389
1390 let forced_threshold = self.worst_case_end_of_island(distance);
1391 trace!("forced_threshold = {forced_threshold}");
1392
1393 // Emit traps/constants after the island: with potentially
1394 // unbounded pending constants/traps and potentially small
1395 // deadlines, it would otherwise be possible to emit a
1396 // small-range jump, have a nearby deadline *before* the end
1397 // of pending constants/traps, and not be able to emit a
1398 // veneer in time.
1399 //
1400 // Fixups whose labels aren't yet defined (e.g. references to
1401 // pending constants/traps) are simply deferred here; they'll
1402 // be resolved in the next island or in the final fixup pass
1403 // at the end of emission.
1404
1405 // Either handle all pending fixups because they're ready or move them
1406 // onto the `BinaryHeap` tracking all pending fixups if they aren't
1407 // ready.
1408 assert!(self.latest_branches.is_empty());
1409 trace!(
1410 "About to handle fixups at offset {}: {:?}",
1411 self.cur_offset(),
1412 self.pending_fixup_records
1413 );
1414 for fixup in mem::take(&mut self.pending_fixup_records) {
1415 if self.should_apply_fixup(&fixup, forced_threshold) {
1416 self.handle_fixup(fixup, force_veneers, forced_threshold);
1417 } else {
1418 self.fixup_records.push(fixup);
1419 }
1420 }
1421 self.pending_fixup_deadline = u32::MAX;
1422 while let Some(fixup) = self.fixup_records.peek() {
1423 trace!(
1424 "emit_island: fixup {:?} deadline {}",
1425 fixup,
1426 fixup.deadline()
1427 );
1428
1429 // If this fixup shouldn't be applied, that means its label isn't
1430 // defined yet and there'll be remaining space to apply a veneer if
1431 // necessary in the future after this island. In that situation
1432 // because `fixup_records` is sorted by deadline this loop can
1433 // exit.
1434 if !self.should_apply_fixup(fixup, forced_threshold) {
1435 break;
1436 }
1437
1438 let fixup = self.fixup_records.pop().unwrap();
1439 self.handle_fixup(fixup, force_veneers, forced_threshold);
1440 }
1441
1442 // Now emit pending traps and constants.
1443 //
1444 // Note that traps are placed first since this typically happens at the
1445 // end of the function and for disassemblers we try to keep all the code
1446 // contiguously together.
1447 trace!("emitting pending traps: {:?}", self.pending_traps);
1448 for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1449 // If this trap has source information associated with it then
1450 // emit this information for the trap instruction going out now too.
1451 if let Some(loc) = loc {
1452 self.start_srcloc(loc);
1453 }
1454 self.align_to(I::LabelUse::ALIGN);
1455 self.bind_label(label, ctrl_plane);
1456 self.add_trap(code);
1457 self.put_data(I::TRAP_OPCODE);
1458 if loc.is_some() {
1459 self.end_srcloc();
1460 }
1461 }
1462
1463 trace!("emitting pending constants: {:?}", self.pending_constants);
1464 for constant in mem::take(&mut self.pending_constants) {
1465 let MachBufferConstant { align, size, .. } = self.constants[constant];
1466 let label = self.constants[constant].upcoming_label.take().unwrap();
1467 self.align_to(align);
1468 self.bind_label(label, ctrl_plane);
1469 self.used_constants.push((constant, self.cur_offset()));
1470 self.get_appended_space(size);
1471 }
1472
1473 if let Some(loc) = cur_loc {
1474 self.start_srcloc(loc);
1475 }
1476 }
1477
1478 fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1479 let label_offset = self.resolve_label_offset(fixup.label);
1480 trace!(
1481 "should_apply_fixup: fixup {fixup:?} label_offset {label_offset} deadline {} forced_threshold {forced_threshold} supports_veneer {}",
1482 fixup.deadline(),
1483 fixup.kind.supports_veneer()
1484 );
1485 let result = (label_offset != UNKNOWN_LABEL_OFFSET)
1486 || ((fixup.deadline() < forced_threshold) && fixup.kind.supports_veneer());
1487 trace!(
1488 " -> {}, {}, {} -> {result}",
1489 label_offset != UNKNOWN_LABEL_OFFSET,
1490 fixup.deadline() < forced_threshold,
1491 fixup.kind.supports_veneer()
1492 );
1493 result
1494 }
1495
1496 fn handle_fixup(
1497 &mut self,
1498 fixup: MachLabelFixup<I>,
1499 force_veneers: ForceVeneers,
1500 forced_threshold: CodeOffset,
1501 ) {
1502 let MachLabelFixup {
1503 label,
1504 offset,
1505 kind,
1506 } = fixup;
1507 let start = offset as usize;
1508 let end = (offset + kind.patch_size()) as usize;
1509 let label_offset = self.resolve_label_offset(label);
1510
1511 if label_offset != UNKNOWN_LABEL_OFFSET {
1512 // If the offset of the label for this fixup is known then
1513 // we're going to do something here-and-now. We're either going
1514 // to patch the original offset because it's an in-bounds jump,
1515 // or we're going to generate a veneer, patch the fixup to jump
1516 // to the veneer, and then keep going.
1517 //
1518 // If the label comes after the original fixup, then we should
1519 // be guaranteed that the jump is in-bounds. Otherwise there's
1520 // a bug somewhere because this method wasn't called soon
1521 // enough. All forward-jumps are tracked and should get veneers
1522 // before their deadline comes and they're unable to jump
1523 // further.
1524 //
1525 // Otherwise if the label is before the fixup, then that's a
1526 // backwards jump. If it's past the maximum negative range
1527 // then we'll emit a veneer that to jump forward to which can
1528 // then jump backwards.
1529 let veneer_required = if label_offset >= offset {
1530 assert!((label_offset - offset) <= kind.max_pos_range());
1531 false
1532 } else {
1533 (offset - label_offset) > kind.max_neg_range()
1534 };
1535 trace!(
1536 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1537 label_offset,
1538 veneer_required,
1539 kind.max_pos_range(),
1540 kind.max_neg_range()
1541 );
1542
1543 if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1544 self.emit_veneer(label, offset, kind);
1545 } else {
1546 let slice = &mut self.data[start..end];
1547 trace!(
1548 "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1549 );
1550 kind.patch(slice, offset, label_offset);
1551 }
1552 } else {
1553 // If the offset of this label is not known at this time then
1554 // that means that a veneer is required because after this
1555 // island the target can't be in range of the original target.
1556 assert!(forced_threshold - offset > kind.max_pos_range());
1557 self.emit_veneer(label, offset, kind);
1558 }
1559 }
1560
1561 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1562 ///
1563 /// This will generate extra machine code, using `kind`, to get a
1564 /// larger-jump-kind than `kind` allows. The code at `offset` is then
1565 /// patched to jump to our new code, and then the new code is enqueued for
1566 /// a fixup to get processed at some later time.
1567 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1568 // If this `kind` doesn't support a veneer then that's a bug in the
1569 // backend because we need to implement support for such a veneer.
1570 assert!(
1571 kind.supports_veneer(),
1572 "jump beyond the range of {kind:?} but a veneer isn't supported",
1573 );
1574
1575 // Allocate space for a veneer in the island.
1576 self.align_to(I::LabelUse::ALIGN);
1577 let veneer_offset = self.cur_offset();
1578 trace!("making a veneer at {}", veneer_offset);
1579 let start = offset as usize;
1580 let end = (offset + kind.patch_size()) as usize;
1581 let slice = &mut self.data[start..end];
1582 // Patch the original label use to refer to the veneer.
1583 trace!(
1584 "patching original at offset {} to veneer offset {}",
1585 offset, veneer_offset
1586 );
1587 kind.patch(slice, offset, veneer_offset);
1588 // Generate the veneer.
1589 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1590 let (veneer_fixup_off, veneer_label_use) =
1591 kind.generate_veneer(veneer_slice, veneer_offset);
1592 trace!(
1593 "generated veneer; fixup offset {}, label_use {:?}",
1594 veneer_fixup_off, veneer_label_use
1595 );
1596 // Register a new use of `label` with our new veneer fixup and
1597 // offset. This'll recalculate deadlines accordingly and
1598 // enqueue this fixup to get processed at some later
1599 // time.
1600 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1601 }
1602
1603 fn finish_emission_maybe_forcing_veneers(
1604 &mut self,
1605 force_veneers: ForceVeneers,
1606 ctrl_plane: &mut ControlPlane,
1607 ) {
1608 while !self.pending_constants.is_empty()
1609 || !self.pending_traps.is_empty()
1610 || !self.fixup_records.is_empty()
1611 || !self.pending_fixup_records.is_empty()
1612 {
1613 // `emit_island()` will emit any pending veneers and constants, and
1614 // as a side-effect, will also take care of any fixups with resolved
1615 // labels eagerly.
1616 self.emit_island_maybe_forced(force_veneers, 0, ctrl_plane);
1617 }
1618
1619 // Ensure that all labels have been fixed up after the last island is emitted. This is a
1620 // full (release-mode) assert because an unresolved label means the emitted code is
1621 // incorrect.
1622 assert!(self.fixup_records.is_empty());
1623 assert!(self.pending_fixup_records.is_empty());
1624 }
1625
1626 /// Finish any deferred emissions and/or fixups.
1627 pub fn finish(
1628 mut self,
1629 constants: &VCodeConstants,
1630 ctrl_plane: &mut ControlPlane,
1631 ) -> MachBufferFinalized<Stencil> {
1632 let _tt = timing::vcode_emit_finish();
1633
1634 self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1635
1636 let alignment = self.finish_constants(constants);
1637
1638 // Resolve all labels to their offsets.
1639 let finalized_relocs = self
1640 .relocs
1641 .iter()
1642 .map(|reloc| FinalizedMachReloc {
1643 offset: reloc.offset,
1644 kind: reloc.kind,
1645 addend: reloc.addend,
1646 target: match &reloc.target {
1647 RelocTarget::ExternalName(name) => {
1648 FinalizedRelocTarget::ExternalName(name.clone())
1649 }
1650 RelocTarget::Label(label) => {
1651 FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1652 }
1653 },
1654 })
1655 .collect();
1656
1657 let finalized_exception_handlers = self
1658 .exception_handlers
1659 .iter()
1660 .map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))
1661 .collect();
1662
1663 let mut srclocs = self.srclocs;
1664 srclocs.sort_by_key(|entry| entry.start);
1665
1666 MachBufferFinalized {
1667 data: self.data,
1668 relocs: finalized_relocs,
1669 traps: self.traps,
1670 call_sites: self.call_sites,
1671 patchable_call_sites: self.patchable_call_sites,
1672 exception_handlers: finalized_exception_handlers,
1673 srclocs,
1674 debug_tags: self.debug_tags,
1675 debug_tag_pool: self.debug_tag_pool,
1676 user_stack_maps: self.user_stack_maps,
1677 unwind_info: self.unwind_info,
1678 alignment,
1679 frame_layout: self.frame_layout,
1680 nop_units: I::gen_nop_units(),
1681 }
1682 }
1683
1684 /// Add an external relocation at the given offset.
1685 pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1686 &mut self,
1687 offset: CodeOffset,
1688 kind: Reloc,
1689 target: &T,
1690 addend: Addend,
1691 ) {
1692 let target: RelocTarget = target.clone().into();
1693 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1694 // generate a label-use statement to track whether an island is possibly
1695 // needed to escape this function to actually get to the external name.
1696 // This is most likely to come up on AArch64 where calls between
1697 // functions use a 26-bit signed offset which gives +/- 64MB. This means
1698 // that if a function is 128MB in size and there's a call in the middle
1699 // it's impossible to reach the actual target. Also, while it's
1700 // technically possible to jump to the start of a function and then jump
1701 // further, island insertion below always inserts islands after
1702 // previously appended code so for Cranelift's own implementation this
1703 // is also a problem for 64MB functions on AArch64 which start with a
1704 // call instruction, those won't be able to escape.
1705 //
1706 // Ideally what needs to happen here is that a `LabelUse` is
1707 // transparently generated (or call-sites of this function are audited
1708 // to generate a `LabelUse` instead) and tracked internally. The actual
1709 // relocation would then change over time if and when a veneer is
1710 // inserted, where the relocation here would be patched by this
1711 // `MachBuffer` to jump to the veneer. The problem, though, is that all
1712 // this still needs to end up, in the case of a singular function,
1713 // generating a final relocation pointing either to this particular
1714 // relocation or to the veneer inserted. Additionally
1715 // `MachBuffer` needs the concept of a label which will never be
1716 // resolved, so `emit_island` doesn't trip over not actually ever
1717 // knowing what some labels are. Currently the loop in
1718 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1719 // loop.
1720 //
1721 // For now this means that because relocs aren't tracked at all that
1722 // AArch64 functions have a rough size limits of 64MB. For now that's
1723 // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1724 // when a relocation can't otherwise be resolved later, so it shouldn't
1725 // actually result in any memory unsafety or anything like that.
1726 self.relocs.push(MachReloc {
1727 offset,
1728 kind,
1729 target,
1730 addend,
1731 });
1732 }
1733
1734 /// Add an external relocation at the current offset.
1735 pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1736 &mut self,
1737 kind: Reloc,
1738 target: &T,
1739 addend: Addend,
1740 ) {
1741 self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1742 }
1743
1744 /// Add a trap record at the current offset.
1745 pub fn add_trap(&mut self, code: TrapCode) {
1746 self.traps.push(MachTrap {
1747 offset: self.data.len() as CodeOffset,
1748 code,
1749 });
1750 }
1751
1752 /// Add a call-site record at the current offset.
1753 pub fn add_call_site(&mut self) {
1754 self.add_try_call_site(None, core::iter::empty());
1755 }
1756
1757 /// Add a call-site record at the current offset with exception
1758 /// handlers.
1759 pub fn add_try_call_site(
1760 &mut self,
1761 frame_offset: Option<u32>,
1762 exception_handlers: impl Iterator<Item = MachExceptionHandler>,
1763 ) {
1764 let start = u32::try_from(self.exception_handlers.len()).unwrap();
1765 self.exception_handlers.extend(exception_handlers);
1766 let end = u32::try_from(self.exception_handlers.len()).unwrap();
1767 let exception_handler_range = start..end;
1768
1769 self.call_sites.push(MachCallSite {
1770 ret_addr: self.data.len() as CodeOffset,
1771 frame_offset,
1772 exception_handler_range,
1773 });
1774 }
1775
1776 /// Add a patchable call record at the current offset The actual
1777 /// call is expected to have been emitted; the VCodeInst trait
1778 /// specifies how to NOP it out, and we carry that information to
1779 /// the finalized Machbuffer.
1780 pub fn add_patchable_call_site(&mut self, len: u32) {
1781 self.patchable_call_sites.push(MachPatchableCallSite {
1782 ret_addr: self.cur_offset(),
1783 len,
1784 });
1785 }
1786
1787 /// Add an unwind record at the current offset.
1788 pub fn add_unwind(&mut self, unwind: UnwindInst) {
1789 self.unwind_info.push((self.cur_offset(), unwind));
1790 }
1791
1792 /// Set the `SourceLoc` for code from this offset until the offset at the
1793 /// next call to `end_srcloc()`.
1794 /// Returns the current [CodeOffset] and [RelSourceLoc].
1795 pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1796 let cur = (self.cur_offset(), loc);
1797 self.cur_srcloc = Some(cur);
1798 cur
1799 }
1800
1801 /// Mark the end of the `SourceLoc` segment started at the last
1802 /// `start_srcloc()` call.
1803 pub fn end_srcloc(&mut self) {
1804 let (start, loc) = self
1805 .cur_srcloc
1806 .take()
1807 .expect("end_srcloc() called without start_srcloc()");
1808 let end = self.cur_offset();
1809 // Skip zero-length extends.
1810 debug_assert!(end >= start);
1811 if end > start {
1812 self.srclocs.push(MachSrcLoc { start, end, loc });
1813 }
1814 }
1815
1816 /// Push a user stack map onto this buffer.
1817 ///
1818 /// The stack map is associated with the given `return_addr` code
1819 /// offset. This must be the PC for the instruction just *after* this stack
1820 /// map's associated instruction. For example in the sequence `call $foo;
1821 /// add r8, rax`, the `return_addr` must be the offset of the start of the
1822 /// `add` instruction.
1823 ///
1824 /// Stack maps must be pushed in sorted `return_addr` order.
1825 pub fn push_user_stack_map(
1826 &mut self,
1827 emit_state: &I::State,
1828 return_addr: CodeOffset,
1829 mut stack_map: ir::UserStackMap,
1830 ) {
1831 let span = emit_state.frame_layout().active_size();
1832 trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1833
1834 debug_assert!(
1835 self.user_stack_maps
1836 .last()
1837 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1838 "pushed stack maps out of order: {} is not less than {}",
1839 self.user_stack_maps.last().unwrap().0,
1840 return_addr,
1841 );
1842
1843 stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1844 self.user_stack_maps.push((return_addr, span, stack_map));
1845 }
1846
1847 /// Push a debug tag associated with the current buffer offset.
1848 pub fn push_debug_tags(&mut self, pos: MachDebugTagPos, tags: &[DebugTag]) {
1849 trace!("debug tags at offset {}: {tags:?}", self.cur_offset());
1850 let start = u32::try_from(self.debug_tag_pool.len()).unwrap();
1851 self.debug_tag_pool.extend(tags.iter().cloned());
1852 let end = u32::try_from(self.debug_tag_pool.len()).unwrap();
1853 self.debug_tags.push(MachDebugTags {
1854 offset: self.cur_offset(),
1855 pos,
1856 range: start..end,
1857 });
1858 }
1859
1860 /// Increase the alignment of the buffer to the given alignment if bigger
1861 /// than the current alignment.
1862 pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1863 self.min_alignment = self.min_alignment.max(
1864 1u32.checked_shl(u32::from(align_to))
1865 .expect("log2_min_function_alignment too large"),
1866 );
1867 }
1868
1869 /// Set the frame layout metadata.
1870 pub fn set_frame_layout(&mut self, frame_layout: MachBufferFrameLayout) {
1871 debug_assert!(self.frame_layout.is_none());
1872 self.frame_layout = Some(frame_layout);
1873 }
1874}
1875
1876impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1877 fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1878 for b in iter {
1879 self.put1(b);
1880 }
1881 }
1882}
1883
1884impl<T: CompilePhase> MachBufferFinalized<T> {
1885 /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1886 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1887 &self.srclocs[..]
1888 }
1889
1890 /// Get all debug tags, sorted by associated offset.
1891 pub fn debug_tags(&self) -> impl Iterator<Item = MachBufferDebugTagList<'_>> {
1892 self.debug_tags.iter().map(|tags| {
1893 let start = usize::try_from(tags.range.start).unwrap();
1894 let end = usize::try_from(tags.range.end).unwrap();
1895 MachBufferDebugTagList {
1896 offset: tags.offset,
1897 pos: tags.pos,
1898 tags: &self.debug_tag_pool[start..end],
1899 }
1900 })
1901 }
1902
1903 /// Get the total required size for the code.
1904 pub fn total_size(&self) -> CodeOffset {
1905 self.data.len() as CodeOffset
1906 }
1907
1908 /// Return the code in this mach buffer as a hex string for testing purposes.
1909 pub fn stringify_code_bytes(&self) -> String {
1910 // This is pretty lame, but whatever ..
1911 use core::fmt::Write;
1912 let mut s = String::with_capacity(self.data.len() * 2);
1913 for b in &self.data {
1914 write!(&mut s, "{b:02X}").unwrap();
1915 }
1916 s
1917 }
1918
1919 /// Get the code bytes.
1920 pub fn data(&self) -> &[u8] {
1921 // N.B.: we emit every section into the .text section as far as
1922 // the `CodeSink` is concerned; we do not bother to segregate
1923 // the contents into the actual program text, the jumptable and the
1924 // rodata (constant pool). This allows us to generate code assuming
1925 // that these will not be relocated relative to each other, and avoids
1926 // having to designate each section as belonging in one of the three
1927 // fixed categories defined by `CodeSink`. If this becomes a problem
1928 // later (e.g. because of memory permissions or similar), we can
1929 // add this designation and segregate the output; take care, however,
1930 // to add the appropriate relocations in this case.
1931
1932 &self.data[..]
1933 }
1934
1935 /// Get a mutable slice of the code bytes, allowing patching
1936 /// post-passes.
1937 pub fn data_mut(&mut self) -> &mut [u8] {
1938 &mut self.data[..]
1939 }
1940
1941 /// Get the list of external relocations for this code.
1942 pub fn relocs(&self) -> &[FinalizedMachReloc] {
1943 &self.relocs[..]
1944 }
1945
1946 /// Get the list of trap records for this code.
1947 pub fn traps(&self) -> &[MachTrap] {
1948 &self.traps[..]
1949 }
1950
1951 /// Get the user stack map metadata for this code.
1952 pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1953 &self.user_stack_maps
1954 }
1955
1956 /// Take this buffer's user stack map metadata.
1957 pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1958 mem::take(&mut self.user_stack_maps)
1959 }
1960
1961 /// Get the list of call sites for this code, along with
1962 /// associated exception handlers.
1963 ///
1964 /// Each item yielded by the returned iterator is a struct with:
1965 ///
1966 /// - The call site metadata record, with a `ret_addr` field
1967 /// directly accessible and denoting the offset of the return
1968 /// address into this buffer's code.
1969 /// - The slice of pairs of exception tags and code offsets
1970 /// denoting exception-handler entry points associated with this
1971 /// call site.
1972 pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1973 self.call_sites.iter().map(|call_site| {
1974 let handler_range = call_site.exception_handler_range.clone();
1975 let handler_range = usize::try_from(handler_range.start).unwrap()
1976 ..usize::try_from(handler_range.end).unwrap();
1977 FinalizedMachCallSite {
1978 ret_addr: call_site.ret_addr,
1979 frame_offset: call_site.frame_offset,
1980 exception_handlers: &self.exception_handlers[handler_range],
1981 }
1982 })
1983 }
1984
1985 /// Get the frame layout, if known.
1986 pub fn frame_layout(&self) -> Option<&MachBufferFrameLayout> {
1987 self.frame_layout.as_ref()
1988 }
1989
1990 /// Get the list of patchable call sites for this code.
1991 ///
1992 /// Each location in the buffer contains the bytes for a call
1993 /// instruction to the specified target. If the call is to be
1994 /// patched out, the bytes in the region should be replaced with
1995 /// those given in the `MachBufferFinalized::nop` array, repeated
1996 /// as many times as necessary. (The length of the patchable
1997 /// region is guaranteed to be an integer multiple of that NOP
1998 /// unit size.)
1999 pub fn patchable_call_sites(&self) -> impl Iterator<Item = &MachPatchableCallSite> + '_ {
2000 self.patchable_call_sites.iter()
2001 }
2002}
2003
2004/// An item in the exception-handler list for a callsite, with label
2005/// references. Items are interpreted in left-to-right order and the
2006/// first match wins.
2007#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2008pub enum MachExceptionHandler {
2009 /// A specific tag (in the current dynamic context) should be
2010 /// handled by the code at the given offset.
2011 Tag(ExceptionTag, MachLabel),
2012 /// All exceptions should be handled by the code at the given
2013 /// offset.
2014 Default(MachLabel),
2015 /// The dynamic context for interpreting tags is updated to the
2016 /// value stored in the given machine location (in this frame's
2017 /// context).
2018 Context(ExceptionContextLoc),
2019}
2020
2021impl MachExceptionHandler {
2022 fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {
2023 match self {
2024 Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),
2025 Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),
2026 Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),
2027 }
2028 }
2029}
2030
2031/// An item in the exception-handler list for a callsite, with final
2032/// (lowered) code offsets. Items are interpreted in left-to-right
2033/// order and the first match wins.
2034#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2035#[cfg_attr(
2036 feature = "enable-serde",
2037 derive(serde_derive::Serialize, serde_derive::Deserialize)
2038)]
2039pub enum FinalizedMachExceptionHandler {
2040 /// A specific tag (in the current dynamic context) should be
2041 /// handled by the code at the given offset.
2042 Tag(ExceptionTag, CodeOffset),
2043 /// All exceptions should be handled by the code at the given
2044 /// offset.
2045 Default(CodeOffset),
2046 /// The dynamic context for interpreting tags is updated to the
2047 /// value stored in the given machine location (in this frame's
2048 /// context).
2049 Context(ExceptionContextLoc),
2050}
2051
2052/// A location for a dynamic exception context value.
2053#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2054#[cfg_attr(
2055 feature = "enable-serde",
2056 derive(serde_derive::Serialize, serde_derive::Deserialize)
2057)]
2058pub enum ExceptionContextLoc {
2059 /// An offset from SP at the callsite.
2060 SPOffset(u32),
2061 /// A GPR at the callsite. The physical register number for the
2062 /// GPR register file on the target architecture is used.
2063 GPR(u8),
2064}
2065
2066/// Metadata about a constant.
2067struct MachBufferConstant {
2068 /// A label which has not yet been bound which can be used for this
2069 /// constant.
2070 ///
2071 /// This is lazily created when a label is requested for a constant and is
2072 /// cleared when a constant is emitted.
2073 upcoming_label: Option<MachLabel>,
2074 /// Required alignment.
2075 align: CodeOffset,
2076 /// The byte size of this constant.
2077 size: usize,
2078}
2079
2080/// A trap that is deferred to the next time an island is emitted for either
2081/// traps, constants, or fixups.
2082#[derive(Debug)]
2083struct MachLabelTrap {
2084 /// This label will refer to the trap's offset.
2085 label: MachLabel,
2086 /// The code associated with this trap.
2087 code: TrapCode,
2088 /// An optional source location to assign for this trap.
2089 loc: Option<RelSourceLoc>,
2090}
2091
2092/// A fixup to perform on the buffer once code is emitted. Fixups always refer
2093/// to labels and patch the code based on label offsets. Hence, they are like
2094/// relocations, but internal to one buffer.
2095#[derive(Debug)]
2096struct MachLabelFixup<I: VCodeInst> {
2097 /// The label whose offset controls this fixup.
2098 label: MachLabel,
2099 /// The offset to fix up / patch to refer to this label.
2100 offset: CodeOffset,
2101 /// The kind of fixup. This is architecture-specific; each architecture may have,
2102 /// e.g., several types of branch instructions, each with differently-sized
2103 /// offset fields and different places within the instruction to place the
2104 /// bits.
2105 kind: I::LabelUse,
2106}
2107
2108impl<I: VCodeInst> MachLabelFixup<I> {
2109 fn deadline(&self) -> CodeOffset {
2110 self.offset.saturating_add(self.kind.max_pos_range())
2111 }
2112}
2113
2114impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
2115 fn eq(&self, other: &Self) -> bool {
2116 self.deadline() == other.deadline()
2117 }
2118}
2119
2120impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
2121
2122impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
2123 fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
2124 Some(self.cmp(other))
2125 }
2126}
2127
2128impl<I: VCodeInst> Ord for MachLabelFixup<I> {
2129 fn cmp(&self, other: &Self) -> Ordering {
2130 other.deadline().cmp(&self.deadline())
2131 }
2132}
2133
2134/// A relocation resulting from a compilation.
2135#[derive(Clone, Debug, PartialEq)]
2136#[cfg_attr(
2137 feature = "enable-serde",
2138 derive(serde_derive::Serialize, serde_derive::Deserialize)
2139)]
2140pub struct MachRelocBase<T> {
2141 /// The offset at which the relocation applies, *relative to the
2142 /// containing section*.
2143 pub offset: CodeOffset,
2144 /// The kind of relocation.
2145 pub kind: Reloc,
2146 /// The external symbol / name to which this relocation refers.
2147 pub target: T,
2148 /// The addend to add to the symbol value.
2149 pub addend: i64,
2150}
2151
2152type MachReloc = MachRelocBase<RelocTarget>;
2153
2154/// A relocation resulting from a compilation.
2155pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
2156
2157/// A Relocation target
2158#[derive(Debug, Clone, PartialEq, Eq, Hash)]
2159pub enum RelocTarget {
2160 /// Points to an [ExternalName] outside the current function.
2161 ExternalName(ExternalName),
2162 /// Points to a [MachLabel] inside this function.
2163 /// This is different from [MachLabelFixup] in that both the relocation and the
2164 /// label will be emitted and are only resolved at link time.
2165 ///
2166 /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
2167 Label(MachLabel),
2168}
2169
2170impl From<ExternalName> for RelocTarget {
2171 fn from(name: ExternalName) -> Self {
2172 Self::ExternalName(name)
2173 }
2174}
2175
2176impl From<MachLabel> for RelocTarget {
2177 fn from(label: MachLabel) -> Self {
2178 Self::Label(label)
2179 }
2180}
2181
2182/// A Relocation target
2183#[derive(Debug, Clone, PartialEq, Eq, Hash)]
2184#[cfg_attr(
2185 feature = "enable-serde",
2186 derive(serde_derive::Serialize, serde_derive::Deserialize)
2187)]
2188pub enum FinalizedRelocTarget {
2189 /// Points to an [ExternalName] outside the current function.
2190 ExternalName(ExternalName),
2191 /// Points to a [CodeOffset] from the start of the current function.
2192 Func(CodeOffset),
2193}
2194
2195impl FinalizedRelocTarget {
2196 /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
2197 /// output.
2198 pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
2199 match self {
2200 FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
2201 FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
2202 }
2203 }
2204}
2205
2206/// A trap record resulting from a compilation.
2207#[derive(Clone, Debug, PartialEq)]
2208#[cfg_attr(
2209 feature = "enable-serde",
2210 derive(serde_derive::Serialize, serde_derive::Deserialize)
2211)]
2212pub struct MachTrap {
2213 /// The offset at which the trap instruction occurs, *relative to the
2214 /// containing section*.
2215 pub offset: CodeOffset,
2216 /// The trap code.
2217 pub code: TrapCode,
2218}
2219
2220/// A call site record resulting from a compilation.
2221#[derive(Clone, Debug, PartialEq)]
2222#[cfg_attr(
2223 feature = "enable-serde",
2224 derive(serde_derive::Serialize, serde_derive::Deserialize)
2225)]
2226pub struct MachCallSite {
2227 /// The offset of the call's return address, *relative to the
2228 /// start of the buffer*.
2229 pub ret_addr: CodeOffset,
2230
2231 /// The offset from the FP at this callsite down to the SP when
2232 /// the call occurs, if known. In other words, the size of the
2233 /// stack frame up to the saved FP slot. Useful to recover the
2234 /// start of the stack frame and to look up dynamic contexts
2235 /// stored in [`ExceptionContextLoc::SPOffset`].
2236 ///
2237 /// If `None`, the compiler backend did not specify a frame
2238 /// offset. The runtime in use with the compiled code may require
2239 /// the frame offset if exception handlers are present or dynamic
2240 /// context is used, but that is not Cranelift's concern: the
2241 /// frame offset is optional at this level.
2242 pub frame_offset: Option<u32>,
2243
2244 /// Range in `exception_handlers` corresponding to the exception
2245 /// handlers for this callsite.
2246 exception_handler_range: Range<u32>,
2247}
2248
2249/// A call site record resulting from a compilation.
2250#[derive(Clone, Debug, PartialEq)]
2251pub struct FinalizedMachCallSite<'a> {
2252 /// The offset of the call's return address, *relative to the
2253 /// start of the buffer*.
2254 pub ret_addr: CodeOffset,
2255
2256 /// The offset from the FP at this callsite down to the SP when
2257 /// the call occurs, if known. In other words, the size of the
2258 /// stack frame up to the saved FP slot. Useful to recover the
2259 /// start of the stack frame and to look up dynamic contexts
2260 /// stored in [`ExceptionContextLoc::SPOffset`].
2261 ///
2262 /// If `None`, the compiler backend did not specify a frame
2263 /// offset. The runtime in use with the compiled code may require
2264 /// the frame offset if exception handlers are present or dynamic
2265 /// context is used, but that is not Cranelift's concern: the
2266 /// frame offset is optional at this level.
2267 pub frame_offset: Option<u32>,
2268
2269 /// Exception handlers at this callsite, with target offsets
2270 /// *relative to the start of the buffer*.
2271 pub exception_handlers: &'a [FinalizedMachExceptionHandler],
2272}
2273
2274/// A patchable call site record resulting from a compilation.
2275#[derive(Clone, Debug, PartialEq)]
2276#[cfg_attr(
2277 feature = "enable-serde",
2278 derive(serde_derive::Serialize, serde_derive::Deserialize)
2279)]
2280pub struct MachPatchableCallSite {
2281 /// The offset of the call's return address (i.e., the address
2282 /// after the end of the patchable region), *relative to the start
2283 /// of the buffer*.
2284 pub ret_addr: CodeOffset,
2285
2286 /// The length of the region to be patched by NOP bytes.
2287 pub len: u32,
2288}
2289
2290/// A source-location mapping resulting from a compilation.
2291#[derive(PartialEq, Debug, Clone)]
2292#[cfg_attr(
2293 feature = "enable-serde",
2294 derive(serde_derive::Serialize, serde_derive::Deserialize)
2295)]
2296pub struct MachSrcLoc<T: CompilePhase> {
2297 /// The start of the region of code corresponding to a source location.
2298 /// This is relative to the start of the function, not to the start of the
2299 /// section.
2300 pub start: CodeOffset,
2301 /// The end of the region of code corresponding to a source location.
2302 /// This is relative to the start of the function, not to the start of the
2303 /// section.
2304 pub end: CodeOffset,
2305 /// The source location.
2306 pub loc: T::SourceLocType,
2307}
2308
2309impl MachSrcLoc<Stencil> {
2310 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
2311 MachSrcLoc {
2312 start: self.start,
2313 end: self.end,
2314 loc: self.loc.expand(base_srcloc),
2315 }
2316 }
2317}
2318
2319/// Record of branch instruction in the buffer, to facilitate editing.
2320#[derive(Clone, Debug)]
2321struct MachBranch {
2322 start: CodeOffset,
2323 end: CodeOffset,
2324 target: MachLabel,
2325 fixup: usize,
2326 inverted: Option<SmallVec<[u8; 8]>>,
2327 /// All labels pointing to the start of this branch. For correctness, this
2328 /// *must* be complete (i.e., must contain all labels whose resolved offsets
2329 /// are at the start of this branch): we rely on being able to redirect all
2330 /// labels that could jump to this branch before removing it, if it is
2331 /// otherwise unreachable.
2332 labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2333}
2334
2335impl MachBranch {
2336 fn is_cond(&self) -> bool {
2337 self.inverted.is_some()
2338 }
2339 fn is_uncond(&self) -> bool {
2340 self.inverted.is_none()
2341 }
2342}
2343
2344/// Stack-frame layout information carried through to machine
2345/// code. This provides sufficient information to interpret an active
2346/// stack frame from a running function, if provided.
2347#[derive(Clone, Debug, PartialEq)]
2348#[cfg_attr(
2349 feature = "enable-serde",
2350 derive(serde_derive::Serialize, serde_derive::Deserialize)
2351)]
2352pub struct MachBufferFrameLayout {
2353 /// Offset from bottom of frame to FP (near top of frame). This
2354 /// allows reading the frame given only FP.
2355 pub frame_to_fp_offset: u32,
2356 /// Offset from bottom of frame for each StackSlot,
2357 pub stackslots: SecondaryMap<ir::StackSlot, MachBufferStackSlot>,
2358}
2359
2360/// Descriptor for a single stack slot in the compiled function.
2361#[derive(Clone, Debug, PartialEq, Default)]
2362#[cfg_attr(
2363 feature = "enable-serde",
2364 derive(serde_derive::Serialize, serde_derive::Deserialize)
2365)]
2366pub struct MachBufferStackSlot {
2367 /// Offset from the bottom of the stack frame.
2368 pub offset: u32,
2369
2370 /// User-provided key to describe this stack slot.
2371 pub key: Option<ir::StackSlotKey>,
2372}
2373
2374/// Debug tags: a sequence of references to a stack slot, or a
2375/// user-defined value, at a particular PC.
2376#[derive(Clone, Debug, PartialEq)]
2377#[cfg_attr(
2378 feature = "enable-serde",
2379 derive(serde_derive::Serialize, serde_derive::Deserialize)
2380)]
2381pub(crate) struct MachDebugTags {
2382 /// Offset at which this tag applies.
2383 pub offset: CodeOffset,
2384
2385 /// Position on the attached instruction. This indicates whether
2386 /// the tags attach to the prior instruction (i.e., as a return
2387 /// point from a call) or the current instruction (i.e., as a PC
2388 /// seen during a trap).
2389 pub pos: MachDebugTagPos,
2390
2391 /// The range in the tag pool.
2392 pub range: Range<u32>,
2393}
2394
2395/// Debug tag position on an instruction.
2396///
2397/// We need to distinguish position on an instruction, and not just
2398/// use offsets, because of the following case:
2399///
2400/// ```plain
2401/// <tag1, tag2> call ...
2402/// <tag3, tag4> trapping_store ...
2403/// ```
2404///
2405/// If the stack is walked and interpreted with debug tags while
2406/// within the call, the PC seen will be the return point, i.e. the
2407/// address after the call. If the stack is walked and interpreted
2408/// with debug tags upon a trap of the following instruction, it will
2409/// be the PC of that instruction -- which is the same PC! Thus to
2410/// disambiguate which tags we want, we attach a "pre/post" flag to
2411/// every group of tags at an offset; and when we look up tags, we
2412/// look them up for an offset and "position" at that offset.
2413///
2414/// Thus there are logically two positions at every offset -- so the
2415/// above will be emitted as
2416///
2417/// ```plain
2418/// 0: call ...
2419/// 4, post: <tag1, tag2>
2420/// 4, pre: <tag3, tag4>
2421/// 4: trapping_store ...
2422/// ```
2423#[derive(Clone, Copy, Debug, PartialEq, Eq)]
2424#[cfg_attr(
2425 feature = "enable-serde",
2426 derive(serde_derive::Serialize, serde_derive::Deserialize)
2427)]
2428pub enum MachDebugTagPos {
2429 /// Tags attached after the instruction that ends at this offset.
2430 ///
2431 /// This is used to attach tags to a call, because the PC we see
2432 /// when walking the stack is the *return point*.
2433 Post,
2434 /// Tags attached before the instruction that starts at this offset.
2435 ///
2436 /// This is used to attach tags to every other kind of
2437 /// instruction, because the PC we see when processing a trap of
2438 /// that instruction is the PC of that instruction, not the
2439 /// following one.
2440 Pre,
2441}
2442
2443/// Iterator item for visiting debug tags.
2444pub struct MachBufferDebugTagList<'a> {
2445 /// Offset at which this tag applies.
2446 pub offset: CodeOffset,
2447
2448 /// Position at this offset ("post", attaching to prior
2449 /// instruction, or "pre", attaching to next instruction).
2450 pub pos: MachDebugTagPos,
2451
2452 /// The underlying tags.
2453 pub tags: &'a [DebugTag],
2454}
2455
2456/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2457///
2458/// Note that `MachBuffer` was primarily written for intra-function references
2459/// of jumps between basic blocks, but it's also quite usable for entire text
2460/// sections and resolving references between functions themselves. This
2461/// builder interprets "blocks" as labeled functions for the purposes of
2462/// resolving labels internally in the buffer.
2463pub struct MachTextSectionBuilder<I: VCodeInst> {
2464 buf: MachBuffer<I>,
2465 next_func: usize,
2466 force_veneers: ForceVeneers,
2467}
2468
2469impl<I: VCodeInst> MachTextSectionBuilder<I> {
2470 /// Creates a new text section builder which will have `num_funcs` functions
2471 /// pushed into it.
2472 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2473 let mut buf = MachBuffer::new();
2474 buf.reserve_labels_for_blocks(num_funcs);
2475 MachTextSectionBuilder {
2476 buf,
2477 next_func: 0,
2478 force_veneers: ForceVeneers::No,
2479 }
2480 }
2481}
2482
2483impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2484 fn append(
2485 &mut self,
2486 labeled: bool,
2487 func: &[u8],
2488 align: u32,
2489 ctrl_plane: &mut ControlPlane,
2490 ) -> u64 {
2491 // Conditionally emit an island if it's necessary to resolve jumps
2492 // between functions which are too far away.
2493 let size = func.len() as u32;
2494 if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2495 self.buf
2496 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2497 }
2498
2499 self.buf.align_to(align);
2500 let pos = self.buf.cur_offset();
2501 if labeled {
2502 self.buf.bind_label(
2503 MachLabel::from_block(BlockIndex::new(self.next_func)),
2504 ctrl_plane,
2505 );
2506 self.next_func += 1;
2507 }
2508 self.buf.put_data(func);
2509 u64::from(pos)
2510 }
2511
2512 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2513 crate::trace!(
2514 "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2515 );
2516 let label = MachLabel::from_block(BlockIndex::new(target));
2517 let offset = u32::try_from(offset).unwrap();
2518 match I::LabelUse::from_reloc(reloc, addend) {
2519 Some(label_use) => {
2520 self.buf.use_label_at_offset(offset, label, label_use);
2521 true
2522 }
2523 None => false,
2524 }
2525 }
2526
2527 fn force_veneers(&mut self) {
2528 self.force_veneers = ForceVeneers::Yes;
2529 }
2530
2531 fn write(&mut self, offset: u64, data: &[u8]) {
2532 self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2533 }
2534
2535 fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2536 // Double-check all functions were pushed.
2537 assert_eq!(self.next_func, self.buf.label_offsets.len());
2538
2539 // Finish up any veneers, if necessary.
2540 self.buf
2541 .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2542
2543 // We don't need the data any more, so return it to the caller.
2544 mem::take(&mut self.buf.data).into_vec()
2545 }
2546}
2547
2548// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2549#[cfg(all(test, feature = "arm64"))]
2550mod test {
2551 use cranelift_entity::EntityRef as _;
2552
2553 use super::*;
2554 use crate::ir::UserExternalNameRef;
2555 use crate::isa::aarch64;
2556 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2557 use crate::isa::aarch64::inst::{OperandSize, xreg};
2558 use crate::machinst::{MachInst, MachInstEmit, MachInstEmitState};
2559 use crate::settings;
2560
2561 fn label(n: u32) -> MachLabel {
2562 MachLabel::from_block(BlockIndex::new(n as usize))
2563 }
2564 fn target(n: u32) -> BranchTarget {
2565 BranchTarget::Label(label(n))
2566 }
2567
2568 fn emit_info() -> EmitInfo {
2569 let flags = settings::Flags::new(settings::builder());
2570 let isa_flags = aarch64::settings::Flags::new(&flags, &aarch64::settings::builder());
2571 EmitInfo::new(flags, isa_flags)
2572 }
2573
2574 #[test]
2575 fn test_elide_jump_to_next() {
2576 let info = emit_info();
2577 let mut buf = MachBuffer::new();
2578 let mut state = <Inst as MachInstEmit>::State::default();
2579 let constants = Default::default();
2580
2581 buf.reserve_labels_for_blocks(2);
2582 buf.bind_label(label(0), state.ctrl_plane_mut());
2583 let inst = Inst::Jump { dest: target(1) };
2584 inst.emit(&mut buf, &info, &mut state);
2585 buf.bind_label(label(1), state.ctrl_plane_mut());
2586 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2587 assert_eq!(0, buf.total_size());
2588 }
2589
2590 #[test]
2591 fn test_elide_trivial_jump_blocks() {
2592 let info = emit_info();
2593 let mut buf = MachBuffer::new();
2594 let mut state = <Inst as MachInstEmit>::State::default();
2595 let constants = Default::default();
2596
2597 buf.reserve_labels_for_blocks(4);
2598
2599 buf.bind_label(label(0), state.ctrl_plane_mut());
2600 let inst = Inst::CondBr {
2601 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2602 taken: target(1),
2603 not_taken: target(2),
2604 };
2605 inst.emit(&mut buf, &info, &mut state);
2606
2607 buf.bind_label(label(1), state.ctrl_plane_mut());
2608 let inst = Inst::Jump { dest: target(3) };
2609 inst.emit(&mut buf, &info, &mut state);
2610
2611 buf.bind_label(label(2), state.ctrl_plane_mut());
2612 let inst = Inst::Jump { dest: target(3) };
2613 inst.emit(&mut buf, &info, &mut state);
2614
2615 buf.bind_label(label(3), state.ctrl_plane_mut());
2616
2617 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2618 assert_eq!(0, buf.total_size());
2619 }
2620
2621 #[test]
2622 fn test_flip_cond() {
2623 let info = emit_info();
2624 let mut buf = MachBuffer::new();
2625 let mut state = <Inst as MachInstEmit>::State::default();
2626 let constants = Default::default();
2627
2628 buf.reserve_labels_for_blocks(4);
2629
2630 buf.bind_label(label(0), state.ctrl_plane_mut());
2631 let inst = Inst::CondBr {
2632 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2633 taken: target(1),
2634 not_taken: target(2),
2635 };
2636 inst.emit(&mut buf, &info, &mut state);
2637
2638 buf.bind_label(label(1), state.ctrl_plane_mut());
2639 let inst = Inst::Nop4;
2640 inst.emit(&mut buf, &info, &mut state);
2641
2642 buf.bind_label(label(2), state.ctrl_plane_mut());
2643 let inst = Inst::Udf {
2644 trap_code: TrapCode::STACK_OVERFLOW,
2645 };
2646 inst.emit(&mut buf, &info, &mut state);
2647
2648 buf.bind_label(label(3), state.ctrl_plane_mut());
2649
2650 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2651
2652 let mut buf2 = MachBuffer::new();
2653 let mut state = Default::default();
2654 let inst = Inst::TrapIf {
2655 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2656 trap_code: TrapCode::STACK_OVERFLOW,
2657 };
2658 inst.emit(&mut buf2, &info, &mut state);
2659 let inst = Inst::Nop4;
2660 inst.emit(&mut buf2, &info, &mut state);
2661
2662 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2663
2664 assert_eq!(buf.data, buf2.data);
2665 }
2666
2667 #[test]
2668 fn test_island() {
2669 let info = emit_info();
2670 let mut buf = MachBuffer::new();
2671 let mut state = <Inst as MachInstEmit>::State::default();
2672 let constants = Default::default();
2673
2674 buf.reserve_labels_for_blocks(4);
2675
2676 buf.bind_label(label(0), state.ctrl_plane_mut());
2677 let inst = Inst::CondBr {
2678 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2679 taken: target(2),
2680 not_taken: target(3),
2681 };
2682 inst.emit(&mut buf, &info, &mut state);
2683
2684 buf.bind_label(label(1), state.ctrl_plane_mut());
2685 while buf.cur_offset() < 2000000 {
2686 if buf.island_needed(0) {
2687 buf.emit_island(0, state.ctrl_plane_mut());
2688 }
2689 let inst = Inst::Nop4;
2690 inst.emit(&mut buf, &info, &mut state);
2691 }
2692
2693 buf.bind_label(label(2), state.ctrl_plane_mut());
2694 let inst = Inst::Nop4;
2695 inst.emit(&mut buf, &info, &mut state);
2696
2697 buf.bind_label(label(3), state.ctrl_plane_mut());
2698 let inst = Inst::Nop4;
2699 inst.emit(&mut buf, &info, &mut state);
2700
2701 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2702
2703 assert_eq!(2000000 + 8, buf.total_size());
2704
2705 let mut buf2 = MachBuffer::new();
2706 let mut state = Default::default();
2707 let inst = Inst::CondBr {
2708 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2709
2710 // This conditionally taken branch has a 19-bit constant, shifted
2711 // to the left by two, giving us a 21-bit range in total. Half of
2712 // this range positive so the we should be around 1 << 20 bytes
2713 // away for our jump target.
2714 //
2715 // There are two pending fixups by the time we reach this point,
2716 // one for this 19-bit jump and one for the unconditional 26-bit
2717 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2718 // veneer is 20 bytes large, which means that pessimistically
2719 // assuming we'll need two veneers. Currently each veneer is
2720 // pessimistically assumed to be the maximal size which means we
2721 // need 40 bytes of extra space, meaning that the actual island
2722 // should come 40-bytes before the deadline.
2723 taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2724
2725 // This branch is in-range so no veneers should be needed, it should
2726 // go directly to the target.
2727 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2728 };
2729 inst.emit(&mut buf2, &info, &mut state);
2730
2731 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2732
2733 assert_eq!(&buf.data[0..8], &buf2.data[..]);
2734 }
2735
2736 #[test]
2737 fn test_island_backward() {
2738 let info = emit_info();
2739 let mut buf = MachBuffer::new();
2740 let mut state = <Inst as MachInstEmit>::State::default();
2741 let constants = Default::default();
2742
2743 buf.reserve_labels_for_blocks(4);
2744
2745 buf.bind_label(label(0), state.ctrl_plane_mut());
2746 let inst = Inst::Nop4;
2747 inst.emit(&mut buf, &info, &mut state);
2748
2749 buf.bind_label(label(1), state.ctrl_plane_mut());
2750 let inst = Inst::Nop4;
2751 inst.emit(&mut buf, &info, &mut state);
2752
2753 buf.bind_label(label(2), state.ctrl_plane_mut());
2754 while buf.cur_offset() < 2000000 {
2755 let inst = Inst::Nop4;
2756 inst.emit(&mut buf, &info, &mut state);
2757 }
2758
2759 buf.bind_label(label(3), state.ctrl_plane_mut());
2760 let inst = Inst::CondBr {
2761 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2762 taken: target(0),
2763 not_taken: target(1),
2764 };
2765 inst.emit(&mut buf, &info, &mut state);
2766
2767 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2768
2769 assert_eq!(2000000 + 12, buf.total_size());
2770
2771 let mut buf2 = MachBuffer::new();
2772 let mut state = Default::default();
2773 let inst = Inst::CondBr {
2774 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2775 taken: BranchTarget::ResolvedOffset(8),
2776 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2777 };
2778 inst.emit(&mut buf2, &info, &mut state);
2779 let inst = Inst::Jump {
2780 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2781 };
2782 inst.emit(&mut buf2, &info, &mut state);
2783
2784 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2785
2786 assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2787 }
2788
2789 #[test]
2790 fn test_multiple_redirect() {
2791 // label0:
2792 // cbz x0, label1
2793 // b label2
2794 // label1:
2795 // b label3
2796 // label2:
2797 // nop
2798 // nop
2799 // b label0
2800 // label3:
2801 // b label4
2802 // label4:
2803 // b label5
2804 // label5:
2805 // b label7
2806 // label6:
2807 // nop
2808 // label7:
2809 // ret
2810 //
2811 // -- should become:
2812 //
2813 // label0:
2814 // cbz x0, label7
2815 // label2:
2816 // nop
2817 // nop
2818 // b label0
2819 // label6:
2820 // nop
2821 // label7:
2822 // ret
2823
2824 let info = emit_info();
2825 let mut buf = MachBuffer::new();
2826 let mut state = <Inst as MachInstEmit>::State::default();
2827 let constants = Default::default();
2828
2829 buf.reserve_labels_for_blocks(8);
2830
2831 buf.bind_label(label(0), state.ctrl_plane_mut());
2832 let inst = Inst::CondBr {
2833 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2834 taken: target(1),
2835 not_taken: target(2),
2836 };
2837 inst.emit(&mut buf, &info, &mut state);
2838
2839 buf.bind_label(label(1), state.ctrl_plane_mut());
2840 let inst = Inst::Jump { dest: target(3) };
2841 inst.emit(&mut buf, &info, &mut state);
2842
2843 buf.bind_label(label(2), state.ctrl_plane_mut());
2844 let inst = Inst::Nop4;
2845 inst.emit(&mut buf, &info, &mut state);
2846 inst.emit(&mut buf, &info, &mut state);
2847 let inst = Inst::Jump { dest: target(0) };
2848 inst.emit(&mut buf, &info, &mut state);
2849
2850 buf.bind_label(label(3), state.ctrl_plane_mut());
2851 let inst = Inst::Jump { dest: target(4) };
2852 inst.emit(&mut buf, &info, &mut state);
2853
2854 buf.bind_label(label(4), state.ctrl_plane_mut());
2855 let inst = Inst::Jump { dest: target(5) };
2856 inst.emit(&mut buf, &info, &mut state);
2857
2858 buf.bind_label(label(5), state.ctrl_plane_mut());
2859 let inst = Inst::Jump { dest: target(7) };
2860 inst.emit(&mut buf, &info, &mut state);
2861
2862 buf.bind_label(label(6), state.ctrl_plane_mut());
2863 let inst = Inst::Nop4;
2864 inst.emit(&mut buf, &info, &mut state);
2865
2866 buf.bind_label(label(7), state.ctrl_plane_mut());
2867 let inst = Inst::Ret {};
2868 inst.emit(&mut buf, &info, &mut state);
2869
2870 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2871
2872 let golden_data = vec![
2873 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2874 0x1f, 0x20, 0x03, 0xd5, // nop
2875 0x1f, 0x20, 0x03, 0xd5, // nop
2876 0xfd, 0xff, 0xff, 0x17, // b 0
2877 0x1f, 0x20, 0x03, 0xd5, // nop
2878 0xc0, 0x03, 0x5f, 0xd6, // ret
2879 ];
2880
2881 assert_eq!(&golden_data[..], &buf.data[..]);
2882 }
2883
2884 #[test]
2885 fn test_handle_branch_cycle() {
2886 // label0:
2887 // b label1
2888 // label1:
2889 // b label2
2890 // label2:
2891 // b label3
2892 // label3:
2893 // b label4
2894 // label4:
2895 // b label1 // note: not label0 (to make it interesting).
2896 //
2897 // -- should become:
2898 //
2899 // label0, label1, ..., label4:
2900 // b label0
2901 let info = emit_info();
2902 let mut buf = MachBuffer::new();
2903 let mut state = <Inst as MachInstEmit>::State::default();
2904 let constants = Default::default();
2905
2906 buf.reserve_labels_for_blocks(5);
2907
2908 buf.bind_label(label(0), state.ctrl_plane_mut());
2909 let inst = Inst::Jump { dest: target(1) };
2910 inst.emit(&mut buf, &info, &mut state);
2911
2912 buf.bind_label(label(1), state.ctrl_plane_mut());
2913 let inst = Inst::Jump { dest: target(2) };
2914 inst.emit(&mut buf, &info, &mut state);
2915
2916 buf.bind_label(label(2), state.ctrl_plane_mut());
2917 let inst = Inst::Jump { dest: target(3) };
2918 inst.emit(&mut buf, &info, &mut state);
2919
2920 buf.bind_label(label(3), state.ctrl_plane_mut());
2921 let inst = Inst::Jump { dest: target(4) };
2922 inst.emit(&mut buf, &info, &mut state);
2923
2924 buf.bind_label(label(4), state.ctrl_plane_mut());
2925 let inst = Inst::Jump { dest: target(1) };
2926 inst.emit(&mut buf, &info, &mut state);
2927
2928 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2929
2930 let golden_data = vec![
2931 0x00, 0x00, 0x00, 0x14, // b 0
2932 ];
2933
2934 assert_eq!(&golden_data[..], &buf.data[..]);
2935 }
2936
2937 #[test]
2938 fn metadata_records() {
2939 let mut buf = MachBuffer::<Inst>::new();
2940 let ctrl_plane = &mut Default::default();
2941 let constants = Default::default();
2942
2943 buf.reserve_labels_for_blocks(3);
2944
2945 buf.bind_label(label(0), ctrl_plane);
2946 buf.put1(1);
2947 buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2948 buf.put1(2);
2949 buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2950 buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2951 buf.add_try_call_site(
2952 Some(0x10),
2953 [
2954 MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),
2955 MachExceptionHandler::Default(label(1)),
2956 ]
2957 .into_iter(),
2958 );
2959 buf.add_reloc(
2960 Reloc::Abs4,
2961 &ExternalName::User(UserExternalNameRef::new(0)),
2962 0,
2963 );
2964 buf.put1(3);
2965 buf.add_reloc(
2966 Reloc::Abs8,
2967 &ExternalName::User(UserExternalNameRef::new(1)),
2968 1,
2969 );
2970 buf.put1(4);
2971 buf.bind_label(label(1), ctrl_plane);
2972 buf.put1(0xff);
2973 buf.bind_label(label(2), ctrl_plane);
2974 buf.put1(0xff);
2975
2976 let buf = buf.finish(&constants, ctrl_plane);
2977
2978 assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2979 assert_eq!(
2980 buf.traps()
2981 .iter()
2982 .map(|trap| (trap.offset, trap.code))
2983 .collect::<Vec<_>>(),
2984 vec![
2985 (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2986 (2, TrapCode::INTEGER_OVERFLOW),
2987 (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2988 ]
2989 );
2990 let call_sites: Vec<_> = buf.call_sites().collect();
2991 assert_eq!(call_sites[0].ret_addr, 2);
2992 assert_eq!(call_sites[0].frame_offset, Some(0x10));
2993 assert_eq!(
2994 call_sites[0].exception_handlers,
2995 &[
2996 FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),
2997 FinalizedMachExceptionHandler::Default(4)
2998 ],
2999 );
3000 assert_eq!(
3001 buf.relocs()
3002 .iter()
3003 .map(|reloc| (reloc.offset, reloc.kind))
3004 .collect::<Vec<_>>(),
3005 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
3006 );
3007 }
3008
3009 /// Drive the buffer in the same idiom that VCode emission uses:
3010 /// emit instruction bytes via the given closure, then run the
3011 /// per-instruction island check.
3012 fn emit_with_island_check(
3013 buf: &mut MachBuffer<Inst>,
3014 state: &mut <Inst as MachInstEmit>::State,
3015 f: impl FnOnce(&mut MachBuffer<Inst>, &mut <Inst as MachInstEmit>::State),
3016 ) {
3017 f(buf, state);
3018 let lookahead = Inst::worst_case_size() + Inst::worst_case_island_growth();
3019 if buf.island_needed(lookahead) {
3020 let jump_around = buf.get_label();
3021 Inst::gen_jump(jump_around).emit(buf, &emit_info(), state);
3022 buf.emit_island(0, state.ctrl_plane_mut());
3023 buf.bind_label(jump_around, state.ctrl_plane_mut());
3024 }
3025 }
3026
3027 /// Many constant loads in a single basic block: each emits an
3028 /// `Ldr19` reference (+/- 1 MiB range, no veneer support) and
3029 /// adds a 16-byte constant to the pool. Without the
3030 /// per-instruction island check, the pool would grow past the
3031 /// reach of earlier `ldr`s and the buffer would panic during the
3032 /// final fixup pass (see issue #12968).
3033 #[test]
3034 fn test_many_constants_in_one_block() {
3035 use crate::ir::constant::ConstantData;
3036
3037 let mut buf = MachBuffer::<Inst>::new();
3038 let mut state = <Inst as MachInstEmit>::State::default();
3039 let mut constants = VCodeConstants::default();
3040
3041 let n = 200_000;
3042 let mut handles = Vec::with_capacity(n);
3043 for i in 0..n {
3044 let bytes: Vec<u8> = (0..16).map(|b| ((i + b) & 0xff) as u8).collect();
3045 let data = VCodeConstantData::Generated(ConstantData::from(&bytes[..]));
3046 handles.push(constants.insert(data));
3047 }
3048 buf.register_constants(&constants);
3049
3050 buf.reserve_labels_for_blocks(1);
3051 buf.bind_label(label(0), state.ctrl_plane_mut());
3052
3053 for &handle in &handles {
3054 emit_with_island_check(&mut buf, &mut state, |buf, _state| {
3055 let off = buf.cur_offset();
3056 let const_label = buf.get_label_for_constant(handle);
3057 buf.use_label_at_offset(off, const_label, aarch64::inst::LabelUse::Ldr19);
3058 // Placeholder ldr literal instruction.
3059 buf.put4(0);
3060 });
3061 }
3062
3063 let _ = buf.finish(&constants, state.ctrl_plane_mut());
3064 }
3065
3066 /// Mix conditional branches with short ranges (`Branch19`, +/- 1
3067 /// MiB) and many constant loads. The branches' veneers must
3068 /// remain in range as the constant pool grows.
3069 #[test]
3070 fn test_short_branch_amid_constants() {
3071 use crate::ir::constant::ConstantData;
3072
3073 let mut buf = MachBuffer::<Inst>::new();
3074 let mut state = <Inst as MachInstEmit>::State::default();
3075 let mut constants = VCodeConstants::default();
3076
3077 let n = 100_000;
3078 let mut handles = Vec::with_capacity(n);
3079 for i in 0..n {
3080 let bytes: Vec<u8> = (0..16).map(|b| ((i ^ b) & 0xff) as u8).collect();
3081 handles.push(
3082 constants.insert(VCodeConstantData::Generated(ConstantData::from(&bytes[..]))),
3083 );
3084 }
3085 buf.register_constants(&constants);
3086
3087 buf.reserve_labels_for_blocks(2);
3088 buf.bind_label(label(0), state.ctrl_plane_mut());
3089
3090 emit_with_island_check(&mut buf, &mut state, |buf, state| {
3091 let inst = Inst::CondBr {
3092 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
3093 taken: target(1),
3094 not_taken: target(0),
3095 };
3096 inst.emit(buf, &emit_info(), state);
3097 });
3098
3099 for &handle in &handles {
3100 emit_with_island_check(&mut buf, &mut state, |buf, _state| {
3101 let off = buf.cur_offset();
3102 let const_label = buf.get_label_for_constant(handle);
3103 buf.use_label_at_offset(off, const_label, aarch64::inst::LabelUse::Ldr19);
3104 buf.put4(0);
3105 });
3106 }
3107
3108 buf.bind_label(label(1), state.ctrl_plane_mut());
3109 let _ = buf.finish(&constants, state.ctrl_plane_mut());
3110 }
3111
3112 /// Driving an island in the middle of a block via the
3113 /// jump-around-plus-`emit_island` idiom must emit a branch followed
3114 /// by the island contents, such that fall-through reaches the
3115 /// post-island code.
3116 #[test]
3117 fn test_mid_block_island_via_gen_jump() {
3118 let mut buf = MachBuffer::<Inst>::new();
3119 let mut state = <Inst as MachInstEmit>::State::default();
3120 let constants = VCodeConstants::default();
3121
3122 buf.reserve_labels_for_blocks(1);
3123 buf.bind_label(label(0), state.ctrl_plane_mut());
3124
3125 // Place a trap which will need to be emitted in the next island.
3126 let trap_label = buf.defer_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
3127 let off = buf.cur_offset();
3128 buf.use_label_at_offset(off, trap_label, aarch64::inst::LabelUse::Branch19);
3129 buf.put4(0); // placeholder for cbnz-like reference
3130
3131 let before = buf.cur_offset();
3132 let jump_around = buf.get_label();
3133 Inst::gen_jump(jump_around).emit(&mut buf, &emit_info(), &mut state);
3134 buf.emit_island(0, state.ctrl_plane_mut());
3135 buf.bind_label(jump_around, state.ctrl_plane_mut());
3136 let after = buf.cur_offset();
3137
3138 // The jump-around branch (4 bytes on AArch64) plus at least the
3139 // deferred trap (4 bytes) gives a minimum island growth of 8 bytes.
3140 assert!(
3141 after - before >= 8,
3142 "island grew too little: {}",
3143 after - before
3144 );
3145
3146 let buf = buf.finish(&constants, state.ctrl_plane_mut());
3147 let _ = buf.total_size();
3148 }
3149}