cranelift_codegen/machinst/buffer.rs
1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//! emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//! For example, on AArch64, conditional branches can only target code +/- 1MB
12//! from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//! edge blocks when the register allocator does not need to insert any
17//! spills/reloads/moves in edge blocks, results in many suboptimal branch
18//! patterns. The lowering also pays no attention to block order, and so
19//! two-target conditional forms (cond-br followed by uncond-br) can often by
20//! avoided because one of the targets is the fallthrough. There are several
21//! cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//! forms, but with zero offsets and optimistically assuming the target will be
29//! in range. Record the "fixup" for later. Targets are denoted instead by
30//! symbolic "labels" that are then bound to certain offsets in the buffer as
31//! we emit code. (Nominally, there is a label at the start of every basic
32//! block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//! reference "goes out of range". We call this the "deadline". If we reach the
36//! deadline and we still have not bound the label to which an unresolved branch
37//! refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//! simply a chunk of code inserted in the middle of the code actually produced
41//! by the emitter (e.g., vcode iterating over instruction structs). The emitter
42//! has some awareness of this: it either asks for an island between blocks, so
43//! it is not accidentally executed, or else it emits a branch around the island
44//! when all other options fail (see `Inst::EmitIsland` meta-instruction).
45//!
46//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47//! that implements a longer-range reference to a label. The idea is that, for
48//! example, a branch with a limited range can branch to a "veneer" instead,
49//! which is simply a branch in a form that can use a longer-range reference. On
50//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52//! conditional branch's label reference can be fixed up with a "veneer" to
53//! achieve a longer range.
54//!
55//! - To implement all of this, we require the backend to provide a `LabelUse`
56//! type that implements a trait. This is nominally an enum that records one of
57//! several kinds of references to an offset in code -- basically, a relocation
58//! type -- and will usually correspond to different instruction formats. The
59//! `LabelUse` implementation specifies the maximum range, how to patch in the
60//! actual label location when known, and how to generate a veneer to extend the
61//! range.
62//!
63//! That satisfies label references, but we still may have suboptimal branch
64//! patterns. To clean up the branches, we do a simple "peephole"-style
65//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66//! informs the buffer of branches in the code and, in the case of conditionals,
67//! the code that would have been emitted to invert this branch's condition. We
68//! track the "latest branches": these are branches that are contiguous up to
69//! the current offset. (If any code is emitted after a branch, that branch or
70//! run of contiguous branches is no longer "latest".) The latest branches are
71//! those that we can edit by simply truncating the buffer and doing something
72//! else instead.
73//!
74//! To optimize branches, we implement several simple rules, and try to apply
75//! them to the "latest branches" when possible:
76//!
77//! - A branch with a label target, when that label is bound to the ending
78//! offset of the branch (the fallthrough location), can be removed altogether,
79//! because the branch would have no effect).
80//!
81//! - An unconditional branch that starts at a label location, and branches to
82//! another label, results in a "label alias": all references to the label bound
83//! *to* this branch instruction are instead resolved to the *target* of the
84//! branch instruction. This effectively removes empty blocks that just
85//! unconditionally branch to the next block. We call this "branch threading".
86//!
87//! - A conditional followed by an unconditional, when the conditional branches
88//! to the unconditional's fallthrough, results in (i) the truncation of the
89//! unconditional, (ii) the inversion of the condition's condition, and (iii)
90//! replacement of the conditional's target (using the original target of the
91//! unconditional). This is a fancy way of saying "we can flip a two-target
92//! conditional branch's taken/not-taken targets if it works better with our
93//! fallthrough". To make this work, the emitter actually gives the buffer
94//! *both* forms of every conditional branch: the true form is emitted into the
95//! buffer, and the "inverted" machine-code bytes are provided as part of the
96//! branch-fixup metadata.
97//!
98//! - An unconditional B preceded by another unconditional P, when B's label(s) have
99//! been redirected to target(B), can be removed entirely. This is an extension
100//! of the branch-threading optimization, and is valid because if we know there
101//! will be no fallthrough into this branch instruction (the prior instruction
102//! is an unconditional jump), and if we know we have successfully redirected
103//! all labels, then this branch instruction is unreachable. Note that this
104//! works because the redirection happens before the label is ever resolved
105//! (fixups happen at island emission time, at which point latest-branches are
106//! cleared, or at the end of emission), so we are sure to catch and redirect
107//! all possible paths to this instruction.
108//!
109//! # Branch-optimization Correctness
110//!
111//! The branch-optimization mechanism depends on a few data structures with
112//! invariants, which are always held outside the scope of top-level public
113//! methods:
114//!
115//! - The latest-branches list. Each entry describes a span of the buffer
116//! (start/end offsets), the label target, the corresponding fixup-list entry
117//! index, and the bytes (must be the same length) for the inverted form, if
118//! conditional. The list of labels that are bound to the start-offset of this
119//! branch is *complete* (if any label has a resolved offset equal to `start`
120//! and is not an alias, it must appear in this list) and *precise* (no label
121//! in this list can be bound to another offset). No label in this list should
122//! be an alias. No two branch ranges can overlap, and branches are in
123//! ascending-offset order.
124//!
125//! - The labels-at-tail list. This contains all MachLabels that have been bound
126//! to (whose resolved offsets are equal to) the tail offset of the buffer.
127//! No label in this list should be an alias.
128//!
129//! - The label_offsets array, containing the bound offset of a label or
130//! UNKNOWN. No label can be bound at an offset greater than the current
131//! buffer tail.
132//!
133//! - The label_aliases array, containing another label to which a label is
134//! bound or UNKNOWN. A label's resolved offset is the resolved offset
135//! of the label it is aliased to, if this is set.
136//!
137//! We argue below, at each method, how the invariants in these data structures
138//! are maintained (grep for "Post-invariant").
139//!
140//! Given these invariants, we argue why each optimization preserves execution
141//! semantics below (grep for "Preserves execution semantics").
142//!
143//! # Avoiding Quadratic Behavior
144//!
145//! There are two cases where we've had to take some care to avoid
146//! quadratic worst-case behavior:
147//!
148//! - The "labels at this branch" list can grow unboundedly if the
149//! code generator binds many labels at one location. If the count
150//! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151//! simply abort an optimization early in a way that is always correct
152//! but is conservative.
153//!
154//! - The fixup list can interact with island emission to create
155//! "quadratic island behavior". In a little more detail, one can hit
156//! this behavior by having some pending fixups (forward label
157//! references) with long-range label-use kinds, and some others
158//! with shorter-range references that nonetheless still are pending
159//! long enough to trigger island generation. In such a case, we
160//! process the fixup list, generate veneers to extend some forward
161//! references' ranges, but leave the other (longer-range) ones
162//! alone. The way this was implemented put them back on a list and
163//! resulted in quadratic behavior.
164//!
165//! To avoid this fixups are split into two lists: one "pending" list and one
166//! final list. The pending list is kept around for handling fixups related to
167//! branches so it can be edited/truncated. When an island is reached, which
168//! starts processing fixups, all pending fixups are flushed into the final
169//! list. The final list is a `BinaryHeap` which enables fixup processing to
170//! only process those which are required during island emission, deferring
171//! all longer-range fixups to later.
172
173use crate::binemit::{Addend, CodeOffset, Reloc};
174use crate::ir::function::FunctionParameters;
175use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176use crate::isa::unwind::UnwindInst;
177use crate::machinst::{
178 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179};
180use crate::trace;
181use crate::{MachInstEmitState, ir};
182use crate::{VCodeConstantData, timing};
183use core::ops::Range;
184use cranelift_control::ControlPlane;
185use cranelift_entity::{PrimaryMap, entity_impl};
186use smallvec::SmallVec;
187use std::cmp::Ordering;
188use std::collections::BinaryHeap;
189use std::mem;
190use std::string::String;
191use std::vec::Vec;
192
193#[cfg(feature = "enable-serde")]
194use serde::{Deserialize, Serialize};
195
196#[cfg(feature = "enable-serde")]
197pub trait CompilePhase {
198 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
199 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
200}
201
202#[cfg(not(feature = "enable-serde"))]
203pub trait CompilePhase {
204 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
205 type SourceLocType: core::fmt::Debug + PartialEq + Clone;
206}
207
208/// Status of a compiled artifact that needs patching before being used.
209#[derive(Clone, Debug, PartialEq)]
210#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
211pub struct Stencil;
212
213/// Status of a compiled artifact ready to use.
214#[derive(Clone, Debug, PartialEq)]
215pub struct Final;
216
217impl CompilePhase for Stencil {
218 type MachSrcLocType = MachSrcLoc<Stencil>;
219 type SourceLocType = RelSourceLoc;
220}
221
222impl CompilePhase for Final {
223 type MachSrcLocType = MachSrcLoc<Final>;
224 type SourceLocType = SourceLoc;
225}
226
227#[derive(Clone, Copy, Debug, PartialEq, Eq)]
228enum ForceVeneers {
229 Yes,
230 No,
231}
232
233/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
234/// in bulk.
235///
236/// This struct uses `SmallVec`s to support small-ish function bodies without
237/// any heap allocation. As such, it will be several kilobytes large. This is
238/// likely fine as long as it is stack-allocated for function emission then
239/// thrown away; but beware if many buffer objects are retained persistently.
240pub struct MachBuffer<I: VCodeInst> {
241 /// The buffer contents, as raw bytes.
242 data: SmallVec<[u8; 1024]>,
243 /// The required alignment of this buffer.
244 min_alignment: u32,
245 /// Any relocations referring to this code. Note that only *external*
246 /// relocations are tracked here; references to labels within the buffer are
247 /// resolved before emission.
248 relocs: SmallVec<[MachReloc; 16]>,
249 /// Any trap records referring to this code.
250 traps: SmallVec<[MachTrap; 16]>,
251 /// Any call site records referring to this code.
252 call_sites: SmallVec<[MachCallSite; 16]>,
253 /// Any exception-handler records referred to at call sites.
254 exception_handlers: SmallVec<[MachExceptionHandler; 16]>,
255 /// Any source location mappings referring to this code.
256 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
257 /// Any user stack maps for this code.
258 ///
259 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
260 /// by code offset, and each stack map covers `span` bytes on the stack.
261 user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
262 /// Any unwind info at a given location.
263 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
264 /// The current source location in progress (after `start_srcloc()` and
265 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.
266 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
267 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
268 label_offsets: SmallVec<[CodeOffset; 16]>,
269 /// Label aliases: when one label points to an unconditional jump, and that
270 /// jump points to another label, we can redirect references to the first
271 /// label immediately to the second.
272 ///
273 /// Invariant: we don't have label-alias cycles. We ensure this by,
274 /// before setting label A to alias label B, resolving B's alias
275 /// target (iteratively until a non-aliased label); if B is already
276 /// aliased to A, then we cannot alias A back to B.
277 label_aliases: SmallVec<[MachLabel; 16]>,
278 /// Constants that must be emitted at some point.
279 pending_constants: SmallVec<[VCodeConstant; 16]>,
280 /// Byte size of all constants in `pending_constants`.
281 pending_constants_size: CodeOffset,
282 /// Traps that must be emitted at some point.
283 pending_traps: SmallVec<[MachLabelTrap; 16]>,
284 /// Fixups that haven't yet been flushed into `fixup_records` below and may
285 /// be related to branches that are chomped. These all get added to
286 /// `fixup_records` during island emission.
287 pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
288 /// The nearest upcoming deadline for entries in `pending_fixup_records`.
289 pending_fixup_deadline: CodeOffset,
290 /// Fixups that must be performed after all code is emitted.
291 fixup_records: BinaryHeap<MachLabelFixup<I>>,
292 /// Latest branches, to facilitate in-place editing for better fallthrough
293 /// behavior and empty-block removal.
294 latest_branches: SmallVec<[MachBranch; 4]>,
295 /// All labels at the current offset (emission tail). This is lazily
296 /// cleared: it is actually accurate as long as the current offset is
297 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
298 /// be considered as empty.
299 ///
300 /// For correctness, this *must* be complete (i.e., the vector must contain
301 /// all labels whose offsets are resolved to the current tail), because we
302 /// rely on it to update labels when we truncate branches.
303 labels_at_tail: SmallVec<[MachLabel; 4]>,
304 /// The last offset at which `labels_at_tail` is valid. It is conceptually
305 /// always describing the tail of the buffer, but we do not clear
306 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
307 /// when the offset has grown past this (`labels_at_tail_off`) point.
308 /// Always <= `cur_offset()`.
309 labels_at_tail_off: CodeOffset,
310 /// Metadata about all constants that this function has access to.
311 ///
312 /// This records the size/alignment of all constants (not the actual data)
313 /// along with the last available label generated for the constant. This map
314 /// is consulted when constants are referred to and the label assigned to a
315 /// constant may change over time as well.
316 constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
317 /// All recorded usages of constants as pairs of the constant and where the
318 /// constant needs to be placed within `self.data`. Note that the same
319 /// constant may appear in this array multiple times if it was emitted
320 /// multiple times.
321 used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
322 /// Indicates when a patchable region is currently open, to guard that it's
323 /// not possible to nest patchable regions.
324 open_patchable: bool,
325}
326
327impl MachBufferFinalized<Stencil> {
328 /// Get a finalized machine buffer by applying the function's base source location.
329 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
330 MachBufferFinalized {
331 data: self.data,
332 relocs: self.relocs,
333 traps: self.traps,
334 call_sites: self.call_sites,
335 exception_handlers: self.exception_handlers,
336 srclocs: self
337 .srclocs
338 .into_iter()
339 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
340 .collect(),
341 user_stack_maps: self.user_stack_maps,
342 unwind_info: self.unwind_info,
343 alignment: self.alignment,
344 }
345 }
346}
347
348/// A `MachBuffer` once emission is completed: holds generated code and records,
349/// without fixups. This allows the type to be independent of the backend.
350#[derive(PartialEq, Debug, Clone)]
351#[cfg_attr(
352 feature = "enable-serde",
353 derive(serde_derive::Serialize, serde_derive::Deserialize)
354)]
355pub struct MachBufferFinalized<T: CompilePhase> {
356 /// The buffer contents, as raw bytes.
357 pub(crate) data: SmallVec<[u8; 1024]>,
358 /// Any relocations referring to this code. Note that only *external*
359 /// relocations are tracked here; references to labels within the buffer are
360 /// resolved before emission.
361 pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
362 /// Any trap records referring to this code.
363 pub(crate) traps: SmallVec<[MachTrap; 16]>,
364 /// Any call site records referring to this code.
365 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
366 /// Any exception-handler records referred to at call sites.
367 pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,
368 /// Any source location mappings referring to this code.
369 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
370 /// Any user stack maps for this code.
371 ///
372 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
373 /// by code offset, and each stack map covers `span` bytes on the stack.
374 pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
375 /// Any unwind info at a given location.
376 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
377 /// The required alignment of this buffer.
378 pub alignment: u32,
379}
380
381const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
382const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
383
384/// Threshold on max length of `labels_at_this_branch` list to avoid
385/// unbounded quadratic behavior (see comment below at use-site).
386const LABEL_LIST_THRESHOLD: usize = 100;
387
388/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
389/// the point at which it is used by emitted code; the buffer records "fixups"
390/// for references to the label, and will come back and patch the code
391/// appropriately when the label's location is eventually known.
392#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
393pub struct MachLabel(u32);
394entity_impl!(MachLabel);
395
396impl MachLabel {
397 /// Get a label for a block. (The first N MachLabels are always reserved for
398 /// the N blocks in the vcode.)
399 pub fn from_block(bindex: BlockIndex) -> MachLabel {
400 MachLabel(bindex.index() as u32)
401 }
402
403 /// Creates a string representing this label, for convenience.
404 pub fn to_string(&self) -> String {
405 format!("label{}", self.0)
406 }
407}
408
409impl Default for MachLabel {
410 fn default() -> Self {
411 UNKNOWN_LABEL
412 }
413}
414
415/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
416/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
417/// the [`OpenPatchRegion`] token in the process.
418pub struct OpenPatchRegion(usize);
419
420/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
421/// of where you might want to use this is for patching instructions that mention constants that
422/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
423/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
424/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
425/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
426/// bytes, and the constants uses can be updated directly.
427pub struct PatchRegion {
428 range: Range<usize>,
429}
430
431impl PatchRegion {
432 /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
433 pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
434 &mut buffer.data[self.range]
435 }
436}
437
438impl<I: VCodeInst> MachBuffer<I> {
439 /// Create a new section, known to start at `start_offset` and with a size limited to
440 /// `length_limit`.
441 pub fn new() -> MachBuffer<I> {
442 MachBuffer {
443 data: SmallVec::new(),
444 min_alignment: I::function_alignment().minimum,
445 relocs: SmallVec::new(),
446 traps: SmallVec::new(),
447 call_sites: SmallVec::new(),
448 exception_handlers: SmallVec::new(),
449 srclocs: SmallVec::new(),
450 user_stack_maps: SmallVec::new(),
451 unwind_info: SmallVec::new(),
452 cur_srcloc: None,
453 label_offsets: SmallVec::new(),
454 label_aliases: SmallVec::new(),
455 pending_constants: SmallVec::new(),
456 pending_constants_size: 0,
457 pending_traps: SmallVec::new(),
458 pending_fixup_records: SmallVec::new(),
459 pending_fixup_deadline: u32::MAX,
460 fixup_records: Default::default(),
461 latest_branches: SmallVec::new(),
462 labels_at_tail: SmallVec::new(),
463 labels_at_tail_off: 0,
464 constants: Default::default(),
465 used_constants: Default::default(),
466 open_patchable: false,
467 }
468 }
469
470 /// Current offset from start of buffer.
471 pub fn cur_offset(&self) -> CodeOffset {
472 self.data.len() as CodeOffset
473 }
474
475 /// Add a byte.
476 pub fn put1(&mut self, value: u8) {
477 self.data.push(value);
478
479 // Post-invariant: conceptual-labels_at_tail contains a complete and
480 // precise list of labels bound at `cur_offset()`. We have advanced
481 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
482 // before, it is not anymore (and it cannot become equal, because
483 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
484 // conceptually empty (even though it is only lazily cleared). No labels
485 // can be bound at this new offset (by invariant on `label_offsets`).
486 // Hence the invariant holds.
487 }
488
489 /// Add 2 bytes.
490 pub fn put2(&mut self, value: u16) {
491 let bytes = value.to_le_bytes();
492 self.data.extend_from_slice(&bytes[..]);
493
494 // Post-invariant: as for `put1()`.
495 }
496
497 /// Add 4 bytes.
498 pub fn put4(&mut self, value: u32) {
499 let bytes = value.to_le_bytes();
500 self.data.extend_from_slice(&bytes[..]);
501
502 // Post-invariant: as for `put1()`.
503 }
504
505 /// Add 8 bytes.
506 pub fn put8(&mut self, value: u64) {
507 let bytes = value.to_le_bytes();
508 self.data.extend_from_slice(&bytes[..]);
509
510 // Post-invariant: as for `put1()`.
511 }
512
513 /// Add a slice of bytes.
514 pub fn put_data(&mut self, data: &[u8]) {
515 self.data.extend_from_slice(data);
516
517 // Post-invariant: as for `put1()`.
518 }
519
520 /// Reserve appended space and return a mutable slice referring to it.
521 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
522 let off = self.data.len();
523 let new_len = self.data.len() + len;
524 self.data.resize(new_len, 0);
525 &mut self.data[off..]
526
527 // Post-invariant: as for `put1()`.
528 }
529
530 /// Align up to the given alignment.
531 pub fn align_to(&mut self, align_to: CodeOffset) {
532 trace!("MachBuffer: align to {}", align_to);
533 assert!(
534 align_to.is_power_of_two(),
535 "{align_to} is not a power of two"
536 );
537 while self.cur_offset() & (align_to - 1) != 0 {
538 self.put1(0);
539 }
540
541 // Post-invariant: as for `put1()`.
542 }
543
544 /// Begin a region of patchable code. There is one requirement for the
545 /// code that is emitted: It must not introduce any instructions that
546 /// could be chomped (branches are an example of this). In other words,
547 /// you must not call [`MachBuffer::add_cond_branch`] or
548 /// [`MachBuffer::add_uncond_branch`] between calls to this method and
549 /// [`MachBuffer::end_patchable`].
550 pub fn start_patchable(&mut self) -> OpenPatchRegion {
551 assert!(!self.open_patchable, "Patchable regions may not be nested");
552 self.open_patchable = true;
553 OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
554 }
555
556 /// End a region of patchable code, yielding a [`PatchRegion`] value that
557 /// can be consumed later to produce a one-off mutable slice to the
558 /// associated region of the data buffer.
559 pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
560 // No need to assert the state of `open_patchable` here, as we take
561 // ownership of the only `OpenPatchable` value.
562 self.open_patchable = false;
563 let end = usize::try_from(self.cur_offset()).unwrap();
564 PatchRegion { range: open.0..end }
565 }
566
567 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
568 /// offset yet.
569 pub fn get_label(&mut self) -> MachLabel {
570 let l = self.label_offsets.len() as u32;
571 self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
572 self.label_aliases.push(UNKNOWN_LABEL);
573 trace!("MachBuffer: new label -> {:?}", MachLabel(l));
574 MachLabel(l)
575
576 // Post-invariant: the only mutation is to add a new label; it has no
577 // bound offset yet, so it trivially satisfies all invariants.
578 }
579
580 /// Reserve the first N MachLabels for blocks.
581 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
582 trace!("MachBuffer: first {} labels are for blocks", blocks);
583 debug_assert!(self.label_offsets.is_empty());
584 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
585 self.label_aliases.resize(blocks, UNKNOWN_LABEL);
586
587 // Post-invariant: as for `get_label()`.
588 }
589
590 /// Registers metadata in this `MachBuffer` about the `constants` provided.
591 ///
592 /// This will record the size/alignment of all constants which will prepare
593 /// them for emission later on.
594 pub fn register_constants(&mut self, constants: &VCodeConstants) {
595 for (c, val) in constants.iter() {
596 self.register_constant(&c, val);
597 }
598 }
599
600 /// Similar to [`MachBuffer::register_constants`] but registers a
601 /// single constant metadata. This function is useful in
602 /// situations where not all constants are known at the time of
603 /// emission.
604 pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
605 let c2 = self.constants.push(MachBufferConstant {
606 upcoming_label: None,
607 align: data.alignment(),
608 size: data.as_slice().len(),
609 });
610 assert_eq!(*constant, c2);
611 }
612
613 /// Completes constant emission by iterating over `self.used_constants` and
614 /// filling in the "holes" with the constant values provided by `constants`.
615 ///
616 /// Returns the alignment required for this entire buffer. Alignment starts
617 /// at the ISA's minimum function alignment and can be increased due to
618 /// constant requirements.
619 fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
620 let mut alignment = self.min_alignment;
621 for (constant, offset) in mem::take(&mut self.used_constants) {
622 let constant = constants.get(constant);
623 let data = constant.as_slice();
624 self.data[offset as usize..][..data.len()].copy_from_slice(data);
625 alignment = constant.alignment().max(alignment);
626 }
627 alignment
628 }
629
630 /// Returns a label that can be used to refer to the `constant` provided.
631 ///
632 /// This will automatically defer a new constant to be emitted for
633 /// `constant` if it has not been previously emitted. Note that this
634 /// function may return a different label for the same constant at
635 /// different points in time. The label is valid to use only from the
636 /// current location; the MachBuffer takes care to emit the same constant
637 /// multiple times if needed so the constant is always in range.
638 pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
639 let MachBufferConstant {
640 align,
641 size,
642 upcoming_label,
643 } = self.constants[constant];
644 if let Some(label) = upcoming_label {
645 return label;
646 }
647
648 let label = self.get_label();
649 trace!(
650 "defer constant: eventually emit {size} bytes aligned \
651 to {align} at label {label:?}",
652 );
653 self.pending_constants.push(constant);
654 self.pending_constants_size += size as u32;
655 self.constants[constant].upcoming_label = Some(label);
656 label
657 }
658
659 /// Bind a label to the current offset. A label can only be bound once.
660 pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
661 trace!(
662 "MachBuffer: bind label {:?} at offset {}",
663 label,
664 self.cur_offset()
665 );
666 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
667 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
668 let offset = self.cur_offset();
669 self.label_offsets[label.0 as usize] = offset;
670 self.lazily_clear_labels_at_tail();
671 self.labels_at_tail.push(label);
672
673 // Invariants hold: bound offset of label is <= cur_offset (in fact it
674 // is equal). If the `labels_at_tail` list was complete and precise
675 // before, it is still, because we have bound this label to the current
676 // offset and added it to the list (which contains all labels at the
677 // current offset).
678
679 self.optimize_branches(ctrl_plane);
680
681 // Post-invariant: by `optimize_branches()` (see argument there).
682 }
683
684 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
685 /// offset that it applies to.
686 fn lazily_clear_labels_at_tail(&mut self) {
687 let offset = self.cur_offset();
688 if offset > self.labels_at_tail_off {
689 self.labels_at_tail_off = offset;
690 self.labels_at_tail.clear();
691 }
692
693 // Post-invariant: either labels_at_tail_off was at cur_offset, and
694 // state is untouched, or was less than cur_offset, in which case the
695 // labels_at_tail list was conceptually empty, and is now actually
696 // empty.
697 }
698
699 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
700 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
701 let mut iters = 0;
702 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
703 label = self.label_aliases[label.0 as usize];
704 // To protect against an infinite loop (despite our assurances to
705 // ourselves that the invariants make this impossible), assert out
706 // after 1M iterations. The number of basic blocks is limited
707 // in most contexts anyway so this should be impossible to hit with
708 // a legitimate input.
709 iters += 1;
710 assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
711 }
712 self.label_offsets[label.0 as usize]
713
714 // Post-invariant: no mutations.
715 }
716
717 /// Emit a reference to the given label with the given reference type (i.e.,
718 /// branch-instruction format) at the current offset. This is like a
719 /// relocation, but handled internally.
720 ///
721 /// This can be called before the branch is actually emitted; fixups will
722 /// not happen until an island is emitted or the buffer is finished.
723 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
724 trace!(
725 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
726 offset, label, kind
727 );
728
729 // Add the fixup, and update the worst-case island size based on a
730 // veneer for this label use.
731 let fixup = MachLabelFixup {
732 label,
733 offset,
734 kind,
735 };
736 self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
737 self.pending_fixup_records.push(fixup);
738
739 // Post-invariant: no mutations to branches/labels data structures.
740 }
741
742 /// Inform the buffer of an unconditional branch at the given offset,
743 /// targeting the given label. May be used to optimize branches.
744 /// The last added label-use must correspond to this branch.
745 /// This must be called when the current offset is equal to `start`; i.e.,
746 /// before actually emitting the branch. This implies that for a branch that
747 /// uses a label and is eligible for optimizations by the MachBuffer, the
748 /// proper sequence is:
749 ///
750 /// - Call `use_label_at_offset()` to emit the fixup record.
751 /// - Call `add_uncond_branch()` to make note of the branch.
752 /// - Emit the bytes for the branch's machine code.
753 ///
754 /// Additional requirement: no labels may be bound between `start` and `end`
755 /// (exclusive on both ends).
756 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
757 debug_assert!(
758 !self.open_patchable,
759 "Branch instruction inserted within a patchable region"
760 );
761 assert!(self.cur_offset() == start);
762 debug_assert!(end > start);
763 assert!(!self.pending_fixup_records.is_empty());
764 let fixup = self.pending_fixup_records.len() - 1;
765 self.lazily_clear_labels_at_tail();
766 self.latest_branches.push(MachBranch {
767 start,
768 end,
769 target,
770 fixup,
771 inverted: None,
772 labels_at_this_branch: self.labels_at_tail.clone(),
773 });
774
775 // Post-invariant: we asserted branch start is current tail; the list of
776 // labels at branch is cloned from list of labels at current tail.
777 }
778
779 /// Inform the buffer of a conditional branch at the given offset,
780 /// targeting the given label. May be used to optimize branches.
781 /// The last added label-use must correspond to this branch.
782 ///
783 /// Additional requirement: no labels may be bound between `start` and `end`
784 /// (exclusive on both ends).
785 pub fn add_cond_branch(
786 &mut self,
787 start: CodeOffset,
788 end: CodeOffset,
789 target: MachLabel,
790 inverted: &[u8],
791 ) {
792 debug_assert!(
793 !self.open_patchable,
794 "Branch instruction inserted within a patchable region"
795 );
796 assert!(self.cur_offset() == start);
797 debug_assert!(end > start);
798 assert!(!self.pending_fixup_records.is_empty());
799 debug_assert!(
800 inverted.len() == (end - start) as usize,
801 "branch length = {}, but inverted length = {}",
802 end - start,
803 inverted.len()
804 );
805 let fixup = self.pending_fixup_records.len() - 1;
806 let inverted = Some(SmallVec::from(inverted));
807 self.lazily_clear_labels_at_tail();
808 self.latest_branches.push(MachBranch {
809 start,
810 end,
811 target,
812 fixup,
813 inverted,
814 labels_at_this_branch: self.labels_at_tail.clone(),
815 });
816
817 // Post-invariant: we asserted branch start is current tail; labels at
818 // branch list is cloned from list of labels at current tail.
819 }
820
821 fn truncate_last_branch(&mut self) {
822 debug_assert!(
823 !self.open_patchable,
824 "Branch instruction truncated within a patchable region"
825 );
826
827 self.lazily_clear_labels_at_tail();
828 // Invariants hold at this point.
829
830 let b = self.latest_branches.pop().unwrap();
831 assert!(b.end == self.cur_offset());
832
833 // State:
834 // [PRE CODE]
835 // Offset b.start, b.labels_at_this_branch:
836 // [BRANCH CODE]
837 // cur_off, self.labels_at_tail -->
838 // (end of buffer)
839 self.data.truncate(b.start as usize);
840 self.pending_fixup_records.truncate(b.fixup);
841 while let Some(last_srcloc) = self.srclocs.last_mut() {
842 if last_srcloc.end <= b.start {
843 break;
844 }
845 if last_srcloc.start < b.start {
846 last_srcloc.end = b.start;
847 break;
848 }
849 self.srclocs.pop();
850 }
851 // State:
852 // [PRE CODE]
853 // cur_off, Offset b.start, b.labels_at_this_branch:
854 // (end of buffer)
855 //
856 // self.labels_at_tail --> (past end of buffer)
857 let cur_off = self.cur_offset();
858 self.labels_at_tail_off = cur_off;
859 // State:
860 // [PRE CODE]
861 // cur_off, Offset b.start, b.labels_at_this_branch,
862 // self.labels_at_tail:
863 // (end of buffer)
864 //
865 // resolve_label_offset(l) for l in labels_at_tail:
866 // (past end of buffer)
867
868 trace!(
869 "truncate_last_branch: truncated {:?}; off now {}",
870 b, cur_off
871 );
872
873 // Fix up resolved label offsets for labels at tail.
874 for &l in &self.labels_at_tail {
875 self.label_offsets[l.0 as usize] = cur_off;
876 }
877 // Old labels_at_this_branch are now at cur_off.
878 self.labels_at_tail.extend(b.labels_at_this_branch);
879
880 // Post-invariant: this operation is defined to truncate the buffer,
881 // which moves cur_off backward, and to move labels at the end of the
882 // buffer back to the start-of-branch offset.
883 //
884 // latest_branches satisfies all invariants:
885 // - it has no branches past the end of the buffer (branches are in
886 // order, we removed the last one, and we truncated the buffer to just
887 // before the start of that branch)
888 // - no labels were moved to lower offsets than the (new) cur_off, so
889 // the labels_at_this_branch list for any other branch need not change.
890 //
891 // labels_at_tail satisfies all invariants:
892 // - all labels that were at the tail after the truncated branch are
893 // moved backward to just before the branch, which becomes the new tail;
894 // thus every element in the list should remain (ensured by `.extend()`
895 // above).
896 // - all labels that refer to the new tail, which is the start-offset of
897 // the truncated branch, must be present. The `labels_at_this_branch`
898 // list in the truncated branch's record is a complete and precise list
899 // of exactly these labels; we append these to labels_at_tail.
900 // - labels_at_tail_off is at cur_off after truncation occurs, so the
901 // list is valid (not to be lazily cleared).
902 //
903 // The stated operation was performed:
904 // - For each label at the end of the buffer prior to this method, it
905 // now resolves to the new (truncated) end of the buffer: it must have
906 // been in `labels_at_tail` (this list is precise and complete, and
907 // the tail was at the end of the truncated branch on entry), and we
908 // iterate over this list and set `label_offsets` to the new tail.
909 // None of these labels could have been an alias (by invariant), so
910 // `label_offsets` is authoritative for each.
911 // - No other labels will be past the end of the buffer, because of the
912 // requirement that no labels be bound to the middle of branch ranges
913 // (see comments to `add_{cond,uncond}_branch()`).
914 // - The buffer is truncated to just before the last branch, and the
915 // fixup record referring to that last branch is removed.
916 }
917
918 /// Performs various optimizations on branches pointing at the current label.
919 pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
920 if ctrl_plane.get_decision() {
921 return;
922 }
923
924 self.lazily_clear_labels_at_tail();
925 // Invariants valid at this point.
926
927 trace!(
928 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
929 self.latest_branches, self.labels_at_tail, self.pending_fixup_records
930 );
931
932 // We continue to munch on branches at the tail of the buffer until no
933 // more rules apply. Note that the loop only continues if a branch is
934 // actually truncated (or if labels are redirected away from a branch),
935 // so this always makes progress.
936 while let Some(b) = self.latest_branches.last() {
937 let cur_off = self.cur_offset();
938 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
939 // If there has been any code emission since the end of the last branch or
940 // label definition, then there's nothing we can edit (because we
941 // don't move code once placed, only back up and overwrite), so
942 // clear the records and finish.
943 if b.end < cur_off {
944 break;
945 }
946
947 // If the "labels at this branch" list on this branch is
948 // longer than a threshold, don't do any simplification,
949 // and let the branch remain to separate those labels from
950 // the current tail. This avoids quadratic behavior (see
951 // #3468): otherwise, if a long string of "goto next;
952 // next:" patterns are emitted, all of the labels will
953 // coalesce into a long list of aliases for the current
954 // buffer tail. We must track all aliases of the current
955 // tail for correctness, but we are also allowed to skip
956 // optimization (removal) of any branch, so we take the
957 // escape hatch here and let it stand. In effect this
958 // "spreads" the many thousands of labels in the
959 // pathological case among an actual (harmless but
960 // suboptimal) instruction once per N labels.
961 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
962 break;
963 }
964
965 // Invariant: we are looking at a branch that ends at the tail of
966 // the buffer.
967
968 // For any branch, conditional or unconditional:
969 // - If the target is a label at the current offset, then remove
970 // the conditional branch, and reset all labels that targeted
971 // the current offset (end of branch) to the truncated
972 // end-of-code.
973 //
974 // Preserves execution semantics: a branch to its own fallthrough
975 // address is equivalent to a no-op; in both cases, nextPC is the
976 // fallthrough.
977 if self.resolve_label_offset(b.target) == cur_off {
978 trace!("branch with target == cur off; truncating");
979 self.truncate_last_branch();
980 continue;
981 }
982
983 // If latest is an unconditional branch:
984 //
985 // - If the branch's target is not its own start address, then for
986 // each label at the start of branch, make the label an alias of the
987 // branch target, and remove the label from the "labels at this
988 // branch" list.
989 //
990 // - Preserves execution semantics: an unconditional branch's
991 // only effect is to set PC to a new PC; this change simply
992 // collapses one step in the step-semantics.
993 //
994 // - Post-invariant: the labels that were bound to the start of
995 // this branch become aliases, so they must not be present in any
996 // labels-at-this-branch list or the labels-at-tail list. The
997 // labels are removed form the latest-branch record's
998 // labels-at-this-branch list, and are never placed in the
999 // labels-at-tail list. Furthermore, it is correct that they are
1000 // not in either list, because they are now aliases, and labels
1001 // that are aliases remain aliases forever.
1002 //
1003 // - If there is a prior unconditional branch that ends just before
1004 // this one begins, and this branch has no labels bound to its
1005 // start, then we can truncate this branch, because it is entirely
1006 // unreachable (we have redirected all labels that make it
1007 // reachable otherwise). Do so and continue around the loop.
1008 //
1009 // - Preserves execution semantics: the branch is unreachable,
1010 // because execution can only flow into an instruction from the
1011 // prior instruction's fallthrough or from a branch bound to that
1012 // instruction's start offset. Unconditional branches have no
1013 // fallthrough, so if the prior instruction is an unconditional
1014 // branch, no fallthrough entry can happen. The
1015 // labels-at-this-branch list is complete (by invariant), so if it
1016 // is empty, then the instruction is entirely unreachable. Thus,
1017 // it can be removed.
1018 //
1019 // - Post-invariant: ensured by truncate_last_branch().
1020 //
1021 // - If there is a prior conditional branch whose target label
1022 // resolves to the current offset (branches around the
1023 // unconditional branch), then remove the unconditional branch,
1024 // and make the target of the unconditional the target of the
1025 // conditional instead.
1026 //
1027 // - Preserves execution semantics: previously we had:
1028 //
1029 // L1:
1030 // cond_br L2
1031 // br L3
1032 // L2:
1033 // (end of buffer)
1034 //
1035 // by removing the last branch, we have:
1036 //
1037 // L1:
1038 // cond_br L2
1039 // L2:
1040 // (end of buffer)
1041 //
1042 // we then fix up the records for the conditional branch to
1043 // have:
1044 //
1045 // L1:
1046 // cond_br.inverted L3
1047 // L2:
1048 //
1049 // In the original code, control flow reaches L2 when the
1050 // conditional branch's predicate is true, and L3 otherwise. In
1051 // the optimized code, the same is true.
1052 //
1053 // - Post-invariant: all edits to latest_branches and
1054 // labels_at_tail are performed by `truncate_last_branch()`,
1055 // which maintains the invariants at each step.
1056
1057 if b.is_uncond() {
1058 // Set any label equal to current branch's start as an alias of
1059 // the branch's target, if the target is not the branch itself
1060 // (i.e., an infinite loop).
1061 //
1062 // We cannot perform this aliasing if the target of this branch
1063 // ultimately aliases back here; if so, we need to keep this
1064 // branch, so break out of this loop entirely (and clear the
1065 // latest-branches list below).
1066 //
1067 // Note that this check is what prevents cycles from forming in
1068 // `self.label_aliases`. To see why, consider an arbitrary start
1069 // state:
1070 //
1071 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1072 // Ln, which is not aliased.
1073 //
1074 // We would create a cycle if we assigned label_aliases[Ln]
1075 // = L1. Note that the below assignment is the only write
1076 // to label_aliases.
1077 //
1078 // By our other invariants, we have that Ln (`l` below)
1079 // resolves to the offset `b.start`, because it is in the
1080 // set `b.labels_at_this_branch`.
1081 //
1082 // If L1 were already aliased, through some arbitrarily deep
1083 // chain, to Ln, then it must also resolve to this offset
1084 // `b.start`.
1085 //
1086 // By checking the resolution of `L1` against this offset,
1087 // and aborting this branch-simplification if they are
1088 // equal, we prevent the below assignment from ever creating
1089 // a cycle.
1090 if self.resolve_label_offset(b.target) != b.start {
1091 let redirected = b.labels_at_this_branch.len();
1092 for &l in &b.labels_at_this_branch {
1093 trace!(
1094 " -> label at start of branch {:?} redirected to target {:?}",
1095 l, b.target
1096 );
1097 self.label_aliases[l.0 as usize] = b.target;
1098 // NOTE: we continue to ensure the invariant that labels
1099 // pointing to tail of buffer are in `labels_at_tail`
1100 // because we already ensured above that the last branch
1101 // cannot have a target of `cur_off`; so we never have
1102 // to put the label into `labels_at_tail` when moving it
1103 // here.
1104 }
1105 // Maintain invariant: all branches have been redirected
1106 // and are no longer pointing at the start of this branch.
1107 let mut_b = self.latest_branches.last_mut().unwrap();
1108 mut_b.labels_at_this_branch.clear();
1109
1110 if redirected > 0 {
1111 trace!(" -> after label redirects, restarting loop");
1112 continue;
1113 }
1114 } else {
1115 break;
1116 }
1117
1118 let b = self.latest_branches.last().unwrap();
1119
1120 // Examine any immediately preceding branch.
1121 if self.latest_branches.len() > 1 {
1122 let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1123 trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1124 // This uncond is immediately after another uncond; we
1125 // should have already redirected labels to this uncond away
1126 // (but check to be sure); so we can truncate this uncond.
1127 if prev_b.is_uncond()
1128 && prev_b.end == b.start
1129 && b.labels_at_this_branch.is_empty()
1130 {
1131 trace!(" -> uncond follows another uncond; truncating");
1132 self.truncate_last_branch();
1133 continue;
1134 }
1135
1136 // This uncond is immediately after a conditional, and the
1137 // conditional's target is the end of this uncond, and we've
1138 // already redirected labels to this uncond away; so we can
1139 // truncate this uncond, flip the sense of the conditional, and
1140 // set the conditional's target (in `latest_branches` and in
1141 // `fixup_records`) to the uncond's target.
1142 if prev_b.is_cond()
1143 && prev_b.end == b.start
1144 && self.resolve_label_offset(prev_b.target) == cur_off
1145 {
1146 trace!(
1147 " -> uncond follows a conditional, and conditional's target resolves to current offset"
1148 );
1149 // Save the target of the uncond (this becomes the
1150 // target of the cond), and truncate the uncond.
1151 let target = b.target;
1152 let data = prev_b.inverted.clone().unwrap();
1153 self.truncate_last_branch();
1154
1155 // Mutate the code and cond branch.
1156 let off_before_edit = self.cur_offset();
1157 let prev_b = self.latest_branches.last_mut().unwrap();
1158 let not_inverted = SmallVec::from(
1159 &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1160 );
1161
1162 // Low-level edit: replaces bytes of branch with
1163 // inverted form. cur_off remains the same afterward, so
1164 // we do not need to modify label data structures.
1165 self.data.truncate(prev_b.start as usize);
1166 self.data.extend_from_slice(&data[..]);
1167
1168 // Save the original code as the inversion of the
1169 // inverted branch, in case we later edit this branch
1170 // again.
1171 prev_b.inverted = Some(not_inverted);
1172 self.pending_fixup_records[prev_b.fixup].label = target;
1173 trace!(" -> reassigning target of condbr to {:?}", target);
1174 prev_b.target = target;
1175 debug_assert_eq!(off_before_edit, self.cur_offset());
1176 continue;
1177 }
1178 }
1179 }
1180
1181 // If we couldn't do anything with the last branch, then break.
1182 break;
1183 }
1184
1185 self.purge_latest_branches();
1186
1187 trace!(
1188 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1189 self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1190 );
1191 }
1192
1193 fn purge_latest_branches(&mut self) {
1194 // All of our branch simplification rules work only if a branch ends at
1195 // the tail of the buffer, with no following code; and branches are in
1196 // order in latest_branches; so if the last entry ends prior to
1197 // cur_offset, then clear all entries.
1198 let cur_off = self.cur_offset();
1199 if let Some(l) = self.latest_branches.last() {
1200 if l.end < cur_off {
1201 trace!("purge_latest_branches: removing branch {:?}", l);
1202 self.latest_branches.clear();
1203 }
1204 }
1205
1206 // Post-invariant: no invariant requires any branch to appear in
1207 // `latest_branches`; it is always optional. The list-clear above thus
1208 // preserves all semantics.
1209 }
1210
1211 /// Emit a trap at some point in the future with the specified code and
1212 /// stack map.
1213 ///
1214 /// This function returns a [`MachLabel`] which will be the future address
1215 /// of the trap. Jumps should refer to this label, likely by using the
1216 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1217 /// patched in once the address of the trap is known.
1218 ///
1219 /// This will batch all traps into the end of the function.
1220 pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1221 let label = self.get_label();
1222 self.pending_traps.push(MachLabelTrap {
1223 label,
1224 code,
1225 loc: self.cur_srcloc.map(|(_start, loc)| loc),
1226 });
1227 label
1228 }
1229
1230 /// Is an island needed within the next N bytes?
1231 pub fn island_needed(&self, distance: CodeOffset) -> bool {
1232 let deadline = match self.fixup_records.peek() {
1233 Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1234 None => self.pending_fixup_deadline,
1235 };
1236 deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1237 }
1238
1239 /// Returns the maximal offset that islands can reach if `distance` more
1240 /// bytes are appended.
1241 ///
1242 /// This is used to determine if veneers need insertions since jumps that
1243 /// can't reach past this point must get a veneer of some form.
1244 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1245 // Assume that all fixups will require veneers and that the veneers are
1246 // the worst-case size for each platform. This is an over-generalization
1247 // to avoid iterating over the `fixup_records` list or maintaining
1248 // information about it as we go along.
1249 let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1250 as u32)
1251 * (I::LabelUse::worst_case_veneer_size())
1252 + self.pending_constants_size
1253 + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1254 self.cur_offset()
1255 .saturating_add(distance)
1256 .saturating_add(island_worst_case_size)
1257 }
1258
1259 /// Emit all pending constants and required pending veneers.
1260 ///
1261 /// Should only be called if `island_needed()` returns true, i.e., if we
1262 /// actually reach a deadline. It's not necessarily a problem to do so
1263 /// otherwise but it may result in unnecessary work during emission.
1264 pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1265 self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1266 }
1267
1268 /// Same as `emit_island`, but an internal API with a `force_veneers`
1269 /// argument to force all veneers to always get emitted for debugging.
1270 fn emit_island_maybe_forced(
1271 &mut self,
1272 force_veneers: ForceVeneers,
1273 distance: CodeOffset,
1274 ctrl_plane: &mut ControlPlane,
1275 ) {
1276 // We're going to purge fixups, so no latest-branch editing can happen
1277 // anymore.
1278 self.latest_branches.clear();
1279
1280 // End the current location tracking since anything emitted during this
1281 // function shouldn't be attributed to whatever the current source
1282 // location is.
1283 //
1284 // Note that the current source location, if it's set right now, will be
1285 // restored at the end of this island emission.
1286 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1287 if cur_loc.is_some() {
1288 self.end_srcloc();
1289 }
1290
1291 let forced_threshold = self.worst_case_end_of_island(distance);
1292
1293 // First flush out all traps/constants so we have more labels in case
1294 // fixups are applied against these labels.
1295 //
1296 // Note that traps are placed first since this typically happens at the
1297 // end of the function and for disassemblers we try to keep all the code
1298 // contiguously together.
1299 for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1300 // If this trap has source information associated with it then
1301 // emit this information for the trap instruction going out now too.
1302 if let Some(loc) = loc {
1303 self.start_srcloc(loc);
1304 }
1305 self.align_to(I::LabelUse::ALIGN);
1306 self.bind_label(label, ctrl_plane);
1307 self.add_trap(code);
1308 self.put_data(I::TRAP_OPCODE);
1309 if loc.is_some() {
1310 self.end_srcloc();
1311 }
1312 }
1313
1314 for constant in mem::take(&mut self.pending_constants) {
1315 let MachBufferConstant { align, size, .. } = self.constants[constant];
1316 let label = self.constants[constant].upcoming_label.take().unwrap();
1317 self.align_to(align);
1318 self.bind_label(label, ctrl_plane);
1319 self.used_constants.push((constant, self.cur_offset()));
1320 self.get_appended_space(size);
1321 }
1322
1323 // Either handle all pending fixups because they're ready or move them
1324 // onto the `BinaryHeap` tracking all pending fixups if they aren't
1325 // ready.
1326 assert!(self.latest_branches.is_empty());
1327 for fixup in mem::take(&mut self.pending_fixup_records) {
1328 if self.should_apply_fixup(&fixup, forced_threshold) {
1329 self.handle_fixup(fixup, force_veneers, forced_threshold);
1330 } else {
1331 self.fixup_records.push(fixup);
1332 }
1333 }
1334 self.pending_fixup_deadline = u32::MAX;
1335 while let Some(fixup) = self.fixup_records.peek() {
1336 trace!("emit_island: fixup {:?}", fixup);
1337
1338 // If this fixup shouldn't be applied, that means its label isn't
1339 // defined yet and there'll be remaining space to apply a veneer if
1340 // necessary in the future after this island. In that situation
1341 // because `fixup_records` is sorted by deadline this loop can
1342 // exit.
1343 if !self.should_apply_fixup(fixup, forced_threshold) {
1344 break;
1345 }
1346
1347 let fixup = self.fixup_records.pop().unwrap();
1348 self.handle_fixup(fixup, force_veneers, forced_threshold);
1349 }
1350
1351 if let Some(loc) = cur_loc {
1352 self.start_srcloc(loc);
1353 }
1354 }
1355
1356 fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1357 let label_offset = self.resolve_label_offset(fixup.label);
1358 label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1359 }
1360
1361 fn handle_fixup(
1362 &mut self,
1363 fixup: MachLabelFixup<I>,
1364 force_veneers: ForceVeneers,
1365 forced_threshold: CodeOffset,
1366 ) {
1367 let MachLabelFixup {
1368 label,
1369 offset,
1370 kind,
1371 } = fixup;
1372 let start = offset as usize;
1373 let end = (offset + kind.patch_size()) as usize;
1374 let label_offset = self.resolve_label_offset(label);
1375
1376 if label_offset != UNKNOWN_LABEL_OFFSET {
1377 // If the offset of the label for this fixup is known then
1378 // we're going to do something here-and-now. We're either going
1379 // to patch the original offset because it's an in-bounds jump,
1380 // or we're going to generate a veneer, patch the fixup to jump
1381 // to the veneer, and then keep going.
1382 //
1383 // If the label comes after the original fixup, then we should
1384 // be guaranteed that the jump is in-bounds. Otherwise there's
1385 // a bug somewhere because this method wasn't called soon
1386 // enough. All forward-jumps are tracked and should get veneers
1387 // before their deadline comes and they're unable to jump
1388 // further.
1389 //
1390 // Otherwise if the label is before the fixup, then that's a
1391 // backwards jump. If it's past the maximum negative range
1392 // then we'll emit a veneer that to jump forward to which can
1393 // then jump backwards.
1394 let veneer_required = if label_offset >= offset {
1395 assert!((label_offset - offset) <= kind.max_pos_range());
1396 false
1397 } else {
1398 (offset - label_offset) > kind.max_neg_range()
1399 };
1400 trace!(
1401 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1402 label_offset,
1403 veneer_required,
1404 kind.max_pos_range(),
1405 kind.max_neg_range()
1406 );
1407
1408 if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1409 self.emit_veneer(label, offset, kind);
1410 } else {
1411 let slice = &mut self.data[start..end];
1412 trace!(
1413 "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1414 );
1415 kind.patch(slice, offset, label_offset);
1416 }
1417 } else {
1418 // If the offset of this label is not known at this time then
1419 // that means that a veneer is required because after this
1420 // island the target can't be in range of the original target.
1421 assert!(forced_threshold - offset > kind.max_pos_range());
1422 self.emit_veneer(label, offset, kind);
1423 }
1424 }
1425
1426 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1427 ///
1428 /// This will generate extra machine code, using `kind`, to get a
1429 /// larger-jump-kind than `kind` allows. The code at `offset` is then
1430 /// patched to jump to our new code, and then the new code is enqueued for
1431 /// a fixup to get processed at some later time.
1432 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1433 // If this `kind` doesn't support a veneer then that's a bug in the
1434 // backend because we need to implement support for such a veneer.
1435 assert!(
1436 kind.supports_veneer(),
1437 "jump beyond the range of {kind:?} but a veneer isn't supported",
1438 );
1439
1440 // Allocate space for a veneer in the island.
1441 self.align_to(I::LabelUse::ALIGN);
1442 let veneer_offset = self.cur_offset();
1443 trace!("making a veneer at {}", veneer_offset);
1444 let start = offset as usize;
1445 let end = (offset + kind.patch_size()) as usize;
1446 let slice = &mut self.data[start..end];
1447 // Patch the original label use to refer to the veneer.
1448 trace!(
1449 "patching original at offset {} to veneer offset {}",
1450 offset, veneer_offset
1451 );
1452 kind.patch(slice, offset, veneer_offset);
1453 // Generate the veneer.
1454 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1455 let (veneer_fixup_off, veneer_label_use) =
1456 kind.generate_veneer(veneer_slice, veneer_offset);
1457 trace!(
1458 "generated veneer; fixup offset {}, label_use {:?}",
1459 veneer_fixup_off, veneer_label_use
1460 );
1461 // Register a new use of `label` with our new veneer fixup and
1462 // offset. This'll recalculate deadlines accordingly and
1463 // enqueue this fixup to get processed at some later
1464 // time.
1465 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1466 }
1467
1468 fn finish_emission_maybe_forcing_veneers(
1469 &mut self,
1470 force_veneers: ForceVeneers,
1471 ctrl_plane: &mut ControlPlane,
1472 ) {
1473 while !self.pending_constants.is_empty()
1474 || !self.pending_traps.is_empty()
1475 || !self.fixup_records.is_empty()
1476 || !self.pending_fixup_records.is_empty()
1477 {
1478 // `emit_island()` will emit any pending veneers and constants, and
1479 // as a side-effect, will also take care of any fixups with resolved
1480 // labels eagerly.
1481 self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1482 }
1483
1484 // Ensure that all labels have been fixed up after the last island is emitted. This is a
1485 // full (release-mode) assert because an unresolved label means the emitted code is
1486 // incorrect.
1487 assert!(self.fixup_records.is_empty());
1488 assert!(self.pending_fixup_records.is_empty());
1489 }
1490
1491 /// Finish any deferred emissions and/or fixups.
1492 pub fn finish(
1493 mut self,
1494 constants: &VCodeConstants,
1495 ctrl_plane: &mut ControlPlane,
1496 ) -> MachBufferFinalized<Stencil> {
1497 let _tt = timing::vcode_emit_finish();
1498
1499 self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1500
1501 let alignment = self.finish_constants(constants);
1502
1503 // Resolve all labels to their offsets.
1504 let finalized_relocs = self
1505 .relocs
1506 .iter()
1507 .map(|reloc| FinalizedMachReloc {
1508 offset: reloc.offset,
1509 kind: reloc.kind,
1510 addend: reloc.addend,
1511 target: match &reloc.target {
1512 RelocTarget::ExternalName(name) => {
1513 FinalizedRelocTarget::ExternalName(name.clone())
1514 }
1515 RelocTarget::Label(label) => {
1516 FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1517 }
1518 },
1519 })
1520 .collect();
1521
1522 let finalized_exception_handlers = self
1523 .exception_handlers
1524 .iter()
1525 .map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))
1526 .collect();
1527
1528 let mut srclocs = self.srclocs;
1529 srclocs.sort_by_key(|entry| entry.start);
1530
1531 MachBufferFinalized {
1532 data: self.data,
1533 relocs: finalized_relocs,
1534 traps: self.traps,
1535 call_sites: self.call_sites,
1536 exception_handlers: finalized_exception_handlers,
1537 srclocs,
1538 user_stack_maps: self.user_stack_maps,
1539 unwind_info: self.unwind_info,
1540 alignment,
1541 }
1542 }
1543
1544 /// Add an external relocation at the given offset.
1545 pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1546 &mut self,
1547 offset: CodeOffset,
1548 kind: Reloc,
1549 target: &T,
1550 addend: Addend,
1551 ) {
1552 let target: RelocTarget = target.clone().into();
1553 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1554 // generate a label-use statement to track whether an island is possibly
1555 // needed to escape this function to actually get to the external name.
1556 // This is most likely to come up on AArch64 where calls between
1557 // functions use a 26-bit signed offset which gives +/- 64MB. This means
1558 // that if a function is 128MB in size and there's a call in the middle
1559 // it's impossible to reach the actual target. Also, while it's
1560 // technically possible to jump to the start of a function and then jump
1561 // further, island insertion below always inserts islands after
1562 // previously appended code so for Cranelift's own implementation this
1563 // is also a problem for 64MB functions on AArch64 which start with a
1564 // call instruction, those won't be able to escape.
1565 //
1566 // Ideally what needs to happen here is that a `LabelUse` is
1567 // transparently generated (or call-sites of this function are audited
1568 // to generate a `LabelUse` instead) and tracked internally. The actual
1569 // relocation would then change over time if and when a veneer is
1570 // inserted, where the relocation here would be patched by this
1571 // `MachBuffer` to jump to the veneer. The problem, though, is that all
1572 // this still needs to end up, in the case of a singular function,
1573 // generating a final relocation pointing either to this particular
1574 // relocation or to the veneer inserted. Additionally
1575 // `MachBuffer` needs the concept of a label which will never be
1576 // resolved, so `emit_island` doesn't trip over not actually ever
1577 // knowning what some labels are. Currently the loop in
1578 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1579 // loop.
1580 //
1581 // For now this means that because relocs aren't tracked at all that
1582 // AArch64 functions have a rough size limits of 64MB. For now that's
1583 // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1584 // when a relocation can't otherwise be resolved later, so it shouldn't
1585 // actually result in any memory unsafety or anything like that.
1586 self.relocs.push(MachReloc {
1587 offset,
1588 kind,
1589 target,
1590 addend,
1591 });
1592 }
1593
1594 /// Add an external relocation at the current offset.
1595 pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1596 &mut self,
1597 kind: Reloc,
1598 target: &T,
1599 addend: Addend,
1600 ) {
1601 self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1602 }
1603
1604 /// Add a trap record at the current offset.
1605 pub fn add_trap(&mut self, code: TrapCode) {
1606 self.traps.push(MachTrap {
1607 offset: self.data.len() as CodeOffset,
1608 code,
1609 });
1610 }
1611
1612 /// Add a call-site record at the current offset.
1613 pub fn add_call_site(&mut self) {
1614 self.add_try_call_site(core::iter::empty());
1615 }
1616
1617 /// Add a call-site record at the current offset with exception
1618 /// handlers.
1619 pub fn add_try_call_site(
1620 &mut self,
1621 exception_handlers: impl Iterator<Item = MachExceptionHandler>,
1622 ) {
1623 let start = u32::try_from(self.exception_handlers.len()).unwrap();
1624 self.exception_handlers.extend(exception_handlers);
1625 let end = u32::try_from(self.exception_handlers.len()).unwrap();
1626 let exception_handler_range = start..end;
1627
1628 self.call_sites.push(MachCallSite {
1629 ret_addr: self.data.len() as CodeOffset,
1630 exception_handler_range,
1631 });
1632 }
1633
1634 /// Add an unwind record at the current offset.
1635 pub fn add_unwind(&mut self, unwind: UnwindInst) {
1636 self.unwind_info.push((self.cur_offset(), unwind));
1637 }
1638
1639 /// Set the `SourceLoc` for code from this offset until the offset at the
1640 /// next call to `end_srcloc()`.
1641 /// Returns the current [CodeOffset] and [RelSourceLoc].
1642 pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1643 let cur = (self.cur_offset(), loc);
1644 self.cur_srcloc = Some(cur);
1645 cur
1646 }
1647
1648 /// Mark the end of the `SourceLoc` segment started at the last
1649 /// `start_srcloc()` call.
1650 pub fn end_srcloc(&mut self) {
1651 let (start, loc) = self
1652 .cur_srcloc
1653 .take()
1654 .expect("end_srcloc() called without start_srcloc()");
1655 let end = self.cur_offset();
1656 // Skip zero-length extends.
1657 debug_assert!(end >= start);
1658 if end > start {
1659 self.srclocs.push(MachSrcLoc { start, end, loc });
1660 }
1661 }
1662
1663 /// Push a user stack map onto this buffer.
1664 ///
1665 /// The stack map is associated with the given `return_addr` code
1666 /// offset. This must be the PC for the instruction just *after* this stack
1667 /// map's associated instruction. For example in the sequence `call $foo;
1668 /// add r8, rax`, the `return_addr` must be the offset of the start of the
1669 /// `add` instruction.
1670 ///
1671 /// Stack maps must be pushed in sorted `return_addr` order.
1672 pub fn push_user_stack_map(
1673 &mut self,
1674 emit_state: &I::State,
1675 return_addr: CodeOffset,
1676 mut stack_map: ir::UserStackMap,
1677 ) {
1678 let span = emit_state.frame_layout().active_size();
1679 trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1680
1681 debug_assert!(
1682 self.user_stack_maps
1683 .last()
1684 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1685 "pushed stack maps out of order: {} is not less than {}",
1686 self.user_stack_maps.last().unwrap().0,
1687 return_addr,
1688 );
1689
1690 stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1691 self.user_stack_maps.push((return_addr, span, stack_map));
1692 }
1693
1694 /// Increase the alignment of the buffer to the given alignment if bigger
1695 /// than the current alignment.
1696 pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1697 self.min_alignment = self.min_alignment.max(
1698 1u32.checked_shl(u32::from(align_to))
1699 .expect("log2_min_function_alignment too large"),
1700 );
1701 }
1702}
1703
1704impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1705 fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1706 for b in iter {
1707 self.put1(b);
1708 }
1709 }
1710}
1711
1712impl<T: CompilePhase> MachBufferFinalized<T> {
1713 /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1714 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1715 &self.srclocs[..]
1716 }
1717
1718 /// Get the total required size for the code.
1719 pub fn total_size(&self) -> CodeOffset {
1720 self.data.len() as CodeOffset
1721 }
1722
1723 /// Return the code in this mach buffer as a hex string for testing purposes.
1724 pub fn stringify_code_bytes(&self) -> String {
1725 // This is pretty lame, but whatever ..
1726 use std::fmt::Write;
1727 let mut s = String::with_capacity(self.data.len() * 2);
1728 for b in &self.data {
1729 write!(&mut s, "{b:02X}").unwrap();
1730 }
1731 s
1732 }
1733
1734 /// Get the code bytes.
1735 pub fn data(&self) -> &[u8] {
1736 // N.B.: we emit every section into the .text section as far as
1737 // the `CodeSink` is concerned; we do not bother to segregate
1738 // the contents into the actual program text, the jumptable and the
1739 // rodata (constant pool). This allows us to generate code assuming
1740 // that these will not be relocated relative to each other, and avoids
1741 // having to designate each section as belonging in one of the three
1742 // fixed categories defined by `CodeSink`. If this becomes a problem
1743 // later (e.g. because of memory permissions or similar), we can
1744 // add this designation and segregate the output; take care, however,
1745 // to add the appropriate relocations in this case.
1746
1747 &self.data[..]
1748 }
1749
1750 /// Get the list of external relocations for this code.
1751 pub fn relocs(&self) -> &[FinalizedMachReloc] {
1752 &self.relocs[..]
1753 }
1754
1755 /// Get the list of trap records for this code.
1756 pub fn traps(&self) -> &[MachTrap] {
1757 &self.traps[..]
1758 }
1759
1760 /// Get the user stack map metadata for this code.
1761 pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1762 &self.user_stack_maps
1763 }
1764
1765 /// Take this buffer's user strack map metadata.
1766 pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1767 mem::take(&mut self.user_stack_maps)
1768 }
1769
1770 /// Get the list of call sites for this code, along with
1771 /// associated exception handlers.
1772 ///
1773 /// Each item yielded by the returned iterator is a struct with:
1774 ///
1775 /// - The call site metadata record, with a `ret_addr` field
1776 /// directly accessible and denoting the offset of the return
1777 /// address into this buffer's code.
1778 /// - The slice of pairs of exception tags and code offsets
1779 /// denoting exception-handler entry points associated with this
1780 /// call site.
1781 pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1782 self.call_sites.iter().map(|call_site| {
1783 let handler_range = call_site.exception_handler_range.clone();
1784 let handler_range = usize::try_from(handler_range.start).unwrap()
1785 ..usize::try_from(handler_range.end).unwrap();
1786 FinalizedMachCallSite {
1787 ret_addr: call_site.ret_addr,
1788 exception_handlers: &self.exception_handlers[handler_range],
1789 }
1790 })
1791 }
1792}
1793
1794/// An item in the exception-handler list for a callsite, with label
1795/// references. Items are interpreted in left-to-right order and the
1796/// first match wins.
1797#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1798pub enum MachExceptionHandler {
1799 /// A specific tag (in the current dynamic context) should be
1800 /// handled by the code at the given offset.
1801 Tag(ExceptionTag, MachLabel),
1802 /// All exceptions should be handled by the code at the given
1803 /// offset.
1804 Default(MachLabel),
1805 /// The dynamic context for interpreting tags is updated to the
1806 /// value stored in the given machine location (in this frame's
1807 /// context).
1808 Context(ExceptionContextLoc),
1809}
1810
1811impl MachExceptionHandler {
1812 fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {
1813 match self {
1814 Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),
1815 Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),
1816 Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),
1817 }
1818 }
1819}
1820
1821/// An item in the exception-handler list for a callsite, with final
1822/// (lowered) code offsets. Items are interpreted in left-to-right
1823/// order and the first match wins.
1824#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1825#[cfg_attr(
1826 feature = "enable-serde",
1827 derive(serde_derive::Serialize, serde_derive::Deserialize)
1828)]
1829pub enum FinalizedMachExceptionHandler {
1830 /// A specific tag (in the current dynamic context) should be
1831 /// handled by the code at the given offset.
1832 Tag(ExceptionTag, CodeOffset),
1833 /// All exceptions should be handled by the code at the given
1834 /// offset.
1835 Default(CodeOffset),
1836 /// The dynamic context for interpreting tags is updated to the
1837 /// value stored in the given machine location (in this frame's
1838 /// context).
1839 Context(ExceptionContextLoc),
1840}
1841
1842/// A location for a dynamic exception context value.
1843#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1844#[cfg_attr(
1845 feature = "enable-serde",
1846 derive(serde_derive::Serialize, serde_derive::Deserialize)
1847)]
1848pub enum ExceptionContextLoc {
1849 /// An offset from SP at the callsite.
1850 SPOffset(u32),
1851 /// A GPR at the callsite. The physical register number for the
1852 /// GPR register file on the target architecture is used.
1853 GPR(u8),
1854}
1855
1856/// Metadata about a constant.
1857struct MachBufferConstant {
1858 /// A label which has not yet been bound which can be used for this
1859 /// constant.
1860 ///
1861 /// This is lazily created when a label is requested for a constant and is
1862 /// cleared when a constant is emitted.
1863 upcoming_label: Option<MachLabel>,
1864 /// Required alignment.
1865 align: CodeOffset,
1866 /// The byte size of this constant.
1867 size: usize,
1868}
1869
1870/// A trap that is deferred to the next time an island is emitted for either
1871/// traps, constants, or fixups.
1872struct MachLabelTrap {
1873 /// This label will refer to the trap's offset.
1874 label: MachLabel,
1875 /// The code associated with this trap.
1876 code: TrapCode,
1877 /// An optional source location to assign for this trap.
1878 loc: Option<RelSourceLoc>,
1879}
1880
1881/// A fixup to perform on the buffer once code is emitted. Fixups always refer
1882/// to labels and patch the code based on label offsets. Hence, they are like
1883/// relocations, but internal to one buffer.
1884#[derive(Debug)]
1885struct MachLabelFixup<I: VCodeInst> {
1886 /// The label whose offset controls this fixup.
1887 label: MachLabel,
1888 /// The offset to fix up / patch to refer to this label.
1889 offset: CodeOffset,
1890 /// The kind of fixup. This is architecture-specific; each architecture may have,
1891 /// e.g., several types of branch instructions, each with differently-sized
1892 /// offset fields and different places within the instruction to place the
1893 /// bits.
1894 kind: I::LabelUse,
1895}
1896
1897impl<I: VCodeInst> MachLabelFixup<I> {
1898 fn deadline(&self) -> CodeOffset {
1899 self.offset.saturating_add(self.kind.max_pos_range())
1900 }
1901}
1902
1903impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1904 fn eq(&self, other: &Self) -> bool {
1905 self.deadline() == other.deadline()
1906 }
1907}
1908
1909impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1910
1911impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1912 fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1913 Some(self.cmp(other))
1914 }
1915}
1916
1917impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1918 fn cmp(&self, other: &Self) -> Ordering {
1919 other.deadline().cmp(&self.deadline())
1920 }
1921}
1922
1923/// A relocation resulting from a compilation.
1924#[derive(Clone, Debug, PartialEq)]
1925#[cfg_attr(
1926 feature = "enable-serde",
1927 derive(serde_derive::Serialize, serde_derive::Deserialize)
1928)]
1929pub struct MachRelocBase<T> {
1930 /// The offset at which the relocation applies, *relative to the
1931 /// containing section*.
1932 pub offset: CodeOffset,
1933 /// The kind of relocation.
1934 pub kind: Reloc,
1935 /// The external symbol / name to which this relocation refers.
1936 pub target: T,
1937 /// The addend to add to the symbol value.
1938 pub addend: i64,
1939}
1940
1941type MachReloc = MachRelocBase<RelocTarget>;
1942
1943/// A relocation resulting from a compilation.
1944pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1945
1946/// A Relocation target
1947#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1948pub enum RelocTarget {
1949 /// Points to an [ExternalName] outside the current function.
1950 ExternalName(ExternalName),
1951 /// Points to a [MachLabel] inside this function.
1952 /// This is different from [MachLabelFixup] in that both the relocation and the
1953 /// label will be emitted and are only resolved at link time.
1954 ///
1955 /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1956 Label(MachLabel),
1957}
1958
1959impl From<ExternalName> for RelocTarget {
1960 fn from(name: ExternalName) -> Self {
1961 Self::ExternalName(name)
1962 }
1963}
1964
1965impl From<MachLabel> for RelocTarget {
1966 fn from(label: MachLabel) -> Self {
1967 Self::Label(label)
1968 }
1969}
1970
1971/// A Relocation target
1972#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1973#[cfg_attr(
1974 feature = "enable-serde",
1975 derive(serde_derive::Serialize, serde_derive::Deserialize)
1976)]
1977pub enum FinalizedRelocTarget {
1978 /// Points to an [ExternalName] outside the current function.
1979 ExternalName(ExternalName),
1980 /// Points to a [CodeOffset] from the start of the current function.
1981 Func(CodeOffset),
1982}
1983
1984impl FinalizedRelocTarget {
1985 /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1986 /// output.
1987 pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1988 match self {
1989 FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1990 FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1991 }
1992 }
1993}
1994
1995/// A trap record resulting from a compilation.
1996#[derive(Clone, Debug, PartialEq)]
1997#[cfg_attr(
1998 feature = "enable-serde",
1999 derive(serde_derive::Serialize, serde_derive::Deserialize)
2000)]
2001pub struct MachTrap {
2002 /// The offset at which the trap instruction occurs, *relative to the
2003 /// containing section*.
2004 pub offset: CodeOffset,
2005 /// The trap code.
2006 pub code: TrapCode,
2007}
2008
2009/// A call site record resulting from a compilation.
2010#[derive(Clone, Debug, PartialEq)]
2011#[cfg_attr(
2012 feature = "enable-serde",
2013 derive(serde_derive::Serialize, serde_derive::Deserialize)
2014)]
2015pub struct MachCallSite {
2016 /// The offset of the call's return address, *relative to the
2017 /// start of the buffer*.
2018 pub ret_addr: CodeOffset,
2019
2020 /// Range in `exception_handlers` corresponding to the exception
2021 /// handlers for this callsite.
2022 exception_handler_range: Range<u32>,
2023}
2024
2025/// A call site record resulting from a compilation.
2026#[derive(Clone, Debug, PartialEq)]
2027pub struct FinalizedMachCallSite<'a> {
2028 /// The offset of the call's return address, *relative to the
2029 /// start of the buffer*.
2030 pub ret_addr: CodeOffset,
2031
2032 /// Exception handlers at this callsite, with target offsets
2033 /// *relative to the start of the buffer*.
2034 pub exception_handlers: &'a [FinalizedMachExceptionHandler],
2035}
2036
2037/// A source-location mapping resulting from a compilation.
2038#[derive(PartialEq, Debug, Clone)]
2039#[cfg_attr(
2040 feature = "enable-serde",
2041 derive(serde_derive::Serialize, serde_derive::Deserialize)
2042)]
2043pub struct MachSrcLoc<T: CompilePhase> {
2044 /// The start of the region of code corresponding to a source location.
2045 /// This is relative to the start of the function, not to the start of the
2046 /// section.
2047 pub start: CodeOffset,
2048 /// The end of the region of code corresponding to a source location.
2049 /// This is relative to the start of the function, not to the start of the
2050 /// section.
2051 pub end: CodeOffset,
2052 /// The source location.
2053 pub loc: T::SourceLocType,
2054}
2055
2056impl MachSrcLoc<Stencil> {
2057 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
2058 MachSrcLoc {
2059 start: self.start,
2060 end: self.end,
2061 loc: self.loc.expand(base_srcloc),
2062 }
2063 }
2064}
2065
2066/// Record of branch instruction in the buffer, to facilitate editing.
2067#[derive(Clone, Debug)]
2068struct MachBranch {
2069 start: CodeOffset,
2070 end: CodeOffset,
2071 target: MachLabel,
2072 fixup: usize,
2073 inverted: Option<SmallVec<[u8; 8]>>,
2074 /// All labels pointing to the start of this branch. For correctness, this
2075 /// *must* be complete (i.e., must contain all labels whose resolved offsets
2076 /// are at the start of this branch): we rely on being able to redirect all
2077 /// labels that could jump to this branch before removing it, if it is
2078 /// otherwise unreachable.
2079 labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2080}
2081
2082impl MachBranch {
2083 fn is_cond(&self) -> bool {
2084 self.inverted.is_some()
2085 }
2086 fn is_uncond(&self) -> bool {
2087 self.inverted.is_none()
2088 }
2089}
2090
2091/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2092///
2093/// Note that `MachBuffer` was primarily written for intra-function references
2094/// of jumps between basic blocks, but it's also quite usable for entire text
2095/// sections and resolving references between functions themselves. This
2096/// builder interprets "blocks" as labeled functions for the purposes of
2097/// resolving labels internally in the buffer.
2098pub struct MachTextSectionBuilder<I: VCodeInst> {
2099 buf: MachBuffer<I>,
2100 next_func: usize,
2101 force_veneers: ForceVeneers,
2102}
2103
2104impl<I: VCodeInst> MachTextSectionBuilder<I> {
2105 /// Creates a new text section builder which will have `num_funcs` functions
2106 /// pushed into it.
2107 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2108 let mut buf = MachBuffer::new();
2109 buf.reserve_labels_for_blocks(num_funcs);
2110 MachTextSectionBuilder {
2111 buf,
2112 next_func: 0,
2113 force_veneers: ForceVeneers::No,
2114 }
2115 }
2116}
2117
2118impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2119 fn append(
2120 &mut self,
2121 labeled: bool,
2122 func: &[u8],
2123 align: u32,
2124 ctrl_plane: &mut ControlPlane,
2125 ) -> u64 {
2126 // Conditionally emit an island if it's necessary to resolve jumps
2127 // between functions which are too far away.
2128 let size = func.len() as u32;
2129 if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2130 self.buf
2131 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2132 }
2133
2134 self.buf.align_to(align);
2135 let pos = self.buf.cur_offset();
2136 if labeled {
2137 self.buf.bind_label(
2138 MachLabel::from_block(BlockIndex::new(self.next_func)),
2139 ctrl_plane,
2140 );
2141 self.next_func += 1;
2142 }
2143 self.buf.put_data(func);
2144 u64::from(pos)
2145 }
2146
2147 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2148 crate::trace!(
2149 "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2150 );
2151 let label = MachLabel::from_block(BlockIndex::new(target));
2152 let offset = u32::try_from(offset).unwrap();
2153 match I::LabelUse::from_reloc(reloc, addend) {
2154 Some(label_use) => {
2155 self.buf.use_label_at_offset(offset, label, label_use);
2156 true
2157 }
2158 None => false,
2159 }
2160 }
2161
2162 fn force_veneers(&mut self) {
2163 self.force_veneers = ForceVeneers::Yes;
2164 }
2165
2166 fn write(&mut self, offset: u64, data: &[u8]) {
2167 self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2168 }
2169
2170 fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2171 // Double-check all functions were pushed.
2172 assert_eq!(self.next_func, self.buf.label_offsets.len());
2173
2174 // Finish up any veneers, if necessary.
2175 self.buf
2176 .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2177
2178 // We don't need the data any more, so return it to the caller.
2179 mem::take(&mut self.buf.data).into_vec()
2180 }
2181}
2182
2183// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2184#[cfg(all(test, feature = "arm64"))]
2185mod test {
2186 use cranelift_entity::EntityRef as _;
2187
2188 use super::*;
2189 use crate::ir::UserExternalNameRef;
2190 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2191 use crate::isa::aarch64::inst::{OperandSize, xreg};
2192 use crate::machinst::{MachInstEmit, MachInstEmitState};
2193 use crate::settings;
2194
2195 fn label(n: u32) -> MachLabel {
2196 MachLabel::from_block(BlockIndex::new(n as usize))
2197 }
2198 fn target(n: u32) -> BranchTarget {
2199 BranchTarget::Label(label(n))
2200 }
2201
2202 #[test]
2203 fn test_elide_jump_to_next() {
2204 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2205 let mut buf = MachBuffer::new();
2206 let mut state = <Inst as MachInstEmit>::State::default();
2207 let constants = Default::default();
2208
2209 buf.reserve_labels_for_blocks(2);
2210 buf.bind_label(label(0), state.ctrl_plane_mut());
2211 let inst = Inst::Jump { dest: target(1) };
2212 inst.emit(&mut buf, &info, &mut state);
2213 buf.bind_label(label(1), state.ctrl_plane_mut());
2214 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2215 assert_eq!(0, buf.total_size());
2216 }
2217
2218 #[test]
2219 fn test_elide_trivial_jump_blocks() {
2220 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2221 let mut buf = MachBuffer::new();
2222 let mut state = <Inst as MachInstEmit>::State::default();
2223 let constants = Default::default();
2224
2225 buf.reserve_labels_for_blocks(4);
2226
2227 buf.bind_label(label(0), state.ctrl_plane_mut());
2228 let inst = Inst::CondBr {
2229 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2230 taken: target(1),
2231 not_taken: target(2),
2232 };
2233 inst.emit(&mut buf, &info, &mut state);
2234
2235 buf.bind_label(label(1), state.ctrl_plane_mut());
2236 let inst = Inst::Jump { dest: target(3) };
2237 inst.emit(&mut buf, &info, &mut state);
2238
2239 buf.bind_label(label(2), state.ctrl_plane_mut());
2240 let inst = Inst::Jump { dest: target(3) };
2241 inst.emit(&mut buf, &info, &mut state);
2242
2243 buf.bind_label(label(3), state.ctrl_plane_mut());
2244
2245 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2246 assert_eq!(0, buf.total_size());
2247 }
2248
2249 #[test]
2250 fn test_flip_cond() {
2251 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2252 let mut buf = MachBuffer::new();
2253 let mut state = <Inst as MachInstEmit>::State::default();
2254 let constants = Default::default();
2255
2256 buf.reserve_labels_for_blocks(4);
2257
2258 buf.bind_label(label(0), state.ctrl_plane_mut());
2259 let inst = Inst::CondBr {
2260 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2261 taken: target(1),
2262 not_taken: target(2),
2263 };
2264 inst.emit(&mut buf, &info, &mut state);
2265
2266 buf.bind_label(label(1), state.ctrl_plane_mut());
2267 let inst = Inst::Nop4;
2268 inst.emit(&mut buf, &info, &mut state);
2269
2270 buf.bind_label(label(2), state.ctrl_plane_mut());
2271 let inst = Inst::Udf {
2272 trap_code: TrapCode::STACK_OVERFLOW,
2273 };
2274 inst.emit(&mut buf, &info, &mut state);
2275
2276 buf.bind_label(label(3), state.ctrl_plane_mut());
2277
2278 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2279
2280 let mut buf2 = MachBuffer::new();
2281 let mut state = Default::default();
2282 let inst = Inst::TrapIf {
2283 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2284 trap_code: TrapCode::STACK_OVERFLOW,
2285 };
2286 inst.emit(&mut buf2, &info, &mut state);
2287 let inst = Inst::Nop4;
2288 inst.emit(&mut buf2, &info, &mut state);
2289
2290 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2291
2292 assert_eq!(buf.data, buf2.data);
2293 }
2294
2295 #[test]
2296 fn test_island() {
2297 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2298 let mut buf = MachBuffer::new();
2299 let mut state = <Inst as MachInstEmit>::State::default();
2300 let constants = Default::default();
2301
2302 buf.reserve_labels_for_blocks(4);
2303
2304 buf.bind_label(label(0), state.ctrl_plane_mut());
2305 let inst = Inst::CondBr {
2306 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2307 taken: target(2),
2308 not_taken: target(3),
2309 };
2310 inst.emit(&mut buf, &info, &mut state);
2311
2312 buf.bind_label(label(1), state.ctrl_plane_mut());
2313 while buf.cur_offset() < 2000000 {
2314 if buf.island_needed(0) {
2315 buf.emit_island(0, state.ctrl_plane_mut());
2316 }
2317 let inst = Inst::Nop4;
2318 inst.emit(&mut buf, &info, &mut state);
2319 }
2320
2321 buf.bind_label(label(2), state.ctrl_plane_mut());
2322 let inst = Inst::Nop4;
2323 inst.emit(&mut buf, &info, &mut state);
2324
2325 buf.bind_label(label(3), state.ctrl_plane_mut());
2326 let inst = Inst::Nop4;
2327 inst.emit(&mut buf, &info, &mut state);
2328
2329 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2330
2331 assert_eq!(2000000 + 8, buf.total_size());
2332
2333 let mut buf2 = MachBuffer::new();
2334 let mut state = Default::default();
2335 let inst = Inst::CondBr {
2336 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2337
2338 // This conditionally taken branch has a 19-bit constant, shifted
2339 // to the left by two, giving us a 21-bit range in total. Half of
2340 // this range positive so the we should be around 1 << 20 bytes
2341 // away for our jump target.
2342 //
2343 // There are two pending fixups by the time we reach this point,
2344 // one for this 19-bit jump and one for the unconditional 26-bit
2345 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2346 // veneer is 20 bytes large, which means that pessimistically
2347 // assuming we'll need two veneers. Currently each veneer is
2348 // pessimistically assumed to be the maximal size which means we
2349 // need 40 bytes of extra space, meaning that the actual island
2350 // should come 40-bytes before the deadline.
2351 taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2352
2353 // This branch is in-range so no veneers should be needed, it should
2354 // go directly to the target.
2355 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2356 };
2357 inst.emit(&mut buf2, &info, &mut state);
2358
2359 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2360
2361 assert_eq!(&buf.data[0..8], &buf2.data[..]);
2362 }
2363
2364 #[test]
2365 fn test_island_backward() {
2366 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2367 let mut buf = MachBuffer::new();
2368 let mut state = <Inst as MachInstEmit>::State::default();
2369 let constants = Default::default();
2370
2371 buf.reserve_labels_for_blocks(4);
2372
2373 buf.bind_label(label(0), state.ctrl_plane_mut());
2374 let inst = Inst::Nop4;
2375 inst.emit(&mut buf, &info, &mut state);
2376
2377 buf.bind_label(label(1), state.ctrl_plane_mut());
2378 let inst = Inst::Nop4;
2379 inst.emit(&mut buf, &info, &mut state);
2380
2381 buf.bind_label(label(2), state.ctrl_plane_mut());
2382 while buf.cur_offset() < 2000000 {
2383 let inst = Inst::Nop4;
2384 inst.emit(&mut buf, &info, &mut state);
2385 }
2386
2387 buf.bind_label(label(3), state.ctrl_plane_mut());
2388 let inst = Inst::CondBr {
2389 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2390 taken: target(0),
2391 not_taken: target(1),
2392 };
2393 inst.emit(&mut buf, &info, &mut state);
2394
2395 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2396
2397 assert_eq!(2000000 + 12, buf.total_size());
2398
2399 let mut buf2 = MachBuffer::new();
2400 let mut state = Default::default();
2401 let inst = Inst::CondBr {
2402 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2403 taken: BranchTarget::ResolvedOffset(8),
2404 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2405 };
2406 inst.emit(&mut buf2, &info, &mut state);
2407 let inst = Inst::Jump {
2408 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2409 };
2410 inst.emit(&mut buf2, &info, &mut state);
2411
2412 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2413
2414 assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2415 }
2416
2417 #[test]
2418 fn test_multiple_redirect() {
2419 // label0:
2420 // cbz x0, label1
2421 // b label2
2422 // label1:
2423 // b label3
2424 // label2:
2425 // nop
2426 // nop
2427 // b label0
2428 // label3:
2429 // b label4
2430 // label4:
2431 // b label5
2432 // label5:
2433 // b label7
2434 // label6:
2435 // nop
2436 // label7:
2437 // ret
2438 //
2439 // -- should become:
2440 //
2441 // label0:
2442 // cbz x0, label7
2443 // label2:
2444 // nop
2445 // nop
2446 // b label0
2447 // label6:
2448 // nop
2449 // label7:
2450 // ret
2451
2452 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2453 let mut buf = MachBuffer::new();
2454 let mut state = <Inst as MachInstEmit>::State::default();
2455 let constants = Default::default();
2456
2457 buf.reserve_labels_for_blocks(8);
2458
2459 buf.bind_label(label(0), state.ctrl_plane_mut());
2460 let inst = Inst::CondBr {
2461 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2462 taken: target(1),
2463 not_taken: target(2),
2464 };
2465 inst.emit(&mut buf, &info, &mut state);
2466
2467 buf.bind_label(label(1), state.ctrl_plane_mut());
2468 let inst = Inst::Jump { dest: target(3) };
2469 inst.emit(&mut buf, &info, &mut state);
2470
2471 buf.bind_label(label(2), state.ctrl_plane_mut());
2472 let inst = Inst::Nop4;
2473 inst.emit(&mut buf, &info, &mut state);
2474 inst.emit(&mut buf, &info, &mut state);
2475 let inst = Inst::Jump { dest: target(0) };
2476 inst.emit(&mut buf, &info, &mut state);
2477
2478 buf.bind_label(label(3), state.ctrl_plane_mut());
2479 let inst = Inst::Jump { dest: target(4) };
2480 inst.emit(&mut buf, &info, &mut state);
2481
2482 buf.bind_label(label(4), state.ctrl_plane_mut());
2483 let inst = Inst::Jump { dest: target(5) };
2484 inst.emit(&mut buf, &info, &mut state);
2485
2486 buf.bind_label(label(5), state.ctrl_plane_mut());
2487 let inst = Inst::Jump { dest: target(7) };
2488 inst.emit(&mut buf, &info, &mut state);
2489
2490 buf.bind_label(label(6), state.ctrl_plane_mut());
2491 let inst = Inst::Nop4;
2492 inst.emit(&mut buf, &info, &mut state);
2493
2494 buf.bind_label(label(7), state.ctrl_plane_mut());
2495 let inst = Inst::Ret {};
2496 inst.emit(&mut buf, &info, &mut state);
2497
2498 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2499
2500 let golden_data = vec![
2501 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2502 0x1f, 0x20, 0x03, 0xd5, // nop
2503 0x1f, 0x20, 0x03, 0xd5, // nop
2504 0xfd, 0xff, 0xff, 0x17, // b 0
2505 0x1f, 0x20, 0x03, 0xd5, // nop
2506 0xc0, 0x03, 0x5f, 0xd6, // ret
2507 ];
2508
2509 assert_eq!(&golden_data[..], &buf.data[..]);
2510 }
2511
2512 #[test]
2513 fn test_handle_branch_cycle() {
2514 // label0:
2515 // b label1
2516 // label1:
2517 // b label2
2518 // label2:
2519 // b label3
2520 // label3:
2521 // b label4
2522 // label4:
2523 // b label1 // note: not label0 (to make it interesting).
2524 //
2525 // -- should become:
2526 //
2527 // label0, label1, ..., label4:
2528 // b label0
2529 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2530 let mut buf = MachBuffer::new();
2531 let mut state = <Inst as MachInstEmit>::State::default();
2532 let constants = Default::default();
2533
2534 buf.reserve_labels_for_blocks(5);
2535
2536 buf.bind_label(label(0), state.ctrl_plane_mut());
2537 let inst = Inst::Jump { dest: target(1) };
2538 inst.emit(&mut buf, &info, &mut state);
2539
2540 buf.bind_label(label(1), state.ctrl_plane_mut());
2541 let inst = Inst::Jump { dest: target(2) };
2542 inst.emit(&mut buf, &info, &mut state);
2543
2544 buf.bind_label(label(2), state.ctrl_plane_mut());
2545 let inst = Inst::Jump { dest: target(3) };
2546 inst.emit(&mut buf, &info, &mut state);
2547
2548 buf.bind_label(label(3), state.ctrl_plane_mut());
2549 let inst = Inst::Jump { dest: target(4) };
2550 inst.emit(&mut buf, &info, &mut state);
2551
2552 buf.bind_label(label(4), state.ctrl_plane_mut());
2553 let inst = Inst::Jump { dest: target(1) };
2554 inst.emit(&mut buf, &info, &mut state);
2555
2556 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2557
2558 let golden_data = vec![
2559 0x00, 0x00, 0x00, 0x14, // b 0
2560 ];
2561
2562 assert_eq!(&golden_data[..], &buf.data[..]);
2563 }
2564
2565 #[test]
2566 fn metadata_records() {
2567 let mut buf = MachBuffer::<Inst>::new();
2568 let ctrl_plane = &mut Default::default();
2569 let constants = Default::default();
2570
2571 buf.reserve_labels_for_blocks(3);
2572
2573 buf.bind_label(label(0), ctrl_plane);
2574 buf.put1(1);
2575 buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2576 buf.put1(2);
2577 buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2578 buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2579 buf.add_try_call_site(
2580 [
2581 MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),
2582 MachExceptionHandler::Default(label(1)),
2583 ]
2584 .into_iter(),
2585 );
2586 buf.add_reloc(
2587 Reloc::Abs4,
2588 &ExternalName::User(UserExternalNameRef::new(0)),
2589 0,
2590 );
2591 buf.put1(3);
2592 buf.add_reloc(
2593 Reloc::Abs8,
2594 &ExternalName::User(UserExternalNameRef::new(1)),
2595 1,
2596 );
2597 buf.put1(4);
2598 buf.bind_label(label(1), ctrl_plane);
2599 buf.put1(0xff);
2600 buf.bind_label(label(2), ctrl_plane);
2601 buf.put1(0xff);
2602
2603 let buf = buf.finish(&constants, ctrl_plane);
2604
2605 assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2606 assert_eq!(
2607 buf.traps()
2608 .iter()
2609 .map(|trap| (trap.offset, trap.code))
2610 .collect::<Vec<_>>(),
2611 vec![
2612 (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2613 (2, TrapCode::INTEGER_OVERFLOW),
2614 (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2615 ]
2616 );
2617 let call_sites: Vec<_> = buf.call_sites().collect();
2618 assert_eq!(call_sites[0].ret_addr, 2);
2619 assert_eq!(
2620 call_sites[0].exception_handlers,
2621 &[
2622 FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),
2623 FinalizedMachExceptionHandler::Default(4)
2624 ],
2625 );
2626 assert_eq!(
2627 buf.relocs()
2628 .iter()
2629 .map(|reloc| (reloc.offset, reloc.kind))
2630 .collect::<Vec<_>>(),
2631 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2632 );
2633 }
2634}