cranelift_codegen/machinst/buffer.rs
1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//! emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//! For example, on AArch64, conditional branches can only target code +/- 1MB
12//! from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//! edge blocks when the register allocator does not need to insert any
17//! spills/reloads/moves in edge blocks, results in many suboptimal branch
18//! patterns. The lowering also pays no attention to block order, and so
19//! two-target conditional forms (cond-br followed by uncond-br) can often by
20//! avoided because one of the targets is the fallthrough. There are several
21//! cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//! forms, but with zero offsets and optimistically assuming the target will be
29//! in range. Record the "fixup" for later. Targets are denoted instead by
30//! symbolic "labels" that are then bound to certain offsets in the buffer as
31//! we emit code. (Nominally, there is a label at the start of every basic
32//! block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//! reference "goes out of range". We call this the "deadline". If we reach the
36//! deadline and we still have not bound the label to which an unresolved branch
37//! refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//! simply a chunk of code inserted in the middle of the code actually produced
41//! by the emitter (e.g., vcode iterating over instruction structs). The emitter
42//! has some awareness of this: it either asks for an island between blocks, so
43//! it is not accidentally executed, or else it emits a branch around the island
44//! when all other options fail (see `Inst::EmitIsland` meta-instruction).
45//!
46//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47//! that implements a longer-range reference to a label. The idea is that, for
48//! example, a branch with a limited range can branch to a "veneer" instead,
49//! which is simply a branch in a form that can use a longer-range reference. On
50//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52//! conditional branch's label reference can be fixed up with a "veneer" to
53//! achieve a longer range.
54//!
55//! - To implement all of this, we require the backend to provide a `LabelUse`
56//! type that implements a trait. This is nominally an enum that records one of
57//! several kinds of references to an offset in code -- basically, a relocation
58//! type -- and will usually correspond to different instruction formats. The
59//! `LabelUse` implementation specifies the maximum range, how to patch in the
60//! actual label location when known, and how to generate a veneer to extend the
61//! range.
62//!
63//! That satisfies label references, but we still may have suboptimal branch
64//! patterns. To clean up the branches, we do a simple "peephole"-style
65//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66//! informs the buffer of branches in the code and, in the case of conditionals,
67//! the code that would have been emitted to invert this branch's condition. We
68//! track the "latest branches": these are branches that are contiguous up to
69//! the current offset. (If any code is emitted after a branch, that branch or
70//! run of contiguous branches is no longer "latest".) The latest branches are
71//! those that we can edit by simply truncating the buffer and doing something
72//! else instead.
73//!
74//! To optimize branches, we implement several simple rules, and try to apply
75//! them to the "latest branches" when possible:
76//!
77//! - A branch with a label target, when that label is bound to the ending
78//! offset of the branch (the fallthrough location), can be removed altogether,
79//! because the branch would have no effect).
80//!
81//! - An unconditional branch that starts at a label location, and branches to
82//! another label, results in a "label alias": all references to the label bound
83//! *to* this branch instruction are instead resolved to the *target* of the
84//! branch instruction. This effectively removes empty blocks that just
85//! unconditionally branch to the next block. We call this "branch threading".
86//!
87//! - A conditional followed by an unconditional, when the conditional branches
88//! to the unconditional's fallthrough, results in (i) the truncation of the
89//! unconditional, (ii) the inversion of the condition's condition, and (iii)
90//! replacement of the conditional's target (using the original target of the
91//! unconditional). This is a fancy way of saying "we can flip a two-target
92//! conditional branch's taken/not-taken targets if it works better with our
93//! fallthrough". To make this work, the emitter actually gives the buffer
94//! *both* forms of every conditional branch: the true form is emitted into the
95//! buffer, and the "inverted" machine-code bytes are provided as part of the
96//! branch-fixup metadata.
97//!
98//! - An unconditional B preceded by another unconditional P, when B's label(s) have
99//! been redirected to target(B), can be removed entirely. This is an extension
100//! of the branch-threading optimization, and is valid because if we know there
101//! will be no fallthrough into this branch instruction (the prior instruction
102//! is an unconditional jump), and if we know we have successfully redirected
103//! all labels, then this branch instruction is unreachable. Note that this
104//! works because the redirection happens before the label is ever resolved
105//! (fixups happen at island emission time, at which point latest-branches are
106//! cleared, or at the end of emission), so we are sure to catch and redirect
107//! all possible paths to this instruction.
108//!
109//! # Branch-optimization Correctness
110//!
111//! The branch-optimization mechanism depends on a few data structures with
112//! invariants, which are always held outside the scope of top-level public
113//! methods:
114//!
115//! - The latest-branches list. Each entry describes a span of the buffer
116//! (start/end offsets), the label target, the corresponding fixup-list entry
117//! index, and the bytes (must be the same length) for the inverted form, if
118//! conditional. The list of labels that are bound to the start-offset of this
119//! branch is *complete* (if any label has a resolved offset equal to `start`
120//! and is not an alias, it must appear in this list) and *precise* (no label
121//! in this list can be bound to another offset). No label in this list should
122//! be an alias. No two branch ranges can overlap, and branches are in
123//! ascending-offset order.
124//!
125//! - The labels-at-tail list. This contains all MachLabels that have been bound
126//! to (whose resolved offsets are equal to) the tail offset of the buffer.
127//! No label in this list should be an alias.
128//!
129//! - The label_offsets array, containing the bound offset of a label or
130//! UNKNOWN. No label can be bound at an offset greater than the current
131//! buffer tail.
132//!
133//! - The label_aliases array, containing another label to which a label is
134//! bound or UNKNOWN. A label's resolved offset is the resolved offset
135//! of the label it is aliased to, if this is set.
136//!
137//! We argue below, at each method, how the invariants in these data structures
138//! are maintained (grep for "Post-invariant").
139//!
140//! Given these invariants, we argue why each optimization preserves execution
141//! semantics below (grep for "Preserves execution semantics").
142//!
143//! # Avoiding Quadratic Behavior
144//!
145//! There are two cases where we've had to take some care to avoid
146//! quadratic worst-case behavior:
147//!
148//! - The "labels at this branch" list can grow unboundedly if the
149//! code generator binds many labels at one location. If the count
150//! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151//! simply abort an optimization early in a way that is always correct
152//! but is conservative.
153//!
154//! - The fixup list can interact with island emission to create
155//! "quadratic island behavior". In a little more detail, one can hit
156//! this behavior by having some pending fixups (forward label
157//! references) with long-range label-use kinds, and some others
158//! with shorter-range references that nonetheless still are pending
159//! long enough to trigger island generation. In such a case, we
160//! process the fixup list, generate veneers to extend some forward
161//! references' ranges, but leave the other (longer-range) ones
162//! alone. The way this was implemented put them back on a list and
163//! resulted in quadratic behavior.
164//!
165//! To avoid this fixups are split into two lists: one "pending" list and one
166//! final list. The pending list is kept around for handling fixups related to
167//! branches so it can be edited/truncated. When an island is reached, which
168//! starts processing fixups, all pending fixups are flushed into the final
169//! list. The final list is a `BinaryHeap` which enables fixup processing to
170//! only process those which are required during island emission, deferring
171//! all longer-range fixups to later.
172
173use crate::binemit::{Addend, CodeOffset, Reloc};
174use crate::ir::function::FunctionParameters;
175use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176use crate::isa::unwind::UnwindInst;
177use crate::machinst::{
178 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179};
180use crate::trace;
181use crate::{MachInstEmitState, ir};
182use crate::{VCodeConstantData, timing};
183use core::ops::Range;
184use cranelift_control::ControlPlane;
185use cranelift_entity::packed_option::PackedOption;
186use cranelift_entity::{PrimaryMap, entity_impl};
187use smallvec::SmallVec;
188use std::cmp::Ordering;
189use std::collections::BinaryHeap;
190use std::mem;
191use std::string::String;
192use std::vec::Vec;
193
194#[cfg(feature = "enable-serde")]
195use serde::{Deserialize, Serialize};
196
197#[cfg(feature = "enable-serde")]
198pub trait CompilePhase {
199 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
200 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
201}
202
203#[cfg(not(feature = "enable-serde"))]
204pub trait CompilePhase {
205 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
206 type SourceLocType: core::fmt::Debug + PartialEq + Clone;
207}
208
209/// Status of a compiled artifact that needs patching before being used.
210#[derive(Clone, Debug, PartialEq)]
211#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
212pub struct Stencil;
213
214/// Status of a compiled artifact ready to use.
215#[derive(Clone, Debug, PartialEq)]
216pub struct Final;
217
218impl CompilePhase for Stencil {
219 type MachSrcLocType = MachSrcLoc<Stencil>;
220 type SourceLocType = RelSourceLoc;
221}
222
223impl CompilePhase for Final {
224 type MachSrcLocType = MachSrcLoc<Final>;
225 type SourceLocType = SourceLoc;
226}
227
228#[derive(Clone, Copy, Debug, PartialEq, Eq)]
229enum ForceVeneers {
230 Yes,
231 No,
232}
233
234/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
235/// in bulk.
236///
237/// This struct uses `SmallVec`s to support small-ish function bodies without
238/// any heap allocation. As such, it will be several kilobytes large. This is
239/// likely fine as long as it is stack-allocated for function emission then
240/// thrown away; but beware if many buffer objects are retained persistently.
241pub struct MachBuffer<I: VCodeInst> {
242 /// The buffer contents, as raw bytes.
243 data: SmallVec<[u8; 1024]>,
244 /// The required alignment of this buffer.
245 min_alignment: u32,
246 /// Any relocations referring to this code. Note that only *external*
247 /// relocations are tracked here; references to labels within the buffer are
248 /// resolved before emission.
249 relocs: SmallVec<[MachReloc; 16]>,
250 /// Any trap records referring to this code.
251 traps: SmallVec<[MachTrap; 16]>,
252 /// Any call site records referring to this code.
253 call_sites: SmallVec<[MachCallSite; 16]>,
254 /// Any exception-handler records referred to at call sites.
255 exception_handlers: SmallVec<[(PackedOption<ir::ExceptionTag>, MachLabel); 16]>,
256 /// Any source location mappings referring to this code.
257 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
258 /// Any user stack maps for this code.
259 ///
260 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
261 /// by code offset, and each stack map covers `span` bytes on the stack.
262 user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
263 /// Any unwind info at a given location.
264 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
265 /// The current source location in progress (after `start_srcloc()` and
266 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.
267 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
268 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
269 label_offsets: SmallVec<[CodeOffset; 16]>,
270 /// Label aliases: when one label points to an unconditional jump, and that
271 /// jump points to another label, we can redirect references to the first
272 /// label immediately to the second.
273 ///
274 /// Invariant: we don't have label-alias cycles. We ensure this by,
275 /// before setting label A to alias label B, resolving B's alias
276 /// target (iteratively until a non-aliased label); if B is already
277 /// aliased to A, then we cannot alias A back to B.
278 label_aliases: SmallVec<[MachLabel; 16]>,
279 /// Constants that must be emitted at some point.
280 pending_constants: SmallVec<[VCodeConstant; 16]>,
281 /// Byte size of all constants in `pending_constants`.
282 pending_constants_size: CodeOffset,
283 /// Traps that must be emitted at some point.
284 pending_traps: SmallVec<[MachLabelTrap; 16]>,
285 /// Fixups that haven't yet been flushed into `fixup_records` below and may
286 /// be related to branches that are chomped. These all get added to
287 /// `fixup_records` during island emission.
288 pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
289 /// The nearest upcoming deadline for entries in `pending_fixup_records`.
290 pending_fixup_deadline: CodeOffset,
291 /// Fixups that must be performed after all code is emitted.
292 fixup_records: BinaryHeap<MachLabelFixup<I>>,
293 /// Latest branches, to facilitate in-place editing for better fallthrough
294 /// behavior and empty-block removal.
295 latest_branches: SmallVec<[MachBranch; 4]>,
296 /// All labels at the current offset (emission tail). This is lazily
297 /// cleared: it is actually accurate as long as the current offset is
298 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
299 /// be considered as empty.
300 ///
301 /// For correctness, this *must* be complete (i.e., the vector must contain
302 /// all labels whose offsets are resolved to the current tail), because we
303 /// rely on it to update labels when we truncate branches.
304 labels_at_tail: SmallVec<[MachLabel; 4]>,
305 /// The last offset at which `labels_at_tail` is valid. It is conceptually
306 /// always describing the tail of the buffer, but we do not clear
307 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
308 /// when the offset has grown past this (`labels_at_tail_off`) point.
309 /// Always <= `cur_offset()`.
310 labels_at_tail_off: CodeOffset,
311 /// Metadata about all constants that this function has access to.
312 ///
313 /// This records the size/alignment of all constants (not the actual data)
314 /// along with the last available label generated for the constant. This map
315 /// is consulted when constants are referred to and the label assigned to a
316 /// constant may change over time as well.
317 constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
318 /// All recorded usages of constants as pairs of the constant and where the
319 /// constant needs to be placed within `self.data`. Note that the same
320 /// constant may appear in this array multiple times if it was emitted
321 /// multiple times.
322 used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
323 /// Indicates when a patchable region is currently open, to guard that it's
324 /// not possible to nest patchable regions.
325 open_patchable: bool,
326}
327
328impl MachBufferFinalized<Stencil> {
329 /// Get a finalized machine buffer by applying the function's base source location.
330 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
331 MachBufferFinalized {
332 data: self.data,
333 relocs: self.relocs,
334 traps: self.traps,
335 call_sites: self.call_sites,
336 exception_handlers: self.exception_handlers,
337 srclocs: self
338 .srclocs
339 .into_iter()
340 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
341 .collect(),
342 user_stack_maps: self.user_stack_maps,
343 unwind_info: self.unwind_info,
344 alignment: self.alignment,
345 }
346 }
347}
348
349/// A `MachBuffer` once emission is completed: holds generated code and records,
350/// without fixups. This allows the type to be independent of the backend.
351#[derive(PartialEq, Debug, Clone)]
352#[cfg_attr(
353 feature = "enable-serde",
354 derive(serde_derive::Serialize, serde_derive::Deserialize)
355)]
356pub struct MachBufferFinalized<T: CompilePhase> {
357 /// The buffer contents, as raw bytes.
358 pub(crate) data: SmallVec<[u8; 1024]>,
359 /// Any relocations referring to this code. Note that only *external*
360 /// relocations are tracked here; references to labels within the buffer are
361 /// resolved before emission.
362 pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
363 /// Any trap records referring to this code.
364 pub(crate) traps: SmallVec<[MachTrap; 16]>,
365 /// Any call site records referring to this code.
366 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
367 /// Any exception-handler records referred to at call sites.
368 pub(crate) exception_handlers: SmallVec<[(PackedOption<ir::ExceptionTag>, CodeOffset); 16]>,
369 /// Any source location mappings referring to this code.
370 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
371 /// Any user stack maps for this code.
372 ///
373 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
374 /// by code offset, and each stack map covers `span` bytes on the stack.
375 pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
376 /// Any unwind info at a given location.
377 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
378 /// The required alignment of this buffer.
379 pub alignment: u32,
380}
381
382const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
383const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
384
385/// Threshold on max length of `labels_at_this_branch` list to avoid
386/// unbounded quadratic behavior (see comment below at use-site).
387const LABEL_LIST_THRESHOLD: usize = 100;
388
389/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
390/// the point at which it is used by emitted code; the buffer records "fixups"
391/// for references to the label, and will come back and patch the code
392/// appropriately when the label's location is eventually known.
393#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
394pub struct MachLabel(u32);
395entity_impl!(MachLabel);
396
397impl MachLabel {
398 /// Get a label for a block. (The first N MachLabels are always reserved for
399 /// the N blocks in the vcode.)
400 pub fn from_block(bindex: BlockIndex) -> MachLabel {
401 MachLabel(bindex.index() as u32)
402 }
403
404 /// Creates a string representing this label, for convenience.
405 pub fn to_string(&self) -> String {
406 format!("label{}", self.0)
407 }
408}
409
410impl Default for MachLabel {
411 fn default() -> Self {
412 UNKNOWN_LABEL
413 }
414}
415
416/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
417/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
418/// the [`OpenPatchRegion`] token in the process.
419pub struct OpenPatchRegion(usize);
420
421/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
422/// of where you might want to use this is for patching instructions that mention constants that
423/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
424/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
425/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
426/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
427/// bytes, and the constants uses can be updated directly.
428pub struct PatchRegion {
429 range: Range<usize>,
430}
431
432impl PatchRegion {
433 /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
434 pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
435 &mut buffer.data[self.range]
436 }
437}
438
439impl<I: VCodeInst> MachBuffer<I> {
440 /// Create a new section, known to start at `start_offset` and with a size limited to
441 /// `length_limit`.
442 pub fn new() -> MachBuffer<I> {
443 MachBuffer {
444 data: SmallVec::new(),
445 min_alignment: I::function_alignment().minimum,
446 relocs: SmallVec::new(),
447 traps: SmallVec::new(),
448 call_sites: SmallVec::new(),
449 exception_handlers: SmallVec::new(),
450 srclocs: SmallVec::new(),
451 user_stack_maps: SmallVec::new(),
452 unwind_info: SmallVec::new(),
453 cur_srcloc: None,
454 label_offsets: SmallVec::new(),
455 label_aliases: SmallVec::new(),
456 pending_constants: SmallVec::new(),
457 pending_constants_size: 0,
458 pending_traps: SmallVec::new(),
459 pending_fixup_records: SmallVec::new(),
460 pending_fixup_deadline: u32::MAX,
461 fixup_records: Default::default(),
462 latest_branches: SmallVec::new(),
463 labels_at_tail: SmallVec::new(),
464 labels_at_tail_off: 0,
465 constants: Default::default(),
466 used_constants: Default::default(),
467 open_patchable: false,
468 }
469 }
470
471 /// Current offset from start of buffer.
472 pub fn cur_offset(&self) -> CodeOffset {
473 self.data.len() as CodeOffset
474 }
475
476 /// Add a byte.
477 pub fn put1(&mut self, value: u8) {
478 self.data.push(value);
479
480 // Post-invariant: conceptual-labels_at_tail contains a complete and
481 // precise list of labels bound at `cur_offset()`. We have advanced
482 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
483 // before, it is not anymore (and it cannot become equal, because
484 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
485 // conceptually empty (even though it is only lazily cleared). No labels
486 // can be bound at this new offset (by invariant on `label_offsets`).
487 // Hence the invariant holds.
488 }
489
490 /// Add 2 bytes.
491 pub fn put2(&mut self, value: u16) {
492 let bytes = value.to_le_bytes();
493 self.data.extend_from_slice(&bytes[..]);
494
495 // Post-invariant: as for `put1()`.
496 }
497
498 /// Add 4 bytes.
499 pub fn put4(&mut self, value: u32) {
500 let bytes = value.to_le_bytes();
501 self.data.extend_from_slice(&bytes[..]);
502
503 // Post-invariant: as for `put1()`.
504 }
505
506 /// Add 8 bytes.
507 pub fn put8(&mut self, value: u64) {
508 let bytes = value.to_le_bytes();
509 self.data.extend_from_slice(&bytes[..]);
510
511 // Post-invariant: as for `put1()`.
512 }
513
514 /// Add a slice of bytes.
515 pub fn put_data(&mut self, data: &[u8]) {
516 self.data.extend_from_slice(data);
517
518 // Post-invariant: as for `put1()`.
519 }
520
521 /// Reserve appended space and return a mutable slice referring to it.
522 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
523 let off = self.data.len();
524 let new_len = self.data.len() + len;
525 self.data.resize(new_len, 0);
526 &mut self.data[off..]
527
528 // Post-invariant: as for `put1()`.
529 }
530
531 /// Align up to the given alignment.
532 pub fn align_to(&mut self, align_to: CodeOffset) {
533 trace!("MachBuffer: align to {}", align_to);
534 assert!(
535 align_to.is_power_of_two(),
536 "{align_to} is not a power of two"
537 );
538 while self.cur_offset() & (align_to - 1) != 0 {
539 self.put1(0);
540 }
541
542 // Post-invariant: as for `put1()`.
543 }
544
545 /// Begin a region of patchable code. There is one requirement for the
546 /// code that is emitted: It must not introduce any instructions that
547 /// could be chomped (branches are an example of this). In other words,
548 /// you must not call [`MachBuffer::add_cond_branch`] or
549 /// [`MachBuffer::add_uncond_branch`] between calls to this method and
550 /// [`MachBuffer::end_patchable`].
551 pub fn start_patchable(&mut self) -> OpenPatchRegion {
552 assert!(!self.open_patchable, "Patchable regions may not be nested");
553 self.open_patchable = true;
554 OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
555 }
556
557 /// End a region of patchable code, yielding a [`PatchRegion`] value that
558 /// can be consumed later to produce a one-off mutable slice to the
559 /// associated region of the data buffer.
560 pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
561 // No need to assert the state of `open_patchable` here, as we take
562 // ownership of the only `OpenPatchable` value.
563 self.open_patchable = false;
564 let end = usize::try_from(self.cur_offset()).unwrap();
565 PatchRegion { range: open.0..end }
566 }
567
568 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
569 /// offset yet.
570 pub fn get_label(&mut self) -> MachLabel {
571 let l = self.label_offsets.len() as u32;
572 self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
573 self.label_aliases.push(UNKNOWN_LABEL);
574 trace!("MachBuffer: new label -> {:?}", MachLabel(l));
575 MachLabel(l)
576
577 // Post-invariant: the only mutation is to add a new label; it has no
578 // bound offset yet, so it trivially satisfies all invariants.
579 }
580
581 /// Reserve the first N MachLabels for blocks.
582 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
583 trace!("MachBuffer: first {} labels are for blocks", blocks);
584 debug_assert!(self.label_offsets.is_empty());
585 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
586 self.label_aliases.resize(blocks, UNKNOWN_LABEL);
587
588 // Post-invariant: as for `get_label()`.
589 }
590
591 /// Registers metadata in this `MachBuffer` about the `constants` provided.
592 ///
593 /// This will record the size/alignment of all constants which will prepare
594 /// them for emission later on.
595 pub fn register_constants(&mut self, constants: &VCodeConstants) {
596 for (c, val) in constants.iter() {
597 self.register_constant(&c, val);
598 }
599 }
600
601 /// Similar to [`MachBuffer::register_constants`] but registers a
602 /// single constant metadata. This function is useful in
603 /// situations where not all constants are known at the time of
604 /// emission.
605 pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
606 let c2 = self.constants.push(MachBufferConstant {
607 upcoming_label: None,
608 align: data.alignment(),
609 size: data.as_slice().len(),
610 });
611 assert_eq!(*constant, c2);
612 }
613
614 /// Completes constant emission by iterating over `self.used_constants` and
615 /// filling in the "holes" with the constant values provided by `constants`.
616 ///
617 /// Returns the alignment required for this entire buffer. Alignment starts
618 /// at the ISA's minimum function alignment and can be increased due to
619 /// constant requirements.
620 fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
621 let mut alignment = self.min_alignment;
622 for (constant, offset) in mem::take(&mut self.used_constants) {
623 let constant = constants.get(constant);
624 let data = constant.as_slice();
625 self.data[offset as usize..][..data.len()].copy_from_slice(data);
626 alignment = constant.alignment().max(alignment);
627 }
628 alignment
629 }
630
631 /// Returns a label that can be used to refer to the `constant` provided.
632 ///
633 /// This will automatically defer a new constant to be emitted for
634 /// `constant` if it has not been previously emitted. Note that this
635 /// function may return a different label for the same constant at
636 /// different points in time. The label is valid to use only from the
637 /// current location; the MachBuffer takes care to emit the same constant
638 /// multiple times if needed so the constant is always in range.
639 pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
640 let MachBufferConstant {
641 align,
642 size,
643 upcoming_label,
644 } = self.constants[constant];
645 if let Some(label) = upcoming_label {
646 return label;
647 }
648
649 let label = self.get_label();
650 trace!(
651 "defer constant: eventually emit {size} bytes aligned \
652 to {align} at label {label:?}",
653 );
654 self.pending_constants.push(constant);
655 self.pending_constants_size += size as u32;
656 self.constants[constant].upcoming_label = Some(label);
657 label
658 }
659
660 /// Bind a label to the current offset. A label can only be bound once.
661 pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
662 trace!(
663 "MachBuffer: bind label {:?} at offset {}",
664 label,
665 self.cur_offset()
666 );
667 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
668 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
669 let offset = self.cur_offset();
670 self.label_offsets[label.0 as usize] = offset;
671 self.lazily_clear_labels_at_tail();
672 self.labels_at_tail.push(label);
673
674 // Invariants hold: bound offset of label is <= cur_offset (in fact it
675 // is equal). If the `labels_at_tail` list was complete and precise
676 // before, it is still, because we have bound this label to the current
677 // offset and added it to the list (which contains all labels at the
678 // current offset).
679
680 self.optimize_branches(ctrl_plane);
681
682 // Post-invariant: by `optimize_branches()` (see argument there).
683 }
684
685 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
686 /// offset that it applies to.
687 fn lazily_clear_labels_at_tail(&mut self) {
688 let offset = self.cur_offset();
689 if offset > self.labels_at_tail_off {
690 self.labels_at_tail_off = offset;
691 self.labels_at_tail.clear();
692 }
693
694 // Post-invariant: either labels_at_tail_off was at cur_offset, and
695 // state is untouched, or was less than cur_offset, in which case the
696 // labels_at_tail list was conceptually empty, and is now actually
697 // empty.
698 }
699
700 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
701 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
702 let mut iters = 0;
703 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
704 label = self.label_aliases[label.0 as usize];
705 // To protect against an infinite loop (despite our assurances to
706 // ourselves that the invariants make this impossible), assert out
707 // after 1M iterations. The number of basic blocks is limited
708 // in most contexts anyway so this should be impossible to hit with
709 // a legitimate input.
710 iters += 1;
711 assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
712 }
713 self.label_offsets[label.0 as usize]
714
715 // Post-invariant: no mutations.
716 }
717
718 /// Emit a reference to the given label with the given reference type (i.e.,
719 /// branch-instruction format) at the current offset. This is like a
720 /// relocation, but handled internally.
721 ///
722 /// This can be called before the branch is actually emitted; fixups will
723 /// not happen until an island is emitted or the buffer is finished.
724 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
725 trace!(
726 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
727 offset, label, kind
728 );
729
730 // Add the fixup, and update the worst-case island size based on a
731 // veneer for this label use.
732 let fixup = MachLabelFixup {
733 label,
734 offset,
735 kind,
736 };
737 self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
738 self.pending_fixup_records.push(fixup);
739
740 // Post-invariant: no mutations to branches/labels data structures.
741 }
742
743 /// Inform the buffer of an unconditional branch at the given offset,
744 /// targeting the given label. May be used to optimize branches.
745 /// The last added label-use must correspond to this branch.
746 /// This must be called when the current offset is equal to `start`; i.e.,
747 /// before actually emitting the branch. This implies that for a branch that
748 /// uses a label and is eligible for optimizations by the MachBuffer, the
749 /// proper sequence is:
750 ///
751 /// - Call `use_label_at_offset()` to emit the fixup record.
752 /// - Call `add_uncond_branch()` to make note of the branch.
753 /// - Emit the bytes for the branch's machine code.
754 ///
755 /// Additional requirement: no labels may be bound between `start` and `end`
756 /// (exclusive on both ends).
757 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
758 debug_assert!(
759 !self.open_patchable,
760 "Branch instruction inserted within a patchable region"
761 );
762 assert!(self.cur_offset() == start);
763 debug_assert!(end > start);
764 assert!(!self.pending_fixup_records.is_empty());
765 let fixup = self.pending_fixup_records.len() - 1;
766 self.lazily_clear_labels_at_tail();
767 self.latest_branches.push(MachBranch {
768 start,
769 end,
770 target,
771 fixup,
772 inverted: None,
773 labels_at_this_branch: self.labels_at_tail.clone(),
774 });
775
776 // Post-invariant: we asserted branch start is current tail; the list of
777 // labels at branch is cloned from list of labels at current tail.
778 }
779
780 /// Inform the buffer of a conditional branch at the given offset,
781 /// targeting the given label. May be used to optimize branches.
782 /// The last added label-use must correspond to this branch.
783 ///
784 /// Additional requirement: no labels may be bound between `start` and `end`
785 /// (exclusive on both ends).
786 pub fn add_cond_branch(
787 &mut self,
788 start: CodeOffset,
789 end: CodeOffset,
790 target: MachLabel,
791 inverted: &[u8],
792 ) {
793 debug_assert!(
794 !self.open_patchable,
795 "Branch instruction inserted within a patchable region"
796 );
797 assert!(self.cur_offset() == start);
798 debug_assert!(end > start);
799 assert!(!self.pending_fixup_records.is_empty());
800 debug_assert!(
801 inverted.len() == (end - start) as usize,
802 "branch length = {}, but inverted length = {}",
803 end - start,
804 inverted.len()
805 );
806 let fixup = self.pending_fixup_records.len() - 1;
807 let inverted = Some(SmallVec::from(inverted));
808 self.lazily_clear_labels_at_tail();
809 self.latest_branches.push(MachBranch {
810 start,
811 end,
812 target,
813 fixup,
814 inverted,
815 labels_at_this_branch: self.labels_at_tail.clone(),
816 });
817
818 // Post-invariant: we asserted branch start is current tail; labels at
819 // branch list is cloned from list of labels at current tail.
820 }
821
822 fn truncate_last_branch(&mut self) {
823 debug_assert!(
824 !self.open_patchable,
825 "Branch instruction truncated within a patchable region"
826 );
827
828 self.lazily_clear_labels_at_tail();
829 // Invariants hold at this point.
830
831 let b = self.latest_branches.pop().unwrap();
832 assert!(b.end == self.cur_offset());
833
834 // State:
835 // [PRE CODE]
836 // Offset b.start, b.labels_at_this_branch:
837 // [BRANCH CODE]
838 // cur_off, self.labels_at_tail -->
839 // (end of buffer)
840 self.data.truncate(b.start as usize);
841 self.pending_fixup_records.truncate(b.fixup);
842 while let Some(last_srcloc) = self.srclocs.last_mut() {
843 if last_srcloc.end <= b.start {
844 break;
845 }
846 if last_srcloc.start < b.start {
847 last_srcloc.end = b.start;
848 break;
849 }
850 self.srclocs.pop();
851 }
852 // State:
853 // [PRE CODE]
854 // cur_off, Offset b.start, b.labels_at_this_branch:
855 // (end of buffer)
856 //
857 // self.labels_at_tail --> (past end of buffer)
858 let cur_off = self.cur_offset();
859 self.labels_at_tail_off = cur_off;
860 // State:
861 // [PRE CODE]
862 // cur_off, Offset b.start, b.labels_at_this_branch,
863 // self.labels_at_tail:
864 // (end of buffer)
865 //
866 // resolve_label_offset(l) for l in labels_at_tail:
867 // (past end of buffer)
868
869 trace!(
870 "truncate_last_branch: truncated {:?}; off now {}",
871 b, cur_off
872 );
873
874 // Fix up resolved label offsets for labels at tail.
875 for &l in &self.labels_at_tail {
876 self.label_offsets[l.0 as usize] = cur_off;
877 }
878 // Old labels_at_this_branch are now at cur_off.
879 self.labels_at_tail.extend(b.labels_at_this_branch);
880
881 // Post-invariant: this operation is defined to truncate the buffer,
882 // which moves cur_off backward, and to move labels at the end of the
883 // buffer back to the start-of-branch offset.
884 //
885 // latest_branches satisfies all invariants:
886 // - it has no branches past the end of the buffer (branches are in
887 // order, we removed the last one, and we truncated the buffer to just
888 // before the start of that branch)
889 // - no labels were moved to lower offsets than the (new) cur_off, so
890 // the labels_at_this_branch list for any other branch need not change.
891 //
892 // labels_at_tail satisfies all invariants:
893 // - all labels that were at the tail after the truncated branch are
894 // moved backward to just before the branch, which becomes the new tail;
895 // thus every element in the list should remain (ensured by `.extend()`
896 // above).
897 // - all labels that refer to the new tail, which is the start-offset of
898 // the truncated branch, must be present. The `labels_at_this_branch`
899 // list in the truncated branch's record is a complete and precise list
900 // of exactly these labels; we append these to labels_at_tail.
901 // - labels_at_tail_off is at cur_off after truncation occurs, so the
902 // list is valid (not to be lazily cleared).
903 //
904 // The stated operation was performed:
905 // - For each label at the end of the buffer prior to this method, it
906 // now resolves to the new (truncated) end of the buffer: it must have
907 // been in `labels_at_tail` (this list is precise and complete, and
908 // the tail was at the end of the truncated branch on entry), and we
909 // iterate over this list and set `label_offsets` to the new tail.
910 // None of these labels could have been an alias (by invariant), so
911 // `label_offsets` is authoritative for each.
912 // - No other labels will be past the end of the buffer, because of the
913 // requirement that no labels be bound to the middle of branch ranges
914 // (see comments to `add_{cond,uncond}_branch()`).
915 // - The buffer is truncated to just before the last branch, and the
916 // fixup record referring to that last branch is removed.
917 }
918
919 /// Performs various optimizations on branches pointing at the current label.
920 pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
921 if ctrl_plane.get_decision() {
922 return;
923 }
924
925 self.lazily_clear_labels_at_tail();
926 // Invariants valid at this point.
927
928 trace!(
929 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
930 self.latest_branches, self.labels_at_tail, self.pending_fixup_records
931 );
932
933 // We continue to munch on branches at the tail of the buffer until no
934 // more rules apply. Note that the loop only continues if a branch is
935 // actually truncated (or if labels are redirected away from a branch),
936 // so this always makes progress.
937 while let Some(b) = self.latest_branches.last() {
938 let cur_off = self.cur_offset();
939 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
940 // If there has been any code emission since the end of the last branch or
941 // label definition, then there's nothing we can edit (because we
942 // don't move code once placed, only back up and overwrite), so
943 // clear the records and finish.
944 if b.end < cur_off {
945 break;
946 }
947
948 // If the "labels at this branch" list on this branch is
949 // longer than a threshold, don't do any simplification,
950 // and let the branch remain to separate those labels from
951 // the current tail. This avoids quadratic behavior (see
952 // #3468): otherwise, if a long string of "goto next;
953 // next:" patterns are emitted, all of the labels will
954 // coalesce into a long list of aliases for the current
955 // buffer tail. We must track all aliases of the current
956 // tail for correctness, but we are also allowed to skip
957 // optimization (removal) of any branch, so we take the
958 // escape hatch here and let it stand. In effect this
959 // "spreads" the many thousands of labels in the
960 // pathological case among an actual (harmless but
961 // suboptimal) instruction once per N labels.
962 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
963 break;
964 }
965
966 // Invariant: we are looking at a branch that ends at the tail of
967 // the buffer.
968
969 // For any branch, conditional or unconditional:
970 // - If the target is a label at the current offset, then remove
971 // the conditional branch, and reset all labels that targeted
972 // the current offset (end of branch) to the truncated
973 // end-of-code.
974 //
975 // Preserves execution semantics: a branch to its own fallthrough
976 // address is equivalent to a no-op; in both cases, nextPC is the
977 // fallthrough.
978 if self.resolve_label_offset(b.target) == cur_off {
979 trace!("branch with target == cur off; truncating");
980 self.truncate_last_branch();
981 continue;
982 }
983
984 // If latest is an unconditional branch:
985 //
986 // - If the branch's target is not its own start address, then for
987 // each label at the start of branch, make the label an alias of the
988 // branch target, and remove the label from the "labels at this
989 // branch" list.
990 //
991 // - Preserves execution semantics: an unconditional branch's
992 // only effect is to set PC to a new PC; this change simply
993 // collapses one step in the step-semantics.
994 //
995 // - Post-invariant: the labels that were bound to the start of
996 // this branch become aliases, so they must not be present in any
997 // labels-at-this-branch list or the labels-at-tail list. The
998 // labels are removed form the latest-branch record's
999 // labels-at-this-branch list, and are never placed in the
1000 // labels-at-tail list. Furthermore, it is correct that they are
1001 // not in either list, because they are now aliases, and labels
1002 // that are aliases remain aliases forever.
1003 //
1004 // - If there is a prior unconditional branch that ends just before
1005 // this one begins, and this branch has no labels bound to its
1006 // start, then we can truncate this branch, because it is entirely
1007 // unreachable (we have redirected all labels that make it
1008 // reachable otherwise). Do so and continue around the loop.
1009 //
1010 // - Preserves execution semantics: the branch is unreachable,
1011 // because execution can only flow into an instruction from the
1012 // prior instruction's fallthrough or from a branch bound to that
1013 // instruction's start offset. Unconditional branches have no
1014 // fallthrough, so if the prior instruction is an unconditional
1015 // branch, no fallthrough entry can happen. The
1016 // labels-at-this-branch list is complete (by invariant), so if it
1017 // is empty, then the instruction is entirely unreachable. Thus,
1018 // it can be removed.
1019 //
1020 // - Post-invariant: ensured by truncate_last_branch().
1021 //
1022 // - If there is a prior conditional branch whose target label
1023 // resolves to the current offset (branches around the
1024 // unconditional branch), then remove the unconditional branch,
1025 // and make the target of the unconditional the target of the
1026 // conditional instead.
1027 //
1028 // - Preserves execution semantics: previously we had:
1029 //
1030 // L1:
1031 // cond_br L2
1032 // br L3
1033 // L2:
1034 // (end of buffer)
1035 //
1036 // by removing the last branch, we have:
1037 //
1038 // L1:
1039 // cond_br L2
1040 // L2:
1041 // (end of buffer)
1042 //
1043 // we then fix up the records for the conditional branch to
1044 // have:
1045 //
1046 // L1:
1047 // cond_br.inverted L3
1048 // L2:
1049 //
1050 // In the original code, control flow reaches L2 when the
1051 // conditional branch's predicate is true, and L3 otherwise. In
1052 // the optimized code, the same is true.
1053 //
1054 // - Post-invariant: all edits to latest_branches and
1055 // labels_at_tail are performed by `truncate_last_branch()`,
1056 // which maintains the invariants at each step.
1057
1058 if b.is_uncond() {
1059 // Set any label equal to current branch's start as an alias of
1060 // the branch's target, if the target is not the branch itself
1061 // (i.e., an infinite loop).
1062 //
1063 // We cannot perform this aliasing if the target of this branch
1064 // ultimately aliases back here; if so, we need to keep this
1065 // branch, so break out of this loop entirely (and clear the
1066 // latest-branches list below).
1067 //
1068 // Note that this check is what prevents cycles from forming in
1069 // `self.label_aliases`. To see why, consider an arbitrary start
1070 // state:
1071 //
1072 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1073 // Ln, which is not aliased.
1074 //
1075 // We would create a cycle if we assigned label_aliases[Ln]
1076 // = L1. Note that the below assignment is the only write
1077 // to label_aliases.
1078 //
1079 // By our other invariants, we have that Ln (`l` below)
1080 // resolves to the offset `b.start`, because it is in the
1081 // set `b.labels_at_this_branch`.
1082 //
1083 // If L1 were already aliased, through some arbitrarily deep
1084 // chain, to Ln, then it must also resolve to this offset
1085 // `b.start`.
1086 //
1087 // By checking the resolution of `L1` against this offset,
1088 // and aborting this branch-simplification if they are
1089 // equal, we prevent the below assignment from ever creating
1090 // a cycle.
1091 if self.resolve_label_offset(b.target) != b.start {
1092 let redirected = b.labels_at_this_branch.len();
1093 for &l in &b.labels_at_this_branch {
1094 trace!(
1095 " -> label at start of branch {:?} redirected to target {:?}",
1096 l, b.target
1097 );
1098 self.label_aliases[l.0 as usize] = b.target;
1099 // NOTE: we continue to ensure the invariant that labels
1100 // pointing to tail of buffer are in `labels_at_tail`
1101 // because we already ensured above that the last branch
1102 // cannot have a target of `cur_off`; so we never have
1103 // to put the label into `labels_at_tail` when moving it
1104 // here.
1105 }
1106 // Maintain invariant: all branches have been redirected
1107 // and are no longer pointing at the start of this branch.
1108 let mut_b = self.latest_branches.last_mut().unwrap();
1109 mut_b.labels_at_this_branch.clear();
1110
1111 if redirected > 0 {
1112 trace!(" -> after label redirects, restarting loop");
1113 continue;
1114 }
1115 } else {
1116 break;
1117 }
1118
1119 let b = self.latest_branches.last().unwrap();
1120
1121 // Examine any immediately preceding branch.
1122 if self.latest_branches.len() > 1 {
1123 let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1124 trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1125 // This uncond is immediately after another uncond; we
1126 // should have already redirected labels to this uncond away
1127 // (but check to be sure); so we can truncate this uncond.
1128 if prev_b.is_uncond()
1129 && prev_b.end == b.start
1130 && b.labels_at_this_branch.is_empty()
1131 {
1132 trace!(" -> uncond follows another uncond; truncating");
1133 self.truncate_last_branch();
1134 continue;
1135 }
1136
1137 // This uncond is immediately after a conditional, and the
1138 // conditional's target is the end of this uncond, and we've
1139 // already redirected labels to this uncond away; so we can
1140 // truncate this uncond, flip the sense of the conditional, and
1141 // set the conditional's target (in `latest_branches` and in
1142 // `fixup_records`) to the uncond's target.
1143 if prev_b.is_cond()
1144 && prev_b.end == b.start
1145 && self.resolve_label_offset(prev_b.target) == cur_off
1146 {
1147 trace!(
1148 " -> uncond follows a conditional, and conditional's target resolves to current offset"
1149 );
1150 // Save the target of the uncond (this becomes the
1151 // target of the cond), and truncate the uncond.
1152 let target = b.target;
1153 let data = prev_b.inverted.clone().unwrap();
1154 self.truncate_last_branch();
1155
1156 // Mutate the code and cond branch.
1157 let off_before_edit = self.cur_offset();
1158 let prev_b = self.latest_branches.last_mut().unwrap();
1159 let not_inverted = SmallVec::from(
1160 &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1161 );
1162
1163 // Low-level edit: replaces bytes of branch with
1164 // inverted form. cur_off remains the same afterward, so
1165 // we do not need to modify label data structures.
1166 self.data.truncate(prev_b.start as usize);
1167 self.data.extend_from_slice(&data[..]);
1168
1169 // Save the original code as the inversion of the
1170 // inverted branch, in case we later edit this branch
1171 // again.
1172 prev_b.inverted = Some(not_inverted);
1173 self.pending_fixup_records[prev_b.fixup].label = target;
1174 trace!(" -> reassigning target of condbr to {:?}", target);
1175 prev_b.target = target;
1176 debug_assert_eq!(off_before_edit, self.cur_offset());
1177 continue;
1178 }
1179 }
1180 }
1181
1182 // If we couldn't do anything with the last branch, then break.
1183 break;
1184 }
1185
1186 self.purge_latest_branches();
1187
1188 trace!(
1189 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1190 self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1191 );
1192 }
1193
1194 fn purge_latest_branches(&mut self) {
1195 // All of our branch simplification rules work only if a branch ends at
1196 // the tail of the buffer, with no following code; and branches are in
1197 // order in latest_branches; so if the last entry ends prior to
1198 // cur_offset, then clear all entries.
1199 let cur_off = self.cur_offset();
1200 if let Some(l) = self.latest_branches.last() {
1201 if l.end < cur_off {
1202 trace!("purge_latest_branches: removing branch {:?}", l);
1203 self.latest_branches.clear();
1204 }
1205 }
1206
1207 // Post-invariant: no invariant requires any branch to appear in
1208 // `latest_branches`; it is always optional. The list-clear above thus
1209 // preserves all semantics.
1210 }
1211
1212 /// Emit a trap at some point in the future with the specified code and
1213 /// stack map.
1214 ///
1215 /// This function returns a [`MachLabel`] which will be the future address
1216 /// of the trap. Jumps should refer to this label, likely by using the
1217 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1218 /// patched in once the address of the trap is known.
1219 ///
1220 /// This will batch all traps into the end of the function.
1221 pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1222 let label = self.get_label();
1223 self.pending_traps.push(MachLabelTrap {
1224 label,
1225 code,
1226 loc: self.cur_srcloc.map(|(_start, loc)| loc),
1227 });
1228 label
1229 }
1230
1231 /// Is an island needed within the next N bytes?
1232 pub fn island_needed(&self, distance: CodeOffset) -> bool {
1233 let deadline = match self.fixup_records.peek() {
1234 Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1235 None => self.pending_fixup_deadline,
1236 };
1237 deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1238 }
1239
1240 /// Returns the maximal offset that islands can reach if `distance` more
1241 /// bytes are appended.
1242 ///
1243 /// This is used to determine if veneers need insertions since jumps that
1244 /// can't reach past this point must get a veneer of some form.
1245 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1246 // Assume that all fixups will require veneers and that the veneers are
1247 // the worst-case size for each platform. This is an over-generalization
1248 // to avoid iterating over the `fixup_records` list or maintaining
1249 // information about it as we go along.
1250 let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1251 as u32)
1252 * (I::LabelUse::worst_case_veneer_size())
1253 + self.pending_constants_size
1254 + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1255 self.cur_offset()
1256 .saturating_add(distance)
1257 .saturating_add(island_worst_case_size)
1258 }
1259
1260 /// Emit all pending constants and required pending veneers.
1261 ///
1262 /// Should only be called if `island_needed()` returns true, i.e., if we
1263 /// actually reach a deadline. It's not necessarily a problem to do so
1264 /// otherwise but it may result in unnecessary work during emission.
1265 pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1266 self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1267 }
1268
1269 /// Same as `emit_island`, but an internal API with a `force_veneers`
1270 /// argument to force all veneers to always get emitted for debugging.
1271 fn emit_island_maybe_forced(
1272 &mut self,
1273 force_veneers: ForceVeneers,
1274 distance: CodeOffset,
1275 ctrl_plane: &mut ControlPlane,
1276 ) {
1277 // We're going to purge fixups, so no latest-branch editing can happen
1278 // anymore.
1279 self.latest_branches.clear();
1280
1281 // End the current location tracking since anything emitted during this
1282 // function shouldn't be attributed to whatever the current source
1283 // location is.
1284 //
1285 // Note that the current source location, if it's set right now, will be
1286 // restored at the end of this island emission.
1287 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1288 if cur_loc.is_some() {
1289 self.end_srcloc();
1290 }
1291
1292 let forced_threshold = self.worst_case_end_of_island(distance);
1293
1294 // First flush out all traps/constants so we have more labels in case
1295 // fixups are applied against these labels.
1296 //
1297 // Note that traps are placed first since this typically happens at the
1298 // end of the function and for disassemblers we try to keep all the code
1299 // contiguously together.
1300 for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1301 // If this trap has source information associated with it then
1302 // emit this information for the trap instruction going out now too.
1303 if let Some(loc) = loc {
1304 self.start_srcloc(loc);
1305 }
1306 self.align_to(I::LabelUse::ALIGN);
1307 self.bind_label(label, ctrl_plane);
1308 self.add_trap(code);
1309 self.put_data(I::TRAP_OPCODE);
1310 if loc.is_some() {
1311 self.end_srcloc();
1312 }
1313 }
1314
1315 for constant in mem::take(&mut self.pending_constants) {
1316 let MachBufferConstant { align, size, .. } = self.constants[constant];
1317 let label = self.constants[constant].upcoming_label.take().unwrap();
1318 self.align_to(align);
1319 self.bind_label(label, ctrl_plane);
1320 self.used_constants.push((constant, self.cur_offset()));
1321 self.get_appended_space(size);
1322 }
1323
1324 // Either handle all pending fixups because they're ready or move them
1325 // onto the `BinaryHeap` tracking all pending fixups if they aren't
1326 // ready.
1327 assert!(self.latest_branches.is_empty());
1328 for fixup in mem::take(&mut self.pending_fixup_records) {
1329 if self.should_apply_fixup(&fixup, forced_threshold) {
1330 self.handle_fixup(fixup, force_veneers, forced_threshold);
1331 } else {
1332 self.fixup_records.push(fixup);
1333 }
1334 }
1335 self.pending_fixup_deadline = u32::MAX;
1336 while let Some(fixup) = self.fixup_records.peek() {
1337 trace!("emit_island: fixup {:?}", fixup);
1338
1339 // If this fixup shouldn't be applied, that means its label isn't
1340 // defined yet and there'll be remaining space to apply a veneer if
1341 // necessary in the future after this island. In that situation
1342 // because `fixup_records` is sorted by deadline this loop can
1343 // exit.
1344 if !self.should_apply_fixup(fixup, forced_threshold) {
1345 break;
1346 }
1347
1348 let fixup = self.fixup_records.pop().unwrap();
1349 self.handle_fixup(fixup, force_veneers, forced_threshold);
1350 }
1351
1352 if let Some(loc) = cur_loc {
1353 self.start_srcloc(loc);
1354 }
1355 }
1356
1357 fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1358 let label_offset = self.resolve_label_offset(fixup.label);
1359 label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1360 }
1361
1362 fn handle_fixup(
1363 &mut self,
1364 fixup: MachLabelFixup<I>,
1365 force_veneers: ForceVeneers,
1366 forced_threshold: CodeOffset,
1367 ) {
1368 let MachLabelFixup {
1369 label,
1370 offset,
1371 kind,
1372 } = fixup;
1373 let start = offset as usize;
1374 let end = (offset + kind.patch_size()) as usize;
1375 let label_offset = self.resolve_label_offset(label);
1376
1377 if label_offset != UNKNOWN_LABEL_OFFSET {
1378 // If the offset of the label for this fixup is known then
1379 // we're going to do something here-and-now. We're either going
1380 // to patch the original offset because it's an in-bounds jump,
1381 // or we're going to generate a veneer, patch the fixup to jump
1382 // to the veneer, and then keep going.
1383 //
1384 // If the label comes after the original fixup, then we should
1385 // be guaranteed that the jump is in-bounds. Otherwise there's
1386 // a bug somewhere because this method wasn't called soon
1387 // enough. All forward-jumps are tracked and should get veneers
1388 // before their deadline comes and they're unable to jump
1389 // further.
1390 //
1391 // Otherwise if the label is before the fixup, then that's a
1392 // backwards jump. If it's past the maximum negative range
1393 // then we'll emit a veneer that to jump forward to which can
1394 // then jump backwards.
1395 let veneer_required = if label_offset >= offset {
1396 assert!((label_offset - offset) <= kind.max_pos_range());
1397 false
1398 } else {
1399 (offset - label_offset) > kind.max_neg_range()
1400 };
1401 trace!(
1402 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1403 label_offset,
1404 veneer_required,
1405 kind.max_pos_range(),
1406 kind.max_neg_range()
1407 );
1408
1409 if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1410 self.emit_veneer(label, offset, kind);
1411 } else {
1412 let slice = &mut self.data[start..end];
1413 trace!(
1414 "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1415 );
1416 kind.patch(slice, offset, label_offset);
1417 }
1418 } else {
1419 // If the offset of this label is not known at this time then
1420 // that means that a veneer is required because after this
1421 // island the target can't be in range of the original target.
1422 assert!(forced_threshold - offset > kind.max_pos_range());
1423 self.emit_veneer(label, offset, kind);
1424 }
1425 }
1426
1427 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1428 ///
1429 /// This will generate extra machine code, using `kind`, to get a
1430 /// larger-jump-kind than `kind` allows. The code at `offset` is then
1431 /// patched to jump to our new code, and then the new code is enqueued for
1432 /// a fixup to get processed at some later time.
1433 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1434 // If this `kind` doesn't support a veneer then that's a bug in the
1435 // backend because we need to implement support for such a veneer.
1436 assert!(
1437 kind.supports_veneer(),
1438 "jump beyond the range of {kind:?} but a veneer isn't supported",
1439 );
1440
1441 // Allocate space for a veneer in the island.
1442 self.align_to(I::LabelUse::ALIGN);
1443 let veneer_offset = self.cur_offset();
1444 trace!("making a veneer at {}", veneer_offset);
1445 let start = offset as usize;
1446 let end = (offset + kind.patch_size()) as usize;
1447 let slice = &mut self.data[start..end];
1448 // Patch the original label use to refer to the veneer.
1449 trace!(
1450 "patching original at offset {} to veneer offset {}",
1451 offset, veneer_offset
1452 );
1453 kind.patch(slice, offset, veneer_offset);
1454 // Generate the veneer.
1455 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1456 let (veneer_fixup_off, veneer_label_use) =
1457 kind.generate_veneer(veneer_slice, veneer_offset);
1458 trace!(
1459 "generated veneer; fixup offset {}, label_use {:?}",
1460 veneer_fixup_off, veneer_label_use
1461 );
1462 // Register a new use of `label` with our new veneer fixup and
1463 // offset. This'll recalculate deadlines accordingly and
1464 // enqueue this fixup to get processed at some later
1465 // time.
1466 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1467 }
1468
1469 fn finish_emission_maybe_forcing_veneers(
1470 &mut self,
1471 force_veneers: ForceVeneers,
1472 ctrl_plane: &mut ControlPlane,
1473 ) {
1474 while !self.pending_constants.is_empty()
1475 || !self.pending_traps.is_empty()
1476 || !self.fixup_records.is_empty()
1477 || !self.pending_fixup_records.is_empty()
1478 {
1479 // `emit_island()` will emit any pending veneers and constants, and
1480 // as a side-effect, will also take care of any fixups with resolved
1481 // labels eagerly.
1482 self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1483 }
1484
1485 // Ensure that all labels have been fixed up after the last island is emitted. This is a
1486 // full (release-mode) assert because an unresolved label means the emitted code is
1487 // incorrect.
1488 assert!(self.fixup_records.is_empty());
1489 assert!(self.pending_fixup_records.is_empty());
1490 }
1491
1492 /// Finish any deferred emissions and/or fixups.
1493 pub fn finish(
1494 mut self,
1495 constants: &VCodeConstants,
1496 ctrl_plane: &mut ControlPlane,
1497 ) -> MachBufferFinalized<Stencil> {
1498 let _tt = timing::vcode_emit_finish();
1499
1500 self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1501
1502 let alignment = self.finish_constants(constants);
1503
1504 // Resolve all labels to their offsets.
1505 let finalized_relocs = self
1506 .relocs
1507 .iter()
1508 .map(|reloc| FinalizedMachReloc {
1509 offset: reloc.offset,
1510 kind: reloc.kind,
1511 addend: reloc.addend,
1512 target: match &reloc.target {
1513 RelocTarget::ExternalName(name) => {
1514 FinalizedRelocTarget::ExternalName(name.clone())
1515 }
1516 RelocTarget::Label(label) => {
1517 FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1518 }
1519 },
1520 })
1521 .collect();
1522
1523 let finalized_exception_handlers = self
1524 .exception_handlers
1525 .iter()
1526 .map(|(tag, label)| (*tag, self.resolve_label_offset(*label)))
1527 .collect();
1528
1529 let mut srclocs = self.srclocs;
1530 srclocs.sort_by_key(|entry| entry.start);
1531
1532 MachBufferFinalized {
1533 data: self.data,
1534 relocs: finalized_relocs,
1535 traps: self.traps,
1536 call_sites: self.call_sites,
1537 exception_handlers: finalized_exception_handlers,
1538 srclocs,
1539 user_stack_maps: self.user_stack_maps,
1540 unwind_info: self.unwind_info,
1541 alignment,
1542 }
1543 }
1544
1545 /// Add an external relocation at the given offset.
1546 pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1547 &mut self,
1548 offset: CodeOffset,
1549 kind: Reloc,
1550 target: &T,
1551 addend: Addend,
1552 ) {
1553 let target: RelocTarget = target.clone().into();
1554 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1555 // generate a label-use statement to track whether an island is possibly
1556 // needed to escape this function to actually get to the external name.
1557 // This is most likely to come up on AArch64 where calls between
1558 // functions use a 26-bit signed offset which gives +/- 64MB. This means
1559 // that if a function is 128MB in size and there's a call in the middle
1560 // it's impossible to reach the actual target. Also, while it's
1561 // technically possible to jump to the start of a function and then jump
1562 // further, island insertion below always inserts islands after
1563 // previously appended code so for Cranelift's own implementation this
1564 // is also a problem for 64MB functions on AArch64 which start with a
1565 // call instruction, those won't be able to escape.
1566 //
1567 // Ideally what needs to happen here is that a `LabelUse` is
1568 // transparently generated (or call-sites of this function are audited
1569 // to generate a `LabelUse` instead) and tracked internally. The actual
1570 // relocation would then change over time if and when a veneer is
1571 // inserted, where the relocation here would be patched by this
1572 // `MachBuffer` to jump to the veneer. The problem, though, is that all
1573 // this still needs to end up, in the case of a singular function,
1574 // generating a final relocation pointing either to this particular
1575 // relocation or to the veneer inserted. Additionally
1576 // `MachBuffer` needs the concept of a label which will never be
1577 // resolved, so `emit_island` doesn't trip over not actually ever
1578 // knowning what some labels are. Currently the loop in
1579 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1580 // loop.
1581 //
1582 // For now this means that because relocs aren't tracked at all that
1583 // AArch64 functions have a rough size limits of 64MB. For now that's
1584 // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1585 // when a relocation can't otherwise be resolved later, so it shouldn't
1586 // actually result in any memory unsafety or anything like that.
1587 self.relocs.push(MachReloc {
1588 offset,
1589 kind,
1590 target,
1591 addend,
1592 });
1593 }
1594
1595 /// Add an external relocation at the current offset.
1596 pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1597 &mut self,
1598 kind: Reloc,
1599 target: &T,
1600 addend: Addend,
1601 ) {
1602 self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1603 }
1604
1605 /// Add a trap record at the current offset.
1606 pub fn add_trap(&mut self, code: TrapCode) {
1607 self.traps.push(MachTrap {
1608 offset: self.data.len() as CodeOffset,
1609 code,
1610 });
1611 }
1612
1613 /// Add a call-site record at the current offset, optionally with exception handlers.
1614 pub fn add_call_site(
1615 &mut self,
1616 exception_handlers: &[(PackedOption<ExceptionTag>, MachLabel)],
1617 ) {
1618 let start = u32::try_from(self.exception_handlers.len()).unwrap();
1619 self.exception_handlers
1620 .extend(exception_handlers.into_iter().copied());
1621 let end = u32::try_from(self.exception_handlers.len()).unwrap();
1622 self.call_sites.push(MachCallSite {
1623 ret_addr: self.data.len() as CodeOffset,
1624 exception_handler_range: start..end,
1625 });
1626 }
1627
1628 /// Add an unwind record at the current offset.
1629 pub fn add_unwind(&mut self, unwind: UnwindInst) {
1630 self.unwind_info.push((self.cur_offset(), unwind));
1631 }
1632
1633 /// Set the `SourceLoc` for code from this offset until the offset at the
1634 /// next call to `end_srcloc()`.
1635 /// Returns the current [CodeOffset] and [RelSourceLoc].
1636 pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1637 let cur = (self.cur_offset(), loc);
1638 self.cur_srcloc = Some(cur);
1639 cur
1640 }
1641
1642 /// Mark the end of the `SourceLoc` segment started at the last
1643 /// `start_srcloc()` call.
1644 pub fn end_srcloc(&mut self) {
1645 let (start, loc) = self
1646 .cur_srcloc
1647 .take()
1648 .expect("end_srcloc() called without start_srcloc()");
1649 let end = self.cur_offset();
1650 // Skip zero-length extends.
1651 debug_assert!(end >= start);
1652 if end > start {
1653 self.srclocs.push(MachSrcLoc { start, end, loc });
1654 }
1655 }
1656
1657 /// Push a user stack map onto this buffer.
1658 ///
1659 /// The stack map is associated with the given `return_addr` code
1660 /// offset. This must be the PC for the instruction just *after* this stack
1661 /// map's associated instruction. For example in the sequence `call $foo;
1662 /// add r8, rax`, the `return_addr` must be the offset of the start of the
1663 /// `add` instruction.
1664 ///
1665 /// Stack maps must be pushed in sorted `return_addr` order.
1666 pub fn push_user_stack_map(
1667 &mut self,
1668 emit_state: &I::State,
1669 return_addr: CodeOffset,
1670 mut stack_map: ir::UserStackMap,
1671 ) {
1672 let span = emit_state.frame_layout().active_size();
1673 trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1674
1675 debug_assert!(
1676 self.user_stack_maps
1677 .last()
1678 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1679 "pushed stack maps out of order: {} is not less than {}",
1680 self.user_stack_maps.last().unwrap().0,
1681 return_addr,
1682 );
1683
1684 stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1685 self.user_stack_maps.push((return_addr, span, stack_map));
1686 }
1687
1688 /// Increase the alignment of the buffer to the given alignment if bigger
1689 /// than the current alignment.
1690 pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1691 self.min_alignment = self.min_alignment.max(
1692 1u32.checked_shl(u32::from(align_to))
1693 .expect("log2_min_function_alignment too large"),
1694 );
1695 }
1696}
1697
1698impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1699 fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1700 for b in iter {
1701 self.put1(b);
1702 }
1703 }
1704}
1705
1706impl<T: CompilePhase> MachBufferFinalized<T> {
1707 /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1708 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1709 &self.srclocs[..]
1710 }
1711
1712 /// Get the total required size for the code.
1713 pub fn total_size(&self) -> CodeOffset {
1714 self.data.len() as CodeOffset
1715 }
1716
1717 /// Return the code in this mach buffer as a hex string for testing purposes.
1718 pub fn stringify_code_bytes(&self) -> String {
1719 // This is pretty lame, but whatever ..
1720 use std::fmt::Write;
1721 let mut s = String::with_capacity(self.data.len() * 2);
1722 for b in &self.data {
1723 write!(&mut s, "{b:02X}").unwrap();
1724 }
1725 s
1726 }
1727
1728 /// Get the code bytes.
1729 pub fn data(&self) -> &[u8] {
1730 // N.B.: we emit every section into the .text section as far as
1731 // the `CodeSink` is concerned; we do not bother to segregate
1732 // the contents into the actual program text, the jumptable and the
1733 // rodata (constant pool). This allows us to generate code assuming
1734 // that these will not be relocated relative to each other, and avoids
1735 // having to designate each section as belonging in one of the three
1736 // fixed categories defined by `CodeSink`. If this becomes a problem
1737 // later (e.g. because of memory permissions or similar), we can
1738 // add this designation and segregate the output; take care, however,
1739 // to add the appropriate relocations in this case.
1740
1741 &self.data[..]
1742 }
1743
1744 /// Get the list of external relocations for this code.
1745 pub fn relocs(&self) -> &[FinalizedMachReloc] {
1746 &self.relocs[..]
1747 }
1748
1749 /// Get the list of trap records for this code.
1750 pub fn traps(&self) -> &[MachTrap] {
1751 &self.traps[..]
1752 }
1753
1754 /// Get the user stack map metadata for this code.
1755 pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1756 &self.user_stack_maps
1757 }
1758
1759 /// Take this buffer's user strack map metadata.
1760 pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1761 mem::take(&mut self.user_stack_maps)
1762 }
1763
1764 /// Get the list of call sites for this code, along with
1765 /// associated exception handlers.
1766 ///
1767 /// Each item yielded by the returned iterator is a struct with:
1768 ///
1769 /// - The call site metadata record, with a `ret_addr` field
1770 /// directly accessible and denoting the offset of the return
1771 /// address into this buffer's code.
1772 /// - The slice of pairs of exception tags and code offsets
1773 /// denoting exception-handler entry points associated with this
1774 /// call site.
1775 pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1776 self.call_sites.iter().map(|call_site| {
1777 let range = call_site.exception_handler_range.clone();
1778 let range = usize::try_from(range.start).unwrap()..usize::try_from(range.end).unwrap();
1779 FinalizedMachCallSite {
1780 ret_addr: call_site.ret_addr,
1781 exception_handlers: &self.exception_handlers[range],
1782 }
1783 })
1784 }
1785}
1786
1787/// Metadata about a constant.
1788struct MachBufferConstant {
1789 /// A label which has not yet been bound which can be used for this
1790 /// constant.
1791 ///
1792 /// This is lazily created when a label is requested for a constant and is
1793 /// cleared when a constant is emitted.
1794 upcoming_label: Option<MachLabel>,
1795 /// Required alignment.
1796 align: CodeOffset,
1797 /// The byte size of this constant.
1798 size: usize,
1799}
1800
1801/// A trap that is deferred to the next time an island is emitted for either
1802/// traps, constants, or fixups.
1803struct MachLabelTrap {
1804 /// This label will refer to the trap's offset.
1805 label: MachLabel,
1806 /// The code associated with this trap.
1807 code: TrapCode,
1808 /// An optional source location to assign for this trap.
1809 loc: Option<RelSourceLoc>,
1810}
1811
1812/// A fixup to perform on the buffer once code is emitted. Fixups always refer
1813/// to labels and patch the code based on label offsets. Hence, they are like
1814/// relocations, but internal to one buffer.
1815#[derive(Debug)]
1816struct MachLabelFixup<I: VCodeInst> {
1817 /// The label whose offset controls this fixup.
1818 label: MachLabel,
1819 /// The offset to fix up / patch to refer to this label.
1820 offset: CodeOffset,
1821 /// The kind of fixup. This is architecture-specific; each architecture may have,
1822 /// e.g., several types of branch instructions, each with differently-sized
1823 /// offset fields and different places within the instruction to place the
1824 /// bits.
1825 kind: I::LabelUse,
1826}
1827
1828impl<I: VCodeInst> MachLabelFixup<I> {
1829 fn deadline(&self) -> CodeOffset {
1830 self.offset.saturating_add(self.kind.max_pos_range())
1831 }
1832}
1833
1834impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1835 fn eq(&self, other: &Self) -> bool {
1836 self.deadline() == other.deadline()
1837 }
1838}
1839
1840impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1841
1842impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1843 fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1844 Some(self.cmp(other))
1845 }
1846}
1847
1848impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1849 fn cmp(&self, other: &Self) -> Ordering {
1850 other.deadline().cmp(&self.deadline())
1851 }
1852}
1853
1854/// A relocation resulting from a compilation.
1855#[derive(Clone, Debug, PartialEq)]
1856#[cfg_attr(
1857 feature = "enable-serde",
1858 derive(serde_derive::Serialize, serde_derive::Deserialize)
1859)]
1860pub struct MachRelocBase<T> {
1861 /// The offset at which the relocation applies, *relative to the
1862 /// containing section*.
1863 pub offset: CodeOffset,
1864 /// The kind of relocation.
1865 pub kind: Reloc,
1866 /// The external symbol / name to which this relocation refers.
1867 pub target: T,
1868 /// The addend to add to the symbol value.
1869 pub addend: i64,
1870}
1871
1872type MachReloc = MachRelocBase<RelocTarget>;
1873
1874/// A relocation resulting from a compilation.
1875pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1876
1877/// A Relocation target
1878#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1879pub enum RelocTarget {
1880 /// Points to an [ExternalName] outside the current function.
1881 ExternalName(ExternalName),
1882 /// Points to a [MachLabel] inside this function.
1883 /// This is different from [MachLabelFixup] in that both the relocation and the
1884 /// label will be emitted and are only resolved at link time.
1885 ///
1886 /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1887 Label(MachLabel),
1888}
1889
1890impl From<ExternalName> for RelocTarget {
1891 fn from(name: ExternalName) -> Self {
1892 Self::ExternalName(name)
1893 }
1894}
1895
1896impl From<MachLabel> for RelocTarget {
1897 fn from(label: MachLabel) -> Self {
1898 Self::Label(label)
1899 }
1900}
1901
1902/// A Relocation target
1903#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1904#[cfg_attr(
1905 feature = "enable-serde",
1906 derive(serde_derive::Serialize, serde_derive::Deserialize)
1907)]
1908pub enum FinalizedRelocTarget {
1909 /// Points to an [ExternalName] outside the current function.
1910 ExternalName(ExternalName),
1911 /// Points to a [CodeOffset] from the start of the current function.
1912 Func(CodeOffset),
1913}
1914
1915impl FinalizedRelocTarget {
1916 /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1917 /// output.
1918 pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1919 match self {
1920 FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1921 FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1922 }
1923 }
1924}
1925
1926/// A trap record resulting from a compilation.
1927#[derive(Clone, Debug, PartialEq)]
1928#[cfg_attr(
1929 feature = "enable-serde",
1930 derive(serde_derive::Serialize, serde_derive::Deserialize)
1931)]
1932pub struct MachTrap {
1933 /// The offset at which the trap instruction occurs, *relative to the
1934 /// containing section*.
1935 pub offset: CodeOffset,
1936 /// The trap code.
1937 pub code: TrapCode,
1938}
1939
1940/// A call site record resulting from a compilation.
1941#[derive(Clone, Debug, PartialEq)]
1942#[cfg_attr(
1943 feature = "enable-serde",
1944 derive(serde_derive::Serialize, serde_derive::Deserialize)
1945)]
1946pub struct MachCallSite {
1947 /// The offset of the call's return address, *relative to the
1948 /// start of the buffer*.
1949 pub ret_addr: CodeOffset,
1950
1951 /// Range in `exception_handlers` corresponding to the exception
1952 /// handlers for this callsite.
1953 exception_handler_range: Range<u32>,
1954}
1955
1956/// A call site record resulting from a compilation.
1957#[derive(Clone, Debug, PartialEq)]
1958pub struct FinalizedMachCallSite<'a> {
1959 /// The offset of the call's return address, *relative to the
1960 /// start of the buffer*.
1961 pub ret_addr: CodeOffset,
1962
1963 /// Exception handlers at this callsite, with target offsets
1964 /// *relative to the start of the buffer*.
1965 pub exception_handlers: &'a [(PackedOption<ir::ExceptionTag>, CodeOffset)],
1966}
1967
1968/// A source-location mapping resulting from a compilation.
1969#[derive(PartialEq, Debug, Clone)]
1970#[cfg_attr(
1971 feature = "enable-serde",
1972 derive(serde_derive::Serialize, serde_derive::Deserialize)
1973)]
1974pub struct MachSrcLoc<T: CompilePhase> {
1975 /// The start of the region of code corresponding to a source location.
1976 /// This is relative to the start of the function, not to the start of the
1977 /// section.
1978 pub start: CodeOffset,
1979 /// The end of the region of code corresponding to a source location.
1980 /// This is relative to the start of the section, not to the start of the
1981 /// section.
1982 pub end: CodeOffset,
1983 /// The source location.
1984 pub loc: T::SourceLocType,
1985}
1986
1987impl MachSrcLoc<Stencil> {
1988 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
1989 MachSrcLoc {
1990 start: self.start,
1991 end: self.end,
1992 loc: self.loc.expand(base_srcloc),
1993 }
1994 }
1995}
1996
1997/// Record of branch instruction in the buffer, to facilitate editing.
1998#[derive(Clone, Debug)]
1999struct MachBranch {
2000 start: CodeOffset,
2001 end: CodeOffset,
2002 target: MachLabel,
2003 fixup: usize,
2004 inverted: Option<SmallVec<[u8; 8]>>,
2005 /// All labels pointing to the start of this branch. For correctness, this
2006 /// *must* be complete (i.e., must contain all labels whose resolved offsets
2007 /// are at the start of this branch): we rely on being able to redirect all
2008 /// labels that could jump to this branch before removing it, if it is
2009 /// otherwise unreachable.
2010 labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2011}
2012
2013impl MachBranch {
2014 fn is_cond(&self) -> bool {
2015 self.inverted.is_some()
2016 }
2017 fn is_uncond(&self) -> bool {
2018 self.inverted.is_none()
2019 }
2020}
2021
2022/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2023///
2024/// Note that `MachBuffer` was primarily written for intra-function references
2025/// of jumps between basic blocks, but it's also quite usable for entire text
2026/// sections and resolving references between functions themselves. This
2027/// builder interprets "blocks" as labeled functions for the purposes of
2028/// resolving labels internally in the buffer.
2029pub struct MachTextSectionBuilder<I: VCodeInst> {
2030 buf: MachBuffer<I>,
2031 next_func: usize,
2032 force_veneers: ForceVeneers,
2033}
2034
2035impl<I: VCodeInst> MachTextSectionBuilder<I> {
2036 /// Creates a new text section builder which will have `num_funcs` functions
2037 /// pushed into it.
2038 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2039 let mut buf = MachBuffer::new();
2040 buf.reserve_labels_for_blocks(num_funcs);
2041 MachTextSectionBuilder {
2042 buf,
2043 next_func: 0,
2044 force_veneers: ForceVeneers::No,
2045 }
2046 }
2047}
2048
2049impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2050 fn append(
2051 &mut self,
2052 labeled: bool,
2053 func: &[u8],
2054 align: u32,
2055 ctrl_plane: &mut ControlPlane,
2056 ) -> u64 {
2057 // Conditionally emit an island if it's necessary to resolve jumps
2058 // between functions which are too far away.
2059 let size = func.len() as u32;
2060 if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2061 self.buf
2062 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2063 }
2064
2065 self.buf.align_to(align);
2066 let pos = self.buf.cur_offset();
2067 if labeled {
2068 self.buf.bind_label(
2069 MachLabel::from_block(BlockIndex::new(self.next_func)),
2070 ctrl_plane,
2071 );
2072 self.next_func += 1;
2073 }
2074 self.buf.put_data(func);
2075 u64::from(pos)
2076 }
2077
2078 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2079 crate::trace!(
2080 "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2081 );
2082 let label = MachLabel::from_block(BlockIndex::new(target));
2083 let offset = u32::try_from(offset).unwrap();
2084 match I::LabelUse::from_reloc(reloc, addend) {
2085 Some(label_use) => {
2086 self.buf.use_label_at_offset(offset, label, label_use);
2087 true
2088 }
2089 None => false,
2090 }
2091 }
2092
2093 fn force_veneers(&mut self) {
2094 self.force_veneers = ForceVeneers::Yes;
2095 }
2096
2097 fn write(&mut self, offset: u64, data: &[u8]) {
2098 self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2099 }
2100
2101 fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2102 // Double-check all functions were pushed.
2103 assert_eq!(self.next_func, self.buf.label_offsets.len());
2104
2105 // Finish up any veneers, if necessary.
2106 self.buf
2107 .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2108
2109 // We don't need the data any more, so return it to the caller.
2110 mem::take(&mut self.buf.data).into_vec()
2111 }
2112}
2113
2114// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2115#[cfg(all(test, feature = "arm64"))]
2116mod test {
2117 use cranelift_entity::EntityRef as _;
2118
2119 use super::*;
2120 use crate::ir::UserExternalNameRef;
2121 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2122 use crate::isa::aarch64::inst::{OperandSize, xreg};
2123 use crate::machinst::{MachInstEmit, MachInstEmitState};
2124 use crate::settings;
2125
2126 fn label(n: u32) -> MachLabel {
2127 MachLabel::from_block(BlockIndex::new(n as usize))
2128 }
2129 fn target(n: u32) -> BranchTarget {
2130 BranchTarget::Label(label(n))
2131 }
2132
2133 #[test]
2134 fn test_elide_jump_to_next() {
2135 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2136 let mut buf = MachBuffer::new();
2137 let mut state = <Inst as MachInstEmit>::State::default();
2138 let constants = Default::default();
2139
2140 buf.reserve_labels_for_blocks(2);
2141 buf.bind_label(label(0), state.ctrl_plane_mut());
2142 let inst = Inst::Jump { dest: target(1) };
2143 inst.emit(&mut buf, &info, &mut state);
2144 buf.bind_label(label(1), state.ctrl_plane_mut());
2145 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2146 assert_eq!(0, buf.total_size());
2147 }
2148
2149 #[test]
2150 fn test_elide_trivial_jump_blocks() {
2151 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2152 let mut buf = MachBuffer::new();
2153 let mut state = <Inst as MachInstEmit>::State::default();
2154 let constants = Default::default();
2155
2156 buf.reserve_labels_for_blocks(4);
2157
2158 buf.bind_label(label(0), state.ctrl_plane_mut());
2159 let inst = Inst::CondBr {
2160 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2161 taken: target(1),
2162 not_taken: target(2),
2163 };
2164 inst.emit(&mut buf, &info, &mut state);
2165
2166 buf.bind_label(label(1), state.ctrl_plane_mut());
2167 let inst = Inst::Jump { dest: target(3) };
2168 inst.emit(&mut buf, &info, &mut state);
2169
2170 buf.bind_label(label(2), state.ctrl_plane_mut());
2171 let inst = Inst::Jump { dest: target(3) };
2172 inst.emit(&mut buf, &info, &mut state);
2173
2174 buf.bind_label(label(3), state.ctrl_plane_mut());
2175
2176 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2177 assert_eq!(0, buf.total_size());
2178 }
2179
2180 #[test]
2181 fn test_flip_cond() {
2182 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2183 let mut buf = MachBuffer::new();
2184 let mut state = <Inst as MachInstEmit>::State::default();
2185 let constants = Default::default();
2186
2187 buf.reserve_labels_for_blocks(4);
2188
2189 buf.bind_label(label(0), state.ctrl_plane_mut());
2190 let inst = Inst::CondBr {
2191 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2192 taken: target(1),
2193 not_taken: target(2),
2194 };
2195 inst.emit(&mut buf, &info, &mut state);
2196
2197 buf.bind_label(label(1), state.ctrl_plane_mut());
2198 let inst = Inst::Nop4;
2199 inst.emit(&mut buf, &info, &mut state);
2200
2201 buf.bind_label(label(2), state.ctrl_plane_mut());
2202 let inst = Inst::Udf {
2203 trap_code: TrapCode::STACK_OVERFLOW,
2204 };
2205 inst.emit(&mut buf, &info, &mut state);
2206
2207 buf.bind_label(label(3), state.ctrl_plane_mut());
2208
2209 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2210
2211 let mut buf2 = MachBuffer::new();
2212 let mut state = Default::default();
2213 let inst = Inst::TrapIf {
2214 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2215 trap_code: TrapCode::STACK_OVERFLOW,
2216 };
2217 inst.emit(&mut buf2, &info, &mut state);
2218 let inst = Inst::Nop4;
2219 inst.emit(&mut buf2, &info, &mut state);
2220
2221 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2222
2223 assert_eq!(buf.data, buf2.data);
2224 }
2225
2226 #[test]
2227 fn test_island() {
2228 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2229 let mut buf = MachBuffer::new();
2230 let mut state = <Inst as MachInstEmit>::State::default();
2231 let constants = Default::default();
2232
2233 buf.reserve_labels_for_blocks(4);
2234
2235 buf.bind_label(label(0), state.ctrl_plane_mut());
2236 let inst = Inst::CondBr {
2237 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2238 taken: target(2),
2239 not_taken: target(3),
2240 };
2241 inst.emit(&mut buf, &info, &mut state);
2242
2243 buf.bind_label(label(1), state.ctrl_plane_mut());
2244 while buf.cur_offset() < 2000000 {
2245 if buf.island_needed(0) {
2246 buf.emit_island(0, state.ctrl_plane_mut());
2247 }
2248 let inst = Inst::Nop4;
2249 inst.emit(&mut buf, &info, &mut state);
2250 }
2251
2252 buf.bind_label(label(2), state.ctrl_plane_mut());
2253 let inst = Inst::Nop4;
2254 inst.emit(&mut buf, &info, &mut state);
2255
2256 buf.bind_label(label(3), state.ctrl_plane_mut());
2257 let inst = Inst::Nop4;
2258 inst.emit(&mut buf, &info, &mut state);
2259
2260 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2261
2262 assert_eq!(2000000 + 8, buf.total_size());
2263
2264 let mut buf2 = MachBuffer::new();
2265 let mut state = Default::default();
2266 let inst = Inst::CondBr {
2267 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2268
2269 // This conditionally taken branch has a 19-bit constant, shifted
2270 // to the left by two, giving us a 21-bit range in total. Half of
2271 // this range positive so the we should be around 1 << 20 bytes
2272 // away for our jump target.
2273 //
2274 // There are two pending fixups by the time we reach this point,
2275 // one for this 19-bit jump and one for the unconditional 26-bit
2276 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2277 // veneer is 20 bytes large, which means that pessimistically
2278 // assuming we'll need two veneers. Currently each veneer is
2279 // pessimistically assumed to be the maximal size which means we
2280 // need 40 bytes of extra space, meaning that the actual island
2281 // should come 40-bytes before the deadline.
2282 taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2283
2284 // This branch is in-range so no veneers should be needed, it should
2285 // go directly to the target.
2286 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2287 };
2288 inst.emit(&mut buf2, &info, &mut state);
2289
2290 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2291
2292 assert_eq!(&buf.data[0..8], &buf2.data[..]);
2293 }
2294
2295 #[test]
2296 fn test_island_backward() {
2297 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2298 let mut buf = MachBuffer::new();
2299 let mut state = <Inst as MachInstEmit>::State::default();
2300 let constants = Default::default();
2301
2302 buf.reserve_labels_for_blocks(4);
2303
2304 buf.bind_label(label(0), state.ctrl_plane_mut());
2305 let inst = Inst::Nop4;
2306 inst.emit(&mut buf, &info, &mut state);
2307
2308 buf.bind_label(label(1), state.ctrl_plane_mut());
2309 let inst = Inst::Nop4;
2310 inst.emit(&mut buf, &info, &mut state);
2311
2312 buf.bind_label(label(2), state.ctrl_plane_mut());
2313 while buf.cur_offset() < 2000000 {
2314 let inst = Inst::Nop4;
2315 inst.emit(&mut buf, &info, &mut state);
2316 }
2317
2318 buf.bind_label(label(3), state.ctrl_plane_mut());
2319 let inst = Inst::CondBr {
2320 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2321 taken: target(0),
2322 not_taken: target(1),
2323 };
2324 inst.emit(&mut buf, &info, &mut state);
2325
2326 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2327
2328 assert_eq!(2000000 + 12, buf.total_size());
2329
2330 let mut buf2 = MachBuffer::new();
2331 let mut state = Default::default();
2332 let inst = Inst::CondBr {
2333 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2334 taken: BranchTarget::ResolvedOffset(8),
2335 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2336 };
2337 inst.emit(&mut buf2, &info, &mut state);
2338 let inst = Inst::Jump {
2339 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2340 };
2341 inst.emit(&mut buf2, &info, &mut state);
2342
2343 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2344
2345 assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2346 }
2347
2348 #[test]
2349 fn test_multiple_redirect() {
2350 // label0:
2351 // cbz x0, label1
2352 // b label2
2353 // label1:
2354 // b label3
2355 // label2:
2356 // nop
2357 // nop
2358 // b label0
2359 // label3:
2360 // b label4
2361 // label4:
2362 // b label5
2363 // label5:
2364 // b label7
2365 // label6:
2366 // nop
2367 // label7:
2368 // ret
2369 //
2370 // -- should become:
2371 //
2372 // label0:
2373 // cbz x0, label7
2374 // label2:
2375 // nop
2376 // nop
2377 // b label0
2378 // label6:
2379 // nop
2380 // label7:
2381 // ret
2382
2383 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2384 let mut buf = MachBuffer::new();
2385 let mut state = <Inst as MachInstEmit>::State::default();
2386 let constants = Default::default();
2387
2388 buf.reserve_labels_for_blocks(8);
2389
2390 buf.bind_label(label(0), state.ctrl_plane_mut());
2391 let inst = Inst::CondBr {
2392 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2393 taken: target(1),
2394 not_taken: target(2),
2395 };
2396 inst.emit(&mut buf, &info, &mut state);
2397
2398 buf.bind_label(label(1), state.ctrl_plane_mut());
2399 let inst = Inst::Jump { dest: target(3) };
2400 inst.emit(&mut buf, &info, &mut state);
2401
2402 buf.bind_label(label(2), state.ctrl_plane_mut());
2403 let inst = Inst::Nop4;
2404 inst.emit(&mut buf, &info, &mut state);
2405 inst.emit(&mut buf, &info, &mut state);
2406 let inst = Inst::Jump { dest: target(0) };
2407 inst.emit(&mut buf, &info, &mut state);
2408
2409 buf.bind_label(label(3), state.ctrl_plane_mut());
2410 let inst = Inst::Jump { dest: target(4) };
2411 inst.emit(&mut buf, &info, &mut state);
2412
2413 buf.bind_label(label(4), state.ctrl_plane_mut());
2414 let inst = Inst::Jump { dest: target(5) };
2415 inst.emit(&mut buf, &info, &mut state);
2416
2417 buf.bind_label(label(5), state.ctrl_plane_mut());
2418 let inst = Inst::Jump { dest: target(7) };
2419 inst.emit(&mut buf, &info, &mut state);
2420
2421 buf.bind_label(label(6), state.ctrl_plane_mut());
2422 let inst = Inst::Nop4;
2423 inst.emit(&mut buf, &info, &mut state);
2424
2425 buf.bind_label(label(7), state.ctrl_plane_mut());
2426 let inst = Inst::Ret {};
2427 inst.emit(&mut buf, &info, &mut state);
2428
2429 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2430
2431 let golden_data = vec![
2432 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2433 0x1f, 0x20, 0x03, 0xd5, // nop
2434 0x1f, 0x20, 0x03, 0xd5, // nop
2435 0xfd, 0xff, 0xff, 0x17, // b 0
2436 0x1f, 0x20, 0x03, 0xd5, // nop
2437 0xc0, 0x03, 0x5f, 0xd6, // ret
2438 ];
2439
2440 assert_eq!(&golden_data[..], &buf.data[..]);
2441 }
2442
2443 #[test]
2444 fn test_handle_branch_cycle() {
2445 // label0:
2446 // b label1
2447 // label1:
2448 // b label2
2449 // label2:
2450 // b label3
2451 // label3:
2452 // b label4
2453 // label4:
2454 // b label1 // note: not label0 (to make it interesting).
2455 //
2456 // -- should become:
2457 //
2458 // label0, label1, ..., label4:
2459 // b label0
2460 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2461 let mut buf = MachBuffer::new();
2462 let mut state = <Inst as MachInstEmit>::State::default();
2463 let constants = Default::default();
2464
2465 buf.reserve_labels_for_blocks(5);
2466
2467 buf.bind_label(label(0), state.ctrl_plane_mut());
2468 let inst = Inst::Jump { dest: target(1) };
2469 inst.emit(&mut buf, &info, &mut state);
2470
2471 buf.bind_label(label(1), state.ctrl_plane_mut());
2472 let inst = Inst::Jump { dest: target(2) };
2473 inst.emit(&mut buf, &info, &mut state);
2474
2475 buf.bind_label(label(2), state.ctrl_plane_mut());
2476 let inst = Inst::Jump { dest: target(3) };
2477 inst.emit(&mut buf, &info, &mut state);
2478
2479 buf.bind_label(label(3), state.ctrl_plane_mut());
2480 let inst = Inst::Jump { dest: target(4) };
2481 inst.emit(&mut buf, &info, &mut state);
2482
2483 buf.bind_label(label(4), state.ctrl_plane_mut());
2484 let inst = Inst::Jump { dest: target(1) };
2485 inst.emit(&mut buf, &info, &mut state);
2486
2487 let buf = buf.finish(&constants, state.ctrl_plane_mut());
2488
2489 let golden_data = vec![
2490 0x00, 0x00, 0x00, 0x14, // b 0
2491 ];
2492
2493 assert_eq!(&golden_data[..], &buf.data[..]);
2494 }
2495
2496 #[test]
2497 fn metadata_records() {
2498 let mut buf = MachBuffer::<Inst>::new();
2499 let ctrl_plane = &mut Default::default();
2500 let constants = Default::default();
2501
2502 buf.reserve_labels_for_blocks(3);
2503
2504 buf.bind_label(label(0), ctrl_plane);
2505 buf.put1(1);
2506 buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2507 buf.put1(2);
2508 buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2509 buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2510 buf.add_call_site(&[
2511 (None.into(), label(1)),
2512 (Some(ExceptionTag::new(42)).into(), label(2)),
2513 ]);
2514 buf.add_reloc(
2515 Reloc::Abs4,
2516 &ExternalName::User(UserExternalNameRef::new(0)),
2517 0,
2518 );
2519 buf.put1(3);
2520 buf.add_reloc(
2521 Reloc::Abs8,
2522 &ExternalName::User(UserExternalNameRef::new(1)),
2523 1,
2524 );
2525 buf.put1(4);
2526 buf.bind_label(label(1), ctrl_plane);
2527 buf.put1(0xff);
2528 buf.bind_label(label(2), ctrl_plane);
2529 buf.put1(0xff);
2530
2531 let buf = buf.finish(&constants, ctrl_plane);
2532
2533 assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2534 assert_eq!(
2535 buf.traps()
2536 .iter()
2537 .map(|trap| (trap.offset, trap.code))
2538 .collect::<Vec<_>>(),
2539 vec![
2540 (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2541 (2, TrapCode::INTEGER_OVERFLOW),
2542 (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2543 ]
2544 );
2545 let call_sites: Vec<_> = buf.call_sites().collect();
2546 assert_eq!(call_sites[0].ret_addr, 2);
2547 assert_eq!(
2548 call_sites[0].exception_handlers,
2549 &[(None.into(), 4), (Some(ExceptionTag::new(42)).into(), 5)]
2550 );
2551 assert_eq!(
2552 buf.relocs()
2553 .iter()
2554 .map(|reloc| (reloc.offset, reloc.kind))
2555 .collect::<Vec<_>>(),
2556 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2557 );
2558 }
2559}