cranelift_codegen/machinst/
abi.rs

1//! Implementation of a vanilla ABI, shared between several machines. The
2//! implementation here assumes that arguments will be passed in registers
3//! first, then additional args on the stack; that the stack grows downward,
4//! contains a standard frame (return address and frame pointer), and the
5//! compiler is otherwise free to allocate space below that with its choice of
6//! layout; and that the machine has some notion of caller- and callee-save
7//! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this
8//! mold and thus both of these backends use this shared implementation.
9//!
10//! See the documentation in specific machine backends for the "instantiation"
11//! of this generic ABI, i.e., which registers are caller/callee-save, arguments
12//! and return values, and any other special requirements.
13//!
14//! For now the implementation here assumes a 64-bit machine, but we intend to
15//! make this 32/64-bit-generic shortly.
16//!
17//! # Vanilla ABI
18//!
19//! First, arguments and return values are passed in registers up to a certain
20//! fixed count, after which they overflow onto the stack. Multiple return
21//! values either fit in registers, or are returned in a separate return-value
22//! area on the stack, given by a hidden extra parameter.
23//!
24//! Note that the exact stack layout is up to us. We settled on the
25//! below design based on several requirements. In particular, we need
26//! to be able to generate instructions (or instruction sequences) to
27//! access arguments, stack slots, and spill slots before we know how
28//! many spill slots or clobber-saves there will be, because of our
29//! pass structure. We also prefer positive offsets to negative
30//! offsets because of an asymmetry in some machines' addressing modes
31//! (e.g., on AArch64, positive offsets have a larger possible range
32//! without a long-form sequence to synthesize an arbitrary
33//! offset). We also need clobber-save registers to be "near" the
34//! frame pointer: Windows unwind information requires it to be within
35//! 240 bytes of RBP. Finally, it is not allowed to access memory
36//! below the current SP value.
37//!
38//! We assume that a prologue first pushes the frame pointer (and
39//! return address above that, if the machine does not do that in
40//! hardware). We set FP to point to this two-word frame record. We
41//! store all other frame slots below this two-word frame record, as
42//! well as enough space for arguments to the largest possible
43//! function call. The stack pointer then remains at this position
44//! for the duration of the function, allowing us to address all
45//! frame storage at positive offsets from SP.
46//!
47//! Note that if we ever support dynamic stack-space allocation (for
48//! `alloca`), we will need a way to reference spill slots and stack
49//! slots relative to a dynamic SP, because we will no longer be able
50//! to know a static offset from SP to the slots at any particular
51//! program point. Probably the best solution at that point will be to
52//! revert to using the frame pointer as the reference for all slots,
53//! to allow generating spill/reload and stackslot accesses before we
54//! know how large the clobber-saves will be.
55//!
56//! # Stack Layout
57//!
58//! The stack looks like:
59//!
60//! ```plain
61//!   (high address)
62//!                              |          ...              |
63//!                              | caller frames             |
64//!                              |          ...              |
65//!                              +===========================+
66//!                              |          ...              |
67//!                              | stack args                |
68//! Canonical Frame Address -->  | (accessed via FP)         |
69//!                              +---------------------------+
70//! SP at function entry ----->  | return address            |
71//!                              +---------------------------+
72//! FP after prologue -------->  | FP (pushed by prologue)   |
73//!                              +---------------------------+           -----
74//!                              |          ...              |             |
75//!                              | clobbered callee-saves    |             |
76//! unwind-frame base -------->  | (pushed by prologue)      |             |
77//!                              +---------------------------+   -----     |
78//!                              |          ...              |     |       |
79//!                              | spill slots               |     |       |
80//!                              | (accessed via SP)         |   fixed   active
81//!                              |          ...              |   frame    size
82//!                              | stack slots               |  storage    |
83//!                              | (accessed via SP)         |    size     |
84//!                              | (alloc'd by prologue)     |     |       |
85//!                              +---------------------------+   -----     |
86//!                              | [alignment as needed]     |             |
87//!                              |          ...              |             |
88//!                              | args for largest call     |             |
89//! SP ----------------------->  | (alloc'd by prologue)     |             |
90//!                              +===========================+           -----
91//!
92//!   (low address)
93//! ```
94//!
95//! # Multi-value Returns
96//!
97//! We support multi-value returns by using multiple return-value
98//! registers. In some cases this is an extension of the base system
99//! ABI. See each platform's `abi.rs` implementation for details.
100
101use crate::CodegenError;
102use crate::entity::SecondaryMap;
103use crate::ir::{ArgumentExtension, ArgumentPurpose, ExceptionTag, Signature};
104use crate::ir::{StackSlotKey, types::*};
105use crate::isa::TargetIsa;
106use crate::settings::ProbestackStrategy;
107use crate::{ir, isa};
108use crate::{machinst::*, trace};
109use alloc::boxed::Box;
110use regalloc2::{MachineEnv, PReg, PRegSet};
111use rustc_hash::FxHashMap;
112use smallvec::smallvec;
113use std::collections::HashMap;
114use std::marker::PhantomData;
115
116/// A small vector of instructions (with some reasonable size); appropriate for
117/// a small fixed sequence implementing one operation.
118pub type SmallInstVec<I> = SmallVec<[I; 4]>;
119
120/// A type used by backends to track argument-binding info in the "args"
121/// pseudoinst. The pseudoinst holds a vec of `ArgPair` structs.
122#[derive(Clone, Debug)]
123pub struct ArgPair {
124    /// The vreg that is defined by this args pseudoinst.
125    pub vreg: Writable<Reg>,
126    /// The preg that the arg arrives in; this constrains the vreg's
127    /// placement at the pseudoinst.
128    pub preg: Reg,
129}
130
131/// A type used by backends to track return register binding info in the "ret"
132/// pseudoinst. The pseudoinst holds a vec of `RetPair` structs.
133#[derive(Clone, Debug)]
134pub struct RetPair {
135    /// The vreg that is returned by this pseudionst.
136    pub vreg: Reg,
137    /// The preg that the arg is returned through; this constrains the vreg's
138    /// placement at the pseudoinst.
139    pub preg: Reg,
140}
141
142/// A location for (part of) an argument or return value. These "storage slots"
143/// are specified for each register-sized part of an argument.
144#[derive(Clone, Copy, Debug, PartialEq, Eq)]
145pub enum ABIArgSlot {
146    /// In a real register.
147    Reg {
148        /// Register that holds this arg.
149        reg: RealReg,
150        /// Value type of this arg.
151        ty: ir::Type,
152        /// Should this arg be zero- or sign-extended?
153        extension: ir::ArgumentExtension,
154    },
155    /// Arguments only: on stack, at given offset from SP at entry.
156    Stack {
157        /// Offset of this arg relative to the base of stack args.
158        offset: i64,
159        /// Value type of this arg.
160        ty: ir::Type,
161        /// Should this arg be zero- or sign-extended?
162        extension: ir::ArgumentExtension,
163    },
164}
165
166impl ABIArgSlot {
167    /// The type of the value that will be stored in this slot.
168    pub fn get_type(&self) -> ir::Type {
169        match self {
170            ABIArgSlot::Reg { ty, .. } => *ty,
171            ABIArgSlot::Stack { ty, .. } => *ty,
172        }
173    }
174}
175
176/// A vector of `ABIArgSlot`s. Inline capacity for one element because basically
177/// 100% of values use one slot. Only `i128`s need multiple slots, and they are
178/// super rare (and never happen with Wasm).
179pub type ABIArgSlotVec = SmallVec<[ABIArgSlot; 1]>;
180
181/// An ABIArg is composed of one or more parts. This allows for a CLIF-level
182/// Value to be passed with its parts in more than one location at the ABI
183/// level. For example, a 128-bit integer may be passed in two 64-bit registers,
184/// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The
185/// number of "parts" should correspond to the number of registers used to store
186/// this type according to the machine backend.
187///
188/// As an invariant, the `purpose` for every part must match. As a further
189/// invariant, a `StructArg` part cannot appear with any other part.
190#[derive(Clone, Debug)]
191pub enum ABIArg {
192    /// Storage slots (registers or stack locations) for each part of the
193    /// argument value. The number of slots must equal the number of register
194    /// parts used to store a value of this type.
195    Slots {
196        /// Slots, one per register part.
197        slots: ABIArgSlotVec,
198        /// Purpose of this arg.
199        purpose: ir::ArgumentPurpose,
200    },
201    /// Structure argument. We reserve stack space for it, but the CLIF-level
202    /// semantics are a little weird: the value passed to the call instruction,
203    /// and received in the corresponding block param, is a *pointer*. On the
204    /// caller side, we memcpy the data from the passed-in pointer to the stack
205    /// area; on the callee side, we compute a pointer to this stack area and
206    /// provide that as the argument's value.
207    StructArg {
208        /// Offset of this arg relative to base of stack args.
209        offset: i64,
210        /// Size of this arg on the stack.
211        size: u64,
212        /// Purpose of this arg.
213        purpose: ir::ArgumentPurpose,
214    },
215    /// Implicit argument. Similar to a StructArg, except that we have the
216    /// target type, not a pointer type, at the CLIF-level. This argument is
217    /// still being passed via reference implicitly.
218    ImplicitPtrArg {
219        /// Register or stack slot holding a pointer to the buffer.
220        pointer: ABIArgSlot,
221        /// Offset of the argument buffer.
222        offset: i64,
223        /// Type of the implicit argument.
224        ty: Type,
225        /// Purpose of this arg.
226        purpose: ir::ArgumentPurpose,
227    },
228}
229
230impl ABIArg {
231    /// Create an ABIArg from one register.
232    pub fn reg(
233        reg: RealReg,
234        ty: ir::Type,
235        extension: ir::ArgumentExtension,
236        purpose: ir::ArgumentPurpose,
237    ) -> ABIArg {
238        ABIArg::Slots {
239            slots: smallvec![ABIArgSlot::Reg { reg, ty, extension }],
240            purpose,
241        }
242    }
243
244    /// Create an ABIArg from one stack slot.
245    pub fn stack(
246        offset: i64,
247        ty: ir::Type,
248        extension: ir::ArgumentExtension,
249        purpose: ir::ArgumentPurpose,
250    ) -> ABIArg {
251        ABIArg::Slots {
252            slots: smallvec![ABIArgSlot::Stack {
253                offset,
254                ty,
255                extension,
256            }],
257            purpose,
258        }
259    }
260}
261
262/// Are we computing information about arguments or return values? Much of the
263/// handling is factored out into common routines; this enum allows us to
264/// distinguish which case we're handling.
265#[derive(Clone, Copy, Debug, PartialEq, Eq)]
266pub enum ArgsOrRets {
267    /// Arguments.
268    Args,
269    /// Return values.
270    Rets,
271}
272
273/// Abstract location for a machine-specific ABI impl to translate into the
274/// appropriate addressing mode.
275#[derive(Clone, Copy, Debug, PartialEq, Eq)]
276pub enum StackAMode {
277    /// Offset into the current frame's argument area.
278    IncomingArg(i64, u32),
279    /// Offset within the stack slots in the current frame.
280    Slot(i64),
281    /// Offset into the callee frame's argument area.
282    OutgoingArg(i64),
283}
284
285impl StackAMode {
286    fn offset_by(&self, offset: u32) -> Self {
287        match self {
288            StackAMode::IncomingArg(off, size) => {
289                StackAMode::IncomingArg(off.checked_add(i64::from(offset)).unwrap(), *size)
290            }
291            StackAMode::Slot(off) => StackAMode::Slot(off.checked_add(i64::from(offset)).unwrap()),
292            StackAMode::OutgoingArg(off) => {
293                StackAMode::OutgoingArg(off.checked_add(i64::from(offset)).unwrap())
294            }
295        }
296    }
297}
298
299/// Trait implemented by machine-specific backend to represent ISA flags.
300pub trait IsaFlags: Clone {
301    /// Get a flag indicating whether forward-edge CFI is enabled.
302    fn is_forward_edge_cfi_enabled(&self) -> bool {
303        false
304    }
305}
306
307/// Used as an out-parameter to accumulate a sequence of `ABIArg`s in
308/// `ABIMachineSpec::compute_arg_locs`. Wraps the shared allocation for all
309/// `ABIArg`s in `SigSet` and exposes just the args for the current
310/// `compute_arg_locs` call.
311pub struct ArgsAccumulator<'a> {
312    sig_set_abi_args: &'a mut Vec<ABIArg>,
313    start: usize,
314    non_formal_flag: bool,
315}
316
317impl<'a> ArgsAccumulator<'a> {
318    fn new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self {
319        let start = sig_set_abi_args.len();
320        ArgsAccumulator {
321            sig_set_abi_args,
322            start,
323            non_formal_flag: false,
324        }
325    }
326
327    #[inline]
328    pub fn push(&mut self, arg: ABIArg) {
329        debug_assert!(!self.non_formal_flag);
330        self.sig_set_abi_args.push(arg)
331    }
332
333    #[inline]
334    pub fn push_non_formal(&mut self, arg: ABIArg) {
335        self.non_formal_flag = true;
336        self.sig_set_abi_args.push(arg)
337    }
338
339    #[inline]
340    pub fn args(&self) -> &[ABIArg] {
341        &self.sig_set_abi_args[self.start..]
342    }
343
344    #[inline]
345    pub fn args_mut(&mut self) -> &mut [ABIArg] {
346        &mut self.sig_set_abi_args[self.start..]
347    }
348}
349
350/// Trait implemented by machine-specific backend to provide information about
351/// register assignments and to allow generating the specific instructions for
352/// stack loads/saves, prologues/epilogues, etc.
353pub trait ABIMachineSpec {
354    /// The instruction type.
355    type I: VCodeInst;
356
357    /// The ISA flags type.
358    type F: IsaFlags;
359
360    /// This is the limit for the size of argument and return-value areas on the
361    /// stack. We place a reasonable limit here to avoid integer overflow issues
362    /// with 32-bit arithmetic.
363    const STACK_ARG_RET_SIZE_LIMIT: u32;
364
365    /// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.
366    fn word_bits() -> u32;
367
368    /// Returns the number of bytes in a word.
369    fn word_bytes() -> u32 {
370        return Self::word_bits() / 8;
371    }
372
373    /// Returns word-size integer type.
374    fn word_type() -> Type {
375        match Self::word_bits() {
376            32 => I32,
377            64 => I64,
378            _ => unreachable!(),
379        }
380    }
381
382    /// Returns word register class.
383    fn word_reg_class() -> RegClass {
384        RegClass::Int
385    }
386
387    /// Returns required stack alignment in bytes.
388    fn stack_align(call_conv: isa::CallConv) -> u32;
389
390    /// Process a list of parameters or return values and allocate them to registers
391    /// and stack slots.
392    ///
393    /// The argument locations should be pushed onto the given `ArgsAccumulator`
394    /// in order. Any extra arguments added (such as return area pointers)
395    /// should come at the end of the list so that the first N lowered
396    /// parameters align with the N clif parameters.
397    ///
398    /// Returns the stack-space used (rounded up to as alignment requires), and
399    /// if `add_ret_area_ptr` was passed, the index of the extra synthetic arg
400    /// that was added.
401    fn compute_arg_locs(
402        call_conv: isa::CallConv,
403        flags: &settings::Flags,
404        params: &[ir::AbiParam],
405        args_or_rets: ArgsOrRets,
406        add_ret_area_ptr: bool,
407        args: ArgsAccumulator,
408    ) -> CodegenResult<(u32, Option<usize>)>;
409
410    /// Generate a load from the stack.
411    fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
412
413    /// Generate a store to the stack.
414    fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;
415
416    /// Generate a move.
417    fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;
418
419    /// Generate an integer-extend operation.
420    fn gen_extend(
421        to_reg: Writable<Reg>,
422        from_reg: Reg,
423        is_signed: bool,
424        from_bits: u8,
425        to_bits: u8,
426    ) -> Self::I;
427
428    /// Generate an "args" pseudo-instruction to capture input args in
429    /// registers.
430    fn gen_args(args: Vec<ArgPair>) -> Self::I;
431
432    /// Generate a "rets" pseudo-instruction that moves vregs to return
433    /// registers.
434    fn gen_rets(rets: Vec<RetPair>) -> Self::I;
435
436    /// Generate an add-with-immediate. Note that even if this uses a scratch
437    /// register, it must satisfy two requirements:
438    ///
439    /// - The add-imm sequence must only clobber caller-save registers that are
440    ///   not used for arguments, because it will be placed in the prologue
441    ///   before the clobbered callee-save registers are saved.
442    ///
443    /// - The add-imm sequence must work correctly when `from_reg` and/or
444    ///   `into_reg` are the register returned by `get_stacklimit_reg()`.
445    fn gen_add_imm(
446        call_conv: isa::CallConv,
447        into_reg: Writable<Reg>,
448        from_reg: Reg,
449        imm: u32,
450    ) -> SmallInstVec<Self::I>;
451
452    /// Generate a sequence that traps with a `TrapCode::StackOverflow` code if
453    /// the stack pointer is less than the given limit register (assuming the
454    /// stack grows downward).
455    fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;
456
457    /// Generate an instruction to compute an address of a stack slot (FP- or
458    /// SP-based offset).
459    fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I;
460
461    /// Get a fixed register to use to compute a stack limit. This is needed for
462    /// certain sequences generated after the register allocator has already
463    /// run. This must satisfy two requirements:
464    ///
465    /// - It must be a caller-save register that is not used for arguments,
466    ///   because it will be clobbered in the prologue before the clobbered
467    ///   callee-save registers are saved.
468    ///
469    /// - It must be safe to pass as an argument and/or destination to
470    ///   `gen_add_imm()`. This is relevant when an addition with a large
471    ///   immediate needs its own temporary; it cannot use the same fixed
472    ///   temporary as this one.
473    fn get_stacklimit_reg(call_conv: isa::CallConv) -> Reg;
474
475    /// Generate a load to the given [base+offset] address.
476    fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;
477
478    /// Generate a store from the given [base+offset] address.
479    fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;
480
481    /// Adjust the stack pointer up or down.
482    fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;
483
484    /// Compute a FrameLayout structure containing a sorted list of all clobbered
485    /// registers that are callee-saved according to the ABI, as well as the sizes
486    /// of all parts of the stack frame.  The result is used to emit the prologue
487    /// and epilogue routines.
488    fn compute_frame_layout(
489        call_conv: isa::CallConv,
490        flags: &settings::Flags,
491        sig: &Signature,
492        regs: &[Writable<RealReg>],
493        function_calls: FunctionCalls,
494        incoming_args_size: u32,
495        tail_args_size: u32,
496        stackslots_size: u32,
497        fixed_frame_storage_size: u32,
498        outgoing_args_size: u32,
499    ) -> FrameLayout;
500
501    /// Generate the usual frame-setup sequence for this architecture: e.g.,
502    /// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on
503    /// AArch64.
504    fn gen_prologue_frame_setup(
505        call_conv: isa::CallConv,
506        flags: &settings::Flags,
507        isa_flags: &Self::F,
508        frame_layout: &FrameLayout,
509    ) -> SmallInstVec<Self::I>;
510
511    /// Generate the usual frame-restore sequence for this architecture.
512    fn gen_epilogue_frame_restore(
513        call_conv: isa::CallConv,
514        flags: &settings::Flags,
515        isa_flags: &Self::F,
516        frame_layout: &FrameLayout,
517    ) -> SmallInstVec<Self::I>;
518
519    /// Generate a return instruction.
520    fn gen_return(
521        call_conv: isa::CallConv,
522        isa_flags: &Self::F,
523        frame_layout: &FrameLayout,
524    ) -> SmallInstVec<Self::I>;
525
526    /// Generate a probestack call.
527    fn gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32);
528
529    /// Generate a inline stack probe.
530    fn gen_inline_probestack(
531        insts: &mut SmallInstVec<Self::I>,
532        call_conv: isa::CallConv,
533        frame_size: u32,
534        guard_size: u32,
535    );
536
537    /// Generate a clobber-save sequence. The implementation here should return
538    /// a sequence of instructions that "push" or otherwise save to the stack all
539    /// registers written/modified by the function body that are callee-saved.
540    /// The sequence of instructions should adjust the stack pointer downward,
541    /// and should align as necessary according to ABI requirements.
542    fn gen_clobber_save(
543        call_conv: isa::CallConv,
544        flags: &settings::Flags,
545        frame_layout: &FrameLayout,
546    ) -> SmallVec<[Self::I; 16]>;
547
548    /// Generate a clobber-restore sequence. This sequence should perform the
549    /// opposite of the clobber-save sequence generated above, assuming that SP
550    /// going into the sequence is at the same point that it was left when the
551    /// clobber-save sequence finished.
552    fn gen_clobber_restore(
553        call_conv: isa::CallConv,
554        flags: &settings::Flags,
555        frame_layout: &FrameLayout,
556    ) -> SmallVec<[Self::I; 16]>;
557
558    /// Generate a memcpy invocation. Used to set up struct
559    /// args. Takes `src`, `dst` as read-only inputs and passes a temporary
560    /// allocator.
561    fn gen_memcpy<F: FnMut(Type) -> Writable<Reg>>(
562        call_conv: isa::CallConv,
563        dst: Reg,
564        src: Reg,
565        size: usize,
566        alloc_tmp: F,
567    ) -> SmallVec<[Self::I; 8]>;
568
569    /// Get the number of spillslots required for the given register-class.
570    fn get_number_of_spillslots_for_value(
571        rc: RegClass,
572        target_vector_bytes: u32,
573        isa_flags: &Self::F,
574    ) -> u32;
575
576    /// Get the ABI-dependent MachineEnv for managing register allocation.
577    fn get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv;
578
579    /// Get all caller-save registers, that is, registers that we expect
580    /// not to be saved across a call to a callee with the given ABI.
581    fn get_regs_clobbered_by_call(
582        call_conv_of_callee: isa::CallConv,
583        is_exception: bool,
584    ) -> PRegSet;
585
586    /// Get the needed extension mode, given the mode attached to the argument
587    /// in the signature and the calling convention. The input (the attribute in
588    /// the signature) specifies what extension type should be done *if* the ABI
589    /// requires extension to the full register; this method's return value
590    /// indicates whether the extension actually *will* be done.
591    fn get_ext_mode(
592        call_conv: isa::CallConv,
593        specified: ir::ArgumentExtension,
594    ) -> ir::ArgumentExtension;
595
596    /// Get a temporary register that is available to use after a call
597    /// completes and that does not interfere with register-carried
598    /// return values. This is used to move stack-carried return
599    /// values directly into spillslots if needed.
600    fn retval_temp_reg(call_conv_of_callee: isa::CallConv) -> Writable<Reg>;
601
602    /// Get the exception payload registers, if any, for a calling
603    /// convention.
604    ///
605    /// Note that the argument here is the calling convention of the *callee*.
606    /// This might differ from the caller but the exceptional payloads that are
607    /// available are defined by the callee, not the caller.
608    fn exception_payload_regs(callee_conv: isa::CallConv) -> &'static [Reg] {
609        let _ = callee_conv;
610        &[]
611    }
612}
613
614/// Out-of-line data for calls, to keep the size of `Inst` down.
615#[derive(Clone, Debug)]
616pub struct CallInfo<T> {
617    /// Receiver of this call
618    pub dest: T,
619    /// Register uses of this call.
620    pub uses: CallArgList,
621    /// Register defs of this call.
622    pub defs: CallRetList,
623    /// Registers clobbered by this call, as per its calling convention.
624    pub clobbers: PRegSet,
625    /// The calling convention of the callee.
626    pub callee_conv: isa::CallConv,
627    /// The calling convention of the caller.
628    pub caller_conv: isa::CallConv,
629    /// The number of bytes that the callee will pop from the stack for the
630    /// caller, if any. (Used for popping stack arguments with the `tail`
631    /// calling convention.)
632    pub callee_pop_size: u32,
633    /// Information for a try-call, if this is one. We combine
634    /// handling of calls and try-calls as much as possible to share
635    /// argument/return logic; they mostly differ in the metadata that
636    /// they emit, which this information feeds into.
637    pub try_call_info: Option<TryCallInfo>,
638    /// Whether this call is patchable.
639    pub patchable: bool,
640}
641
642/// Out-of-line information present on `try_call` instructions only:
643/// information that is used to generate exception-handling tables and
644/// link up to destination blocks properly.
645#[derive(Clone, Debug)]
646pub struct TryCallInfo {
647    /// The target to jump to on a normal returhn.
648    pub continuation: MachLabel,
649    /// Exception tags to catch and corresponding destination labels.
650    pub exception_handlers: Box<[TryCallHandler]>,
651}
652
653/// Information about an individual handler at a try-call site.
654#[derive(Clone, Debug)]
655pub enum TryCallHandler {
656    /// If the tag matches (given the current context), recover at the
657    /// label.
658    Tag(ExceptionTag, MachLabel),
659    /// Recover at the label unconditionally.
660    Default(MachLabel),
661    /// Set the dynamic context for interpreting tags at this point in
662    /// the handler list.
663    Context(Reg),
664}
665
666impl<T> CallInfo<T> {
667    /// Creates an empty set of info with no clobbers/uses/etc with the
668    /// specified ABI
669    pub fn empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T> {
670        CallInfo {
671            dest,
672            uses: smallvec![],
673            defs: smallvec![],
674            clobbers: PRegSet::empty(),
675            caller_conv: call_conv,
676            callee_conv: call_conv,
677            callee_pop_size: 0,
678            try_call_info: None,
679            patchable: false,
680        }
681    }
682}
683
684/// The id of an ABI signature within the `SigSet`.
685#[derive(Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
686pub struct Sig(u32);
687cranelift_entity::entity_impl!(Sig);
688
689impl Sig {
690    fn prev(self) -> Option<Sig> {
691        self.0.checked_sub(1).map(Sig)
692    }
693}
694
695/// ABI information shared between body (callee) and caller.
696#[derive(Clone, Debug)]
697pub struct SigData {
698    /// Currently both return values and arguments are stored in a continuous space vector
699    /// in `SigSet::abi_args`.
700    ///
701    /// ```plain
702    ///                  +----------------------------------------------+
703    ///                  | return values                                |
704    ///                  | ...                                          |
705    ///   rets_end   --> +----------------------------------------------+
706    ///                  | arguments                                    |
707    ///                  | ...                                          |
708    ///   args_end   --> +----------------------------------------------+
709    ///
710    /// ```
711    ///
712    /// Note we only store two offsets as rets_end == args_start, and rets_start == prev.args_end.
713    ///
714    /// Argument location ending offset (regs or stack slots). Stack offsets are relative to
715    /// SP on entry to function.
716    ///
717    /// This is a index into the `SigSet::abi_args`.
718    args_end: u32,
719
720    /// Return-value location ending offset. Stack offsets are relative to the return-area
721    /// pointer.
722    ///
723    /// This is a index into the `SigSet::abi_args`.
724    rets_end: u32,
725
726    /// Space on stack used to store arguments. We're storing the size in u32 to
727    /// reduce the size of the struct.
728    sized_stack_arg_space: u32,
729
730    /// Space on stack used to store return values. We're storing the size in u32 to
731    /// reduce the size of the struct.
732    sized_stack_ret_space: u32,
733
734    /// Index in `args` of the stack-return-value-area argument.
735    stack_ret_arg: Option<u16>,
736
737    /// Calling convention used.
738    call_conv: isa::CallConv,
739}
740
741impl SigData {
742    /// Get total stack space required for arguments.
743    pub fn sized_stack_arg_space(&self) -> u32 {
744        self.sized_stack_arg_space
745    }
746
747    /// Get total stack space required for return values.
748    pub fn sized_stack_ret_space(&self) -> u32 {
749        self.sized_stack_ret_space
750    }
751
752    /// Get calling convention used.
753    pub fn call_conv(&self) -> isa::CallConv {
754        self.call_conv
755    }
756
757    /// The index of the stack-return-value-area argument, if any.
758    pub fn stack_ret_arg(&self) -> Option<u16> {
759        self.stack_ret_arg
760    }
761}
762
763/// A (mostly) deduplicated set of ABI signatures.
764///
765/// We say "mostly" because we do not dedupe between signatures interned via
766/// `ir::SigRef` (direct and indirect calls; the vast majority of signatures in
767/// this set) vs via `ir::Signature` (the callee itself and libcalls). Doing
768/// this final bit of deduplication would require filling out the
769/// `ir_signature_to_abi_sig`, which is a bunch of allocations (not just the
770/// hash map itself but params and returns vecs in each signature) that we want
771/// to avoid.
772///
773/// In general, prefer using the `ir::SigRef`-taking methods to the
774/// `ir::Signature`-taking methods when you can get away with it, as they don't
775/// require cloning non-copy types that will trigger heap allocations.
776///
777/// This type can be indexed by `Sig` to access its associated `SigData`.
778pub struct SigSet {
779    /// Interned `ir::Signature`s that we already have an ABI signature for.
780    ir_signature_to_abi_sig: FxHashMap<ir::Signature, Sig>,
781
782    /// Interned `ir::SigRef`s that we already have an ABI signature for.
783    ir_sig_ref_to_abi_sig: SecondaryMap<ir::SigRef, Option<Sig>>,
784
785    /// A single, shared allocation for all `ABIArg`s used by all
786    /// `SigData`s. Each `SigData` references its args/rets via indices into
787    /// this allocation.
788    abi_args: Vec<ABIArg>,
789
790    /// The actual ABI signatures, keyed by `Sig`.
791    sigs: PrimaryMap<Sig, SigData>,
792}
793
794impl SigSet {
795    /// Construct a new `SigSet`, interning all of the signatures used by the
796    /// given function.
797    pub fn new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self>
798    where
799        M: ABIMachineSpec,
800    {
801        let arg_estimate = func.dfg.signatures.len() * 6;
802
803        let mut sigs = SigSet {
804            ir_signature_to_abi_sig: FxHashMap::default(),
805            ir_sig_ref_to_abi_sig: SecondaryMap::with_capacity(func.dfg.signatures.len()),
806            abi_args: Vec::with_capacity(arg_estimate),
807            sigs: PrimaryMap::with_capacity(1 + func.dfg.signatures.len()),
808        };
809
810        sigs.make_abi_sig_from_ir_signature::<M>(func.signature.clone(), flags)?;
811        for sig_ref in func.dfg.signatures.keys() {
812            sigs.make_abi_sig_from_ir_sig_ref::<M>(sig_ref, &func.dfg, flags)?;
813        }
814
815        Ok(sigs)
816    }
817
818    /// Have we already interned an ABI signature for the given `ir::Signature`?
819    pub fn have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool {
820        self.ir_signature_to_abi_sig.contains_key(signature)
821    }
822
823    /// Construct and intern an ABI signature for the given `ir::Signature`.
824    pub fn make_abi_sig_from_ir_signature<M>(
825        &mut self,
826        signature: ir::Signature,
827        flags: &settings::Flags,
828    ) -> CodegenResult<Sig>
829    where
830        M: ABIMachineSpec,
831    {
832        // Because the `HashMap` entry API requires taking ownership of the
833        // lookup key -- and we want to avoid unnecessary clones of
834        // `ir::Signature`s, even at the cost of duplicate lookups -- we can't
835        // have a single, get-or-create-style method for interning
836        // `ir::Signature`s into ABI signatures. So at least (debug) assert that
837        // we aren't creating duplicate ABI signatures for the same
838        // `ir::Signature`.
839        debug_assert!(!self.have_abi_sig_for_signature(&signature));
840
841        let sig_data = self.from_func_sig::<M>(&signature, flags)?;
842        let sig = self.sigs.push(sig_data);
843        self.ir_signature_to_abi_sig.insert(signature, sig);
844        Ok(sig)
845    }
846
847    fn make_abi_sig_from_ir_sig_ref<M>(
848        &mut self,
849        sig_ref: ir::SigRef,
850        dfg: &ir::DataFlowGraph,
851        flags: &settings::Flags,
852    ) -> CodegenResult<Sig>
853    where
854        M: ABIMachineSpec,
855    {
856        if let Some(sig) = self.ir_sig_ref_to_abi_sig[sig_ref] {
857            return Ok(sig);
858        }
859        let signature = &dfg.signatures[sig_ref];
860        let sig_data = self.from_func_sig::<M>(signature, flags)?;
861        let sig = self.sigs.push(sig_data);
862        self.ir_sig_ref_to_abi_sig[sig_ref] = Some(sig);
863        Ok(sig)
864    }
865
866    /// Get the already-interned ABI signature id for the given `ir::SigRef`.
867    pub fn abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig {
868        self.ir_sig_ref_to_abi_sig[sig_ref]
869            .expect("must call `make_abi_sig_from_ir_sig_ref` before `get_abi_sig_for_sig_ref`")
870    }
871
872    /// Get the already-interned ABI signature id for the given `ir::Signature`.
873    pub fn abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig {
874        self.ir_signature_to_abi_sig
875            .get(signature)
876            .copied()
877            .expect("must call `make_abi_sig_from_ir_signature` before `get_abi_sig_for_signature`")
878    }
879
880    pub fn from_func_sig<M: ABIMachineSpec>(
881        &mut self,
882        sig: &ir::Signature,
883        flags: &settings::Flags,
884    ) -> CodegenResult<SigData> {
885        // Keep in sync with ensure_struct_return_ptr_is_returned
886        if sig.uses_special_return(ArgumentPurpose::StructReturn) {
887            panic!("Explicit StructReturn return value not allowed: {sig:?}")
888        }
889        let tmp;
890        let returns = if let Some(struct_ret_index) =
891            sig.special_param_index(ArgumentPurpose::StructReturn)
892        {
893            if !sig.returns.is_empty() {
894                panic!("No return values are allowed when using StructReturn: {sig:?}");
895            }
896            tmp = [sig.params[struct_ret_index]];
897            &tmp
898        } else {
899            sig.returns.as_slice()
900        };
901
902        // Compute args and retvals from signature. Handle retvals first,
903        // because we may need to add a return-area arg to the args.
904
905        // NOTE: We rely on the order of the args (rets -> args) inserted to compute the offsets in
906        // `SigSet::args()` and `SigSet::rets()`. Therefore, we cannot change the two
907        // compute_arg_locs order.
908        let (sized_stack_ret_space, _) = M::compute_arg_locs(
909            sig.call_conv,
910            flags,
911            &returns,
912            ArgsOrRets::Rets,
913            /* extra ret-area ptr = */ false,
914            ArgsAccumulator::new(&mut self.abi_args),
915        )?;
916        if !flags.enable_multi_ret_implicit_sret() {
917            assert_eq!(sized_stack_ret_space, 0);
918        }
919        let rets_end = u32::try_from(self.abi_args.len()).unwrap();
920
921        // To avoid overflow issues, limit the return size to something reasonable.
922        if sized_stack_ret_space > M::STACK_ARG_RET_SIZE_LIMIT {
923            return Err(CodegenError::ImplLimitExceeded);
924        }
925
926        let need_stack_return_area = sized_stack_ret_space > 0;
927        if need_stack_return_area {
928            assert!(!sig.uses_special_param(ir::ArgumentPurpose::StructReturn));
929        }
930
931        let (sized_stack_arg_space, stack_ret_arg) = M::compute_arg_locs(
932            sig.call_conv,
933            flags,
934            &sig.params,
935            ArgsOrRets::Args,
936            need_stack_return_area,
937            ArgsAccumulator::new(&mut self.abi_args),
938        )?;
939        let args_end = u32::try_from(self.abi_args.len()).unwrap();
940
941        // To avoid overflow issues, limit the arg size to something reasonable.
942        if sized_stack_arg_space > M::STACK_ARG_RET_SIZE_LIMIT {
943            return Err(CodegenError::ImplLimitExceeded);
944        }
945
946        trace!(
947            "ABISig: sig {:?} => args end = {} rets end = {}
948             arg stack = {} ret stack = {} stack_ret_arg = {:?}",
949            sig,
950            args_end,
951            rets_end,
952            sized_stack_arg_space,
953            sized_stack_ret_space,
954            need_stack_return_area,
955        );
956
957        let stack_ret_arg = stack_ret_arg.map(|s| u16::try_from(s).unwrap());
958        Ok(SigData {
959            args_end,
960            rets_end,
961            sized_stack_arg_space,
962            sized_stack_ret_space,
963            stack_ret_arg,
964            call_conv: sig.call_conv,
965        })
966    }
967
968    /// Get this signature's ABI arguments.
969    pub fn args(&self, sig: Sig) -> &[ABIArg] {
970        let sig_data = &self.sigs[sig];
971        // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
972        let start = usize::try_from(sig_data.rets_end).unwrap();
973        let end = usize::try_from(sig_data.args_end).unwrap();
974        &self.abi_args[start..end]
975    }
976
977    /// Get information specifying how to pass the implicit pointer
978    /// to the return-value area on the stack, if required.
979    pub fn get_ret_arg(&self, sig: Sig) -> Option<ABIArg> {
980        let sig_data = &self.sigs[sig];
981        if let Some(i) = sig_data.stack_ret_arg {
982            Some(self.args(sig)[usize::from(i)].clone())
983        } else {
984            None
985        }
986    }
987
988    /// Get information specifying how to pass one argument.
989    pub fn get_arg(&self, sig: Sig, idx: usize) -> ABIArg {
990        self.args(sig)[idx].clone()
991    }
992
993    /// Get this signature's ABI returns.
994    pub fn rets(&self, sig: Sig) -> &[ABIArg] {
995        let sig_data = &self.sigs[sig];
996        // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
997        let start = usize::try_from(sig.prev().map_or(0, |prev| self.sigs[prev].args_end)).unwrap();
998        let end = usize::try_from(sig_data.rets_end).unwrap();
999        &self.abi_args[start..end]
1000    }
1001
1002    /// Get information specifying how to pass one return value.
1003    pub fn get_ret(&self, sig: Sig, idx: usize) -> ABIArg {
1004        self.rets(sig)[idx].clone()
1005    }
1006
1007    /// Get the number of arguments expected.
1008    pub fn num_args(&self, sig: Sig) -> usize {
1009        let len = self.args(sig).len();
1010        if self.sigs[sig].stack_ret_arg.is_some() {
1011            len - 1
1012        } else {
1013            len
1014        }
1015    }
1016
1017    /// Get the number of return values expected.
1018    pub fn num_rets(&self, sig: Sig) -> usize {
1019        self.rets(sig).len()
1020    }
1021}
1022
1023// NB: we do _not_ implement `IndexMut` because these signatures are
1024// deduplicated and shared!
1025impl std::ops::Index<Sig> for SigSet {
1026    type Output = SigData;
1027
1028    fn index(&self, sig: Sig) -> &Self::Output {
1029        &self.sigs[sig]
1030    }
1031}
1032
1033/// Structure describing the layout of a function's stack frame.
1034#[derive(Clone, Debug, Default)]
1035pub struct FrameLayout {
1036    /// Word size in bytes, so this struct can be
1037    /// monomorphic/independent of `ABIMachineSpec`.
1038    pub word_bytes: u32,
1039
1040    /// N.B. The areas whose sizes are given in this structure fully
1041    /// cover the current function's stack frame, from high to low
1042    /// stack addresses in the sequence below.  Each size contains
1043    /// any alignment padding that may be required by the ABI.
1044
1045    /// Size of incoming arguments on the stack.  This is not technically
1046    /// part of this function's frame, but code in the function will still
1047    /// need to access it.  Depending on the ABI, we may need to set up a
1048    /// frame pointer to do so; we also may need to pop this area from the
1049    /// stack upon return.
1050    pub incoming_args_size: u32,
1051
1052    /// The size of the incoming argument area, taking into account any
1053    /// potential increase in size required for tail calls present in the
1054    /// function. In the case that no tail calls are present, this value
1055    /// will be the same as [`Self::incoming_args_size`].
1056    pub tail_args_size: u32,
1057
1058    /// Size of the "setup area", typically holding the return address
1059    /// and/or the saved frame pointer.  This may be written either during
1060    /// the call itself (e.g. a pushed return address) or by code emitted
1061    /// from gen_prologue_frame_setup.  In any case, after that code has
1062    /// completed execution, the stack pointer is expected to point to the
1063    /// bottom of this area.  The same holds at the start of code emitted
1064    /// by gen_epilogue_frame_restore.
1065    pub setup_area_size: u32,
1066
1067    /// Size of the area used to save callee-saved clobbered registers.
1068    /// This area is accessed by code emitted from gen_clobber_save and
1069    /// gen_clobber_restore.
1070    pub clobber_size: u32,
1071
1072    /// Storage allocated for the fixed part of the stack frame.
1073    /// This contains stack slots and spill slots.
1074    pub fixed_frame_storage_size: u32,
1075
1076    /// The size of all stackslots.
1077    pub stackslots_size: u32,
1078
1079    /// Stack size to be reserved for outgoing arguments, if used by
1080    /// the current ABI, or 0 otherwise.  After gen_clobber_save and
1081    /// before gen_clobber_restore, the stack pointer points to the
1082    /// bottom of this area.
1083    pub outgoing_args_size: u32,
1084
1085    /// Sorted list of callee-saved registers that are clobbered
1086    /// according to the ABI.  These registers will be saved and
1087    /// restored by gen_clobber_save and gen_clobber_restore.
1088    pub clobbered_callee_saves: Vec<Writable<RealReg>>,
1089
1090    /// The function's call pattern classification.
1091    pub function_calls: FunctionCalls,
1092}
1093
1094impl FrameLayout {
1095    /// Split the clobbered callee-save registers into integer-class and
1096    /// float-class groups.
1097    ///
1098    /// This method does not currently support vector-class callee-save
1099    /// registers because no current backend has them.
1100    pub fn clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>]) {
1101        let (ints, floats) = self.clobbered_callee_saves.split_at(
1102            self.clobbered_callee_saves
1103                .partition_point(|r| r.to_reg().class() == RegClass::Int),
1104        );
1105        debug_assert!(floats.iter().all(|r| r.to_reg().class() == RegClass::Float));
1106        (ints, floats)
1107    }
1108
1109    /// The size of FP to SP while the frame is active (not during prologue
1110    /// setup or epilogue tear down).
1111    pub fn active_size(&self) -> u32 {
1112        self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1113    }
1114
1115    /// Get the offset from the SP to the sized stack slots area.
1116    pub fn sp_to_sized_stack_slots(&self) -> u32 {
1117        self.outgoing_args_size
1118    }
1119
1120    /// Get the offset of a spill slot from SP.
1121    pub fn spillslot_offset(&self, spillslot: SpillSlot) -> i64 {
1122        // Offset from beginning of spillslot area.
1123        let islot = spillslot.index() as i64;
1124        let spill_off = islot * self.word_bytes as i64;
1125        let sp_off = self.stackslots_size as i64 + spill_off;
1126
1127        sp_off
1128    }
1129
1130    /// Get the offset from SP up to FP.
1131    pub fn sp_to_fp(&self) -> u32 {
1132        self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1133    }
1134}
1135
1136/// ABI object for a function body.
1137pub struct Callee<M: ABIMachineSpec> {
1138    /// CLIF-level signature, possibly normalized.
1139    ir_sig: ir::Signature,
1140    /// Signature: arg and retval regs.
1141    sig: Sig,
1142    /// Defined dynamic types.
1143    dynamic_type_sizes: HashMap<Type, u32>,
1144    /// Offsets to each dynamic stackslot.
1145    dynamic_stackslots: PrimaryMap<DynamicStackSlot, u32>,
1146    /// Offsets to each sized stackslot.
1147    sized_stackslots: PrimaryMap<StackSlot, u32>,
1148    /// Descriptors for sized stackslots.
1149    sized_stackslot_keys: SecondaryMap<StackSlot, Option<StackSlotKey>>,
1150    /// Total stack size of all stackslots
1151    stackslots_size: u32,
1152    /// Stack size to be reserved for outgoing arguments.
1153    outgoing_args_size: u32,
1154    /// Initially the number of bytes originating in the callers frame where stack arguments will
1155    /// live. After lowering this number may be larger than the size expected by the function being
1156    /// compiled, as tail calls potentially require more space for stack arguments.
1157    tail_args_size: u32,
1158    /// Register-argument defs, to be provided to the `args`
1159    /// pseudo-inst, and pregs to constrain them to.
1160    reg_args: Vec<ArgPair>,
1161    /// Finalized frame layout for this function.
1162    frame_layout: Option<FrameLayout>,
1163    /// The register holding the return-area pointer, if needed.
1164    ret_area_ptr: Option<Reg>,
1165    /// Calling convention this function expects.
1166    call_conv: isa::CallConv,
1167    /// The settings controlling this function's compilation.
1168    flags: settings::Flags,
1169    /// The ISA-specific flag values controlling this function's compilation.
1170    isa_flags: M::F,
1171    /// If this function has a stack limit specified, then `Reg` is where the
1172    /// stack limit will be located after the instructions specified have been
1173    /// executed.
1174    ///
1175    /// Note that this is intended for insertion into the prologue, if
1176    /// present. Also note that because the instructions here execute in the
1177    /// prologue this happens after legalization/register allocation/etc so we
1178    /// need to be extremely careful with each instruction. The instructions are
1179    /// manually register-allocated and carefully only use caller-saved
1180    /// registers and keep nothing live after this sequence of instructions.
1181    stack_limit: Option<(Reg, SmallInstVec<M::I>)>,
1182
1183    _mach: PhantomData<M>,
1184}
1185
1186fn get_special_purpose_param_register(
1187    f: &ir::Function,
1188    sigs: &SigSet,
1189    sig: Sig,
1190    purpose: ir::ArgumentPurpose,
1191) -> Option<Reg> {
1192    let idx = f.signature.special_param_index(purpose)?;
1193    match &sigs.args(sig)[idx] {
1194        &ABIArg::Slots { ref slots, .. } => match &slots[0] {
1195            &ABIArgSlot::Reg { reg, .. } => Some(reg.into()),
1196            _ => None,
1197        },
1198        _ => None,
1199    }
1200}
1201
1202fn checked_round_up(val: u32, mask: u32) -> Option<u32> {
1203    Some(val.checked_add(mask)? & !mask)
1204}
1205
1206impl<M: ABIMachineSpec> Callee<M> {
1207    /// Create a new body ABI instance.
1208    pub fn new(
1209        f: &ir::Function,
1210        isa: &dyn TargetIsa,
1211        isa_flags: &M::F,
1212        sigs: &SigSet,
1213    ) -> CodegenResult<Self> {
1214        trace!("ABI: func signature {:?}", f.signature);
1215
1216        let flags = isa.flags().clone();
1217        let sig = sigs.abi_sig_for_signature(&f.signature);
1218
1219        let call_conv = f.signature.call_conv;
1220        // Only these calling conventions are supported.
1221        debug_assert!(
1222            call_conv == isa::CallConv::SystemV
1223                || call_conv == isa::CallConv::Tail
1224                || call_conv == isa::CallConv::Fast
1225                || call_conv == isa::CallConv::WindowsFastcall
1226                || call_conv == isa::CallConv::AppleAarch64
1227                || call_conv == isa::CallConv::Winch
1228                || call_conv == isa::CallConv::PreserveAll,
1229            "Unsupported calling convention: {call_conv:?}"
1230        );
1231
1232        // Compute sized stackslot locations and total stackslot size.
1233        let mut end_offset: u32 = 0;
1234        let mut sized_stackslots = PrimaryMap::new();
1235        let mut sized_stackslot_keys = SecondaryMap::new();
1236
1237        for (stackslot, data) in f.sized_stack_slots.iter() {
1238            // We start our computation possibly unaligned where the previous
1239            // stackslot left off.
1240            let unaligned_start_offset = end_offset;
1241
1242            // The start of the stackslot must be aligned.
1243            //
1244            // We always at least machine-word-align slots, but also
1245            // satisfy the user's requested alignment.
1246            debug_assert!(data.align_shift < 32);
1247            let align = std::cmp::max(M::word_bytes(), 1u32 << data.align_shift);
1248            let mask = align - 1;
1249            let start_offset = checked_round_up(unaligned_start_offset, mask)
1250                .ok_or(CodegenError::ImplLimitExceeded)?;
1251
1252            // The end offset is the start offset increased by the size
1253            end_offset = start_offset
1254                .checked_add(data.size)
1255                .ok_or(CodegenError::ImplLimitExceeded)?;
1256
1257            debug_assert_eq!(stackslot.as_u32() as usize, sized_stackslots.len());
1258            sized_stackslots.push(start_offset);
1259            sized_stackslot_keys[stackslot] = data.key;
1260        }
1261
1262        // Compute dynamic stackslot locations and total stackslot size.
1263        let mut dynamic_stackslots = PrimaryMap::new();
1264        for (stackslot, data) in f.dynamic_stack_slots.iter() {
1265            debug_assert_eq!(stackslot.as_u32() as usize, dynamic_stackslots.len());
1266
1267            // This computation is similar to the stackslots above
1268            let unaligned_start_offset = end_offset;
1269
1270            let mask = M::word_bytes() - 1;
1271            let start_offset = checked_round_up(unaligned_start_offset, mask)
1272                .ok_or(CodegenError::ImplLimitExceeded)?;
1273
1274            let ty = f.get_concrete_dynamic_ty(data.dyn_ty).ok_or_else(|| {
1275                CodegenError::Unsupported(format!("invalid dynamic vector type: {}", data.dyn_ty))
1276            })?;
1277
1278            end_offset = start_offset
1279                .checked_add(isa.dynamic_vector_bytes(ty))
1280                .ok_or(CodegenError::ImplLimitExceeded)?;
1281
1282            dynamic_stackslots.push(start_offset);
1283        }
1284
1285        // The size of the stackslots needs to be word aligned
1286        let stackslots_size = checked_round_up(end_offset, M::word_bytes() - 1)
1287            .ok_or(CodegenError::ImplLimitExceeded)?;
1288
1289        let mut dynamic_type_sizes = HashMap::with_capacity(f.dfg.dynamic_types.len());
1290        for (dyn_ty, _data) in f.dfg.dynamic_types.iter() {
1291            let ty = f
1292                .get_concrete_dynamic_ty(dyn_ty)
1293                .unwrap_or_else(|| panic!("invalid dynamic vector type: {dyn_ty}"));
1294            let size = isa.dynamic_vector_bytes(ty);
1295            dynamic_type_sizes.insert(ty, size);
1296        }
1297
1298        // Figure out what instructions, if any, will be needed to check the
1299        // stack limit. This can either be specified as a special-purpose
1300        // argument or as a global value which often calculates the stack limit
1301        // from the arguments.
1302        let stack_limit = f
1303            .stack_limit
1304            .map(|gv| gen_stack_limit::<M>(f, sigs, sig, gv));
1305
1306        let tail_args_size = sigs[sig].sized_stack_arg_space;
1307
1308        Ok(Self {
1309            ir_sig: ensure_struct_return_ptr_is_returned(&f.signature),
1310            sig,
1311            dynamic_stackslots,
1312            dynamic_type_sizes,
1313            sized_stackslots,
1314            sized_stackslot_keys,
1315            stackslots_size,
1316            outgoing_args_size: 0,
1317            tail_args_size,
1318            reg_args: vec![],
1319            frame_layout: None,
1320            ret_area_ptr: None,
1321            call_conv,
1322            flags,
1323            isa_flags: isa_flags.clone(),
1324            stack_limit,
1325            _mach: PhantomData,
1326        })
1327    }
1328
1329    /// Inserts instructions necessary for checking the stack limit into the
1330    /// prologue.
1331    ///
1332    /// This function will generate instructions necessary for perform a stack
1333    /// check at the header of a function. The stack check is intended to trap
1334    /// if the stack pointer goes below a particular threshold, preventing stack
1335    /// overflow in wasm or other code. The `stack_limit` argument here is the
1336    /// register which holds the threshold below which we're supposed to trap.
1337    /// This function is known to allocate `stack_size` bytes and we'll push
1338    /// instructions onto `insts`.
1339    ///
1340    /// Note that the instructions generated here are special because this is
1341    /// happening so late in the pipeline (e.g. after register allocation). This
1342    /// means that we need to do manual register allocation here and also be
1343    /// careful to not clobber any callee-saved or argument registers. For now
1344    /// this routine makes do with the `spilltmp_reg` as one temporary
1345    /// register, and a second register of `tmp2` which is caller-saved. This
1346    /// should be fine for us since no spills should happen in this sequence of
1347    /// instructions, so our register won't get accidentally clobbered.
1348    ///
1349    /// No values can be live after the prologue, but in this case that's ok
1350    /// because we just need to perform a stack check before progressing with
1351    /// the rest of the function.
1352    fn insert_stack_check(
1353        &self,
1354        stack_limit: Reg,
1355        stack_size: u32,
1356        insts: &mut SmallInstVec<M::I>,
1357    ) {
1358        // With no explicit stack allocated we can just emit the simple check of
1359        // the stack registers against the stack limit register, and trap if
1360        // it's out of bounds.
1361        if stack_size == 0 {
1362            insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1363            return;
1364        }
1365
1366        // Note that the 32k stack size here is pretty special. See the
1367        // documentation in x86/abi.rs for why this is here. The general idea is
1368        // that we're protecting against overflow in the addition that happens
1369        // below.
1370        if stack_size >= 32 * 1024 {
1371            insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1372        }
1373
1374        // Add the `stack_size` to `stack_limit`, placing the result in
1375        // `scratch`.
1376        //
1377        // Note though that `stack_limit`'s register may be the same as
1378        // `scratch`. If our stack size doesn't fit into an immediate this
1379        // means we need a second scratch register for loading the stack size
1380        // into a register.
1381        let scratch = Writable::from_reg(M::get_stacklimit_reg(self.call_conv));
1382        insts.extend(M::gen_add_imm(
1383            self.call_conv,
1384            scratch,
1385            stack_limit,
1386            stack_size,
1387        ));
1388        insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));
1389    }
1390}
1391
1392/// Generates the instructions necessary for the `gv` to be materialized into a
1393/// register.
1394///
1395/// This function will return a register that will contain the result of
1396/// evaluating `gv`. It will also return any instructions necessary to calculate
1397/// the value of the register.
1398///
1399/// Note that global values are typically lowered to instructions via the
1400/// standard legalization pass. Unfortunately though prologue generation happens
1401/// so late in the pipeline that we can't use these legalization passes to
1402/// generate the instructions for `gv`. As a result we duplicate some lowering
1403/// of `gv` here and support only some global values. This is similar to what
1404/// the x86 backend does for now, and hopefully this can be somewhat cleaned up
1405/// in the future too!
1406///
1407/// Also note that this function will make use of `writable_spilltmp_reg()` as a
1408/// temporary register to store values in if necessary. Currently after we write
1409/// to this register there's guaranteed to be no spilled values between where
1410/// it's used, because we're not participating in register allocation anyway!
1411fn gen_stack_limit<M: ABIMachineSpec>(
1412    f: &ir::Function,
1413    sigs: &SigSet,
1414    sig: Sig,
1415    gv: ir::GlobalValue,
1416) -> (Reg, SmallInstVec<M::I>) {
1417    let mut insts = smallvec![];
1418    let reg = generate_gv::<M>(f, sigs, sig, gv, &mut insts);
1419    return (reg, insts);
1420}
1421
1422fn generate_gv<M: ABIMachineSpec>(
1423    f: &ir::Function,
1424    sigs: &SigSet,
1425    sig: Sig,
1426    gv: ir::GlobalValue,
1427    insts: &mut SmallInstVec<M::I>,
1428) -> Reg {
1429    match f.global_values[gv] {
1430        // Return the direct register the vmcontext is in
1431        ir::GlobalValueData::VMContext => {
1432            get_special_purpose_param_register(f, sigs, sig, ir::ArgumentPurpose::VMContext)
1433                .expect("no vmcontext parameter found")
1434        }
1435        // Load our base value into a register, then load from that register
1436        // in to a temporary register.
1437        ir::GlobalValueData::Load {
1438            base,
1439            offset,
1440            global_type: _,
1441            flags: _,
1442        } => {
1443            let base = generate_gv::<M>(f, sigs, sig, base, insts);
1444            let into_reg = Writable::from_reg(M::get_stacklimit_reg(f.stencil.signature.call_conv));
1445            insts.push(M::gen_load_base_offset(
1446                into_reg,
1447                base,
1448                offset.into(),
1449                M::word_type(),
1450            ));
1451            return into_reg.to_reg();
1452        }
1453        ref other => panic!("global value for stack limit not supported: {other}"),
1454    }
1455}
1456
1457/// Returns true if the signature needs to be legalized.
1458fn missing_struct_return(sig: &ir::Signature) -> bool {
1459    sig.uses_special_param(ArgumentPurpose::StructReturn)
1460        && !sig.uses_special_return(ArgumentPurpose::StructReturn)
1461}
1462
1463fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {
1464    // Keep in sync with Callee::new
1465    let mut sig = sig.clone();
1466    if sig.uses_special_return(ArgumentPurpose::StructReturn) {
1467        panic!("Explicit StructReturn return value not allowed: {sig:?}")
1468    }
1469    if let Some(struct_ret_index) = sig.special_param_index(ArgumentPurpose::StructReturn) {
1470        if !sig.returns.is_empty() {
1471            panic!("No return values are allowed when using StructReturn: {sig:?}");
1472        }
1473        sig.returns.insert(0, sig.params[struct_ret_index]);
1474    }
1475    sig
1476}
1477
1478/// ### Pre-Regalloc Functions
1479///
1480/// These methods of `Callee` may only be called before regalloc.
1481impl<M: ABIMachineSpec> Callee<M> {
1482    /// Access the (possibly legalized) signature.
1483    pub fn signature(&self) -> &ir::Signature {
1484        debug_assert!(
1485            !missing_struct_return(&self.ir_sig),
1486            "`Callee::ir_sig` is always legalized"
1487        );
1488        &self.ir_sig
1489    }
1490
1491    /// Initialize. This is called after the Callee is constructed because it
1492    /// may allocate a temp vreg, which can only be allocated once the lowering
1493    /// context exists.
1494    pub fn init_retval_area(
1495        &mut self,
1496        sigs: &SigSet,
1497        vregs: &mut VRegAllocator<M::I>,
1498    ) -> CodegenResult<()> {
1499        if sigs[self.sig].stack_ret_arg.is_some() {
1500            let ret_area_ptr = vregs.alloc(M::word_type())?;
1501            self.ret_area_ptr = Some(ret_area_ptr.only_reg().unwrap());
1502        }
1503        Ok(())
1504    }
1505
1506    /// Get the return area pointer register, if any.
1507    pub fn ret_area_ptr(&self) -> Option<Reg> {
1508        self.ret_area_ptr
1509    }
1510
1511    /// Accumulate outgoing arguments.
1512    ///
1513    /// This ensures that at least `size` bytes are allocated in the prologue to
1514    /// be available for use in function calls to hold arguments and/or return
1515    /// values. If this function is called multiple times, the maximum of all
1516    /// `size` values will be available.
1517    pub fn accumulate_outgoing_args_size(&mut self, size: u32) {
1518        if size > self.outgoing_args_size {
1519            self.outgoing_args_size = size;
1520        }
1521    }
1522
1523    /// Accumulate the incoming argument area size requirements for a tail call,
1524    /// as it could be larger than the incoming arguments of the function
1525    /// currently being compiled.
1526    pub fn accumulate_tail_args_size(&mut self, size: u32) {
1527        if size > self.tail_args_size {
1528            self.tail_args_size = size;
1529        }
1530    }
1531
1532    pub fn is_forward_edge_cfi_enabled(&self) -> bool {
1533        self.isa_flags.is_forward_edge_cfi_enabled()
1534    }
1535
1536    /// Get the calling convention implemented by this ABI object.
1537    pub fn call_conv(&self) -> isa::CallConv {
1538        self.call_conv
1539    }
1540
1541    /// Get the ABI-dependent MachineEnv for managing register allocation.
1542    pub fn machine_env(&self) -> &MachineEnv {
1543        M::get_machine_env(&self.flags, self.call_conv)
1544    }
1545
1546    /// The offsets of all sized stack slots (not spill slots) for debuginfo purposes.
1547    pub fn sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {
1548        &self.sized_stackslots
1549    }
1550
1551    /// The offsets of all dynamic stack slots (not spill slots) for debuginfo purposes.
1552    pub fn dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32> {
1553        &self.dynamic_stackslots
1554    }
1555
1556    /// Generate an instruction which copies an argument to a destination
1557    /// register.
1558    pub fn gen_copy_arg_to_regs(
1559        &mut self,
1560        sigs: &SigSet,
1561        idx: usize,
1562        into_regs: ValueRegs<Writable<Reg>>,
1563        vregs: &mut VRegAllocator<M::I>,
1564    ) -> SmallInstVec<M::I> {
1565        let mut insts = smallvec![];
1566        let mut copy_arg_slot_to_reg = |slot: &ABIArgSlot, into_reg: &Writable<Reg>| {
1567            match slot {
1568                &ABIArgSlot::Reg { reg, .. } => {
1569                    // Add a preg -> def pair to the eventual `args`
1570                    // instruction.  Extension mode doesn't matter
1571                    // (we're copying out, not in; we ignore high bits
1572                    // by convention).
1573                    let arg = ArgPair {
1574                        vreg: *into_reg,
1575                        preg: reg.into(),
1576                    };
1577                    self.reg_args.push(arg);
1578                }
1579                &ABIArgSlot::Stack {
1580                    offset,
1581                    ty,
1582                    extension,
1583                    ..
1584                } => {
1585                    // However, we have to respect the extension mode for stack
1586                    // slots, or else we grab the wrong bytes on big-endian.
1587                    let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1588                    let ty =
1589                        if ext != ArgumentExtension::None && M::word_bits() > ty_bits(ty) as u32 {
1590                            M::word_type()
1591                        } else {
1592                            ty
1593                        };
1594                    insts.push(M::gen_load_stack(
1595                        StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1596                        *into_reg,
1597                        ty,
1598                    ));
1599                }
1600            }
1601        };
1602
1603        match &sigs.args(self.sig)[idx] {
1604            &ABIArg::Slots { ref slots, .. } => {
1605                assert_eq!(into_regs.len(), slots.len());
1606                for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
1607                    copy_arg_slot_to_reg(&slot, &into_reg);
1608                }
1609            }
1610            &ABIArg::StructArg { offset, .. } => {
1611                let into_reg = into_regs.only_reg().unwrap();
1612                // Buffer address is implicitly defined by the ABI.
1613                insts.push(M::gen_get_stack_addr(
1614                    StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1615                    into_reg,
1616                ));
1617            }
1618            &ABIArg::ImplicitPtrArg { pointer, ty, .. } => {
1619                let into_reg = into_regs.only_reg().unwrap();
1620                // We need to dereference the pointer.
1621                let base = match &pointer {
1622                    &ABIArgSlot::Reg { reg, ty, .. } => {
1623                        let tmp = vregs.alloc_with_deferred_error(ty).only_reg().unwrap();
1624                        self.reg_args.push(ArgPair {
1625                            vreg: Writable::from_reg(tmp),
1626                            preg: reg.into(),
1627                        });
1628                        tmp
1629                    }
1630                    &ABIArgSlot::Stack { offset, ty, .. } => {
1631                        let addr_reg = writable_value_regs(vregs.alloc_with_deferred_error(ty))
1632                            .only_reg()
1633                            .unwrap();
1634                        insts.push(M::gen_load_stack(
1635                            StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1636                            addr_reg,
1637                            ty,
1638                        ));
1639                        addr_reg.to_reg()
1640                    }
1641                };
1642                insts.push(M::gen_load_base_offset(into_reg, base, 0, ty));
1643            }
1644        }
1645        insts
1646    }
1647
1648    /// Generate an instruction which copies a source register to a return value slot.
1649    pub fn gen_copy_regs_to_retval(
1650        &self,
1651        sigs: &SigSet,
1652        idx: usize,
1653        from_regs: ValueRegs<Reg>,
1654        vregs: &mut VRegAllocator<M::I>,
1655    ) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>) {
1656        let mut reg_pairs = smallvec![];
1657        let mut ret = smallvec![];
1658        let word_bits = M::word_bits() as u8;
1659        match &sigs.rets(self.sig)[idx] {
1660            &ABIArg::Slots { ref slots, .. } => {
1661                assert_eq!(from_regs.len(), slots.len());
1662                for (slot, &from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1663                    match slot {
1664                        &ABIArgSlot::Reg {
1665                            reg, ty, extension, ..
1666                        } => {
1667                            let from_bits = ty_bits(ty) as u8;
1668                            let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1669                            let vreg = match (ext, from_bits) {
1670                                (ir::ArgumentExtension::Uext, n)
1671                                | (ir::ArgumentExtension::Sext, n)
1672                                    if n < word_bits =>
1673                                {
1674                                    let signed = ext == ir::ArgumentExtension::Sext;
1675                                    let dst =
1676                                        writable_value_regs(vregs.alloc_with_deferred_error(ty))
1677                                            .only_reg()
1678                                            .unwrap();
1679                                    ret.push(M::gen_extend(
1680                                        dst, from_reg, signed, from_bits,
1681                                        /* to_bits = */ word_bits,
1682                                    ));
1683                                    dst.to_reg()
1684                                }
1685                                _ => {
1686                                    // No move needed, regalloc2 will emit it using the constraint
1687                                    // added by the RetPair.
1688                                    from_reg
1689                                }
1690                            };
1691                            reg_pairs.push(RetPair {
1692                                vreg,
1693                                preg: Reg::from(reg),
1694                            });
1695                        }
1696                        &ABIArgSlot::Stack {
1697                            offset,
1698                            ty,
1699                            extension,
1700                            ..
1701                        } => {
1702                            let mut ty = ty;
1703                            let from_bits = ty_bits(ty) as u8;
1704                            // A machine ABI implementation should ensure that stack frames
1705                            // have "reasonable" size. All current ABIs for machinst
1706                            // backends (aarch64 and x64) enforce a 128MB limit.
1707                            let off = i32::try_from(offset).expect(
1708                                "Argument stack offset greater than 2GB; should hit impl limit first",
1709                                );
1710                            let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1711                            // Trash the from_reg; it should be its last use.
1712                            match (ext, from_bits) {
1713                                (ir::ArgumentExtension::Uext, n)
1714                                | (ir::ArgumentExtension::Sext, n)
1715                                    if n < word_bits =>
1716                                {
1717                                    assert_eq!(M::word_reg_class(), from_reg.class());
1718                                    let signed = ext == ir::ArgumentExtension::Sext;
1719                                    let dst =
1720                                        writable_value_regs(vregs.alloc_with_deferred_error(ty))
1721                                            .only_reg()
1722                                            .unwrap();
1723                                    ret.push(M::gen_extend(
1724                                        dst, from_reg, signed, from_bits,
1725                                        /* to_bits = */ word_bits,
1726                                    ));
1727                                    // Store the extended version.
1728                                    ty = M::word_type();
1729                                }
1730                                _ => {}
1731                            };
1732                            ret.push(M::gen_store_base_offset(
1733                                self.ret_area_ptr.unwrap(),
1734                                off,
1735                                from_reg,
1736                                ty,
1737                            ));
1738                        }
1739                    }
1740                }
1741            }
1742            ABIArg::StructArg { .. } => {
1743                panic!("StructArg in return position is unsupported");
1744            }
1745            ABIArg::ImplicitPtrArg { .. } => {
1746                panic!("ImplicitPtrArg in return position is unsupported");
1747            }
1748        }
1749        (reg_pairs, ret)
1750    }
1751
1752    /// Generate any setup instruction needed to save values to the
1753    /// return-value area. This is usually used when were are multiple return
1754    /// values or an otherwise large return value that must be passed on the
1755    /// stack; typically the ABI specifies an extra hidden argument that is a
1756    /// pointer to that memory.
1757    pub fn gen_retval_area_setup(
1758        &mut self,
1759        sigs: &SigSet,
1760        vregs: &mut VRegAllocator<M::I>,
1761    ) -> Option<M::I> {
1762        if let Some(i) = sigs[self.sig].stack_ret_arg {
1763            let ret_area_ptr = Writable::from_reg(self.ret_area_ptr.unwrap());
1764            let insts =
1765                self.gen_copy_arg_to_regs(sigs, i.into(), ValueRegs::one(ret_area_ptr), vregs);
1766            insts.into_iter().next().map(|inst| {
1767                trace!(
1768                    "gen_retval_area_setup: inst {:?}; ptr reg is {:?}",
1769                    inst,
1770                    ret_area_ptr.to_reg()
1771                );
1772                inst
1773            })
1774        } else {
1775            trace!("gen_retval_area_setup: not needed");
1776            None
1777        }
1778    }
1779
1780    /// Generate a return instruction.
1781    pub fn gen_rets(&self, rets: Vec<RetPair>) -> M::I {
1782        M::gen_rets(rets)
1783    }
1784
1785    /// Set up arguments values `args` for a call with signature `sig`.
1786    /// This will return a series of instructions to be emitted to set
1787    /// up all arguments, as well as a `CallArgList` list representing
1788    /// the arguments passed in registers.  The latter need to be added
1789    /// as constraints to the actual call instruction.
1790    pub fn gen_call_args(
1791        &self,
1792        sigs: &SigSet,
1793        sig: Sig,
1794        args: &[ValueRegs<Reg>],
1795        is_tail_call: bool,
1796        flags: &settings::Flags,
1797        vregs: &mut VRegAllocator<M::I>,
1798    ) -> (CallArgList, SmallInstVec<M::I>) {
1799        let mut uses: CallArgList = smallvec![];
1800        let mut insts = smallvec![];
1801
1802        assert_eq!(args.len(), sigs.num_args(sig));
1803
1804        let call_conv = sigs[sig].call_conv;
1805        let stack_arg_space = sigs[sig].sized_stack_arg_space;
1806        let stack_arg = |offset| {
1807            if is_tail_call {
1808                StackAMode::IncomingArg(offset, stack_arg_space)
1809            } else {
1810                StackAMode::OutgoingArg(offset)
1811            }
1812        };
1813
1814        let word_ty = M::word_type();
1815        let word_rc = M::word_reg_class();
1816        let word_bits = M::word_bits() as usize;
1817
1818        if is_tail_call {
1819            debug_assert_eq!(
1820                self.call_conv,
1821                isa::CallConv::Tail,
1822                "Can only do `return_call`s from within a `tail` calling convention function"
1823            );
1824        }
1825
1826        // Helper to process a single argument slot (register or stack slot).
1827        // This will either add the register to the `uses` list or write the
1828        // value to the stack slot in the outgoing argument area (or for tail
1829        // calls, the incoming argument area).
1830        let mut process_arg_slot = |insts: &mut SmallInstVec<M::I>, slot, vreg, ty| {
1831            match &slot {
1832                &ABIArgSlot::Reg { reg, .. } => {
1833                    uses.push(CallArgPair {
1834                        vreg,
1835                        preg: reg.into(),
1836                    });
1837                }
1838                &ABIArgSlot::Stack { offset, .. } => {
1839                    insts.push(M::gen_store_stack(stack_arg(offset), vreg, ty));
1840                }
1841            };
1842        };
1843
1844        // First pass: Handle `StructArg` arguments.  These need to be copied
1845        // into their associated stack buffers.  This should happen before any
1846        // of the other arguments are processed, as the `memcpy` call might
1847        // clobber registers used by other arguments.
1848        for (idx, from_regs) in args.iter().enumerate() {
1849            match &sigs.args(sig)[idx] {
1850                &ABIArg::Slots { .. } | &ABIArg::ImplicitPtrArg { .. } => {}
1851                &ABIArg::StructArg { offset, size, .. } => {
1852                    let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1853                    insts.push(M::gen_get_stack_addr(
1854                        stack_arg(offset),
1855                        Writable::from_reg(tmp),
1856                    ));
1857                    insts.extend(M::gen_memcpy(
1858                        isa::CallConv::for_libcall(flags, call_conv),
1859                        tmp,
1860                        from_regs.only_reg().unwrap(),
1861                        size as usize,
1862                        |ty| {
1863                            Writable::from_reg(
1864                                vregs.alloc_with_deferred_error(ty).only_reg().unwrap(),
1865                            )
1866                        },
1867                    ));
1868                }
1869            }
1870        }
1871
1872        // Second pass: Handle everything except `StructArg` arguments.
1873        for (idx, from_regs) in args.iter().enumerate() {
1874            match sigs.args(sig)[idx] {
1875                ABIArg::Slots { ref slots, .. } => {
1876                    assert_eq!(from_regs.len(), slots.len());
1877                    for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1878                        // Load argument slot value from `from_reg`, and perform any zero-
1879                        // or sign-extension that is required by the ABI.
1880                        let (ty, extension) = match *slot {
1881                            ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),
1882                            ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),
1883                        };
1884                        let ext = M::get_ext_mode(call_conv, extension);
1885                        let (vreg, ty) = if ext != ir::ArgumentExtension::None
1886                            && ty_bits(ty) < word_bits
1887                        {
1888                            assert_eq!(word_rc, from_reg.class());
1889                            let signed = match ext {
1890                                ir::ArgumentExtension::Uext => false,
1891                                ir::ArgumentExtension::Sext => true,
1892                                _ => unreachable!(),
1893                            };
1894                            let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1895                            insts.push(M::gen_extend(
1896                                Writable::from_reg(tmp),
1897                                *from_reg,
1898                                signed,
1899                                ty_bits(ty) as u8,
1900                                word_bits as u8,
1901                            ));
1902                            (tmp, word_ty)
1903                        } else {
1904                            (*from_reg, ty)
1905                        };
1906                        process_arg_slot(&mut insts, *slot, vreg, ty);
1907                    }
1908                }
1909                ABIArg::ImplicitPtrArg {
1910                    offset,
1911                    pointer,
1912                    ty,
1913                    ..
1914                } => {
1915                    let vreg = from_regs.only_reg().unwrap();
1916                    let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1917                    insts.push(M::gen_get_stack_addr(
1918                        stack_arg(offset),
1919                        Writable::from_reg(tmp),
1920                    ));
1921                    insts.push(M::gen_store_base_offset(tmp, 0, vreg, ty));
1922                    process_arg_slot(&mut insts, pointer, tmp, word_ty);
1923                }
1924                ABIArg::StructArg { .. } => {}
1925            }
1926        }
1927
1928        // Finally, set the stack-return pointer to the return argument area.
1929        // For tail calls, this means forwarding the incoming stack-return pointer.
1930        if let Some(ret_arg) = sigs.get_ret_arg(sig) {
1931            let ret_area = if is_tail_call {
1932                self.ret_area_ptr.expect(
1933                    "if the tail callee has a return pointer, then the tail caller must as well",
1934                )
1935            } else {
1936                let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1937                let amode = StackAMode::OutgoingArg(stack_arg_space.into());
1938                insts.push(M::gen_get_stack_addr(amode, Writable::from_reg(tmp)));
1939                tmp
1940            };
1941            match ret_arg {
1942                // The return pointer must occupy a single slot.
1943                ABIArg::Slots { slots, .. } => {
1944                    assert_eq!(slots.len(), 1);
1945                    process_arg_slot(&mut insts, slots[0], ret_area, word_ty);
1946                }
1947                _ => unreachable!(),
1948            }
1949        }
1950
1951        (uses, insts)
1952    }
1953
1954    /// Set up return values `outputs` for a call with signature `sig`.
1955    /// This does not emit (or return) any instructions, but returns a
1956    /// `CallRetList` representing the return value constraints.  This
1957    /// needs to be added to the actual call instruction.
1958    ///
1959    /// If `try_call_payloads` is non-zero, it is expected to hold
1960    /// exception payload registers for try_call instructions.  These
1961    /// will be added as needed to the `CallRetList` as well.
1962    pub fn gen_call_rets(
1963        &self,
1964        sigs: &SigSet,
1965        sig: Sig,
1966        outputs: &[ValueRegs<Reg>],
1967        try_call_payloads: Option<&[Writable<Reg>]>,
1968        vregs: &mut VRegAllocator<M::I>,
1969    ) -> CallRetList {
1970        let callee_conv = sigs[sig].call_conv;
1971        let stack_arg_space = sigs[sig].sized_stack_arg_space;
1972
1973        let word_ty = M::word_type();
1974        let word_bits = M::word_bits() as usize;
1975
1976        let mut defs: CallRetList = smallvec![];
1977        let mut outputs = outputs.into_iter();
1978        let num_rets = sigs.num_rets(sig);
1979        for idx in 0..num_rets {
1980            let ret = sigs.rets(sig)[idx].clone();
1981            match ret {
1982                ABIArg::Slots {
1983                    ref slots, purpose, ..
1984                } => {
1985                    // We do not use the returned copy of the return buffer pointer,
1986                    // so skip any StructReturn returns that may be present.
1987                    if purpose == ArgumentPurpose::StructReturn {
1988                        continue;
1989                    }
1990                    let retval_regs = outputs.next().unwrap();
1991                    assert_eq!(retval_regs.len(), slots.len());
1992                    for (slot, retval_reg) in slots.iter().zip(retval_regs.regs().iter()) {
1993                        // We do not perform any extension because we're copying out, not in,
1994                        // and we ignore high bits in our own registers by convention.  However,
1995                        // we still need to use the proper extended type to access stack slots
1996                        // (this is critical on big-endian systems).
1997                        let (ty, extension) = match *slot {
1998                            ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),
1999                            ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),
2000                        };
2001                        let ext = M::get_ext_mode(callee_conv, extension);
2002                        let ty = if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
2003                            word_ty
2004                        } else {
2005                            ty
2006                        };
2007
2008                        match slot {
2009                            &ABIArgSlot::Reg { reg, .. } => {
2010                                defs.push(CallRetPair {
2011                                    vreg: Writable::from_reg(*retval_reg),
2012                                    location: RetLocation::Reg(reg.into(), ty),
2013                                });
2014                            }
2015                            &ABIArgSlot::Stack { offset, .. } => {
2016                                let amode =
2017                                    StackAMode::OutgoingArg(offset + i64::from(stack_arg_space));
2018                                defs.push(CallRetPair {
2019                                    vreg: Writable::from_reg(*retval_reg),
2020                                    location: RetLocation::Stack(amode, ty),
2021                                });
2022                            }
2023                        }
2024                    }
2025                }
2026                ABIArg::StructArg { .. } => {
2027                    panic!("StructArg not supported in return position");
2028                }
2029                ABIArg::ImplicitPtrArg { .. } => {
2030                    panic!("ImplicitPtrArg not supported in return position");
2031                }
2032            }
2033        }
2034        assert!(outputs.next().is_none());
2035
2036        if let Some(try_call_payloads) = try_call_payloads {
2037            // Let `M` say where the payload values are going to end up and then
2038            // double-check it's the same size as the calling convention's
2039            // reported number of exception types.
2040            let pregs = M::exception_payload_regs(callee_conv);
2041            assert_eq!(
2042                callee_conv.exception_payload_types(M::word_type()).len(),
2043                pregs.len()
2044            );
2045
2046            // We need to update `defs` to contain the exception
2047            // payload regs as well. We have two sources of info that
2048            // we join:
2049            //
2050            // - The machine-specific ABI implementation `M`, which
2051            //   tells us the particular registers that payload values
2052            //   must be in
2053            // - The passed-in lowering context, which gives us the
2054            //   vregs we must define.
2055            //
2056            // Note that payload values may need to end up in the same
2057            // physical registers as ordinary return values; this is
2058            // not a conflict, because we either get one or the
2059            // other. For regalloc's purposes, we define both starting
2060            // here at the callsite, but we can share one def in the
2061            // `defs` list and alias one vreg to another. Thus we
2062            // handle the two cases below for each payload register:
2063            // overlaps a return value (and we alias to it) or not
2064            // (and we add a def).
2065            for (i, &preg) in pregs.iter().enumerate() {
2066                let vreg = try_call_payloads[i];
2067                if let Some(existing) = defs.iter().find(|def| match def.location {
2068                    RetLocation::Reg(r, _) => r == preg,
2069                    _ => false,
2070                }) {
2071                    vregs.set_vreg_alias(vreg.to_reg(), existing.vreg.to_reg());
2072                } else {
2073                    defs.push(CallRetPair {
2074                        vreg,
2075                        location: RetLocation::Reg(preg, word_ty),
2076                    });
2077                }
2078            }
2079        }
2080
2081        defs
2082    }
2083
2084    /// Populate a `CallInfo` for a call with signature `sig`.
2085    ///
2086    /// `dest` is the target-specific call destination value
2087    /// `uses` is the `CallArgList` describing argument constraints
2088    /// `defs` is the `CallRetList` describing return constraints
2089    /// `try_call_info` describes exception targets for try_call instructions
2090    /// `patchable` describes whether this callsite should emit metadata
2091    /// for patching to enable/disable it.
2092    ///
2093    /// The clobber list is computed here from the above data.
2094    pub fn gen_call_info<T>(
2095        &self,
2096        sigs: &SigSet,
2097        sig: Sig,
2098        dest: T,
2099        uses: CallArgList,
2100        defs: CallRetList,
2101        try_call_info: Option<TryCallInfo>,
2102        patchable: bool,
2103    ) -> CallInfo<T> {
2104        let caller_conv = self.call_conv;
2105        let callee_conv = sigs[sig].call_conv;
2106        let stack_arg_space = sigs[sig].sized_stack_arg_space;
2107
2108        let clobbers = {
2109            // Get clobbers: all caller-saves. These may include return value
2110            // regs, which we will remove from the clobber set below.
2111            let mut clobbers =
2112                <M>::get_regs_clobbered_by_call(callee_conv, try_call_info.is_some());
2113
2114            // Remove retval regs from clobbers.
2115            for def in &defs {
2116                if let RetLocation::Reg(preg, _) = def.location {
2117                    clobbers.remove(PReg::from(preg.to_real_reg().unwrap()));
2118                }
2119            }
2120
2121            clobbers
2122        };
2123
2124        // Any adjustment to SP to account for required outgoing arguments/stack return values must
2125        // be done inside of the call pseudo-op, to ensure that SP is always in a consistent
2126        // state for all other instructions. For example, if a tail-call abi function is called
2127        // here, the reclamation of the outgoing argument area must be done inside of the call
2128        // pseudo-op's emission to ensure that SP is consistent at all other points in the lowered
2129        // function. (Except the prologue and epilogue, but those are fairly special parts of the
2130        // function that establish the SP invariants that are relied on elsewhere and are generated
2131        // after the register allocator has run and thus cannot have register allocator-inserted
2132        // references to SP offsets.)
2133
2134        let callee_pop_size = if callee_conv == isa::CallConv::Tail {
2135            // The tail calling convention has callees pop stack arguments.
2136            stack_arg_space
2137        } else {
2138            0
2139        };
2140
2141        CallInfo {
2142            dest,
2143            uses,
2144            defs,
2145            clobbers,
2146            callee_conv,
2147            caller_conv,
2148            callee_pop_size,
2149            try_call_info,
2150            patchable,
2151        }
2152    }
2153
2154    /// Get the raw offset of a sized stackslot in the slot region.
2155    pub fn sized_stackslot_offset(&self, slot: StackSlot) -> u32 {
2156        self.sized_stackslots[slot]
2157    }
2158
2159    /// Produce an instruction that computes a sized stackslot address.
2160    pub fn sized_stackslot_addr(
2161        &self,
2162        slot: StackSlot,
2163        offset: u32,
2164        into_reg: Writable<Reg>,
2165    ) -> M::I {
2166        // Offset from beginning of stackslot area.
2167        let stack_off = self.sized_stackslots[slot] as i64;
2168        let sp_off: i64 = stack_off + (offset as i64);
2169        M::gen_get_stack_addr(StackAMode::Slot(sp_off), into_reg)
2170    }
2171
2172    /// Produce an instruction that computes a dynamic stackslot address.
2173    pub fn dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I {
2174        let stack_off = self.dynamic_stackslots[slot] as i64;
2175        M::gen_get_stack_addr(StackAMode::Slot(stack_off), into_reg)
2176    }
2177
2178    /// Get an `args` pseudo-inst, if any, that should appear at the
2179    /// very top of the function body prior to regalloc.
2180    pub fn take_args(&mut self) -> Option<M::I> {
2181        if self.reg_args.len() > 0 {
2182            // Very first instruction is an `args` pseudo-inst that
2183            // establishes live-ranges for in-register arguments and
2184            // constrains them at the start of the function to the
2185            // locations defined by the ABI.
2186            Some(M::gen_args(std::mem::take(&mut self.reg_args)))
2187        } else {
2188            None
2189        }
2190    }
2191}
2192
2193/// ### Post-Regalloc Functions
2194///
2195/// These methods of `Callee` may only be called after
2196/// regalloc.
2197impl<M: ABIMachineSpec> Callee<M> {
2198    /// Compute the final frame layout, post-regalloc.
2199    ///
2200    /// This must be called before gen_prologue or gen_epilogue.
2201    pub fn compute_frame_layout(
2202        &mut self,
2203        sigs: &SigSet,
2204        spillslots: usize,
2205        clobbered: Vec<Writable<RealReg>>,
2206        function_calls: FunctionCalls,
2207    ) {
2208        let bytes = M::word_bytes();
2209        let total_stacksize = self.stackslots_size + bytes * spillslots as u32;
2210        let mask = M::stack_align(self.call_conv) - 1;
2211        let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.
2212        self.frame_layout = Some(M::compute_frame_layout(
2213            self.call_conv,
2214            &self.flags,
2215            self.signature(),
2216            &clobbered,
2217            function_calls,
2218            self.stack_args_size(sigs),
2219            self.tail_args_size,
2220            self.stackslots_size,
2221            total_stacksize,
2222            self.outgoing_args_size,
2223        ));
2224    }
2225
2226    /// Generate a prologue, post-regalloc.
2227    ///
2228    /// This should include any stack frame or other setup necessary to use the
2229    /// other methods (`load_arg`, `store_retval`, and spillslot accesses.)
2230    pub fn gen_prologue(&self) -> SmallInstVec<M::I> {
2231        let frame_layout = self.frame_layout();
2232        let mut insts = smallvec![];
2233
2234        // Set up frame.
2235        insts.extend(M::gen_prologue_frame_setup(
2236            self.call_conv,
2237            &self.flags,
2238            &self.isa_flags,
2239            &frame_layout,
2240        ));
2241
2242        // The stack limit check needs to cover all the stack adjustments we
2243        // might make, up to the next stack limit check in any function we
2244        // call. Since this happens after frame setup, the current function's
2245        // setup area needs to be accounted for in the caller's stack limit
2246        // check, but we need to account for any setup area that our callees
2247        // might need. Note that s390x may also use the outgoing args area for
2248        // backtrace support even in leaf functions, so that should be accounted
2249        // for unconditionally.
2250        let total_stacksize = (frame_layout.tail_args_size - frame_layout.incoming_args_size)
2251            + frame_layout.clobber_size
2252            + frame_layout.fixed_frame_storage_size
2253            + frame_layout.outgoing_args_size
2254            + if frame_layout.function_calls == FunctionCalls::None {
2255                0
2256            } else {
2257                frame_layout.setup_area_size
2258            };
2259
2260        // Leaf functions with zero stack don't need a stack check if one's
2261        // specified, otherwise always insert the stack check.
2262        if total_stacksize > 0 || frame_layout.function_calls != FunctionCalls::None {
2263            if let Some((reg, stack_limit_load)) = &self.stack_limit {
2264                insts.extend(stack_limit_load.clone());
2265                self.insert_stack_check(*reg, total_stacksize, &mut insts);
2266            }
2267
2268            if self.flags.enable_probestack() {
2269                let guard_size = 1 << self.flags.probestack_size_log2();
2270                match self.flags.probestack_strategy() {
2271                    ProbestackStrategy::Inline => M::gen_inline_probestack(
2272                        &mut insts,
2273                        self.call_conv,
2274                        total_stacksize,
2275                        guard_size,
2276                    ),
2277                    ProbestackStrategy::Outline => {
2278                        if total_stacksize >= guard_size {
2279                            M::gen_probestack(&mut insts, total_stacksize);
2280                        }
2281                    }
2282                }
2283            }
2284        }
2285
2286        // Save clobbered registers.
2287        insts.extend(M::gen_clobber_save(
2288            self.call_conv,
2289            &self.flags,
2290            &frame_layout,
2291        ));
2292
2293        insts
2294    }
2295
2296    /// Generate an epilogue, post-regalloc.
2297    ///
2298    /// Note that this must generate the actual return instruction (rather than
2299    /// emitting this in the lowering logic), because the epilogue code comes
2300    /// before the return and the two are likely closely related.
2301    pub fn gen_epilogue(&self) -> SmallInstVec<M::I> {
2302        let frame_layout = self.frame_layout();
2303        let mut insts = smallvec![];
2304
2305        // Restore clobbered registers.
2306        insts.extend(M::gen_clobber_restore(
2307            self.call_conv,
2308            &self.flags,
2309            &frame_layout,
2310        ));
2311
2312        // Tear down frame.
2313        insts.extend(M::gen_epilogue_frame_restore(
2314            self.call_conv,
2315            &self.flags,
2316            &self.isa_flags,
2317            &frame_layout,
2318        ));
2319
2320        // And return.
2321        insts.extend(M::gen_return(
2322            self.call_conv,
2323            &self.isa_flags,
2324            &frame_layout,
2325        ));
2326
2327        trace!("Epilogue: {:?}", insts);
2328        insts
2329    }
2330
2331    /// Return a reference to the computed frame layout information. This
2332    /// function will panic if it's called before [`Self::compute_frame_layout`].
2333    pub fn frame_layout(&self) -> &FrameLayout {
2334        self.frame_layout
2335            .as_ref()
2336            .expect("frame layout not computed before prologue generation")
2337    }
2338
2339    /// Returns the offset from SP to FP for the given function, after
2340    /// the prologue has set up the frame. This comprises the spill
2341    /// slots and stack-storage slots as well as storage for clobbered
2342    /// callee-save registers and outgoing arguments at callsites
2343    /// (space for which is reserved during frame setup).
2344    pub fn sp_to_fp_offset(&self) -> u32 {
2345        let frame_layout = self.frame_layout();
2346        frame_layout.clobber_size
2347            + frame_layout.fixed_frame_storage_size
2348            + frame_layout.outgoing_args_size
2349    }
2350
2351    /// Returns offset from the slot base in the current frame to the caller's SP.
2352    pub fn slot_base_to_caller_sp_offset(&self) -> u32 {
2353        // Note: this looks very similar to `frame_size()` above, but
2354        // it differs in both endpoints: it measures from the bottom
2355        // of stackslots, excluding outgoing args; and it includes the
2356        // setup area (FP/LR) size and any extra tail-args space.
2357        let frame_layout = self.frame_layout();
2358        frame_layout.clobber_size
2359            + frame_layout.fixed_frame_storage_size
2360            + frame_layout.setup_area_size
2361            + (frame_layout.tail_args_size - frame_layout.incoming_args_size)
2362    }
2363
2364    /// Returns the size of arguments expected on the stack.
2365    pub fn stack_args_size(&self, sigs: &SigSet) -> u32 {
2366        sigs[self.sig].sized_stack_arg_space
2367    }
2368
2369    /// Get the spill-slot size.
2370    pub fn get_spillslot_size(&self, rc: RegClass) -> u32 {
2371        let max = if self.dynamic_type_sizes.len() == 0 {
2372            16
2373        } else {
2374            *self
2375                .dynamic_type_sizes
2376                .iter()
2377                .max_by(|x, y| x.1.cmp(&y.1))
2378                .map(|(_k, v)| v)
2379                .unwrap()
2380        };
2381        M::get_number_of_spillslots_for_value(rc, max, &self.isa_flags)
2382    }
2383
2384    /// Get the spill slot offset relative to the fixed allocation area start.
2385    pub fn get_spillslot_offset(&self, slot: SpillSlot) -> i64 {
2386        self.frame_layout().spillslot_offset(slot)
2387    }
2388
2389    /// Generate a spill.
2390    pub fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I {
2391        let ty = M::I::canonical_type_for_rc(from_reg.class());
2392        debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
2393
2394        let sp_off = self.get_spillslot_offset(to_slot);
2395        trace!("gen_spill: {from_reg:?} into slot {to_slot:?} at offset {sp_off}");
2396
2397        let from = StackAMode::Slot(sp_off);
2398        <M>::gen_store_stack(from, Reg::from(from_reg), ty)
2399    }
2400
2401    /// Generate a reload (fill).
2402    pub fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I {
2403        let ty = M::I::canonical_type_for_rc(to_reg.to_reg().class());
2404        debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
2405
2406        let sp_off = self.get_spillslot_offset(from_slot);
2407        trace!("gen_reload: {to_reg:?} from slot {from_slot:?} at offset {sp_off}");
2408
2409        let from = StackAMode::Slot(sp_off);
2410        <M>::gen_load_stack(from, to_reg.map(Reg::from), ty)
2411    }
2412
2413    /// Provide metadata to be emitted alongside machine code.
2414    ///
2415    /// This metadata describes the frame layout sufficiently to find
2416    /// stack slots, so that runtimes and unwinders can observe state
2417    /// set up by compiled code in stackslots allocated for that
2418    /// purpose.
2419    pub fn frame_slot_metadata(&self) -> MachBufferFrameLayout {
2420        let frame_to_fp_offset = self.sp_to_fp_offset();
2421        let mut stackslots = SecondaryMap::with_capacity(self.sized_stackslots.len());
2422        let storage_area_base = self.frame_layout().outgoing_args_size;
2423        for (slot, storage_area_offset) in &self.sized_stackslots {
2424            stackslots[slot] = MachBufferStackSlot {
2425                offset: storage_area_base.checked_add(*storage_area_offset).unwrap(),
2426                key: self.sized_stackslot_keys[slot],
2427            };
2428        }
2429        MachBufferFrameLayout {
2430            frame_to_fp_offset,
2431            stackslots,
2432        }
2433    }
2434}
2435
2436/// An input argument to a call instruction: the vreg that is used,
2437/// and the preg it is constrained to (per the ABI).
2438#[derive(Clone, Debug)]
2439pub struct CallArgPair {
2440    /// The virtual register to use for the argument.
2441    pub vreg: Reg,
2442    /// The real register into which the arg goes.
2443    pub preg: Reg,
2444}
2445
2446/// An output return value from a call instruction: the vreg that is
2447/// defined, and the preg or stack location it is constrained to (per
2448/// the ABI).
2449#[derive(Clone, Debug)]
2450pub struct CallRetPair {
2451    /// The virtual register to define from this return value.
2452    pub vreg: Writable<Reg>,
2453    /// The real register from which the return value is read.
2454    pub location: RetLocation,
2455}
2456
2457/// A location to load a return-value from after a call completes.
2458#[derive(Clone, Debug, PartialEq, Eq)]
2459pub enum RetLocation {
2460    /// A physical register.
2461    Reg(Reg, Type),
2462    /// A stack location, identified by a `StackAMode`.
2463    Stack(StackAMode, Type),
2464}
2465
2466pub type CallArgList = SmallVec<[CallArgPair; 8]>;
2467pub type CallRetList = SmallVec<[CallRetPair; 8]>;
2468
2469impl<T> CallInfo<T> {
2470    /// Emit loads for any stack-carried return values using the call
2471    /// info and allocations.
2472    pub fn emit_retval_loads<
2473        M: ABIMachineSpec,
2474        EmitFn: FnMut(M::I),
2475        IslandFn: Fn(u32) -> Option<M::I>,
2476    >(
2477        &self,
2478        stackslots_size: u32,
2479        mut emit: EmitFn,
2480        emit_island: IslandFn,
2481    ) {
2482        // Count stack-ret locations and emit an island to account for
2483        // this space usage.
2484        let mut space_needed = 0;
2485        for CallRetPair { location, .. } in &self.defs {
2486            if let RetLocation::Stack(..) = location {
2487                // Assume up to ten instructions, semi-arbitrarily:
2488                // load from stack, store to spillslot, codegen of
2489                // large offsets on RISC ISAs.
2490                space_needed += 10 * M::I::worst_case_size();
2491            }
2492        }
2493        if space_needed > 0 {
2494            if let Some(island_inst) = emit_island(space_needed) {
2495                emit(island_inst);
2496            }
2497        }
2498
2499        let temp = M::retval_temp_reg(self.callee_conv);
2500        // The temporary must be noted as clobbered unless there are
2501        // no returns (hence it isn't needed). The latter can only be
2502        // the case statically for an ABI when the ABI doesn't allow
2503        // any returns at all (e.g., patchable-call ABI).
2504        debug_assert!(
2505            self.defs.is_empty()
2506                || M::get_regs_clobbered_by_call(self.callee_conv, self.try_call_info.is_some())
2507                    .contains(PReg::from(temp.to_reg().to_real_reg().unwrap()))
2508        );
2509
2510        for CallRetPair { vreg, location } in &self.defs {
2511            match location {
2512                RetLocation::Reg(preg, ..) => {
2513                    // The temporary must not also be an actual return
2514                    // value register.
2515                    debug_assert!(*preg != temp.to_reg());
2516                }
2517                RetLocation::Stack(amode, ty) => {
2518                    if let Some(spillslot) = vreg.to_reg().to_spillslot() {
2519                        // `temp` is an integer register of machine word
2520                        // width, but `ty` may be floating-point/vector,
2521                        // which (i) may not be loadable directly into an
2522                        // int reg, and (ii) may be wider than a machine
2523                        // word. For simplicity, and because there are not
2524                        // always easy choices for volatile float/vec regs
2525                        // (see e.g. x86-64, where fastcall clobbers only
2526                        // xmm0-xmm5, but tail uses xmm0-xmm7 for
2527                        // returns), we use the integer temp register in
2528                        // steps.
2529                        let parts = (ty.bytes() + M::word_bytes() - 1) / M::word_bytes();
2530                        let one_part_load_ty =
2531                            Type::int_with_byte_size(M::word_bytes().min(ty.bytes()) as u16)
2532                                .unwrap();
2533                        for part in 0..parts {
2534                            emit(M::gen_load_stack(
2535                                amode.offset_by(part * M::word_bytes()),
2536                                temp,
2537                                one_part_load_ty,
2538                            ));
2539                            emit(M::gen_store_stack(
2540                                StackAMode::Slot(
2541                                    i64::from(stackslots_size)
2542                                        + i64::from(M::word_bytes())
2543                                            * ((spillslot.index() as i64) + (part as i64)),
2544                                ),
2545                                temp.to_reg(),
2546                                M::word_type(),
2547                            ));
2548                        }
2549                    } else {
2550                        assert_ne!(*vreg, temp);
2551                        emit(M::gen_load_stack(*amode, *vreg, *ty));
2552                    }
2553                }
2554            }
2555        }
2556    }
2557}
2558
2559impl TryCallInfo {
2560    pub(crate) fn exception_handlers(
2561        &self,
2562        layout: &FrameLayout,
2563    ) -> impl Iterator<Item = MachExceptionHandler> {
2564        self.exception_handlers.iter().map(|handler| match handler {
2565            TryCallHandler::Tag(tag, label) => MachExceptionHandler::Tag(*tag, *label),
2566            TryCallHandler::Default(label) => MachExceptionHandler::Default(*label),
2567            TryCallHandler::Context(reg) => {
2568                let loc = if let Some(spillslot) = reg.to_spillslot() {
2569                    // The spillslot offset is relative to the "fixed
2570                    // storage area", which comes after outgoing args.
2571                    let offset = layout.spillslot_offset(spillslot) + i64::from(layout.outgoing_args_size);
2572                    ExceptionContextLoc::SPOffset(u32::try_from(offset).expect("SP offset cannot be negative or larger than 4GiB"))
2573                } else if let Some(realreg) = reg.to_real_reg() {
2574                    ExceptionContextLoc::GPR(realreg.hw_enc())
2575                } else {
2576                    panic!("Virtual register present in try-call handler clause after register allocation");
2577                };
2578                MachExceptionHandler::Context(loc)
2579            }
2580        })
2581    }
2582
2583    pub(crate) fn pretty_print_dests(&self) -> String {
2584        self.exception_handlers
2585            .iter()
2586            .map(|handler| match handler {
2587                TryCallHandler::Tag(tag, label) => format!("{tag:?}: {label:?}"),
2588                TryCallHandler::Default(label) => format!("default: {label:?}"),
2589                TryCallHandler::Context(loc) => format!("context {loc:?}"),
2590            })
2591            .collect::<Vec<_>>()
2592            .join(", ")
2593    }
2594
2595    pub(crate) fn collect_operands(&mut self, collector: &mut impl OperandVisitor) {
2596        for handler in &mut self.exception_handlers {
2597            match handler {
2598                TryCallHandler::Context(ctx) => {
2599                    collector.any_late_use(ctx);
2600                }
2601                TryCallHandler::Tag(_, _) | TryCallHandler::Default(_) => {}
2602            }
2603        }
2604    }
2605}
2606
2607#[cfg(test)]
2608mod tests {
2609    use super::SigData;
2610
2611    #[test]
2612    fn sig_data_size() {
2613        // The size of `SigData` is performance sensitive, so make sure
2614        // we don't regress it unintentionally.
2615        assert_eq!(std::mem::size_of::<SigData>(), 24);
2616    }
2617}