cranelift_codegen/machinst/
abi.rs

1//! Implementation of a vanilla ABI, shared between several machines. The
2//! implementation here assumes that arguments will be passed in registers
3//! first, then additional args on the stack; that the stack grows downward,
4//! contains a standard frame (return address and frame pointer), and the
5//! compiler is otherwise free to allocate space below that with its choice of
6//! layout; and that the machine has some notion of caller- and callee-save
7//! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this
8//! mold and thus both of these backends use this shared implementation.
9//!
10//! See the documentation in specific machine backends for the "instantiation"
11//! of this generic ABI, i.e., which registers are caller/callee-save, arguments
12//! and return values, and any other special requirements.
13//!
14//! For now the implementation here assumes a 64-bit machine, but we intend to
15//! make this 32/64-bit-generic shortly.
16//!
17//! # Vanilla ABI
18//!
19//! First, arguments and return values are passed in registers up to a certain
20//! fixed count, after which they overflow onto the stack. Multiple return
21//! values either fit in registers, or are returned in a separate return-value
22//! area on the stack, given by a hidden extra parameter.
23//!
24//! Note that the exact stack layout is up to us. We settled on the
25//! below design based on several requirements. In particular, we need
26//! to be able to generate instructions (or instruction sequences) to
27//! access arguments, stack slots, and spill slots before we know how
28//! many spill slots or clobber-saves there will be, because of our
29//! pass structure. We also prefer positive offsets to negative
30//! offsets because of an asymmetry in some machines' addressing modes
31//! (e.g., on AArch64, positive offsets have a larger possible range
32//! without a long-form sequence to synthesize an arbitrary
33//! offset). We also need clobber-save registers to be "near" the
34//! frame pointer: Windows unwind information requires it to be within
35//! 240 bytes of RBP. Finally, it is not allowed to access memory
36//! below the current SP value.
37//!
38//! We assume that a prologue first pushes the frame pointer (and
39//! return address above that, if the machine does not do that in
40//! hardware). We set FP to point to this two-word frame record. We
41//! store all other frame slots below this two-word frame record, as
42//! well as enough space for arguments to the largest possible
43//! function call. The stack pointer then remains at this position
44//! for the duration of the function, allowing us to address all
45//! frame storage at positive offsets from SP.
46//!
47//! Note that if we ever support dynamic stack-space allocation (for
48//! `alloca`), we will need a way to reference spill slots and stack
49//! slots relative to a dynamic SP, because we will no longer be able
50//! to know a static offset from SP to the slots at any particular
51//! program point. Probably the best solution at that point will be to
52//! revert to using the frame pointer as the reference for all slots,
53//! to allow generating spill/reload and stackslot accesses before we
54//! know how large the clobber-saves will be.
55//!
56//! # Stack Layout
57//!
58//! The stack looks like:
59//!
60//! ```plain
61//!   (high address)
62//!                              |          ...              |
63//!                              | caller frames             |
64//!                              |          ...              |
65//!                              +===========================+
66//!                              |          ...              |
67//!                              | stack args                |
68//! Canonical Frame Address -->  | (accessed via FP)         |
69//!                              +---------------------------+
70//! SP at function entry ----->  | return address            |
71//!                              +---------------------------+
72//! FP after prologue -------->  | FP (pushed by prologue)   |
73//!                              +---------------------------+           -----
74//!                              |          ...              |             |
75//!                              | clobbered callee-saves    |             |
76//! unwind-frame base -------->  | (pushed by prologue)      |             |
77//!                              +---------------------------+   -----     |
78//!                              |          ...              |     |       |
79//!                              | spill slots               |     |       |
80//!                              | (accessed via SP)         |   fixed   active
81//!                              |          ...              |   frame    size
82//!                              | stack slots               |  storage    |
83//!                              | (accessed via SP)         |    size     |
84//!                              | (alloc'd by prologue)     |     |       |
85//!                              +---------------------------+   -----     |
86//!                              | [alignment as needed]     |             |
87//!                              |          ...              |             |
88//!                              | args for largest call     |             |
89//! SP ----------------------->  | (alloc'd by prologue)     |             |
90//!                              +===========================+           -----
91//!
92//!   (low address)
93//! ```
94//!
95//! # Multi-value Returns
96//!
97//! We support multi-value returns by using multiple return-value
98//! registers. In some cases this is an extension of the base system
99//! ABI. See each platform's `abi.rs` implementation for details.
100
101use crate::entity::SecondaryMap;
102use crate::ir::types::*;
103use crate::ir::{ArgumentExtension, ArgumentPurpose, ExceptionTable, ExceptionTag, Signature};
104use crate::isa::TargetIsa;
105use crate::settings::ProbestackStrategy;
106use crate::CodegenError;
107use crate::{ir, isa};
108use crate::{machinst::*, trace};
109use alloc::boxed::Box;
110use cranelift_entity::packed_option::PackedOption;
111use regalloc2::{MachineEnv, PReg, PRegSet};
112use rustc_hash::FxHashMap;
113use smallvec::smallvec;
114use std::collections::HashMap;
115use std::marker::PhantomData;
116use std::mem;
117
118/// A small vector of instructions (with some reasonable size); appropriate for
119/// a small fixed sequence implementing one operation.
120pub type SmallInstVec<I> = SmallVec<[I; 4]>;
121
122/// A type used by backends to track argument-binding info in the "args"
123/// pseudoinst. The pseudoinst holds a vec of `ArgPair` structs.
124#[derive(Clone, Debug)]
125pub struct ArgPair {
126    /// The vreg that is defined by this args pseudoinst.
127    pub vreg: Writable<Reg>,
128    /// The preg that the arg arrives in; this constrains the vreg's
129    /// placement at the pseudoinst.
130    pub preg: Reg,
131}
132
133/// A type used by backends to track return register binding info in the "ret"
134/// pseudoinst. The pseudoinst holds a vec of `RetPair` structs.
135#[derive(Clone, Debug)]
136pub struct RetPair {
137    /// The vreg that is returned by this pseudionst.
138    pub vreg: Reg,
139    /// The preg that the arg is returned through; this constrains the vreg's
140    /// placement at the pseudoinst.
141    pub preg: Reg,
142}
143
144/// A location for (part of) an argument or return value. These "storage slots"
145/// are specified for each register-sized part of an argument.
146#[derive(Clone, Copy, Debug, PartialEq, Eq)]
147pub enum ABIArgSlot {
148    /// In a real register.
149    Reg {
150        /// Register that holds this arg.
151        reg: RealReg,
152        /// Value type of this arg.
153        ty: ir::Type,
154        /// Should this arg be zero- or sign-extended?
155        extension: ir::ArgumentExtension,
156    },
157    /// Arguments only: on stack, at given offset from SP at entry.
158    Stack {
159        /// Offset of this arg relative to the base of stack args.
160        offset: i64,
161        /// Value type of this arg.
162        ty: ir::Type,
163        /// Should this arg be zero- or sign-extended?
164        extension: ir::ArgumentExtension,
165    },
166}
167
168impl ABIArgSlot {
169    /// The type of the value that will be stored in this slot.
170    pub fn get_type(&self) -> ir::Type {
171        match self {
172            ABIArgSlot::Reg { ty, .. } => *ty,
173            ABIArgSlot::Stack { ty, .. } => *ty,
174        }
175    }
176}
177
178/// A vector of `ABIArgSlot`s. Inline capacity for one element because basically
179/// 100% of values use one slot. Only `i128`s need multiple slots, and they are
180/// super rare (and never happen with Wasm).
181pub type ABIArgSlotVec = SmallVec<[ABIArgSlot; 1]>;
182
183/// An ABIArg is composed of one or more parts. This allows for a CLIF-level
184/// Value to be passed with its parts in more than one location at the ABI
185/// level. For example, a 128-bit integer may be passed in two 64-bit registers,
186/// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The
187/// number of "parts" should correspond to the number of registers used to store
188/// this type according to the machine backend.
189///
190/// As an invariant, the `purpose` for every part must match. As a further
191/// invariant, a `StructArg` part cannot appear with any other part.
192#[derive(Clone, Debug)]
193pub enum ABIArg {
194    /// Storage slots (registers or stack locations) for each part of the
195    /// argument value. The number of slots must equal the number of register
196    /// parts used to store a value of this type.
197    Slots {
198        /// Slots, one per register part.
199        slots: ABIArgSlotVec,
200        /// Purpose of this arg.
201        purpose: ir::ArgumentPurpose,
202    },
203    /// Structure argument. We reserve stack space for it, but the CLIF-level
204    /// semantics are a little weird: the value passed to the call instruction,
205    /// and received in the corresponding block param, is a *pointer*. On the
206    /// caller side, we memcpy the data from the passed-in pointer to the stack
207    /// area; on the callee side, we compute a pointer to this stack area and
208    /// provide that as the argument's value.
209    StructArg {
210        /// Offset of this arg relative to base of stack args.
211        offset: i64,
212        /// Size of this arg on the stack.
213        size: u64,
214        /// Purpose of this arg.
215        purpose: ir::ArgumentPurpose,
216    },
217    /// Implicit argument. Similar to a StructArg, except that we have the
218    /// target type, not a pointer type, at the CLIF-level. This argument is
219    /// still being passed via reference implicitly.
220    ImplicitPtrArg {
221        /// Register or stack slot holding a pointer to the buffer.
222        pointer: ABIArgSlot,
223        /// Offset of the argument buffer.
224        offset: i64,
225        /// Type of the implicit argument.
226        ty: Type,
227        /// Purpose of this arg.
228        purpose: ir::ArgumentPurpose,
229    },
230}
231
232impl ABIArg {
233    /// Create an ABIArg from one register.
234    pub fn reg(
235        reg: RealReg,
236        ty: ir::Type,
237        extension: ir::ArgumentExtension,
238        purpose: ir::ArgumentPurpose,
239    ) -> ABIArg {
240        ABIArg::Slots {
241            slots: smallvec![ABIArgSlot::Reg { reg, ty, extension }],
242            purpose,
243        }
244    }
245
246    /// Create an ABIArg from one stack slot.
247    pub fn stack(
248        offset: i64,
249        ty: ir::Type,
250        extension: ir::ArgumentExtension,
251        purpose: ir::ArgumentPurpose,
252    ) -> ABIArg {
253        ABIArg::Slots {
254            slots: smallvec![ABIArgSlot::Stack {
255                offset,
256                ty,
257                extension,
258            }],
259            purpose,
260        }
261    }
262}
263
264/// Are we computing information about arguments or return values? Much of the
265/// handling is factored out into common routines; this enum allows us to
266/// distinguish which case we're handling.
267#[derive(Clone, Copy, Debug, PartialEq, Eq)]
268pub enum ArgsOrRets {
269    /// Arguments.
270    Args,
271    /// Return values.
272    Rets,
273}
274
275/// Abstract location for a machine-specific ABI impl to translate into the
276/// appropriate addressing mode.
277#[derive(Clone, Copy, Debug, PartialEq, Eq)]
278pub enum StackAMode {
279    /// Offset into the current frame's argument area.
280    IncomingArg(i64, u32),
281    /// Offset within the stack slots in the current frame.
282    Slot(i64),
283    /// Offset into the callee frame's argument area.
284    OutgoingArg(i64),
285}
286
287impl StackAMode {
288    fn offset_by(&self, offset: u32) -> Self {
289        match self {
290            StackAMode::IncomingArg(off, size) => {
291                StackAMode::IncomingArg(off.checked_add(i64::from(offset)).unwrap(), *size)
292            }
293            StackAMode::Slot(off) => StackAMode::Slot(off.checked_add(i64::from(offset)).unwrap()),
294            StackAMode::OutgoingArg(off) => {
295                StackAMode::OutgoingArg(off.checked_add(i64::from(offset)).unwrap())
296            }
297        }
298    }
299}
300
301/// Trait implemented by machine-specific backend to represent ISA flags.
302pub trait IsaFlags: Clone {
303    /// Get a flag indicating whether forward-edge CFI is enabled.
304    fn is_forward_edge_cfi_enabled(&self) -> bool {
305        false
306    }
307}
308
309/// Used as an out-parameter to accumulate a sequence of `ABIArg`s in
310/// `ABIMachineSpec::compute_arg_locs`. Wraps the shared allocation for all
311/// `ABIArg`s in `SigSet` and exposes just the args for the current
312/// `compute_arg_locs` call.
313pub struct ArgsAccumulator<'a> {
314    sig_set_abi_args: &'a mut Vec<ABIArg>,
315    start: usize,
316    non_formal_flag: bool,
317}
318
319impl<'a> ArgsAccumulator<'a> {
320    fn new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self {
321        let start = sig_set_abi_args.len();
322        ArgsAccumulator {
323            sig_set_abi_args,
324            start,
325            non_formal_flag: false,
326        }
327    }
328
329    #[inline]
330    pub fn push(&mut self, arg: ABIArg) {
331        debug_assert!(!self.non_formal_flag);
332        self.sig_set_abi_args.push(arg)
333    }
334
335    #[inline]
336    pub fn push_non_formal(&mut self, arg: ABIArg) {
337        self.non_formal_flag = true;
338        self.sig_set_abi_args.push(arg)
339    }
340
341    #[inline]
342    pub fn args(&self) -> &[ABIArg] {
343        &self.sig_set_abi_args[self.start..]
344    }
345
346    #[inline]
347    pub fn args_mut(&mut self) -> &mut [ABIArg] {
348        &mut self.sig_set_abi_args[self.start..]
349    }
350}
351
352/// Trait implemented by machine-specific backend to provide information about
353/// register assignments and to allow generating the specific instructions for
354/// stack loads/saves, prologues/epilogues, etc.
355pub trait ABIMachineSpec {
356    /// The instruction type.
357    type I: VCodeInst;
358
359    /// The ISA flags type.
360    type F: IsaFlags;
361
362    /// This is the limit for the size of argument and return-value areas on the
363    /// stack. We place a reasonable limit here to avoid integer overflow issues
364    /// with 32-bit arithmetic.
365    const STACK_ARG_RET_SIZE_LIMIT: u32;
366
367    /// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.
368    fn word_bits() -> u32;
369
370    /// Returns the number of bytes in a word.
371    fn word_bytes() -> u32 {
372        return Self::word_bits() / 8;
373    }
374
375    /// Returns word-size integer type.
376    fn word_type() -> Type {
377        match Self::word_bits() {
378            32 => I32,
379            64 => I64,
380            _ => unreachable!(),
381        }
382    }
383
384    /// Returns word register class.
385    fn word_reg_class() -> RegClass {
386        RegClass::Int
387    }
388
389    /// Returns required stack alignment in bytes.
390    fn stack_align(call_conv: isa::CallConv) -> u32;
391
392    /// Process a list of parameters or return values and allocate them to registers
393    /// and stack slots.
394    ///
395    /// The argument locations should be pushed onto the given `ArgsAccumulator`
396    /// in order. Any extra arguments added (such as return area pointers)
397    /// should come at the end of the list so that the first N lowered
398    /// parameters align with the N clif parameters.
399    ///
400    /// Returns the stack-space used (rounded up to as alignment requires), and
401    /// if `add_ret_area_ptr` was passed, the index of the extra synthetic arg
402    /// that was added.
403    fn compute_arg_locs(
404        call_conv: isa::CallConv,
405        flags: &settings::Flags,
406        params: &[ir::AbiParam],
407        args_or_rets: ArgsOrRets,
408        add_ret_area_ptr: bool,
409        args: ArgsAccumulator,
410    ) -> CodegenResult<(u32, Option<usize>)>;
411
412    /// Generate a load from the stack.
413    fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
414
415    /// Generate a store to the stack.
416    fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;
417
418    /// Generate a move.
419    fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;
420
421    /// Generate an integer-extend operation.
422    fn gen_extend(
423        to_reg: Writable<Reg>,
424        from_reg: Reg,
425        is_signed: bool,
426        from_bits: u8,
427        to_bits: u8,
428    ) -> Self::I;
429
430    /// Generate an "args" pseudo-instruction to capture input args in
431    /// registers.
432    fn gen_args(args: Vec<ArgPair>) -> Self::I;
433
434    /// Generate a "rets" pseudo-instruction that moves vregs to return
435    /// registers.
436    fn gen_rets(rets: Vec<RetPair>) -> Self::I;
437
438    /// Generate an add-with-immediate. Note that even if this uses a scratch
439    /// register, it must satisfy two requirements:
440    ///
441    /// - The add-imm sequence must only clobber caller-save registers that are
442    ///   not used for arguments, because it will be placed in the prologue
443    ///   before the clobbered callee-save registers are saved.
444    ///
445    /// - The add-imm sequence must work correctly when `from_reg` and/or
446    ///   `into_reg` are the register returned by `get_stacklimit_reg()`.
447    fn gen_add_imm(
448        call_conv: isa::CallConv,
449        into_reg: Writable<Reg>,
450        from_reg: Reg,
451        imm: u32,
452    ) -> SmallInstVec<Self::I>;
453
454    /// Generate a sequence that traps with a `TrapCode::StackOverflow` code if
455    /// the stack pointer is less than the given limit register (assuming the
456    /// stack grows downward).
457    fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;
458
459    /// Generate an instruction to compute an address of a stack slot (FP- or
460    /// SP-based offset).
461    fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I;
462
463    /// Get a fixed register to use to compute a stack limit. This is needed for
464    /// certain sequences generated after the register allocator has already
465    /// run. This must satisfy two requirements:
466    ///
467    /// - It must be a caller-save register that is not used for arguments,
468    ///   because it will be clobbered in the prologue before the clobbered
469    ///   callee-save registers are saved.
470    ///
471    /// - It must be safe to pass as an argument and/or destination to
472    ///   `gen_add_imm()`. This is relevant when an addition with a large
473    ///   immediate needs its own temporary; it cannot use the same fixed
474    ///   temporary as this one.
475    fn get_stacklimit_reg(call_conv: isa::CallConv) -> Reg;
476
477    /// Generate a load to the given [base+offset] address.
478    fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;
479
480    /// Generate a store from the given [base+offset] address.
481    fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;
482
483    /// Adjust the stack pointer up or down.
484    fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;
485
486    /// Compute a FrameLayout structure containing a sorted list of all clobbered
487    /// registers that are callee-saved according to the ABI, as well as the sizes
488    /// of all parts of the stack frame.  The result is used to emit the prologue
489    /// and epilogue routines.
490    fn compute_frame_layout(
491        call_conv: isa::CallConv,
492        flags: &settings::Flags,
493        sig: &Signature,
494        regs: &[Writable<RealReg>],
495        is_leaf: bool,
496        incoming_args_size: u32,
497        tail_args_size: u32,
498        stackslots_size: u32,
499        fixed_frame_storage_size: u32,
500        outgoing_args_size: u32,
501    ) -> FrameLayout;
502
503    /// Generate the usual frame-setup sequence for this architecture: e.g.,
504    /// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on
505    /// AArch64.
506    fn gen_prologue_frame_setup(
507        call_conv: isa::CallConv,
508        flags: &settings::Flags,
509        isa_flags: &Self::F,
510        frame_layout: &FrameLayout,
511    ) -> SmallInstVec<Self::I>;
512
513    /// Generate the usual frame-restore sequence for this architecture.
514    fn gen_epilogue_frame_restore(
515        call_conv: isa::CallConv,
516        flags: &settings::Flags,
517        isa_flags: &Self::F,
518        frame_layout: &FrameLayout,
519    ) -> SmallInstVec<Self::I>;
520
521    /// Generate a return instruction.
522    fn gen_return(
523        call_conv: isa::CallConv,
524        isa_flags: &Self::F,
525        frame_layout: &FrameLayout,
526    ) -> SmallInstVec<Self::I>;
527
528    /// Generate a probestack call.
529    fn gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32);
530
531    /// Generate a inline stack probe.
532    fn gen_inline_probestack(
533        insts: &mut SmallInstVec<Self::I>,
534        call_conv: isa::CallConv,
535        frame_size: u32,
536        guard_size: u32,
537    );
538
539    /// Generate a clobber-save sequence. The implementation here should return
540    /// a sequence of instructions that "push" or otherwise save to the stack all
541    /// registers written/modified by the function body that are callee-saved.
542    /// The sequence of instructions should adjust the stack pointer downward,
543    /// and should align as necessary according to ABI requirements.
544    fn gen_clobber_save(
545        call_conv: isa::CallConv,
546        flags: &settings::Flags,
547        frame_layout: &FrameLayout,
548    ) -> SmallVec<[Self::I; 16]>;
549
550    /// Generate a clobber-restore sequence. This sequence should perform the
551    /// opposite of the clobber-save sequence generated above, assuming that SP
552    /// going into the sequence is at the same point that it was left when the
553    /// clobber-save sequence finished.
554    fn gen_clobber_restore(
555        call_conv: isa::CallConv,
556        flags: &settings::Flags,
557        frame_layout: &FrameLayout,
558    ) -> SmallVec<[Self::I; 16]>;
559
560    /// Generate a call instruction/sequence. This method is provided one
561    /// temporary register to use to synthesize the called address, if needed.
562    fn gen_call(dest: &CallDest, tmp: Writable<Reg>, info: CallInfo<()>) -> SmallVec<[Self::I; 2]>;
563
564    /// Generate a memcpy invocation. Used to set up struct
565    /// args. Takes `src`, `dst` as read-only inputs and passes a temporary
566    /// allocator.
567    fn gen_memcpy<F: FnMut(Type) -> Writable<Reg>>(
568        call_conv: isa::CallConv,
569        dst: Reg,
570        src: Reg,
571        size: usize,
572        alloc_tmp: F,
573    ) -> SmallVec<[Self::I; 8]>;
574
575    /// Get the number of spillslots required for the given register-class.
576    fn get_number_of_spillslots_for_value(
577        rc: RegClass,
578        target_vector_bytes: u32,
579        isa_flags: &Self::F,
580    ) -> u32;
581
582    /// Get the ABI-dependent MachineEnv for managing register allocation.
583    fn get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv;
584
585    /// Get all caller-save registers, that is, registers that we expect
586    /// not to be saved across a call to a callee with the given ABI.
587    fn get_regs_clobbered_by_call(
588        call_conv_of_callee: isa::CallConv,
589        is_exception: bool,
590    ) -> PRegSet;
591
592    /// Get the needed extension mode, given the mode attached to the argument
593    /// in the signature and the calling convention. The input (the attribute in
594    /// the signature) specifies what extension type should be done *if* the ABI
595    /// requires extension to the full register; this method's return value
596    /// indicates whether the extension actually *will* be done.
597    fn get_ext_mode(
598        call_conv: isa::CallConv,
599        specified: ir::ArgumentExtension,
600    ) -> ir::ArgumentExtension;
601
602    /// Get a temporary register that is available to use after a call
603    /// completes and that does not interfere with register-carried
604    /// return values. This is used to move stack-carried return
605    /// values directly into spillslots if needed.
606    fn retval_temp_reg(call_conv_of_callee: isa::CallConv) -> Writable<Reg>;
607
608    /// Get the exception payload registers, if any, for a calling
609    /// convention.
610    fn exception_payload_regs(_call_conv: isa::CallConv) -> &'static [Reg] {
611        &[]
612    }
613}
614
615/// Out-of-line data for calls, to keep the size of `Inst` down.
616#[derive(Clone, Debug)]
617pub struct CallInfo<T> {
618    /// Receiver of this call
619    pub dest: T,
620    /// Register uses of this call.
621    pub uses: CallArgList,
622    /// Register defs of this call.
623    pub defs: CallRetList,
624    /// Registers clobbered by this call, as per its calling convention.
625    pub clobbers: PRegSet,
626    /// The calling convention of the callee.
627    pub callee_conv: isa::CallConv,
628    /// The calling convention of the caller.
629    pub caller_conv: isa::CallConv,
630    /// The number of bytes that the callee will pop from the stack for the
631    /// caller, if any. (Used for popping stack arguments with the `tail`
632    /// calling convention.)
633    pub callee_pop_size: u32,
634    /// Information for a try-call, if this is one. We combine
635    /// handling of calls and try-calls as much as possible to share
636    /// argument/return logic; they mostly differ in the metadata that
637    /// they emit, which this information feeds into.
638    pub try_call_info: Option<TryCallInfo>,
639}
640
641/// Out-of-line information present on `try_call` instructions only:
642/// information that is used to generate exception-handling tables and
643/// link up to destination blocks properly.
644#[derive(Clone, Debug)]
645pub struct TryCallInfo {
646    /// The target to jump to on a normal returhn.
647    pub continuation: MachLabel,
648    /// Exception tags to catch and corresponding destination labels.
649    pub exception_dests: Box<[(PackedOption<ExceptionTag>, MachLabel)]>,
650}
651
652impl<T> CallInfo<T> {
653    /// Creates an empty set of info with no clobbers/uses/etc with the
654    /// specified ABI
655    pub fn empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T> {
656        CallInfo {
657            dest,
658            uses: smallvec![],
659            defs: smallvec![],
660            clobbers: PRegSet::empty(),
661            caller_conv: call_conv,
662            callee_conv: call_conv,
663            callee_pop_size: 0,
664            try_call_info: None,
665        }
666    }
667
668    /// Change the `T` payload on this info to `U`.
669    pub fn map<U>(self, f: impl FnOnce(T) -> U) -> CallInfo<U> {
670        CallInfo {
671            dest: f(self.dest),
672            uses: self.uses,
673            defs: self.defs,
674            clobbers: self.clobbers,
675            caller_conv: self.caller_conv,
676            callee_conv: self.callee_conv,
677            callee_pop_size: self.callee_pop_size,
678            try_call_info: self.try_call_info,
679        }
680    }
681}
682
683/// The id of an ABI signature within the `SigSet`.
684#[derive(Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
685pub struct Sig(u32);
686cranelift_entity::entity_impl!(Sig);
687
688impl Sig {
689    fn prev(self) -> Option<Sig> {
690        self.0.checked_sub(1).map(Sig)
691    }
692}
693
694/// ABI information shared between body (callee) and caller.
695#[derive(Clone, Debug)]
696pub struct SigData {
697    /// Currently both return values and arguments are stored in a continuous space vector
698    /// in `SigSet::abi_args`.
699    ///
700    /// ```plain
701    ///                  +----------------------------------------------+
702    ///                  | return values                                |
703    ///                  | ...                                          |
704    ///   rets_end   --> +----------------------------------------------+
705    ///                  | arguments                                    |
706    ///                  | ...                                          |
707    ///   args_end   --> +----------------------------------------------+
708    ///
709    /// ```
710    ///
711    /// Note we only store two offsets as rets_end == args_start, and rets_start == prev.args_end.
712    ///
713    /// Argument location ending offset (regs or stack slots). Stack offsets are relative to
714    /// SP on entry to function.
715    ///
716    /// This is a index into the `SigSet::abi_args`.
717    args_end: u32,
718
719    /// Return-value location ending offset. Stack offsets are relative to the return-area
720    /// pointer.
721    ///
722    /// This is a index into the `SigSet::abi_args`.
723    rets_end: u32,
724
725    /// Space on stack used to store arguments. We're storing the size in u32 to
726    /// reduce the size of the struct.
727    sized_stack_arg_space: u32,
728
729    /// Space on stack used to store return values. We're storing the size in u32 to
730    /// reduce the size of the struct.
731    sized_stack_ret_space: u32,
732
733    /// Index in `args` of the stack-return-value-area argument.
734    stack_ret_arg: Option<u16>,
735
736    /// Calling convention used.
737    call_conv: isa::CallConv,
738}
739
740impl SigData {
741    /// Get total stack space required for arguments.
742    pub fn sized_stack_arg_space(&self) -> i64 {
743        self.sized_stack_arg_space.into()
744    }
745
746    /// Get total stack space required for return values.
747    pub fn sized_stack_ret_space(&self) -> i64 {
748        self.sized_stack_ret_space.into()
749    }
750
751    /// Get calling convention used.
752    pub fn call_conv(&self) -> isa::CallConv {
753        self.call_conv
754    }
755
756    /// The index of the stack-return-value-area argument, if any.
757    pub fn stack_ret_arg(&self) -> Option<u16> {
758        self.stack_ret_arg
759    }
760}
761
762/// A (mostly) deduplicated set of ABI signatures.
763///
764/// We say "mostly" because we do not dedupe between signatures interned via
765/// `ir::SigRef` (direct and indirect calls; the vast majority of signatures in
766/// this set) vs via `ir::Signature` (the callee itself and libcalls). Doing
767/// this final bit of deduplication would require filling out the
768/// `ir_signature_to_abi_sig`, which is a bunch of allocations (not just the
769/// hash map itself but params and returns vecs in each signature) that we want
770/// to avoid.
771///
772/// In general, prefer using the `ir::SigRef`-taking methods to the
773/// `ir::Signature`-taking methods when you can get away with it, as they don't
774/// require cloning non-copy types that will trigger heap allocations.
775///
776/// This type can be indexed by `Sig` to access its associated `SigData`.
777pub struct SigSet {
778    /// Interned `ir::Signature`s that we already have an ABI signature for.
779    ir_signature_to_abi_sig: FxHashMap<ir::Signature, Sig>,
780
781    /// Interned `ir::SigRef`s that we already have an ABI signature for.
782    ir_sig_ref_to_abi_sig: SecondaryMap<ir::SigRef, Option<Sig>>,
783
784    /// A single, shared allocation for all `ABIArg`s used by all
785    /// `SigData`s. Each `SigData` references its args/rets via indices into
786    /// this allocation.
787    abi_args: Vec<ABIArg>,
788
789    /// The actual ABI signatures, keyed by `Sig`.
790    sigs: PrimaryMap<Sig, SigData>,
791}
792
793impl SigSet {
794    /// Construct a new `SigSet`, interning all of the signatures used by the
795    /// given function.
796    pub fn new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self>
797    where
798        M: ABIMachineSpec,
799    {
800        let arg_estimate = func.dfg.signatures.len() * 6;
801
802        let mut sigs = SigSet {
803            ir_signature_to_abi_sig: FxHashMap::default(),
804            ir_sig_ref_to_abi_sig: SecondaryMap::with_capacity(func.dfg.signatures.len()),
805            abi_args: Vec::with_capacity(arg_estimate),
806            sigs: PrimaryMap::with_capacity(1 + func.dfg.signatures.len()),
807        };
808
809        sigs.make_abi_sig_from_ir_signature::<M>(func.signature.clone(), flags)?;
810        for sig_ref in func.dfg.signatures.keys() {
811            sigs.make_abi_sig_from_ir_sig_ref::<M>(sig_ref, &func.dfg, flags)?;
812        }
813
814        Ok(sigs)
815    }
816
817    /// Have we already interned an ABI signature for the given `ir::Signature`?
818    pub fn have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool {
819        self.ir_signature_to_abi_sig.contains_key(signature)
820    }
821
822    /// Construct and intern an ABI signature for the given `ir::Signature`.
823    pub fn make_abi_sig_from_ir_signature<M>(
824        &mut self,
825        signature: ir::Signature,
826        flags: &settings::Flags,
827    ) -> CodegenResult<Sig>
828    where
829        M: ABIMachineSpec,
830    {
831        // Because the `HashMap` entry API requires taking ownership of the
832        // lookup key -- and we want to avoid unnecessary clones of
833        // `ir::Signature`s, even at the cost of duplicate lookups -- we can't
834        // have a single, get-or-create-style method for interning
835        // `ir::Signature`s into ABI signatures. So at least (debug) assert that
836        // we aren't creating duplicate ABI signatures for the same
837        // `ir::Signature`.
838        debug_assert!(!self.have_abi_sig_for_signature(&signature));
839
840        let sig_data = self.from_func_sig::<M>(&signature, flags)?;
841        let sig = self.sigs.push(sig_data);
842        self.ir_signature_to_abi_sig.insert(signature, sig);
843        Ok(sig)
844    }
845
846    fn make_abi_sig_from_ir_sig_ref<M>(
847        &mut self,
848        sig_ref: ir::SigRef,
849        dfg: &ir::DataFlowGraph,
850        flags: &settings::Flags,
851    ) -> CodegenResult<Sig>
852    where
853        M: ABIMachineSpec,
854    {
855        if let Some(sig) = self.ir_sig_ref_to_abi_sig[sig_ref] {
856            return Ok(sig);
857        }
858        let signature = &dfg.signatures[sig_ref];
859        let sig_data = self.from_func_sig::<M>(signature, flags)?;
860        let sig = self.sigs.push(sig_data);
861        self.ir_sig_ref_to_abi_sig[sig_ref] = Some(sig);
862        Ok(sig)
863    }
864
865    /// Get the already-interned ABI signature id for the given `ir::SigRef`.
866    pub fn abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig {
867        self.ir_sig_ref_to_abi_sig[sig_ref]
868            .expect("must call `make_abi_sig_from_ir_sig_ref` before `get_abi_sig_for_sig_ref`")
869    }
870
871    /// Get the already-interned ABI signature id for the given `ir::Signature`.
872    pub fn abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig {
873        self.ir_signature_to_abi_sig
874            .get(signature)
875            .copied()
876            .expect("must call `make_abi_sig_from_ir_signature` before `get_abi_sig_for_signature`")
877    }
878
879    pub fn from_func_sig<M: ABIMachineSpec>(
880        &mut self,
881        sig: &ir::Signature,
882        flags: &settings::Flags,
883    ) -> CodegenResult<SigData> {
884        // Keep in sync with ensure_struct_return_ptr_is_returned
885        if sig.uses_special_return(ArgumentPurpose::StructReturn) {
886            panic!("Explicit StructReturn return value not allowed: {sig:?}")
887        }
888        let tmp;
889        let returns = if let Some(struct_ret_index) =
890            sig.special_param_index(ArgumentPurpose::StructReturn)
891        {
892            if !sig.returns.is_empty() {
893                panic!("No return values are allowed when using StructReturn: {sig:?}");
894            }
895            tmp = [sig.params[struct_ret_index]];
896            &tmp
897        } else {
898            sig.returns.as_slice()
899        };
900
901        // Compute args and retvals from signature. Handle retvals first,
902        // because we may need to add a return-area arg to the args.
903
904        // NOTE: We rely on the order of the args (rets -> args) inserted to compute the offsets in
905        // `SigSet::args()` and `SigSet::rets()`. Therefore, we cannot change the two
906        // compute_arg_locs order.
907        let (sized_stack_ret_space, _) = M::compute_arg_locs(
908            sig.call_conv,
909            flags,
910            &returns,
911            ArgsOrRets::Rets,
912            /* extra ret-area ptr = */ false,
913            ArgsAccumulator::new(&mut self.abi_args),
914        )?;
915        if !flags.enable_multi_ret_implicit_sret() {
916            assert_eq!(sized_stack_ret_space, 0);
917        }
918        let rets_end = u32::try_from(self.abi_args.len()).unwrap();
919
920        // To avoid overflow issues, limit the return size to something reasonable.
921        if sized_stack_ret_space > M::STACK_ARG_RET_SIZE_LIMIT {
922            return Err(CodegenError::ImplLimitExceeded);
923        }
924
925        let need_stack_return_area = sized_stack_ret_space > 0;
926        if need_stack_return_area {
927            assert!(!sig.uses_special_param(ir::ArgumentPurpose::StructReturn));
928        }
929
930        let (sized_stack_arg_space, stack_ret_arg) = M::compute_arg_locs(
931            sig.call_conv,
932            flags,
933            &sig.params,
934            ArgsOrRets::Args,
935            need_stack_return_area,
936            ArgsAccumulator::new(&mut self.abi_args),
937        )?;
938        let args_end = u32::try_from(self.abi_args.len()).unwrap();
939
940        // To avoid overflow issues, limit the arg size to something reasonable.
941        if sized_stack_arg_space > M::STACK_ARG_RET_SIZE_LIMIT {
942            return Err(CodegenError::ImplLimitExceeded);
943        }
944
945        trace!(
946            "ABISig: sig {:?} => args end = {} rets end = {}
947             arg stack = {} ret stack = {} stack_ret_arg = {:?}",
948            sig,
949            args_end,
950            rets_end,
951            sized_stack_arg_space,
952            sized_stack_ret_space,
953            need_stack_return_area,
954        );
955
956        let stack_ret_arg = stack_ret_arg.map(|s| u16::try_from(s).unwrap());
957        Ok(SigData {
958            args_end,
959            rets_end,
960            sized_stack_arg_space,
961            sized_stack_ret_space,
962            stack_ret_arg,
963            call_conv: sig.call_conv,
964        })
965    }
966
967    /// Get this signature's ABI arguments.
968    pub fn args(&self, sig: Sig) -> &[ABIArg] {
969        let sig_data = &self.sigs[sig];
970        // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
971        let start = usize::try_from(sig_data.rets_end).unwrap();
972        let end = usize::try_from(sig_data.args_end).unwrap();
973        &self.abi_args[start..end]
974    }
975
976    /// Get information specifying how to pass the implicit pointer
977    /// to the return-value area on the stack, if required.
978    pub fn get_ret_arg(&self, sig: Sig) -> Option<ABIArg> {
979        let sig_data = &self.sigs[sig];
980        if let Some(i) = sig_data.stack_ret_arg {
981            Some(self.args(sig)[usize::from(i)].clone())
982        } else {
983            None
984        }
985    }
986
987    /// Get information specifying how to pass one argument.
988    pub fn get_arg(&self, sig: Sig, idx: usize) -> ABIArg {
989        self.args(sig)[idx].clone()
990    }
991
992    /// Get this signature's ABI returns.
993    pub fn rets(&self, sig: Sig) -> &[ABIArg] {
994        let sig_data = &self.sigs[sig];
995        // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
996        let start = usize::try_from(sig.prev().map_or(0, |prev| self.sigs[prev].args_end)).unwrap();
997        let end = usize::try_from(sig_data.rets_end).unwrap();
998        &self.abi_args[start..end]
999    }
1000
1001    /// Get information specifying how to pass one return value.
1002    pub fn get_ret(&self, sig: Sig, idx: usize) -> ABIArg {
1003        self.rets(sig)[idx].clone()
1004    }
1005
1006    /// Get the number of arguments expected.
1007    pub fn num_args(&self, sig: Sig) -> usize {
1008        let len = self.args(sig).len();
1009        if self.sigs[sig].stack_ret_arg.is_some() {
1010            len - 1
1011        } else {
1012            len
1013        }
1014    }
1015
1016    /// Get the number of return values expected.
1017    pub fn num_rets(&self, sig: Sig) -> usize {
1018        self.rets(sig).len()
1019    }
1020}
1021
1022// NB: we do _not_ implement `IndexMut` because these signatures are
1023// deduplicated and shared!
1024impl std::ops::Index<Sig> for SigSet {
1025    type Output = SigData;
1026
1027    fn index(&self, sig: Sig) -> &Self::Output {
1028        &self.sigs[sig]
1029    }
1030}
1031
1032/// Structure describing the layout of a function's stack frame.
1033#[derive(Clone, Debug, Default)]
1034pub struct FrameLayout {
1035    /// N.B. The areas whose sizes are given in this structure fully
1036    /// cover the current function's stack frame, from high to low
1037    /// stack addresses in the sequence below.  Each size contains
1038    /// any alignment padding that may be required by the ABI.
1039
1040    /// Size of incoming arguments on the stack.  This is not technically
1041    /// part of this function's frame, but code in the function will still
1042    /// need to access it.  Depending on the ABI, we may need to set up a
1043    /// frame pointer to do so; we also may need to pop this area from the
1044    /// stack upon return.
1045    pub incoming_args_size: u32,
1046
1047    /// The size of the incoming argument area, taking into account any
1048    /// potential increase in size required for tail calls present in the
1049    /// function. In the case that no tail calls are present, this value
1050    /// will be the same as [`Self::incoming_args_size`].
1051    pub tail_args_size: u32,
1052
1053    /// Size of the "setup area", typically holding the return address
1054    /// and/or the saved frame pointer.  This may be written either during
1055    /// the call itself (e.g. a pushed return address) or by code emitted
1056    /// from gen_prologue_frame_setup.  In any case, after that code has
1057    /// completed execution, the stack pointer is expected to point to the
1058    /// bottom of this area.  The same holds at the start of code emitted
1059    /// by gen_epilogue_frame_restore.
1060    pub setup_area_size: u32,
1061
1062    /// Size of the area used to save callee-saved clobbered registers.
1063    /// This area is accessed by code emitted from gen_clobber_save and
1064    /// gen_clobber_restore.
1065    pub clobber_size: u32,
1066
1067    /// Storage allocated for the fixed part of the stack frame.
1068    /// This contains stack slots and spill slots.
1069    pub fixed_frame_storage_size: u32,
1070
1071    /// The size of all stackslots.
1072    pub stackslots_size: u32,
1073
1074    /// Stack size to be reserved for outgoing arguments, if used by
1075    /// the current ABI, or 0 otherwise.  After gen_clobber_save and
1076    /// before gen_clobber_restore, the stack pointer points to the
1077    /// bottom of this area.
1078    pub outgoing_args_size: u32,
1079
1080    /// Sorted list of callee-saved registers that are clobbered
1081    /// according to the ABI.  These registers will be saved and
1082    /// restored by gen_clobber_save and gen_clobber_restore.
1083    pub clobbered_callee_saves: Vec<Writable<RealReg>>,
1084}
1085
1086impl FrameLayout {
1087    /// Split the clobbered callee-save registers into integer-class and
1088    /// float-class groups.
1089    ///
1090    /// This method does not currently support vector-class callee-save
1091    /// registers because no current backend has them.
1092    pub fn clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>]) {
1093        let (ints, floats) = self.clobbered_callee_saves.split_at(
1094            self.clobbered_callee_saves
1095                .partition_point(|r| r.to_reg().class() == RegClass::Int),
1096        );
1097        debug_assert!(floats.iter().all(|r| r.to_reg().class() == RegClass::Float));
1098        (ints, floats)
1099    }
1100
1101    /// The size of FP to SP while the frame is active (not during prologue
1102    /// setup or epilogue tear down).
1103    pub fn active_size(&self) -> u32 {
1104        self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1105    }
1106
1107    /// Get the offset from the SP to the sized stack slots area.
1108    pub fn sp_to_sized_stack_slots(&self) -> u32 {
1109        self.outgoing_args_size
1110    }
1111}
1112
1113/// ABI object for a function body.
1114pub struct Callee<M: ABIMachineSpec> {
1115    /// CLIF-level signature, possibly normalized.
1116    ir_sig: ir::Signature,
1117    /// Signature: arg and retval regs.
1118    sig: Sig,
1119    /// Defined dynamic types.
1120    dynamic_type_sizes: HashMap<Type, u32>,
1121    /// Offsets to each dynamic stackslot.
1122    dynamic_stackslots: PrimaryMap<DynamicStackSlot, u32>,
1123    /// Offsets to each sized stackslot.
1124    sized_stackslots: PrimaryMap<StackSlot, u32>,
1125    /// Total stack size of all stackslots
1126    stackslots_size: u32,
1127    /// Stack size to be reserved for outgoing arguments.
1128    outgoing_args_size: u32,
1129    /// Initially the number of bytes originating in the callers frame where stack arguments will
1130    /// live. After lowering this number may be larger than the size expected by the function being
1131    /// compiled, as tail calls potentially require more space for stack arguments.
1132    tail_args_size: u32,
1133    /// Register-argument defs, to be provided to the `args`
1134    /// pseudo-inst, and pregs to constrain them to.
1135    reg_args: Vec<ArgPair>,
1136    /// Finalized frame layout for this function.
1137    frame_layout: Option<FrameLayout>,
1138    /// The register holding the return-area pointer, if needed.
1139    ret_area_ptr: Option<Reg>,
1140    /// Calling convention this function expects.
1141    call_conv: isa::CallConv,
1142    /// The settings controlling this function's compilation.
1143    flags: settings::Flags,
1144    /// The ISA-specific flag values controlling this function's compilation.
1145    isa_flags: M::F,
1146    /// Whether or not this function is a "leaf", meaning it calls no other
1147    /// functions
1148    is_leaf: bool,
1149    /// If this function has a stack limit specified, then `Reg` is where the
1150    /// stack limit will be located after the instructions specified have been
1151    /// executed.
1152    ///
1153    /// Note that this is intended for insertion into the prologue, if
1154    /// present. Also note that because the instructions here execute in the
1155    /// prologue this happens after legalization/register allocation/etc so we
1156    /// need to be extremely careful with each instruction. The instructions are
1157    /// manually register-allocated and carefully only use caller-saved
1158    /// registers and keep nothing live after this sequence of instructions.
1159    stack_limit: Option<(Reg, SmallInstVec<M::I>)>,
1160
1161    _mach: PhantomData<M>,
1162}
1163
1164fn get_special_purpose_param_register(
1165    f: &ir::Function,
1166    sigs: &SigSet,
1167    sig: Sig,
1168    purpose: ir::ArgumentPurpose,
1169) -> Option<Reg> {
1170    let idx = f.signature.special_param_index(purpose)?;
1171    match &sigs.args(sig)[idx] {
1172        &ABIArg::Slots { ref slots, .. } => match &slots[0] {
1173            &ABIArgSlot::Reg { reg, .. } => Some(reg.into()),
1174            _ => None,
1175        },
1176        _ => None,
1177    }
1178}
1179
1180fn checked_round_up(val: u32, mask: u32) -> Option<u32> {
1181    Some(val.checked_add(mask)? & !mask)
1182}
1183
1184impl<M: ABIMachineSpec> Callee<M> {
1185    /// Create a new body ABI instance.
1186    pub fn new(
1187        f: &ir::Function,
1188        isa: &dyn TargetIsa,
1189        isa_flags: &M::F,
1190        sigs: &SigSet,
1191    ) -> CodegenResult<Self> {
1192        trace!("ABI: func signature {:?}", f.signature);
1193
1194        let flags = isa.flags().clone();
1195        let sig = sigs.abi_sig_for_signature(&f.signature);
1196
1197        let call_conv = f.signature.call_conv;
1198        // Only these calling conventions are supported.
1199        debug_assert!(
1200            call_conv == isa::CallConv::SystemV
1201                || call_conv == isa::CallConv::Tail
1202                || call_conv == isa::CallConv::Fast
1203                || call_conv == isa::CallConv::Cold
1204                || call_conv == isa::CallConv::WindowsFastcall
1205                || call_conv == isa::CallConv::AppleAarch64
1206                || call_conv == isa::CallConv::Winch,
1207            "Unsupported calling convention: {call_conv:?}"
1208        );
1209
1210        // Compute sized stackslot locations and total stackslot size.
1211        let mut end_offset: u32 = 0;
1212        let mut sized_stackslots = PrimaryMap::new();
1213
1214        for (stackslot, data) in f.sized_stack_slots.iter() {
1215            // We start our computation possibly unaligned where the previous
1216            // stackslot left off.
1217            let unaligned_start_offset = end_offset;
1218
1219            // The start of the stackslot must be aligned.
1220            //
1221            // We always at least machine-word-align slots, but also
1222            // satisfy the user's requested alignment.
1223            debug_assert!(data.align_shift < 32);
1224            let align = std::cmp::max(M::word_bytes(), 1u32 << data.align_shift);
1225            let mask = align - 1;
1226            let start_offset = checked_round_up(unaligned_start_offset, mask)
1227                .ok_or(CodegenError::ImplLimitExceeded)?;
1228
1229            // The end offset is the start offset increased by the size
1230            end_offset = start_offset
1231                .checked_add(data.size)
1232                .ok_or(CodegenError::ImplLimitExceeded)?;
1233
1234            debug_assert_eq!(stackslot.as_u32() as usize, sized_stackslots.len());
1235            sized_stackslots.push(start_offset);
1236        }
1237
1238        // Compute dynamic stackslot locations and total stackslot size.
1239        let mut dynamic_stackslots = PrimaryMap::new();
1240        for (stackslot, data) in f.dynamic_stack_slots.iter() {
1241            debug_assert_eq!(stackslot.as_u32() as usize, dynamic_stackslots.len());
1242
1243            // This computation is similar to the stackslots above
1244            let unaligned_start_offset = end_offset;
1245
1246            let mask = M::word_bytes() - 1;
1247            let start_offset = checked_round_up(unaligned_start_offset, mask)
1248                .ok_or(CodegenError::ImplLimitExceeded)?;
1249
1250            let ty = f.get_concrete_dynamic_ty(data.dyn_ty).ok_or_else(|| {
1251                CodegenError::Unsupported(format!("invalid dynamic vector type: {}", data.dyn_ty))
1252            })?;
1253
1254            end_offset = start_offset
1255                .checked_add(isa.dynamic_vector_bytes(ty))
1256                .ok_or(CodegenError::ImplLimitExceeded)?;
1257
1258            dynamic_stackslots.push(start_offset);
1259        }
1260
1261        // The size of the stackslots needs to be word aligned
1262        let stackslots_size = checked_round_up(end_offset, M::word_bytes() - 1)
1263            .ok_or(CodegenError::ImplLimitExceeded)?;
1264
1265        let mut dynamic_type_sizes = HashMap::with_capacity(f.dfg.dynamic_types.len());
1266        for (dyn_ty, _data) in f.dfg.dynamic_types.iter() {
1267            let ty = f
1268                .get_concrete_dynamic_ty(dyn_ty)
1269                .unwrap_or_else(|| panic!("invalid dynamic vector type: {dyn_ty}"));
1270            let size = isa.dynamic_vector_bytes(ty);
1271            dynamic_type_sizes.insert(ty, size);
1272        }
1273
1274        // Figure out what instructions, if any, will be needed to check the
1275        // stack limit. This can either be specified as a special-purpose
1276        // argument or as a global value which often calculates the stack limit
1277        // from the arguments.
1278        let stack_limit = f
1279            .stack_limit
1280            .map(|gv| gen_stack_limit::<M>(f, sigs, sig, gv));
1281
1282        let tail_args_size = sigs[sig].sized_stack_arg_space;
1283
1284        Ok(Self {
1285            ir_sig: ensure_struct_return_ptr_is_returned(&f.signature),
1286            sig,
1287            dynamic_stackslots,
1288            dynamic_type_sizes,
1289            sized_stackslots,
1290            stackslots_size,
1291            outgoing_args_size: 0,
1292            tail_args_size,
1293            reg_args: vec![],
1294            frame_layout: None,
1295            ret_area_ptr: None,
1296            call_conv,
1297            flags,
1298            isa_flags: isa_flags.clone(),
1299            is_leaf: f.is_leaf(),
1300            stack_limit,
1301            _mach: PhantomData,
1302        })
1303    }
1304
1305    /// Inserts instructions necessary for checking the stack limit into the
1306    /// prologue.
1307    ///
1308    /// This function will generate instructions necessary for perform a stack
1309    /// check at the header of a function. The stack check is intended to trap
1310    /// if the stack pointer goes below a particular threshold, preventing stack
1311    /// overflow in wasm or other code. The `stack_limit` argument here is the
1312    /// register which holds the threshold below which we're supposed to trap.
1313    /// This function is known to allocate `stack_size` bytes and we'll push
1314    /// instructions onto `insts`.
1315    ///
1316    /// Note that the instructions generated here are special because this is
1317    /// happening so late in the pipeline (e.g. after register allocation). This
1318    /// means that we need to do manual register allocation here and also be
1319    /// careful to not clobber any callee-saved or argument registers. For now
1320    /// this routine makes do with the `spilltmp_reg` as one temporary
1321    /// register, and a second register of `tmp2` which is caller-saved. This
1322    /// should be fine for us since no spills should happen in this sequence of
1323    /// instructions, so our register won't get accidentally clobbered.
1324    ///
1325    /// No values can be live after the prologue, but in this case that's ok
1326    /// because we just need to perform a stack check before progressing with
1327    /// the rest of the function.
1328    fn insert_stack_check(
1329        &self,
1330        stack_limit: Reg,
1331        stack_size: u32,
1332        insts: &mut SmallInstVec<M::I>,
1333    ) {
1334        // With no explicit stack allocated we can just emit the simple check of
1335        // the stack registers against the stack limit register, and trap if
1336        // it's out of bounds.
1337        if stack_size == 0 {
1338            insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1339            return;
1340        }
1341
1342        // Note that the 32k stack size here is pretty special. See the
1343        // documentation in x86/abi.rs for why this is here. The general idea is
1344        // that we're protecting against overflow in the addition that happens
1345        // below.
1346        if stack_size >= 32 * 1024 {
1347            insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1348        }
1349
1350        // Add the `stack_size` to `stack_limit`, placing the result in
1351        // `scratch`.
1352        //
1353        // Note though that `stack_limit`'s register may be the same as
1354        // `scratch`. If our stack size doesn't fit into an immediate this
1355        // means we need a second scratch register for loading the stack size
1356        // into a register.
1357        let scratch = Writable::from_reg(M::get_stacklimit_reg(self.call_conv));
1358        insts.extend(M::gen_add_imm(self.call_conv, scratch, stack_limit, stack_size).into_iter());
1359        insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));
1360    }
1361}
1362
1363/// Generates the instructions necessary for the `gv` to be materialized into a
1364/// register.
1365///
1366/// This function will return a register that will contain the result of
1367/// evaluating `gv`. It will also return any instructions necessary to calculate
1368/// the value of the register.
1369///
1370/// Note that global values are typically lowered to instructions via the
1371/// standard legalization pass. Unfortunately though prologue generation happens
1372/// so late in the pipeline that we can't use these legalization passes to
1373/// generate the instructions for `gv`. As a result we duplicate some lowering
1374/// of `gv` here and support only some global values. This is similar to what
1375/// the x86 backend does for now, and hopefully this can be somewhat cleaned up
1376/// in the future too!
1377///
1378/// Also note that this function will make use of `writable_spilltmp_reg()` as a
1379/// temporary register to store values in if necessary. Currently after we write
1380/// to this register there's guaranteed to be no spilled values between where
1381/// it's used, because we're not participating in register allocation anyway!
1382fn gen_stack_limit<M: ABIMachineSpec>(
1383    f: &ir::Function,
1384    sigs: &SigSet,
1385    sig: Sig,
1386    gv: ir::GlobalValue,
1387) -> (Reg, SmallInstVec<M::I>) {
1388    let mut insts = smallvec![];
1389    let reg = generate_gv::<M>(f, sigs, sig, gv, &mut insts);
1390    return (reg, insts);
1391}
1392
1393fn generate_gv<M: ABIMachineSpec>(
1394    f: &ir::Function,
1395    sigs: &SigSet,
1396    sig: Sig,
1397    gv: ir::GlobalValue,
1398    insts: &mut SmallInstVec<M::I>,
1399) -> Reg {
1400    match f.global_values[gv] {
1401        // Return the direct register the vmcontext is in
1402        ir::GlobalValueData::VMContext => {
1403            get_special_purpose_param_register(f, sigs, sig, ir::ArgumentPurpose::VMContext)
1404                .expect("no vmcontext parameter found")
1405        }
1406        // Load our base value into a register, then load from that register
1407        // in to a temporary register.
1408        ir::GlobalValueData::Load {
1409            base,
1410            offset,
1411            global_type: _,
1412            flags: _,
1413        } => {
1414            let base = generate_gv::<M>(f, sigs, sig, base, insts);
1415            let into_reg = Writable::from_reg(M::get_stacklimit_reg(f.stencil.signature.call_conv));
1416            insts.push(M::gen_load_base_offset(
1417                into_reg,
1418                base,
1419                offset.into(),
1420                M::word_type(),
1421            ));
1422            return into_reg.to_reg();
1423        }
1424        ref other => panic!("global value for stack limit not supported: {other}"),
1425    }
1426}
1427
1428/// Returns true if the signature needs to be legalized.
1429fn missing_struct_return(sig: &ir::Signature) -> bool {
1430    sig.uses_special_param(ArgumentPurpose::StructReturn)
1431        && !sig.uses_special_return(ArgumentPurpose::StructReturn)
1432}
1433
1434fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {
1435    // Keep in sync with Callee::new
1436    let mut sig = sig.clone();
1437    if sig.uses_special_return(ArgumentPurpose::StructReturn) {
1438        panic!("Explicit StructReturn return value not allowed: {sig:?}")
1439    }
1440    if let Some(struct_ret_index) = sig.special_param_index(ArgumentPurpose::StructReturn) {
1441        if !sig.returns.is_empty() {
1442            panic!("No return values are allowed when using StructReturn: {sig:?}");
1443        }
1444        sig.returns.insert(0, sig.params[struct_ret_index]);
1445    }
1446    sig
1447}
1448
1449/// ### Pre-Regalloc Functions
1450///
1451/// These methods of `Callee` may only be called before regalloc.
1452impl<M: ABIMachineSpec> Callee<M> {
1453    /// Access the (possibly legalized) signature.
1454    pub fn signature(&self) -> &ir::Signature {
1455        debug_assert!(
1456            !missing_struct_return(&self.ir_sig),
1457            "`Callee::ir_sig` is always legalized"
1458        );
1459        &self.ir_sig
1460    }
1461
1462    /// Initialize. This is called after the Callee is constructed because it
1463    /// may allocate a temp vreg, which can only be allocated once the lowering
1464    /// context exists.
1465    pub fn init_retval_area(
1466        &mut self,
1467        sigs: &SigSet,
1468        vregs: &mut VRegAllocator<M::I>,
1469    ) -> CodegenResult<()> {
1470        if sigs[self.sig].stack_ret_arg.is_some() {
1471            let ret_area_ptr = vregs.alloc(M::word_type())?;
1472            self.ret_area_ptr = Some(ret_area_ptr.only_reg().unwrap());
1473        }
1474        Ok(())
1475    }
1476
1477    /// Get the return area pointer register, if any.
1478    pub fn ret_area_ptr(&self) -> Option<Reg> {
1479        self.ret_area_ptr
1480    }
1481
1482    /// Accumulate outgoing arguments.
1483    ///
1484    /// This ensures that at least `size` bytes are allocated in the prologue to
1485    /// be available for use in function calls to hold arguments and/or return
1486    /// values. If this function is called multiple times, the maximum of all
1487    /// `size` values will be available.
1488    pub fn accumulate_outgoing_args_size(&mut self, size: u32) {
1489        if size > self.outgoing_args_size {
1490            self.outgoing_args_size = size;
1491        }
1492    }
1493
1494    /// Accumulate the incoming argument area size requirements for a tail call,
1495    /// as it could be larger than the incoming arguments of the function
1496    /// currently being compiled.
1497    pub fn accumulate_tail_args_size(&mut self, size: u32) {
1498        if size > self.tail_args_size {
1499            self.tail_args_size = size;
1500        }
1501    }
1502
1503    pub fn is_forward_edge_cfi_enabled(&self) -> bool {
1504        self.isa_flags.is_forward_edge_cfi_enabled()
1505    }
1506
1507    /// Get the calling convention implemented by this ABI object.
1508    pub fn call_conv(&self, sigs: &SigSet) -> isa::CallConv {
1509        sigs[self.sig].call_conv
1510    }
1511
1512    /// Get the ABI-dependent MachineEnv for managing register allocation.
1513    pub fn machine_env(&self, sigs: &SigSet) -> &MachineEnv {
1514        M::get_machine_env(&self.flags, self.call_conv(sigs))
1515    }
1516
1517    /// The offsets of all sized stack slots (not spill slots) for debuginfo purposes.
1518    pub fn sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {
1519        &self.sized_stackslots
1520    }
1521
1522    /// The offsets of all dynamic stack slots (not spill slots) for debuginfo purposes.
1523    pub fn dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32> {
1524        &self.dynamic_stackslots
1525    }
1526
1527    /// Generate an instruction which copies an argument to a destination
1528    /// register.
1529    pub fn gen_copy_arg_to_regs(
1530        &mut self,
1531        sigs: &SigSet,
1532        idx: usize,
1533        into_regs: ValueRegs<Writable<Reg>>,
1534        vregs: &mut VRegAllocator<M::I>,
1535    ) -> SmallInstVec<M::I> {
1536        let mut insts = smallvec![];
1537        let mut copy_arg_slot_to_reg = |slot: &ABIArgSlot, into_reg: &Writable<Reg>| {
1538            match slot {
1539                &ABIArgSlot::Reg { reg, .. } => {
1540                    // Add a preg -> def pair to the eventual `args`
1541                    // instruction.  Extension mode doesn't matter
1542                    // (we're copying out, not in; we ignore high bits
1543                    // by convention).
1544                    let arg = ArgPair {
1545                        vreg: *into_reg,
1546                        preg: reg.into(),
1547                    };
1548                    self.reg_args.push(arg);
1549                }
1550                &ABIArgSlot::Stack {
1551                    offset,
1552                    ty,
1553                    extension,
1554                    ..
1555                } => {
1556                    // However, we have to respect the extension mode for stack
1557                    // slots, or else we grab the wrong bytes on big-endian.
1558                    let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1559                    let ty =
1560                        if ext != ArgumentExtension::None && M::word_bits() > ty_bits(ty) as u32 {
1561                            M::word_type()
1562                        } else {
1563                            ty
1564                        };
1565                    insts.push(M::gen_load_stack(
1566                        StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1567                        *into_reg,
1568                        ty,
1569                    ));
1570                }
1571            }
1572        };
1573
1574        match &sigs.args(self.sig)[idx] {
1575            &ABIArg::Slots { ref slots, .. } => {
1576                assert_eq!(into_regs.len(), slots.len());
1577                for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
1578                    copy_arg_slot_to_reg(&slot, &into_reg);
1579                }
1580            }
1581            &ABIArg::StructArg { offset, .. } => {
1582                let into_reg = into_regs.only_reg().unwrap();
1583                // Buffer address is implicitly defined by the ABI.
1584                insts.push(M::gen_get_stack_addr(
1585                    StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1586                    into_reg,
1587                ));
1588            }
1589            &ABIArg::ImplicitPtrArg { pointer, ty, .. } => {
1590                let into_reg = into_regs.only_reg().unwrap();
1591                // We need to dereference the pointer.
1592                let base = match &pointer {
1593                    &ABIArgSlot::Reg { reg, ty, .. } => {
1594                        let tmp = vregs.alloc_with_deferred_error(ty).only_reg().unwrap();
1595                        self.reg_args.push(ArgPair {
1596                            vreg: Writable::from_reg(tmp),
1597                            preg: reg.into(),
1598                        });
1599                        tmp
1600                    }
1601                    &ABIArgSlot::Stack { offset, ty, .. } => {
1602                        let addr_reg = writable_value_regs(vregs.alloc_with_deferred_error(ty))
1603                            .only_reg()
1604                            .unwrap();
1605                        insts.push(M::gen_load_stack(
1606                            StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1607                            addr_reg,
1608                            ty,
1609                        ));
1610                        addr_reg.to_reg()
1611                    }
1612                };
1613                insts.push(M::gen_load_base_offset(into_reg, base, 0, ty));
1614            }
1615        }
1616        insts
1617    }
1618
1619    /// Generate an instruction which copies a source register to a return value slot.
1620    pub fn gen_copy_regs_to_retval(
1621        &self,
1622        sigs: &SigSet,
1623        idx: usize,
1624        from_regs: ValueRegs<Reg>,
1625        vregs: &mut VRegAllocator<M::I>,
1626    ) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>) {
1627        let mut reg_pairs = smallvec![];
1628        let mut ret = smallvec![];
1629        let word_bits = M::word_bits() as u8;
1630        match &sigs.rets(self.sig)[idx] {
1631            &ABIArg::Slots { ref slots, .. } => {
1632                assert_eq!(from_regs.len(), slots.len());
1633                for (slot, &from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1634                    match slot {
1635                        &ABIArgSlot::Reg {
1636                            reg, ty, extension, ..
1637                        } => {
1638                            let from_bits = ty_bits(ty) as u8;
1639                            let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1640                            let vreg = match (ext, from_bits) {
1641                                (ir::ArgumentExtension::Uext, n)
1642                                | (ir::ArgumentExtension::Sext, n)
1643                                    if n < word_bits =>
1644                                {
1645                                    let signed = ext == ir::ArgumentExtension::Sext;
1646                                    let dst =
1647                                        writable_value_regs(vregs.alloc_with_deferred_error(ty))
1648                                            .only_reg()
1649                                            .unwrap();
1650                                    ret.push(M::gen_extend(
1651                                        dst, from_reg, signed, from_bits,
1652                                        /* to_bits = */ word_bits,
1653                                    ));
1654                                    dst.to_reg()
1655                                }
1656                                _ => {
1657                                    // No move needed, regalloc2 will emit it using the constraint
1658                                    // added by the RetPair.
1659                                    from_reg
1660                                }
1661                            };
1662                            reg_pairs.push(RetPair {
1663                                vreg,
1664                                preg: Reg::from(reg),
1665                            });
1666                        }
1667                        &ABIArgSlot::Stack {
1668                            offset,
1669                            ty,
1670                            extension,
1671                            ..
1672                        } => {
1673                            let mut ty = ty;
1674                            let from_bits = ty_bits(ty) as u8;
1675                            // A machine ABI implementation should ensure that stack frames
1676                            // have "reasonable" size. All current ABIs for machinst
1677                            // backends (aarch64 and x64) enforce a 128MB limit.
1678                            let off = i32::try_from(offset).expect(
1679                                "Argument stack offset greater than 2GB; should hit impl limit first",
1680                                );
1681                            let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1682                            // Trash the from_reg; it should be its last use.
1683                            match (ext, from_bits) {
1684                                (ir::ArgumentExtension::Uext, n)
1685                                | (ir::ArgumentExtension::Sext, n)
1686                                    if n < word_bits =>
1687                                {
1688                                    assert_eq!(M::word_reg_class(), from_reg.class());
1689                                    let signed = ext == ir::ArgumentExtension::Sext;
1690                                    let dst =
1691                                        writable_value_regs(vregs.alloc_with_deferred_error(ty))
1692                                            .only_reg()
1693                                            .unwrap();
1694                                    ret.push(M::gen_extend(
1695                                        dst, from_reg, signed, from_bits,
1696                                        /* to_bits = */ word_bits,
1697                                    ));
1698                                    // Store the extended version.
1699                                    ty = M::word_type();
1700                                }
1701                                _ => {}
1702                            };
1703                            ret.push(M::gen_store_base_offset(
1704                                self.ret_area_ptr.unwrap(),
1705                                off,
1706                                from_reg,
1707                                ty,
1708                            ));
1709                        }
1710                    }
1711                }
1712            }
1713            ABIArg::StructArg { .. } => {
1714                panic!("StructArg in return position is unsupported");
1715            }
1716            ABIArg::ImplicitPtrArg { .. } => {
1717                panic!("ImplicitPtrArg in return position is unsupported");
1718            }
1719        }
1720        (reg_pairs, ret)
1721    }
1722
1723    /// Generate any setup instruction needed to save values to the
1724    /// return-value area. This is usually used when were are multiple return
1725    /// values or an otherwise large return value that must be passed on the
1726    /// stack; typically the ABI specifies an extra hidden argument that is a
1727    /// pointer to that memory.
1728    pub fn gen_retval_area_setup(
1729        &mut self,
1730        sigs: &SigSet,
1731        vregs: &mut VRegAllocator<M::I>,
1732    ) -> Option<M::I> {
1733        if let Some(i) = sigs[self.sig].stack_ret_arg {
1734            let ret_area_ptr = Writable::from_reg(self.ret_area_ptr.unwrap());
1735            let insts =
1736                self.gen_copy_arg_to_regs(sigs, i.into(), ValueRegs::one(ret_area_ptr), vregs);
1737            insts.into_iter().next().map(|inst| {
1738                trace!(
1739                    "gen_retval_area_setup: inst {:?}; ptr reg is {:?}",
1740                    inst,
1741                    ret_area_ptr.to_reg()
1742                );
1743                inst
1744            })
1745        } else {
1746            trace!("gen_retval_area_setup: not needed");
1747            None
1748        }
1749    }
1750
1751    /// Generate a return instruction.
1752    pub fn gen_rets(&self, rets: Vec<RetPair>) -> M::I {
1753        M::gen_rets(rets)
1754    }
1755
1756    /// Produce an instruction that computes a sized stackslot address.
1757    pub fn sized_stackslot_addr(
1758        &self,
1759        slot: StackSlot,
1760        offset: u32,
1761        into_reg: Writable<Reg>,
1762    ) -> M::I {
1763        // Offset from beginning of stackslot area.
1764        let stack_off = self.sized_stackslots[slot] as i64;
1765        let sp_off: i64 = stack_off + (offset as i64);
1766        M::gen_get_stack_addr(StackAMode::Slot(sp_off), into_reg)
1767    }
1768
1769    /// Produce an instruction that computes a dynamic stackslot address.
1770    pub fn dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I {
1771        let stack_off = self.dynamic_stackslots[slot] as i64;
1772        M::gen_get_stack_addr(StackAMode::Slot(stack_off), into_reg)
1773    }
1774
1775    /// Get an `args` pseudo-inst, if any, that should appear at the
1776    /// very top of the function body prior to regalloc.
1777    pub fn take_args(&mut self) -> Option<M::I> {
1778        if self.reg_args.len() > 0 {
1779            // Very first instruction is an `args` pseudo-inst that
1780            // establishes live-ranges for in-register arguments and
1781            // constrains them at the start of the function to the
1782            // locations defined by the ABI.
1783            Some(M::gen_args(std::mem::take(&mut self.reg_args)))
1784        } else {
1785            None
1786        }
1787    }
1788}
1789
1790/// ### Post-Regalloc Functions
1791///
1792/// These methods of `Callee` may only be called after
1793/// regalloc.
1794impl<M: ABIMachineSpec> Callee<M> {
1795    /// Compute the final frame layout, post-regalloc.
1796    ///
1797    /// This must be called before gen_prologue or gen_epilogue.
1798    pub fn compute_frame_layout(
1799        &mut self,
1800        sigs: &SigSet,
1801        spillslots: usize,
1802        clobbered: Vec<Writable<RealReg>>,
1803    ) {
1804        let bytes = M::word_bytes();
1805        let total_stacksize = self.stackslots_size + bytes * spillslots as u32;
1806        let mask = M::stack_align(self.call_conv) - 1;
1807        let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.
1808        self.frame_layout = Some(M::compute_frame_layout(
1809            self.call_conv,
1810            &self.flags,
1811            self.signature(),
1812            &clobbered,
1813            self.is_leaf,
1814            self.stack_args_size(sigs),
1815            self.tail_args_size,
1816            self.stackslots_size,
1817            total_stacksize,
1818            self.outgoing_args_size,
1819        ));
1820    }
1821
1822    /// Generate a prologue, post-regalloc.
1823    ///
1824    /// This should include any stack frame or other setup necessary to use the
1825    /// other methods (`load_arg`, `store_retval`, and spillslot accesses.)
1826    pub fn gen_prologue(&self) -> SmallInstVec<M::I> {
1827        let frame_layout = self.frame_layout();
1828        let mut insts = smallvec![];
1829
1830        // Set up frame.
1831        insts.extend(M::gen_prologue_frame_setup(
1832            self.call_conv,
1833            &self.flags,
1834            &self.isa_flags,
1835            &frame_layout,
1836        ));
1837
1838        // The stack limit check needs to cover all the stack adjustments we
1839        // might make, up to the next stack limit check in any function we
1840        // call. Since this happens after frame setup, the current function's
1841        // setup area needs to be accounted for in the caller's stack limit
1842        // check, but we need to account for any setup area that our callees
1843        // might need. Note that s390x may also use the outgoing args area for
1844        // backtrace support even in leaf functions, so that should be accounted
1845        // for unconditionally.
1846        let total_stacksize = (frame_layout.tail_args_size - frame_layout.incoming_args_size)
1847            + frame_layout.clobber_size
1848            + frame_layout.fixed_frame_storage_size
1849            + frame_layout.outgoing_args_size
1850            + if self.is_leaf {
1851                0
1852            } else {
1853                frame_layout.setup_area_size
1854            };
1855
1856        // Leaf functions with zero stack don't need a stack check if one's
1857        // specified, otherwise always insert the stack check.
1858        if total_stacksize > 0 || !self.is_leaf {
1859            if let Some((reg, stack_limit_load)) = &self.stack_limit {
1860                insts.extend(stack_limit_load.clone());
1861                self.insert_stack_check(*reg, total_stacksize, &mut insts);
1862            }
1863
1864            if self.flags.enable_probestack() {
1865                let guard_size = 1 << self.flags.probestack_size_log2();
1866                match self.flags.probestack_strategy() {
1867                    ProbestackStrategy::Inline => M::gen_inline_probestack(
1868                        &mut insts,
1869                        self.call_conv,
1870                        total_stacksize,
1871                        guard_size,
1872                    ),
1873                    ProbestackStrategy::Outline => {
1874                        if total_stacksize >= guard_size {
1875                            M::gen_probestack(&mut insts, total_stacksize);
1876                        }
1877                    }
1878                }
1879            }
1880        }
1881
1882        // Save clobbered registers.
1883        insts.extend(M::gen_clobber_save(
1884            self.call_conv,
1885            &self.flags,
1886            &frame_layout,
1887        ));
1888
1889        insts
1890    }
1891
1892    /// Generate an epilogue, post-regalloc.
1893    ///
1894    /// Note that this must generate the actual return instruction (rather than
1895    /// emitting this in the lowering logic), because the epilogue code comes
1896    /// before the return and the two are likely closely related.
1897    pub fn gen_epilogue(&self) -> SmallInstVec<M::I> {
1898        let frame_layout = self.frame_layout();
1899        let mut insts = smallvec![];
1900
1901        // Restore clobbered registers.
1902        insts.extend(M::gen_clobber_restore(
1903            self.call_conv,
1904            &self.flags,
1905            &frame_layout,
1906        ));
1907
1908        // Tear down frame.
1909        insts.extend(M::gen_epilogue_frame_restore(
1910            self.call_conv,
1911            &self.flags,
1912            &self.isa_flags,
1913            &frame_layout,
1914        ));
1915
1916        // And return.
1917        insts.extend(M::gen_return(
1918            self.call_conv,
1919            &self.isa_flags,
1920            &frame_layout,
1921        ));
1922
1923        trace!("Epilogue: {:?}", insts);
1924        insts
1925    }
1926
1927    /// Return a reference to the computed frame layout information. This
1928    /// function will panic if it's called before [`Self::compute_frame_layout`].
1929    pub fn frame_layout(&self) -> &FrameLayout {
1930        self.frame_layout
1931            .as_ref()
1932            .expect("frame layout not computed before prologue generation")
1933    }
1934
1935    /// Returns the full frame size for the given function, after prologue
1936    /// emission has run. This comprises the spill slots and stack-storage
1937    /// slots as well as storage for clobbered callee-save registers, but
1938    /// not arguments arguments pushed at callsites within this function,
1939    /// or other ephemeral pushes.
1940    pub fn frame_size(&self) -> u32 {
1941        let frame_layout = self.frame_layout();
1942        frame_layout.clobber_size + frame_layout.fixed_frame_storage_size
1943    }
1944
1945    /// Returns offset from the slot base in the current frame to the caller's SP.
1946    pub fn slot_base_to_caller_sp_offset(&self) -> u32 {
1947        let frame_layout = self.frame_layout();
1948        frame_layout.clobber_size
1949            + frame_layout.fixed_frame_storage_size
1950            + frame_layout.setup_area_size
1951    }
1952
1953    /// Returns the size of arguments expected on the stack.
1954    pub fn stack_args_size(&self, sigs: &SigSet) -> u32 {
1955        sigs[self.sig].sized_stack_arg_space
1956    }
1957
1958    /// Get the spill-slot size.
1959    pub fn get_spillslot_size(&self, rc: RegClass) -> u32 {
1960        let max = if self.dynamic_type_sizes.len() == 0 {
1961            16
1962        } else {
1963            *self
1964                .dynamic_type_sizes
1965                .iter()
1966                .max_by(|x, y| x.1.cmp(&y.1))
1967                .map(|(_k, v)| v)
1968                .unwrap()
1969        };
1970        M::get_number_of_spillslots_for_value(rc, max, &self.isa_flags)
1971    }
1972
1973    /// Get the spill slot offset relative to the fixed allocation area start.
1974    pub fn get_spillslot_offset(&self, slot: SpillSlot) -> i64 {
1975        // Offset from beginning of spillslot area.
1976        let islot = slot.index() as i64;
1977        let spill_off = islot * M::word_bytes() as i64;
1978        let sp_off = self.stackslots_size as i64 + spill_off;
1979
1980        sp_off
1981    }
1982
1983    /// Generate a spill.
1984    pub fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I {
1985        let ty = M::I::canonical_type_for_rc(from_reg.class());
1986        debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
1987
1988        let sp_off = self.get_spillslot_offset(to_slot);
1989        trace!("gen_spill: {from_reg:?} into slot {to_slot:?} at offset {sp_off}");
1990
1991        let from = StackAMode::Slot(sp_off);
1992        <M>::gen_store_stack(from, Reg::from(from_reg), ty)
1993    }
1994
1995    /// Generate a reload (fill).
1996    pub fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I {
1997        let ty = M::I::canonical_type_for_rc(to_reg.to_reg().class());
1998        debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
1999
2000        let sp_off = self.get_spillslot_offset(from_slot);
2001        trace!("gen_reload: {to_reg:?} from slot {from_slot:?} at offset {sp_off}");
2002
2003        let from = StackAMode::Slot(sp_off);
2004        <M>::gen_load_stack(from, to_reg.map(Reg::from), ty)
2005    }
2006}
2007
2008/// An input argument to a call instruction: the vreg that is used,
2009/// and the preg it is constrained to (per the ABI).
2010#[derive(Clone, Debug)]
2011pub struct CallArgPair {
2012    /// The virtual register to use for the argument.
2013    pub vreg: Reg,
2014    /// The real register into which the arg goes.
2015    pub preg: Reg,
2016}
2017
2018/// An output return value from a call instruction: the vreg that is
2019/// defined, and the preg or stack location it is constrained to (per
2020/// the ABI).
2021#[derive(Clone, Debug)]
2022pub struct CallRetPair {
2023    /// The virtual register to define from this return value.
2024    pub vreg: Writable<Reg>,
2025    /// The real register from which the return value is read.
2026    pub location: RetLocation,
2027}
2028
2029/// A location to load a return-value from after a call completes.
2030#[derive(Clone, Debug, PartialEq, Eq)]
2031pub enum RetLocation {
2032    /// A physical register.
2033    Reg(Reg, Type),
2034    /// A stack location, identified by a `StackAMode`.
2035    Stack(StackAMode, Type),
2036}
2037
2038pub type CallArgList = SmallVec<[CallArgPair; 8]>;
2039pub type CallRetList = SmallVec<[CallRetPair; 8]>;
2040
2041pub enum IsTailCall {
2042    Yes,
2043    No,
2044}
2045
2046/// ABI object for a callsite.
2047pub struct CallSite<M: ABIMachineSpec> {
2048    /// The called function's signature.
2049    sig: Sig,
2050    /// All register uses for the callsite, i.e., function args, with
2051    /// VReg and the physical register it is constrained to.
2052    uses: CallArgList,
2053    /// All defs for the callsite, i.e., return values.
2054    defs: CallRetList,
2055    /// Call destination.
2056    dest: CallDest,
2057    is_tail_call: IsTailCall,
2058    /// Caller's calling convention.
2059    caller_conv: isa::CallConv,
2060    /// The settings controlling this compilation.
2061    flags: settings::Flags,
2062
2063    _mach: PhantomData<M>,
2064}
2065
2066/// Destination for a call.
2067#[derive(Debug, Clone)]
2068pub enum CallDest {
2069    /// Call to an ExtName (named function symbol).
2070    ExtName(ir::ExternalName, RelocDistance),
2071    /// Indirect call to a function pointer in a register.
2072    Reg(Reg),
2073}
2074
2075impl<M: ABIMachineSpec> CallSite<M> {
2076    /// Create a callsite ABI object for a call directly to the specified function.
2077    pub fn from_func(
2078        sigs: &SigSet,
2079        sig_ref: ir::SigRef,
2080        extname: &ir::ExternalName,
2081        is_tail_call: IsTailCall,
2082        dist: RelocDistance,
2083        caller_conv: isa::CallConv,
2084        flags: settings::Flags,
2085    ) -> CallSite<M> {
2086        let sig = sigs.abi_sig_for_sig_ref(sig_ref);
2087        CallSite {
2088            sig,
2089            uses: smallvec![],
2090            defs: smallvec![],
2091            dest: CallDest::ExtName(extname.clone(), dist),
2092            is_tail_call,
2093            caller_conv,
2094            flags,
2095            _mach: PhantomData,
2096        }
2097    }
2098
2099    /// Create a callsite ABI object for a call directly to the specified
2100    /// libcall.
2101    pub fn from_libcall(
2102        sigs: &SigSet,
2103        sig: &ir::Signature,
2104        extname: &ir::ExternalName,
2105        dist: RelocDistance,
2106        caller_conv: isa::CallConv,
2107        flags: settings::Flags,
2108    ) -> CallSite<M> {
2109        let sig = sigs.abi_sig_for_signature(sig);
2110        CallSite {
2111            sig,
2112            uses: smallvec![],
2113            defs: smallvec![],
2114            dest: CallDest::ExtName(extname.clone(), dist),
2115            is_tail_call: IsTailCall::No,
2116            caller_conv,
2117            flags,
2118            _mach: PhantomData,
2119        }
2120    }
2121
2122    /// Create a callsite ABI object for a call to a function pointer with the
2123    /// given signature.
2124    pub fn from_ptr(
2125        sigs: &SigSet,
2126        sig_ref: ir::SigRef,
2127        ptr: Reg,
2128        is_tail_call: IsTailCall,
2129        caller_conv: isa::CallConv,
2130        flags: settings::Flags,
2131    ) -> CallSite<M> {
2132        let sig = sigs.abi_sig_for_sig_ref(sig_ref);
2133        CallSite {
2134            sig,
2135            uses: smallvec![],
2136            defs: smallvec![],
2137            dest: CallDest::Reg(ptr),
2138            is_tail_call,
2139            caller_conv,
2140            flags,
2141            _mach: PhantomData,
2142        }
2143    }
2144
2145    pub(crate) fn dest(&self) -> &CallDest {
2146        &self.dest
2147    }
2148
2149    pub(crate) fn take_uses(self) -> CallArgList {
2150        self.uses
2151    }
2152
2153    pub(crate) fn sig<'a>(&self, sigs: &'a SigSet) -> &'a SigData {
2154        &sigs[self.sig]
2155    }
2156
2157    pub(crate) fn is_tail_call(&self) -> bool {
2158        matches!(self.is_tail_call, IsTailCall::Yes)
2159    }
2160}
2161
2162impl<M: ABIMachineSpec> CallSite<M> {
2163    /// Get the number of arguments expected.
2164    pub fn num_args(&self, sigs: &SigSet) -> usize {
2165        sigs.num_args(self.sig)
2166    }
2167
2168    /// Get the number of return values expected.
2169    pub fn num_rets(&self, sigs: &SigSet) -> usize {
2170        sigs.num_rets(self.sig)
2171    }
2172
2173    /// Emit a copy of a large argument into its associated stack buffer, if
2174    /// any.  We must be careful to perform all these copies (as necessary)
2175    /// before setting up the argument registers, since we may have to invoke
2176    /// memcpy(), which could clobber any registers already set up.  The
2177    /// back-end should call this routine for all arguments before calling
2178    /// `gen_arg` for all arguments.
2179    pub fn emit_copy_regs_to_buffer(
2180        &self,
2181        ctx: &mut Lower<M::I>,
2182        idx: usize,
2183        from_regs: ValueRegs<Reg>,
2184    ) {
2185        match &ctx.sigs().args(self.sig)[idx] {
2186            &ABIArg::Slots { .. } | &ABIArg::ImplicitPtrArg { .. } => {}
2187            &ABIArg::StructArg { offset, size, .. } => {
2188                let src_ptr = from_regs.only_reg().unwrap();
2189                let dst_ptr = ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2190                ctx.emit(M::gen_get_stack_addr(
2191                    StackAMode::OutgoingArg(offset),
2192                    dst_ptr,
2193                ));
2194                // Emit a memcpy from `src_ptr` to `dst_ptr` of `size` bytes.
2195                // N.B.: because we process StructArg params *first*, this is
2196                // safe w.r.t. clobbers: we have not yet filled in any other
2197                // arg regs.
2198                let memcpy_call_conv =
2199                    isa::CallConv::for_libcall(&self.flags, ctx.sigs()[self.sig].call_conv);
2200                for insn in M::gen_memcpy(
2201                    memcpy_call_conv,
2202                    dst_ptr.to_reg(),
2203                    src_ptr,
2204                    size as usize,
2205                    |ty| ctx.alloc_tmp(ty).only_reg().unwrap(),
2206                )
2207                .into_iter()
2208                {
2209                    ctx.emit(insn);
2210                }
2211            }
2212        }
2213    }
2214
2215    /// Add a constraint for an argument value from a source register.
2216    /// For large arguments with associated stack buffer, this may
2217    /// load the address of the buffer into the argument register, if
2218    /// required by the ABI.
2219    pub fn gen_arg(&mut self, ctx: &mut Lower<M::I>, idx: usize, from_regs: ValueRegs<Reg>) {
2220        let stack_arg_space = ctx.sigs()[self.sig].sized_stack_arg_space;
2221        let stack_arg = if self.is_tail_call() {
2222            StackAMode::IncomingArg
2223        } else {
2224            |offset, _| StackAMode::OutgoingArg(offset)
2225        };
2226        let word_rc = M::word_reg_class();
2227        let word_bits = M::word_bits() as usize;
2228
2229        match ctx.sigs().args(self.sig)[idx].clone() {
2230            ABIArg::Slots { ref slots, .. } => {
2231                assert_eq!(from_regs.len(), slots.len());
2232                for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
2233                    match slot {
2234                        &ABIArgSlot::Reg {
2235                            reg, ty, extension, ..
2236                        } => {
2237                            let ext = M::get_ext_mode(ctx.sigs()[self.sig].call_conv, extension);
2238                            let vreg =
2239                                if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
2240                                    assert_eq!(word_rc, reg.class());
2241                                    let signed = match ext {
2242                                        ir::ArgumentExtension::Uext => false,
2243                                        ir::ArgumentExtension::Sext => true,
2244                                        _ => unreachable!(),
2245                                    };
2246                                    let extend_result =
2247                                        ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2248                                    ctx.emit(M::gen_extend(
2249                                        extend_result,
2250                                        *from_reg,
2251                                        signed,
2252                                        ty_bits(ty) as u8,
2253                                        word_bits as u8,
2254                                    ));
2255                                    extend_result.to_reg()
2256                                } else {
2257                                    *from_reg
2258                                };
2259
2260                            let preg = reg.into();
2261                            self.uses.push(CallArgPair { vreg, preg });
2262                        }
2263                        &ABIArgSlot::Stack {
2264                            offset,
2265                            ty,
2266                            extension,
2267                            ..
2268                        } => {
2269                            let ext = M::get_ext_mode(ctx.sigs()[self.sig].call_conv, extension);
2270                            let (data, ty) =
2271                                if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
2272                                    assert_eq!(word_rc, from_reg.class());
2273                                    let signed = match ext {
2274                                        ir::ArgumentExtension::Uext => false,
2275                                        ir::ArgumentExtension::Sext => true,
2276                                        _ => unreachable!(),
2277                                    };
2278                                    let extend_result =
2279                                        ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2280                                    ctx.emit(M::gen_extend(
2281                                        extend_result,
2282                                        *from_reg,
2283                                        signed,
2284                                        ty_bits(ty) as u8,
2285                                        word_bits as u8,
2286                                    ));
2287                                    // Store the extended version.
2288                                    (extend_result.to_reg(), M::word_type())
2289                                } else {
2290                                    (*from_reg, ty)
2291                                };
2292                            ctx.emit(M::gen_store_stack(
2293                                stack_arg(offset, stack_arg_space),
2294                                data,
2295                                ty,
2296                            ));
2297                        }
2298                    }
2299                }
2300            }
2301            ABIArg::StructArg { .. } => {
2302                // Only supported via ISLE.
2303            }
2304            ABIArg::ImplicitPtrArg {
2305                offset,
2306                pointer,
2307                ty,
2308                purpose: _,
2309            } => {
2310                assert_eq!(from_regs.len(), 1);
2311                let vreg = from_regs.regs()[0];
2312                let amode = StackAMode::OutgoingArg(offset);
2313                let tmp = ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
2314                ctx.emit(M::gen_get_stack_addr(amode, tmp));
2315                let tmp = tmp.to_reg();
2316                ctx.emit(M::gen_store_base_offset(tmp, 0, vreg, ty));
2317                match pointer {
2318                    ABIArgSlot::Reg { reg, .. } => self.uses.push(CallArgPair {
2319                        vreg: tmp,
2320                        preg: reg.into(),
2321                    }),
2322                    ABIArgSlot::Stack { offset, .. } => ctx.emit(M::gen_store_stack(
2323                        stack_arg(offset, stack_arg_space),
2324                        tmp,
2325                        M::word_type(),
2326                    )),
2327                }
2328            }
2329        }
2330    }
2331
2332    /// Call `gen_arg` for each non-hidden argument and emit all instructions
2333    /// generated.
2334    pub fn emit_args(&mut self, ctx: &mut Lower<M::I>, (inputs, off): isle::ValueSlice) {
2335        let num_args = self.num_args(ctx.sigs());
2336        assert_eq!(inputs.len(&ctx.dfg().value_lists) - off, num_args);
2337
2338        let mut arg_value_regs: SmallVec<[_; 16]> = smallvec![];
2339        for i in 0..num_args {
2340            let input = inputs.get(off + i, &ctx.dfg().value_lists).unwrap();
2341            arg_value_regs.push(ctx.put_value_in_regs(input));
2342        }
2343        for (i, arg_regs) in arg_value_regs.iter().enumerate() {
2344            self.emit_copy_regs_to_buffer(ctx, i, *arg_regs);
2345        }
2346        for (i, value_regs) in arg_value_regs.iter().enumerate() {
2347            self.gen_arg(ctx, i, *value_regs);
2348        }
2349    }
2350
2351    /// Emit the code to forward a stack-return pointer argument through a tail
2352    /// call.
2353    pub fn emit_stack_ret_arg_for_tail_call(&mut self, ctx: &mut Lower<M::I>) {
2354        if let Some(i) = ctx.sigs()[self.sig].stack_ret_arg() {
2355            let ret_area_ptr = ctx.abi().ret_area_ptr.expect(
2356                "if the tail callee has a return pointer, then the tail caller \
2357                 must as well",
2358            );
2359            self.gen_arg(ctx, i.into(), ValueRegs::one(ret_area_ptr));
2360        }
2361    }
2362
2363    /// Define a return value after the call returns.
2364    pub fn gen_retval(&mut self, ctx: &mut Lower<M::I>, idx: usize) -> ValueRegs<Reg> {
2365        let mut into_regs: SmallVec<[Reg; 2]> = smallvec![];
2366        let ret = ctx.sigs().rets(self.sig)[idx].clone();
2367        match ret {
2368            ABIArg::Slots { ref slots, .. } => {
2369                for slot in slots {
2370                    match slot {
2371                        // Extension mode doesn't matter because we're copying out, not in,
2372                        // and we ignore high bits in our own registers by convention.
2373                        &ABIArgSlot::Reg { reg, ty, .. } => {
2374                            let into_reg = ctx.alloc_tmp(ty).only_reg().unwrap();
2375                            self.defs.push(CallRetPair {
2376                                vreg: into_reg,
2377                                location: RetLocation::Reg(reg.into(), ty),
2378                            });
2379                            into_regs.push(into_reg.to_reg());
2380                        }
2381                        &ABIArgSlot::Stack { offset, ty, .. } => {
2382                            let into_reg = ctx.alloc_tmp(ty).only_reg().unwrap();
2383                            let sig_data = &ctx.sigs()[self.sig];
2384                            // The outgoing argument area must always be restored after a call,
2385                            // ensuring that the return values will be in a consistent place after
2386                            // any call.
2387                            let ret_area_base = sig_data.sized_stack_arg_space();
2388                            let amode = StackAMode::OutgoingArg(offset + ret_area_base);
2389                            self.defs.push(CallRetPair {
2390                                vreg: into_reg,
2391                                location: RetLocation::Stack(amode, ty),
2392                            });
2393                            into_regs.push(into_reg.to_reg());
2394                        }
2395                    }
2396                }
2397            }
2398            ABIArg::StructArg { .. } => {
2399                panic!("StructArg not supported in return position");
2400            }
2401            ABIArg::ImplicitPtrArg { .. } => {
2402                panic!("ImplicitPtrArg not supported in return position");
2403            }
2404        }
2405
2406        let value_regs = match *into_regs {
2407            [a] => ValueRegs::one(a),
2408            [a, b] => ValueRegs::two(a, b),
2409            _ => panic!("Expected to see one or two slots only from {ret:?}"),
2410        };
2411        value_regs
2412    }
2413
2414    /// Emit the call itself.
2415    ///
2416    /// The returned instruction should have proper use- and def-sets according
2417    /// to the argument registers, return-value registers, and clobbered
2418    /// registers for this function signature in this ABI.
2419    ///
2420    /// (Arg registers are uses, and retval registers are defs. Clobbered
2421    /// registers are also logically defs, but should never be read; their
2422    /// values are "defined" (to the regalloc) but "undefined" in every other
2423    /// sense.)
2424    ///
2425    /// This function should only be called once, as it is allowed to re-use
2426    /// parts of the `CallSite` object in emitting instructions.
2427    pub fn emit_call(
2428        &mut self,
2429        ctx: &mut Lower<M::I>,
2430        try_call_info: Option<(ExceptionTable, &[MachLabel])>,
2431    ) {
2432        let word_type = M::word_type();
2433        if let Some(i) = ctx.sigs()[self.sig].stack_ret_arg {
2434            let rd = ctx.alloc_tmp(word_type).only_reg().unwrap();
2435            let ret_area_base = ctx.sigs()[self.sig].sized_stack_arg_space();
2436            ctx.emit(M::gen_get_stack_addr(
2437                StackAMode::OutgoingArg(ret_area_base),
2438                rd,
2439            ));
2440            self.gen_arg(ctx, i.into(), ValueRegs::one(rd.to_reg()));
2441        }
2442
2443        let uses = mem::take(&mut self.uses);
2444        let mut defs = mem::take(&mut self.defs);
2445
2446        let sig = &ctx.sigs()[self.sig];
2447        let callee_pop_size = if sig.call_conv() == isa::CallConv::Tail {
2448            // The tail calling convention has callees pop stack arguments.
2449            sig.sized_stack_arg_space
2450        } else {
2451            0
2452        };
2453
2454        let call_conv = sig.call_conv;
2455        let ret_space = sig.sized_stack_ret_space;
2456        let arg_space = sig.sized_stack_arg_space;
2457
2458        ctx.abi_mut()
2459            .accumulate_outgoing_args_size(ret_space + arg_space);
2460
2461        let tmp = ctx.alloc_tmp(word_type).only_reg().unwrap();
2462
2463        let try_call_info = try_call_info.map(|(et, labels)| {
2464            let exception_dests = ctx.dfg().exception_tables[et]
2465                .catches()
2466                .map(|(tag, _)| tag.into())
2467                .zip(labels.iter().cloned())
2468                .collect::<Vec<_>>()
2469                .into_boxed_slice();
2470
2471            // We need to update `defs` to contain the exception
2472            // payload regs as well. We have two sources of info that
2473            // we join:
2474            //
2475            // - The machine-specific ABI implementation `M`, which
2476            //   tells us the particular registers that payload values
2477            //   must be in
2478            // - The passed-in lowering context, which gives us the
2479            //   vregs we must define.
2480            //
2481            // Note that payload values may need to end up in the same
2482            // physical registers as ordinary return values; this is
2483            // not a conflict, because we either get one or the
2484            // other. For regalloc's purposes, we define both starting
2485            // here at the callsite, but we can share one def in the
2486            // `defs` list and alias one vreg to another. Thus we
2487            // handle the two cases below for each payload register:
2488            // overlaps a return value (and we alias to it) or not
2489            // (and we add a def).
2490            let pregs = M::exception_payload_regs(call_conv);
2491            for (i, &preg) in pregs.iter().enumerate() {
2492                let vreg = ctx.try_call_exception_defs(ctx.cur_inst())[i];
2493                if let Some(existing) = defs.iter().find(|def| match def.location {
2494                    RetLocation::Reg(r, _) => r == preg,
2495                    _ => false,
2496                }) {
2497                    ctx.vregs_mut()
2498                        .set_vreg_alias(vreg.to_reg(), existing.vreg.to_reg());
2499                } else {
2500                    defs.push(CallRetPair {
2501                        vreg,
2502                        location: RetLocation::Reg(preg, M::word_type()),
2503                    });
2504                }
2505            }
2506
2507            TryCallInfo {
2508                continuation: *labels.last().unwrap(),
2509                exception_dests,
2510            }
2511        });
2512
2513        let clobbers = {
2514            // Get clobbers: all caller-saves. These may include return value
2515            // regs, which we will remove from the clobber set below.
2516            let mut clobbers = <M>::get_regs_clobbered_by_call(
2517                ctx.sigs()[self.sig].call_conv,
2518                try_call_info.is_some(),
2519            );
2520
2521            // Remove retval regs from clobbers.
2522            for def in &defs {
2523                if let RetLocation::Reg(preg, _) = def.location {
2524                    clobbers.remove(PReg::from(preg.to_real_reg().unwrap()));
2525                }
2526            }
2527
2528            clobbers
2529        };
2530
2531        // Any adjustment to SP to account for required outgoing arguments/stack return values must
2532        // be done inside of the call pseudo-op, to ensure that SP is always in a consistent
2533        // state for all other instructions. For example, if a tail-call abi function is called
2534        // here, the reclamation of the outgoing argument area must be done inside of the call
2535        // pseudo-op's emission to ensure that SP is consistent at all other points in the lowered
2536        // function. (Except the prologue and epilogue, but those are fairly special parts of the
2537        // function that establish the SP invariants that are relied on elsewhere and are generated
2538        // after the register allocator has run and thus cannot have register allocator-inserted
2539        // references to SP offsets.)
2540        for inst in M::gen_call(
2541            &self.dest,
2542            tmp,
2543            CallInfo {
2544                dest: (),
2545                uses,
2546                defs,
2547                clobbers,
2548                callee_conv: call_conv,
2549                caller_conv: self.caller_conv,
2550                callee_pop_size,
2551                try_call_info,
2552            },
2553        )
2554        .into_iter()
2555        {
2556            ctx.emit(inst);
2557        }
2558    }
2559}
2560
2561impl<T> CallInfo<T> {
2562    /// Emit loads for any stack-carried return values using the call
2563    /// info and allocations.
2564    pub fn emit_retval_loads<
2565        M: ABIMachineSpec,
2566        EmitFn: FnMut(M::I),
2567        IslandFn: Fn(u32) -> Option<M::I>,
2568    >(
2569        &self,
2570        stackslots_size: u32,
2571        mut emit: EmitFn,
2572        emit_island: IslandFn,
2573    ) {
2574        // Count stack-ret locations and emit an island to account for
2575        // this space usage.
2576        let mut space_needed = 0;
2577        for CallRetPair { location, .. } in &self.defs {
2578            if let RetLocation::Stack(..) = location {
2579                // Assume up to ten instructions, semi-arbitrarily:
2580                // load from stack, store to spillslot, codegen of
2581                // large offsets on RISC ISAs.
2582                space_needed += 10 * M::I::worst_case_size();
2583            }
2584        }
2585        if space_needed > 0 {
2586            if let Some(island_inst) = emit_island(space_needed) {
2587                emit(island_inst);
2588            }
2589        }
2590
2591        let temp = M::retval_temp_reg(self.callee_conv);
2592        // The temporary must be noted as clobbered.
2593        debug_assert!(M::get_regs_clobbered_by_call(
2594            self.callee_conv,
2595            self.try_call_info.is_some()
2596        )
2597        .contains(PReg::from(temp.to_reg().to_real_reg().unwrap())));
2598
2599        for CallRetPair { vreg, location } in &self.defs {
2600            match location {
2601                RetLocation::Reg(preg, ..) => {
2602                    // The temporary must not also be an actual return
2603                    // value register.
2604                    debug_assert!(*preg != temp.to_reg());
2605                }
2606                RetLocation::Stack(amode, ty) => {
2607                    if let Some(spillslot) = vreg.to_reg().to_spillslot() {
2608                        // `temp` is an integer register of machine word
2609                        // width, but `ty` may be floating-point/vector,
2610                        // which (i) may not be loadable directly into an
2611                        // int reg, and (ii) may be wider than a machine
2612                        // word. For simplicity, and because there are not
2613                        // always easy choices for volatile float/vec regs
2614                        // (see e.g. x86-64, where fastcall clobbers only
2615                        // xmm0-xmm5, but tail uses xmm0-xmm7 for
2616                        // returns), we use the integer temp register in
2617                        // steps.
2618                        let parts = (ty.bytes() + M::word_bytes() - 1) / M::word_bytes();
2619                        for part in 0..parts {
2620                            emit(M::gen_load_stack(
2621                                amode.offset_by(part * M::word_bytes()),
2622                                temp,
2623                                M::word_type(),
2624                            ));
2625                            emit(M::gen_store_stack(
2626                                StackAMode::Slot(
2627                                    i64::from(stackslots_size)
2628                                        + i64::from(M::word_bytes())
2629                                            * ((spillslot.index() as i64) + (part as i64)),
2630                                ),
2631                                temp.to_reg(),
2632                                M::word_type(),
2633                            ));
2634                        }
2635                    } else {
2636                        assert_ne!(*vreg, temp);
2637                        emit(M::gen_load_stack(*amode, *vreg, *ty));
2638                    }
2639                }
2640            }
2641        }
2642    }
2643}
2644
2645#[cfg(test)]
2646mod tests {
2647    use super::SigData;
2648
2649    #[test]
2650    fn sig_data_size() {
2651        // The size of `SigData` is performance sensitive, so make sure
2652        // we don't regress it unintentionally.
2653        assert_eq!(std::mem::size_of::<SigData>(), 24);
2654    }
2655}