Calling Conventions

Basics
A calling convention is a low-level standard scheme for how subroutines receive parameters from their caller and how they return a result.
It specifies the following:
- the order in which scalar parameters are allocated;
- how parameters are passed (pushed on the stack, placed in registers, or a mix of both);
- which registers the callee must preserve for the caller;
- and how the task of preparing the stack for, and restoring after, a function call is divided between the caller and the callee.
Calling conventions should not be confused with Application Binary Interfaces (ABIs): while they are related, an ABI is a broader concept than a calling convention and encompasses all binary-level interfaces that ensure compatibility between independently compiled binaries (e.g., shared libraries and executables). An ABI specifies calling conventions, executable formats, how user-space and kernel-space programs interact, and so on.
The stack (or call stack) is a contiguous area of memory. It is used to store local variables and pass additional arguments to subroutines when there is not enough space in the argument registers.
Intel x86 Calling Conventions
cdecl
The subroutine arguments are passed on the stack. Integer values and memory addresses are returned in the EAX
register. In context of the C programming language, function arguments are pushed on the stack in the reverse order. The caller cleans the stack after the function call returns.
cdecl
is used by many compilers for the x86
32-bit Intel architecture, and there are some variations of it.
stdcall
In the pascal
calling convention (historically implemented by Turbo Pascal compilers), the parameters are pushed on the stack in left-to-right order (which is the opposite of cdecl
), and the callee is responsible for balancing the stack before returning.
stdcall
is a variation of the pascal
calling convention in which the callee is responsible for cleaning up the stack, but the parameters are pushed on the stack in right-to-left order, as for cdecl
. Return values are stored in the EAX
register.
stdcall
is the standard calling convention for the Microsoft Win32 API.
fastcall and thiscall
fastcall
passes the first two arguments (evaluated left to right) that fit into ECX
and EDX
. The remaining arguments are pushed onto the stack from right to left.
thiscall
is used for calling C++ non-static member functions. There are different variants that depend on the compiler:
- For the GCC C++ compiler,
thiscall
is almost identical tocdecl
: the caller cleans the stack, and the parameters are passed in right-to-left order. A difference is the addition of thethis
pointer, which is pushed onto the stack last, as if it were the first parameter in the function prototype. - With the Microsoft Visual C++ compiler, the
this
pointer is passed inECX
and it is the callee that cleans the stack, mirroring thestdcall
convention used in C for this compiler and in Windows API functions.
x86-64 on Microsoft Platforms
This calling convention uses registers RCX
, RDX
, R8
, R9
for the first four integer or pointer arguments (in that order), and XMM0
, XMM1
, XMM2
, XMM3
(which are SIMD registers) are used for floating point arguments. Additional arguments are pushed onto the stack (right to left).
Integer return values are returned in RAX
if 64 bits or less. Floating-point return values are returned in XMM0
. Parameters less than 64 bits long are not zero extended (the high bits are not zeroed).
It's the caller's responsibility to allocate 32 bytes of shadow space on the stack right before calling the function (regardless of the actual number of parameters used), and to pop the stack after the call. The shadow space is used to spill RCX
, RDX
, R8
, and R9
but must be made available to all functions, even those with fewer than four parameters.
For example, a function taking 5 integer arguments will take the first to fourth in registers, and the fifth will be pushed on the top of the shadow space. So when the called function is entered, the stack will be composed of (in ascending order) the return address, followed by the shadow space (32 bytes) followed by the fifth parameter.
System V AMD64 ABI
The System V AMD64 ABI defines a calling convention that is followed on Solaris, Linux, FreeBSD, macOS, and other UNIX-like or POSIX-compliant operating systems.
The first six integer or pointer arguments are passed in registers RDI
, RSI
, RDX
, RCX
, R8
, and R9
, while XMM0
to XMM7
are used for floating-point arguments. Additional arguments are passed on the stack and the return value is stored in RAX
.
A shadow space is not provided; on function entry, the return address is adjacent to the seventh integer argument on the stack.
A red zone is a fixed-size area in a function's stack frame beyond the return address which is not preserved by that function. The callee function may use the red zone for storing local variables without the extra overhead of modifying the stack pointer. The 64-bit ABI used by System V mandates a 128-byte red zone which begins directly after the return address and includes the function's arguments.
ARM Procedure Call Standards
32-bit ARM
The first four registers (R0
through R3
) are used to pass argument values to a subroutine and to return a result value from a function. The registers R4
-R8
, R10
and R11
are used to store the values of a routine’s local variables. A subroutine must preserve the contents of the registers R4
-R8
, R10
, R11
and SP
, which means that if it modifies these registers to use them as temporary storage, it must first save them on the stack and then restore them before returning.
64-bit ARM
The first eight registers (X0
through X7
) are used to pass argument values to a subroutine and to return a result value from a function. The caller-saved temporary registers (X9
through X15
) can be modified by the called subroutine without saving and restoring them. The callee-saved temporary registers X19
-X29
can be modified by the called subroutine as long as they are saved and restored before being returned.
X8
is the indirect result register that is used to pass the address location of an indirect result via a pointer. X16
and X17
are intra-procedure-call temporary registers, that can e.g. be used by small pieces of code inserted by the linker (veneers) to allow very far jumps.
X18
is the platform register, reserved for platform-specific uses. For example, it is used by the ShadowCallStack on Android.
In the next episode, I’ll cover the structure of executable files. Stay tuned!
Thanks for reading Crumbs of Cybersecurity! Subscribe for free to receive new posts and support my work.