Hello, World! in AArch64 Assembly
Introduction
Most assembly tutorials target x86_64 on Linux. I tried following one on a Mac running Apple Silicon, and if you've ever tried like me you've likely run into subtle but frustrating differences that cause your program to crash or fail to link. This article walks through a minimal AArch64 hello world on macOS. Not only am I explaining what each instruction does but also how it differs from others.
The Code
.data
msg:
.ascii "Hello, ARM!\n"
len = . - msg
.text
.globl _main
.align 2
_main:
mov x0, #1
adrp x1, msg@PAGE
add x1, x1, msg@PAGEOFF
mov x2, #len
mov x16, #4
svc #0x80
mov x0, #0
mov x16, #1
svc #0x80
The Data Section
.data
msg:
.ascii "Hello, ARM!\n"
len = . - msg
.data declares a section for initialised data - values that exist before the program runs. msg is a label marking the start of our string.
.ascii writes the raw bytes of the string without a null terminator (unlike .asciz which appends one - we don't need it here since we are passing the length explicitly).
len = . - msg is an assembler expression evaluated at assemble time. The . symbol refers to the current location counter - the address immediately after the string. Subtracting msg gives us the byte length of the string without needing to count it manually.
The Text Section
.text
.globl _main
.align 2
.text declares the executable code section. .globl _main makes the _main symbol visible to the linker - this is your first macOS specific detail.
macOS vs. Linux: On Linux you would write .globl _start and define a _start label. macOS links against libsystem which provides a C runtime that expects an entry point named _main. Using _start on mac will cause a linker error.
.align 2 aligns the next instruction to a 22 = 4-byte boundary. AArch64 instructions are always 4 bytes wide - the CPU will fault on a misaligned instruction fetch.
Setting Up the Write Syscall
mov x0, #1
adrp x1, msg@PAGE
add x1, x1, msg@PAGEOFF
mov x2, #len
mov x16, #4
svc #0x80
mov x0, #0
mov x16, #1
svc #0x80
This is where macOS and Linux diverge most significantly. We will go line by line through the above code.
mov x0, #1 It is within register x0 that we store the file descriptor. This is consistent across Linux and macOS. In fact registers x0-x5 are all consistent across the two.
adrp x1, msg@PAGE Here we are loading the page-aligned base address of msg (i.e. we are loading the page which contains msg). Then when we add msg@PAGEOFF to x1 where we are storing the page address. msg@PAGEOFF is adding the within page offset so we get the actual address of msg. Now we are storing the address of msg in register x1. It must use two instructions since in AArch64 instructions are 32 bits so can't load a single 64 bit address in a single instruction. The PAGE and PAGEOFF are Mach-O instructions and unique to macs. On Linux they would use :lo12:msg where it fetches the lowest 12 bits of msg's address. These are the bits missing from the first adrp instruction. Fundamentally, these instructions do the same thing, it's the different file formats on either OS which means different methods must be used. Each OS will only accept its own instructions.
mov x2, #len This instruction is just telling the kernel how many bytes to write.
mov x16, #4 It is within x16 register that macOS stores the syscall number. This is one of the most significant differences as Linux will store the syscall number in register x8. If we were to place the syscall number in the wrong register the kernel either faults or ignores it. In this case we are moving syscall 4 which is the write syscall on macOS, but on Linux it is 64. This is another difference between the two. Further down the code we see syscall 1 being moved into the x16. 1 is the exit syscall number, on Linux it is 93. This is because macOS syscall numbers are derived from BSD, whereas Linux defined their own ABI.
svc #0x80 When svc executes the CPU switches from user mode to kernel mode, receives more privileges (svc stands for supervisor call). It then reads from the appropriate registers and executes. In this case writing Hello, Arm!. On Linux you would have svc #0. Linux doesn't even read the immediate and the kernel dispatches what function is in register x8.
Lastly, these mov x0, #0, mov x16, #1, and svc #0x80 instructions make sure we exit cleanly. Moves exit code 0 into the return register, loads syscall 1 (exit function) into the syscall number register, and then the supervisor executes the exit function with return code 0, signifying success.
Assembly and Linking
as -o hello.o hello.S
ld -o hello hello.o -lSystem -syslibroot $(xcrun -sdk macosx --show-sdk-path)
The macOS linker requires you to explicitly link libSystem (which is akin to C's libc for Darwin) and point it at the SDK sysroot. This is required for programs that do not call any functions from these libraries. Bypassing this step might lead to your program not being supported on macOS, as it is a requirement by Apple. On Linux it's slightly easier.
as -o hello.o hello.S
ld -o hello hello.o
That's it. Now you have a simple "Hello, World!" tutorial for AArch64. Here is my repository with the (simple) source code.