Published on

My notes on Assembly language

assembly cheatsheet

1. Global Declaration

global _start

This tells the assembler that the _start symbol should be the entry point of the program.

2. Text Section

section .text
_start:

This is the start of the program's text section, where the executable code is placed.

3. Write System Call (syscall)

mov rax, 0x01         ; system call for write refer https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md#x86_64-64_bit
mov rdi, 1             ; file handle 1 is stdout
mov rsi, loveletter    ; address of string to output
mov rdx, 300000        ; number of bytes to write
syscall                ; invoke operating system to do the write
  • mov rax, 0x01 sets up the system call number for write.
  • mov rdi, 1 specifies that we want to write to standard output (stdout).
  • mov rsi, loveletter sets the address of the string to be printed.
  • mov rdx, 300000 sets the number of bytes to write (you can adjust this, but it should be greater than or equal to the length of the string to avoid truncation).
  • syscall makes the system call to actually write the string.

4. Exit System Call (syscall)

mov rax, 0x3c        ; system call for exit
xor rdi, rdi          ; exit code 0
syscall               ; invoke operating system to exit
  • mov rax, 0x3c specifies the system call number for exit.
  • xor rdi, rdi sets the exit code to 0 (indicating successful termination).
  • syscall invokes the system call to exit the program.

5. Data Section

section .data
loveletter: db "bug loves programs as i love you", 100
  • This is the data section where the string loveletter is stored.
  • db "bug loves programs as i love you", 100 defines a string of 100 characters, with the string itself being "bug loves programs as i love you". The extra padding ensures the string is large enough to avoid issues during the write.

Key Adjustments

  • Reducing rdx (bytes to write): The number 300000 in the mov rdx, 300000 line is large. You can reduce this number to match the actual string length (e.g., 32 or 50), but if you set it too low, the string might not be displayed properly. It's best to use a value larger than or equal to the length of the string.

    Example:

    mov rdx, 32   ; Update to match actual string length
    
  • The string in .data is padded with 100 bytes, which should be sufficient, but if you change the string or its length, you’ll need to adjust this padding accordingly.

    Example:

    loveletter: db "bug loves programs as i love you", 0  ; Null-terminated string
    

Final Code (Refined)

global _start

section .text
_start:
    mov rax, 0x01         ; system call for write
    mov rdi, 1             ; file handle 1 is stdout
    mov rsi, loveletter    ; address of string to output
    mov rdx, 32            ; number of bytes to write (length of string)
    syscall                ; invoke operating system to do the write

    mov rax, 0x3c          ; system call for exit
    xor rdi, rdi           ; exit code 0
    syscall                ; invoke operating system to exit

section .data
loveletter: db "bug loves programs as i love you", 0  ; Null-terminated string

cat /usr/include/asm/unistd_32.h list os syscalls


mov actually copies data instead of moving

global _start

section .text
_start:
    ; Load value 42 into rax
    mov rax, 42          ; rax = 42

    ; Copy value from rax into rbx
    mov rbx, rax         ; rbx = rax (i.e., rbx = 42)

    ; Now rax still holds 42, and rbx also holds 42

    ; Exit the program (exit code 0)
    mov rax, 60          ; syscall number for exit
    xor rdi, rdi         ; exit code 0
    syscall              ; invoke syscall

you can see value of registers while debugging with gdb we can also

(gdb) i r
rax            0x2a                42
(gdb) set $rax = 0x88
(gdb) i r
rax            0x88                136


The letters ‘b’, ‘w’, ‘l’ and ‘q’ specify byte, word, long and quadruple word operands When there is no sizing suffix and no (suitable) register operands to deduce the size of memory operands, with a few exceptions and where long operand size is possible in the first place, operand size will default to long in 32- and 64-bit modes. Similarly it will default to short in 16-bit mode. Noteworthy exceptions are

Instructions with an implicit on-stack operand as well as branches, which default to quad in 64-bit mode. Sign- and zero-extending moves, which default to byte size source operands. Floating point insns with integer operands, which default to short (for perhaps historical reasons). CRC32 with a 64-bit destination, which defaults to a quad source operand. https://sourceware.org/binutils/docs/as/i386_002dMnemonics.html#Instruction-Naming


section .data
    number: dq 42   ; Define a 64-bit integer with value 42

section .text
    global _start

_start:
    ; Load the value at address 'number' into rax
    mov rax, [number]    ; rax = value at address of 'number' (42)

    ; Copy value from rax into rbx
    mov rbx, rax         ; rbx = rax (i.e., rbx = 42)

    ; Exit the program (exit code 0)
    mov rax, 60          ; syscall number for exit
    xor rdi, rdi         ; exit code 0
    syscall              ; invoke syscall

Key point:

  • number: dq 42: This ensures number is a 64-bit (quadword) integer.
  • mov rax, [number]: This dereferences the memory at the address number and loads the value 42 into rax.

Why mov rax, [number] Works:

  • number is the label (address) of the data.
  • mov rax, [number] tells the CPU to fetch the data at the memory location (number) rather than just copying the address.

mov rax, number
  • This loads the address of number into rax, not the value 42.
  • When you copy this to rbx, rbx holds the address, leading to the large value.

Output Example in GDB:

(gdb) info registers
rax            0x2a   # 42 in hexadecimal
rbx            0x2a

Additional Notes:

  • If you expect to load a 32-bit value, use dd instead of dq:

    number: dd 42  ; 32-bit integer
    mov eax, [number]
    mov ebx, eax
    
  • For byte values, use:

    number: db 42  ; 8-bit integer
    mov al, [number]
    mov bl, al
    

ok

 section .data
    constants dd 5, 8, 17, 44, 50, 52, 60, 65, 70, 77, 80  ; Define an array of integers

 section .text
    global _start
 _start:

    ; Load the address of 'constants' into EDI
    mov edi, constants

    ; Access the value at index 2 (17) using indirect addressing
    mov ebx, [edi + 8]   ; EBX = constants[2] (8 bytes offset, since each int = 4 bytes)

    ; Modify the value at index 1 (8) by setting it to 25
    mov dword [edi + 4], 25   ; constants[1] = 25 (4 bytes offset for the 2nd element)

    ; Exit program (sys_exit)
    mov eax, 1     ; syscall number for exit
    xor ebx, ebx   ; exit code 0
    int 0x80       ; invoke syscall

Sections of the Program:

section .text
    extern printf                ; Declare an external reference to printf
    global _start                 ; Define _start as the entry point of the program
  • extern printf: This declares printf as an external function. This tells the assembler that printf is defined elsewhere (in this case, in the C standard library), and we will link to it during the linking stage.
_start:
    sub rsp, 8                   ; Ensure the stack is 16-byte aligned
  • sub rsp, 8: The x86-64 ABI (Application Binary Interface) requires that the stack pointer (rsp) be 16-byte aligned when making function calls. The printf function, which is called later, requires this alignment. To ensure the stack is properly aligned, we subtract 8 from the stack pointer (rsp). This adjustment ensures that the stack is 16-byte aligned before the printf call, as rsp will be aligned to an 8-byte boundary and printf requires 16-byte alignment for its parameters.
    mov rdi, fmt                 ; Load address of format string into rdi
    mov rsi, message             ; Load address of message string into rsi
    mov rax, 0                   ; Set rax to 0 because printf uses the first 6 integer arguments in registers
  • mov rdi, fmt: The first argument to printf is the format string, which is passed in rdi according to the x86-64 calling convention. In this case, fmt is the format string ("%s\n").
  • mov rsi, message: The second argument to printf is the string we want to print. This string is stored in the message variable, and its address is moved into the rsi register, which holds the second argument in the calling convention.
  • mov rax, 0: The printf function may use additional arguments beyond the first two. According to the x86-64 calling convention, registers rax through r9 are used for the first 6 integer arguments, but here we don't need them. So, we set rax to 0 (a common convention to indicate no additional arguments).
    call printf                  ; Call printf
  • call printf: This line actually calls the printf function, passing the arguments (format string and message) to it. The function printf will print the string "Hello, World\n" to the terminal.
    add rsp, 8                   ; Restore the stack pointer by adding 8 to it
  • add rsp, 8: After the function call to printf, we need to restore the stack pointer to its original state. Since we subtracted 8 from the stack earlier to align it, we add 8 back to rsp to return the stack pointer to its original position.
    mov rax, 60                  ; Set rax to 60 (exit syscall number)
    xor rdi, rdi                 ; Set rdi to 0 (exit status 0)
    syscall                      ; Invoke the exit syscall
  • mov rax, 60: The mov instruction sets the rax register to 60, which is the system call number for exit on Linux. The syscall instruction is used to invoke a system call.
  • xor rdi, rdi: This instruction sets rdi to 0, which is the exit status of the program. A value of 0 indicates successful execution (the program completed without errors).
  • syscall: This triggers the exit system call, causing the program to exit and return control to the operating system.

2. .data Section:

This section is where we define the data used by the program, such as the strings that printf will print.

section .data
    message:  db "the world hates you", 10, 0  ; The message to be printed, with a newline and null terminator
    fmt:      db "%s", 10, 0              ; The format string for printf, with a newline and null terminator
  • fmt: db "%s", 10, 0: This defines the format string fmt that printf expects. It tells printf to expect a string argument (%s), followed by a newline (10) and a null terminator (0).

    Stack Alignment:

    • The x86-64 ABI (Application Binary Interface) requires the stack to be 16-byte aligned at the point of function calls. This is why we adjust the stack pointer before calling printf.

    Linking:

    • The external reference to printf is resolved at the linking stage, which is why we need to link against the C standard library (-lc).
  1. Assemble the Code:

    nasm -felf64 -g asm3.s
    
  2. Link the Object File:

    ld asm3.o -o asm3 -lc --dynamic-linker /lib64/ld-linux-x86-64.so.2
    
  3. Run the Program:

    ./asm3
    
the world hates you

here is a basic program in assembly calling some common c functions

section .text
    extern printf, malloc, free, strcpy, strcmp
    global _start

_start:
    ; Bunny the rabit is going meet his crush!
    mov rdi, fmt_intro                     ; Print the intro message
    mov rsi, msg_intro
    mov rax, 0                             ;rax register must be set to 0 if no floating-point or vector registers(xmm reg) are used for function arguments. It tells printf that no floating-point arguments are present
    call printf                            ;

    ; Bunny wants to impress his crush, but he needs the perfect carrot...
    mov rdi, 128                            ; Allocate 128 bytes for carrot-related memory
    call malloc                            ; Allocate memory for carrot names
    test rax, rax
    jz failed_to_find_carrot               ; If malloc fails, bunny becomes sad

    ; Memory is successfully allocated for the carrot names
    mov rbx, rax                           ; Store the allocated memory address in rbx

    ; The rabbit is searching for the perfect carrot: "love Carrot"
    mov rdi, rbx                           ; Prepare destination for carrot name in allocated memory
    mov rsi, msg_love_carrot
    call strcpy                            ; Copy "Carrot" into memory

    ;Bunny is now on his way with the love Carrot!
    mov rdi, fmt_found_carrot              ; Format string for carrot found
    mov rsi, rbx                           ; Pointer to the carrot name in memory
    mov rax, 0                             ; No xmm registers used
    call printf                            ; Print "Sir Rabbit found the perfect carrot!"

    ; The rabbit arrives at his crush's house and presents the carrot.
    ; His crush compares the "love Carrot" with the "Basic Carrot"
    mov rdi, rbx                           ; First string: "love Carrot"
    mov rsi, msg_basic_carrot              ; Second string: "Basic Carrot"
    call strcmp                            ; (Is it worthy?)

    ; Based on comparison, bunny's fate is decided!
    mov rdi, fmt_comparison_result         ; Format string for comparison result
    mov rsi, rax                           ; Result of comparison in rax
    mov rax, 0
    call printf                            ; Print the comparison result

    ; If the carrot is worthy, bunny wins the heart of his crush!
    cmp rax, 0                             ; If result is 0, the carrots match perfectly
    je crush_says_yes                      ; Jump to successful outcome

    ; If the carrot is not worthy, the crush rejects the rabbit!
    mov rdi, fmt_fail                      ; Format string for failure
    mov rsi, msg_fail                      ; Failure message
    mov rax, 0
    call printf                            ; Print the rejection message
    jmp end_of_story                       ; Jump to end

crush_says_yes:
    ; The crush says YES! The rabbit wins her heart with the perfect carrot!
    mov rdi, fmt_success                   ; Format string for success
    mov rsi, msg_success                   ; Success message
    mov rax, 0
    call printf                            ; Print the success message

end_of_story:
    ; Free the memory used for the carrot name
    mov rdi, rbx                           ; Pointer to the carrot name memory
    call free                              ; Free the memory

    ; Exit the program with success
    mov rax, 60                             ; Exit syscall number
    xor rdi, rdi                            ; Exit status 0 (Everything went fine)
    syscall                                 ; Exit the program gracefully

failed_to_find_carrot:
    ; The rabbit failed to find the carrot, his quest ends in tragedy
    mov rdi, fmt_fail                      ; Format string for failure
    mov rsi, msg_fail                      ; Failure message
    mov rax, 0                             ; No xmm registers used
    call printf                            ; Print the failure message
    mov rax, 60                             ; Exit syscall number
    xor rdi, rdi                            ; Exit status 0 (Failure)
    syscall                                 ; Exit the program

section .data
    fmt_intro: db "bunny is going to meet his crush!", 10, 0
    msg_intro: db "He wants to impress her with the perfect carrot.", 10, 0
    fmt_found_carrot: db "bunny found the perfect carrot: %s", 10, 0
    fmt_comparison_result: db "Comparison result of the carrot: %d", 10, 0
    fmt_success: db "The crush says YES! The perfect carrot won her heart!", 10, 0
    fmt_fail: db "bunny's crush rejects bunny. The carrot was not worthy...", 10, 0
    msg_love_carrot: db "love Carrot", 0               ; The perfect carrot
    msg_basic_carrot: db "Basic Carrot", 0                 ; The rival carrot
    msg_success: db "bunny wins the heart of his crush!", 10, 0
    msg_fail: db "Unfortunately, the carrot wasn’t good enough...", 10, 0

its better to keep up with the field we need to read whats going on

https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html


inline asm notes

Always Use Volatile

  • Volatile Keyword – Prevents the compiler from reordering or removing the assembly.
  • Syntax:
    asm volatile ("...");
    
  • If no output constraints exist, it is implicitly volatile. Still, add volatile to clarify intent.
  • Avoid __volatile__ –outdated .

Never Modify Input Constraints

  • Why? Modifying inputs can lead to unpredictable bugs.
  • Fix: Convert to read-write output with +.
  • Example:
    asm volatile ("..." : "+r"(x) : ...);
    

Never Call Functions from Inline Assembly

  • Why? Inline assembly lacks proper constraints to handle function calls.
  • Allowed: System calls and goto for jumps.
  • Prohibited: Function calls directly.

Avoid Absolute Labels

  • Issue: Named labels may conflict if functions are cloned/inlined.
  • Fix: Use local labels.

Jumps in Assembly Language

assembly does not have things those soydev often use like if, for-each loop , while loop or any of those fancy things, or functions. Instead, control flow is managed using jumps (akin to goto), we just say the IP to move here and there , hey cpu excute that code then go there and execute that kind of thing.

Jump Basics:

  • Instruction Pointer (IP): Jump moves the IP to a specified location.
  • Operands:
    • Absolute Address – Direct memory address.
    • Relative Address – Offset from the current address.
    • Runtime Computed – Address calculated dynamically.

Labels:

  • mark instructions with a string: to act as jump destinations.
  • Example:
    start:
    cmp rax, rcx
    jne start
    

Types of Jumps:

  1. Unconditional Jump (jmp)
    • Direct jump to the label.
    • Used for infinite loops or linking code sections.
    • Example:
      jmp start
      
  2. Conditional Jumps
    • Execute based on FLAGS register values.
    • FLAGS are updated by comparison instructions (cmp).
    • Example:
      cmp rax, rcx    ; Compare rax and rcx
      jne loop        ; Jump if not equal
      
    • Common Conditional Jumps:
      • je – Jump if equal.
      • jne – Jump if not equal.

Functions

  • Functions are implemented by jumping to a block of code and returning with ret.
  • Register Overwriting ,keeping return location is problem it can be solved by using something called stack which stores the info need for function calls .

The Stack

  • Two Pointers:
    • rbp (base pointer) – Start of the stack.
    • rsp (stack pointer) – Last element (grows downward).
  • Stack Operations:
    • push – Write to stack, decrement rsp.
    • pop – Read from stack, increment rsp.
    • call – Push return address, jump to function.
    • ret – Pop return address, jump there.
  • Stack Frame – Memory between rbp and rsp for local variables.

by default max stack size allocated to your process in linux is 8 mb , there is my another // want to learn more about stack https://unix.stackexchange.com/questions/145557/how-does-stack-allocation-work-in-linux


Calling Conventions

  • Argument Passing – First 6 arguments go in rdi, rsi, rdx, rcx, r8, r9. Others use the stack.
  • Return Value – Stored in rax.

Example:

square:
    imul edi, edi    ; x = edi, x^2
    mov  eax, edi    ; return x^2
    ret

Inlining (Optimizing Functions)

  • Avoids the overhead of stack operations by embedding function code directly.
  • Example:
distance:
    imul edi, edi
    imul esi, esi
    lea  eax, [rdi + rsi]
    ret
  • lea (load effective address) optimizes arithmetic.

Tail Call Elimination

  • Recursive Functions – Use stack for each call.
  • Tail Recursion – Last operation is the recursive call.
factorial:
    mov  eax, 1
loop:
    imul eax, edi
    sub  edi, 1
    jne  loop
    ret