computing-systems-212 / Lab 1 (P1): ARM Disassembly / task1 / ANSWERS.txt
ANSWERS.txt
Raw
Q1. What is the address at which the program starts executing (i.e., the address which corresponds to _start)?

When launching gdb and entering `starti`,
the address corresponding to _start can be seen as 0x00000000004000c0.

The first instruction occurring here is `ldr	x0, [sp]`.
----------------------------------------------------------------------------------

Q2. Is the 6c6c6568 in the objdump output a real instruction? Where does it come from?

In the objdump, 6c6c6568 is the 32-bit word that encodes an instruction,
which in this case is `ldnp	d8, d25, [x11, #-320]`.
The `ldnp` instruction loads a pair of registers (N referring to non-temporal hint).

The xxd output also references the 6c6c6568 bytes (in little-endian so 6865 6c6c):
00000100: 6865 6c6c 6f2c 2000 210a 006e 6565 6420.
----------------------------------------------------------------------------------

Q3. Identify all functions in task1. We will define a function as a target of a bl (or blr) instruction. For each function, identify the full range of addresses this function resides at (first address and last address), as well as what this function does. Hint: most of these were directly taken from lecture.

The first bl (branch with link) function that appears is:
40009c:	97fffff7 	bl	0x400078
This function's first address is 0x400078 and it's last address is at 0x40008c (through to 0x40008f).
This function counts and returns the length of the string stored in x0, returning once it encounters a zero-terminated string.

This second bl function is called four times throughout task1:
4000d0:	97fffff0 	bl	0x400090
This function's first address is 0x400090 and it's last address is at 0x4000bc (through to 0x4000bf).
This function allocates space on the stack, performs a write system call with arguments in x0, x1, and x2, then restores original stack pointer.
Thus function does the task of writing the user's command-line argument back as output on the terminal.

----------------------------------------------------------------------------------

Q4. task1 uses a single command-line argument. Where does it find, and how does it access, this argument? Which instructions are involved (address and instruction)?

Command-line arguments are passed onto the stack (argc and argv).
A single command-line argument (argc) would be passed to the top of the stack and is accessed in task1 as such:
4000c0:	f94003e0 	ldr	x0, [sp]
This is the first instruction in task1, loaded using x0, occuring at 0x4000c0.

For additional command-line arguments (an array argv), shifting to the next address in stack like an array provides access.

----------------------------------------------------------------------------------

Q5. Why does the argument-processing code compare something with 2? Where does this 2 come from?

When executing task1 alongside a single command-line argument, `danial27@castor:~$ ./task1 yay`,
TWO arguments are being passed in reality "./task1" and "yay". Understanding that,
this comparison is done with 2 to ensure a single additional argument is provided to the program.
4000c4:	f100081f 	cmp	x0, #0x2        // checks that one argument is provided alongside ./task1
4000c8:	540000a0 	b.eq	0x4000dc    // jumps to the string syscall write function if comparison is equal as intended

If no or 2+ arguments are provided to the program,
the output returns "need exactly one command-line argument",
since x0 would have #0x1 or #0x3 or more.

----------------------------------------------------------------------------------