ARM Assembly Programming
Computer Organization and Assembly Languages p g z y g g Yung-Yu Chuang
with slides by Peng-Sheng Chen
GNU compiler and binutils
• HAM uses GNU compiler and binutils GNU C il
– gcc: GNU C compiler – as: GNU assembler – ld: GNU linker
– gdb: GNU project debugger gdb: GNU project debugger
– insight: a (Tcl/Tk) graphic interface to gdb
Pipeline
• COFF (common object file format) ELF ( d d li k f )
• ELF (extended linker format)
• Segments in the object file
– Text: code
– Data: initialized global variables – BSS: uninitialized global variables
.c .elf
gcc
.s as
.coff ld Simulator Debugger C source asm source object file executable
…
GAS program format
.file “test.s”
t t .text
.global main
.type main, %function main:
main:
MOV R0, #100 ADD R0 R0 R0 ADD R0, R0, R0 SWI #11
.end
GAS program format
.file “test.s”
t t .text
.global main
export variable
.type main, %function main:
main:
MOV R0, #100 ADD R0 R0 R0
set the type of a symbol to be
either a function
ADD R0, R0, R0 SWI #11
either a function or an object
signals the end
.end
of the program call interrupt to
end the program end the program
ARM assembly program
label operation operand comments
main:
LDR R1 value @ load value LDR R1, value @ load value STR R1, result
# SWI #11
value: .word 0x0000C123 result: .word 0
Control structures
• Program is to implement algorithms to solve problems Program decomposition and flow of problems. Program decomposition and flow of control are important concepts to express
algorithms algorithms.
• Flow of control:
– Sequence.
– Decision: if-then-else, switch
– Iteration: repeat-until, do-while, for
• Decomposition: split a problem into several smaller and manageable ones and solve them independently.
(subroutines/functions/procedures)
Decision
• If-then-else i h
• switch
If statements
if then elseC T E // find maximum
if (R0>R1) then R2:=R0
C
if (R0>R1) then R2:=R0 else R2:=R1
BNE else
C
T
B endif else:
E
endif:
E
If statements
if then elseC T E // find maximum
if (R0>R1) then R2:=R0
C
if (R0>R1) then R2:=R0 else R2:=R1
BNE else
C
CMP R0 R1
T
CMP R0, R1 BLE else MOV R2 R0 B endif
else:
E
MOV R2, R0 B endif else: MOV R2, R1 endif:
E
else: MOV R2, R1endif:
If statements
// find maximum
if (R0>R1) then R2:=R0
Two other options:
if (R0>R1) then R2:=R0 else R2:=R1
CMP R0, R1
CMP R0 R1 MOVGT R2, R0
MOVLE R2, R1
CMP R0, R1 BLE else MOV R2 R0 MOV R2, R0
MOV R2, R0 B endif else: MOV R2, R1 CMP R0, R1
MOVLE R2, R1
else: MOV R2, R1 endif:
If statements
if (R1==1 || R1==5 || R1==12) R0=1;
TEQ R1, #1 ...
TEQNE R1 #5
TEQNE R1, #5 ...
TEQNE R1, #12 ...
MOVEQ R0 #1 BNE fail MOVEQ R0, #1 BNE fail
If statements
if (R1==0) zero
else if (R1>0) plus else if (R1>0) plus else if (R1<0) neg
TEQ R1, #0 BMI neg
BEQ zero BPL plus neg: ...
B exit Zero: ...
B exit ...
If statements
R0=abs(R0)
TEQ R0, #0
RSBMI R0 R0 #0 RSBMI R0, R0, #0
Multi-way branches
CMP R0, #`0’
BCC other @ less than ‘0’
BCC other @ less than 0 CMP R0, #`9’
BLS digit @ between ‘0’ and ‘9’g @ CMP R0, #`A’
BCC other
CMP R0, #`Z’
BLS letter @ between ‘A’ and ‘Z’
CMP R0, #`a’
BCC other
CMP R0, #`z’
BHI other @ not between ‘a’ and ‘z’
l tt
letter: ...
Switch statements
switch (exp) {
case c1: S1; break;
e=exp;
if (e==c1) {S1}
case c1: S1; break;
case c2: S2; break;
if (e==c1) {S1}
else
if (e==c2) {S2}
...
case cN: SN; break;
default: SD;
if (e==c2) {S2}
else default: SD;
}
...
Switch statements
switch (R0) {
case 0: S0; break;
CMP R0, #0 BEQ S0
case 0: S0; break;
case 1: S1; break;
case 2: S2; break;
BEQ S0
CMP R0, #1 BEQ S1
case 2: S2; break;
case 3: S3; break;
default: err;
BEQ S1
CMP R0, #2 BEQ S2
default: err;
}
BEQ S2
CMP R0, #3 BEQ S3
The range is between 0 and N BEQ S3 err: ...
B it The range is between 0 and N
B exit S0: ...
Slow if N is large
B exit
Switch statements
ADR R1, JMPTBL CMP R0 #3
What if the range is between M and N?
CMP R0, #3
LDRLS PC, [R1, R0, LSL #2]
err: F l N d l
M and N?
err:...
B exit S0:
For larger N and sparse values, we could use a hash function.
S0: ...
JMPTBL
S0 JMPTBL
R1 JMPTBL:
.word S0 d S1
R0 S1 .word S1
.word S2 3
S2
.word S3 S3S3
Iteration
• repeat-until d hil
• do-while
• for
repeat loops
do { } while ( )S C
loop:
S
C
BEQ loop
C
endw:
while loops
while ( ) { }C S
loop:
C
B test
loop:
BNE endw
S
S
test:C
B loop endw:
BEQ loop endw:
while loops
while ( ) { }C S
BNE d
C
B test
loop:
BNE endw
loop:
S S
test:
C
test:
C
BEQ loop endw:
BEQ loop endw:
GCD
int gcd (int i, int j) {
{
while (i!=j) {
{
if (i>j)
i j;
i -= j;
else
j i
j -= i;
} }
}
GCD
Loop: CMP R1, R2
SUBGT R1 R1 R2 SUBGT R1, R1, R2 SUBLT R2, R2, R1 BNE loop
BNE loop
for loops
for ( ; ; ) { }I C A S for (i=0; i<10; i++) { a[i]:=0; }
I
{ a[i]: 0; }
loop:
C
BNE endfor
S S A
B loop endfor:
for loops
for ( ; ; ) { }I C A S for (i=0; i<10; i++) { a[i]:=0; }
I
{ a[i]: 0; }
loop: MOV R0, #0 ADR R2, A
C
BNE endfor MOV R1, #0 loop: CMP R1, #10
BGE df
S
BGE endforSTR R0,[R2,R1,LSL #2]
ADD R1 R1 #1
S A
B loop endfor:
ADD R1, R1, #1 B loop
endfor:
endfor:
for loops
for (i=0; i<10; i++) { do something; }
Execute a loop for a constant of times.
{ do something; }
MOV R1, #0 MOV R1, #10 loop: CMP R1, #10
BGE endfor
@ d thi
loop:
@ d thi
@ do something ADD R1, R1, #1 B loop
@ do something SUBS R1, R1, #1 BNE loop
B loop
endfor:
BNE loop
endfor:
Procedures
• Arguments: expressions passed into a function Parameters: values received by the function
• Parameters: values received by the function
• Caller and callee
void func(int a, int b) {
callee {
...
}
parameters }
int main(void)
{ arguments
caller {
func(100,200);
...
}
Procedures
main:
f ...
BL func
func:
...
... ...
How to pass arguments? By registers? By stack?
.end .end
• How to pass arguments? By registers? By stack?
By memory? In what order?
Procedures
main:
@ R5 f
caller callee
@ use R5 BL func
@ 5
func:
...
@ 5
@ use R5 ...
@ use R5 ...
How to pass arguments? By registers? By stack?
...
.end
...
.end
• How to pass arguments? By registers? By stack?
By memory? In what order?
• Who should save R5? Caller? Callee?
Procedures (caller save)
main:
@ R5 f
caller callee
@ use R5
@ save R5 f
func:
...
@ 5
BL func
@ restore R5
@ use R5
How to pass arguments? By registers? By stack?
@ use R5
.end .end
• How to pass arguments? By registers? By stack?
By memory? In what order?
• Who should save R5? Caller? Callee?
Procedures (callee save)
main:
@ R5 f @ R5
caller callee
@ use R5 BL func
@ 5
func: @ save R5 ...
@ 5
@ use R5 @ use R5
How to pass arguments? By registers? By stack?
.end
@restore R5 .end
• How to pass arguments? By registers? By stack?
By memory? In what order?
• Who should save R5? Caller? Callee?
Procedures
main:
@ R5 f
caller callee
@ use R5 BL func
@ 5
func:
...
@ 5
@ use R5 ...
@ use R5 ...
How to pass arguments? By registers? By stack?
...
.end
...
.end
• How to pass arguments? By registers? By stack?
By memory? In what order?
• Who should save R5? Caller? Callee?
• We need a protocol for these.
ARM Procedure Call Standard (APCS)
• ARM Ltd. defines a set of rules for procedure entry and exit so that
entry and exit so that
– Object codes generated by different compilers can be linked together
be linked together
– Procedures can be called between high-level languages and assembly
languages and assembly
• APCS defines
Use of registers – Use of registers – Use of stack
F t f t k b d d t t t
– Format of stack-based data structure – Mechanism for argument passing
APCS register usage convention
Register APCS name APCS role
0 a1 Argument 1 / integer result / scratch register
1 a2 Argument 2 / scratch register
2 a3 Argument 3 / scratch register
3 a4 Argument 4 / scratch register
3 a4 Argument 4 / scratch register 4 v1 Register variable 1
5 v2 Register variable 2
6 v3 Register variable 3
6 v3 Register variable 3
7 v4 Register variable 4
8 v5 Register variable 5
9 sb/v6 Static base / register variable 6 10 sl/v7 Stack limit / register variable 7
11 fp Frame p pointerp
12 ip Scratch reg. / new sb in inter-link-unit calls 13 sp Lower end of current stack frame
14 lr Link address / scratch register 14 lr Link address / scratch register
15 pc Program counter
APCS register usage convention
Register APCS name APCS role
0 a1 Argument 1 / integer result / scratch register 1 a2 Argument 2 / scratch register
2 a3 Argument 3 / scratch register
3 a4 Argument 4 / scratch register
3 a4 Argument 4 / scratch register
4 v1 Register variable 1
5 v2 Register variable 2
6 v3 Register variable 3
• Used to pass the first 4 parameters
6 v3 Register variable 3
7 v4 Register variable 4
8 v5 Register variable 5
• Caller-saved if necessary
9 sb/v6 Static base / register variable 6 10 sl/v7 Stack limit / register variable 7
11 fp Frame p pointerp
12 ip Scratch reg. / new sb in inter-link-unit calls 13 sp Lower end of current stack frame
14 lr Link address / scratch register 14 lr Link address / scratch register
15 pc Program counter
APCS register usage convention
Register APCS name APCS role
0 a1 Argument 1 / integer result / scratch register 1 a2 Argument 2 / scratch register
2 a3 Argument 3 / scratch register
3 a4 Argument 4 / scratch register
3 a4 Argument 4 / scratch register
4 v1 Register variable 1
5 v2 Register variable 2
6 v3 Register variable 3
• Register variables, must return
h d
6 v3 Register variable 3
7 v4 Register variable 4
8 v5 Register variable 5
unchanged
• Callee-saved
9 sb/v6 Static base / register variable 6 10 sl/v7 Stack limit / register variable 7
11 fp Frame p pointerp
12 ip Scratch reg. / new sb in inter-link-unit calls 13 sp Lower end of current stack frame
14 lr Link address / scratch register 14 lr Link address / scratch register
15 pc Program counter
APCS register usage convention
Register APCS name APCS role
0 a1 Argument 1 / integer result / scratch register 1 a2 Argument 2 / scratch register
2 a3 Argument 3 / scratch register
3 a4 Argument 4 / scratch register
R i f i l
3 a4 Argument 4 / scratch register
4 v1 Register variable 1
5 v2 Register variable 2
6 v3 Register variable 3
• Registers for special purposes
• Could be used as
6 v3 Register variable 3
7 v4 Register variable 4
8 v5 Register variable 5
• Could be used as
temporary variables if saved properly.
9 sb/v6 Static base / register variable 6 10 sl/v7 Stack limit / register variable 7
11 fp Frame pointer
p p y
p p
12 ip Scratch reg. / new sb in inter-link-unit calls 13 sp Lower end of current stack frame
14 lr Link address / scratch register 14 lr Link address / scratch register
15 pc Program counter
Argument passing
• The first four word arguments are passed through R0 to R3
through R0 to R3.
• Remaining parameters are pushed into stack in
th d
the reverse order.
• Procedures with less than four parameters are
more effective.
Return value
• One word value in R0
A l f l h 2 4 d (R0 R1 R0 R2 R0
• A value of length 2~4 words (R0-R1, R0-R2, R0-
R3)
Function entry/exit
• A simple leaf function with less than four
parameters has the minimal overhead 50% of parameters has the minimal overhead. 50% of calls are to leaf functions
main
BL leaf1
main
...
leaf leaf
leaf1: ...
...
leaf
leaf
MOV PC, LR @ return
leaf
Function entry/exit
• Save a minimal set of temporary variables
BL leaf2 ...
leaf2: STMFD sp!, {regs, lr} @ save ...
LDMFD sp!, {regs, pc} @ restore and
@ return
Standard ARM C program address space
code application
load address code
static data
application image top of application
heap
p pp
top of heap
stack pointer (sp) stack limit (sl)
stack
p ( p)
top of memory
Accessing operands
• A procedure often accesses operands in the following ways
following ways
– An argument passed on a register: no further work
A t d th t k t k i t
– An argument passed on the stack: use stack pointer (R13) relative addressing with an immediate offset known at compiling time
known at compiling time
– A constant: PC-relative addressing, offset known at compiling time
compiling time
– A local variable: allocate on the stack and access through stack pointer relative addressingg p g
– A global variable: allocated in the static area and can be accessed by the static base relative (R9) addressing
Procedure
main:
LDR R0 #0
low LDR R0, #0
...
BL func BL func ...
high
high stack
Procedure
func: STMFD SP!, {R4-R6, LR}
SUB SP SP #0xC
low SUB SP, SP, #0xC
...
STR R0 [SP #0] @ v1=a1
v1 STR R0, [SP, #0] @ v1=a1 2
R4 v2 v3 ...
ADD SP, SP, #0xC
LDMFD SP! {R4 R6 PC}
R4 R5 R6 LDMFD SP!, {R4-R6, PC} LR
high
high stack
Assignment #3 Box Filter
Assignment #3 Box Filter
What is an image
• We can think of an image as a function, f: R
2R:
f( ) i th i t it t iti ( ) – f(r, c) gives the intensity at position (r, c) – defined over a rectangle, with a finite range:
f [0 h 1] [0 1] [0 255]
• f: [0,h-1]x[0,w-1] [0,255]
f
c
r
A digital image
• The image can be represented as a matrix of integer values
integer values
110 110 100 100 100 100 100 100 100 100 c
110 110 100 100 100 100 100 100 100 100 120 130 100 100 100 100 100 100 100 100 110 100 100 100 130 110 120 110 100 100 110 100 100 100 130 110 120 110 100 100 100 100 100 110 90 100 90 100 100 110 130 100 100 130 100 90 130 110 120 100 r
100 100 100 120 100 130 110 120 110 100 100 100 100 90 110 80 120 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100