• 沒有找到結果。

Software hierarchy

N/A
N/A
Protected

Academic year: 2022

Share "Software hierarchy"

Copied!
31
0
0

加載中.... (立即查看全文)

全文

(1)

www.nand2tetris.org

Building a Modern Computer From First Principles

Assembler

(2)

Where we are at:

Assembler Chapter 6

H.L. Language

&

Operating Sys.

abstract interface

Compiler

Chapters 10 - 11

VM Translator

Chapters 7 - 8

Computer Architecture

Chapters 4 - 5

Gate Logic

Chapters 1 - 3 Electrical

Engineering

Physics Virtual

Machine

abstract interface

Software hierarchy

Assembly Language

abstract interface

Hardware hierarchy

Machine Language

abstract interface

Hardware Platform

abstract interface

Chips &

Logic Gates

abstract interface

Human Thought

Abstract design

Chapters 9, 12

(3)

Why care about assemblers?

 Assemblers employ nifty programming tricks

 Assemblers are the first rung up the software hierarchy ladder

 An assembler is a translator of a simple language

 Writing an assembler = low-impact practice for writing compilers.

cross-platform

compiling

(4)

0000000000010000 1110111111001000 0000000000010001 1110101010001000 0000000000010000 1111110000010000 0000000000000000 1111010011010000 0000000000010010 1110001100000001 0000000000010000 1111110000010000 0000000000010001

...

Target code (a text file of binary Hack code)

assemble

Assembly example

       // Computes 1+...+RAM[0]

// And stored the sum in RAM[1]

@i    

M=1    // i = 1      

@sum  

M=0    // sum = 0 (LOOP)

@i     // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE  D;JGT

...    // Etc.

Source code (example)

The program translation challenge

 Extract the program’s semantics from the source program, using the syntax rules of the source language

 Re-express the program’s semantics in the target language, using the syntax rules of the target language

Assembler = simple translator

 Translates each assembly command into one or more binary machine instructions

 Handles symbols (e.g. i , sum , LOOP , …).

execute

(5)

Revisiting Hack low-level programming: an example

        // Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@i

M=1   // i = 1      

@sum  

M=0   // sum = 0 (LOOP)

@i // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE  D;JGT

@i // sum += i D=M

@sum M=D+M

@i // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Assembly program

(

sum.asm

)

CPU emulator screen shot after running this program

The CPU emulator allows loading and executing symbolic Hack code. It resolves all the symbolic

program generated

output user supplied

input

(6)

The assembler’s view of an assembly program

        // Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@i

M=1   // i = 1      

@sum  

M=0   // sum = 0 (LOOP)

@i // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE  D;JGT

@i // sum += i D=M

@sum M=D+M

@i // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being

one of the following:

(7)

The assembler’s view of an assembly program

// Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@i

M=1   // i = 1 

@sum  

M=0   // sum = 0 (LOOP)

@i // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE  D;JGT

@i // sum += i D=M

@sum M=D+M

@i // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being one of the following:

White space

Empty lines/indentation

Line comments

In-line comments

(8)

The assembler’s view of an assembly program

        // Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@i

M=1   // i = 1      

@sum  

M=0   // sum = 0 (LOOP)

@i // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE  D;JGT

@i // sum += i D=M

@sum M=D+M

@i // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being one of the following:

White space

Empty lines/indentation

Line comments

In-line comments

Instructions

A-instruction

C-instruction

(9)

The assembler’s view of an assembly program

        // Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@i

M=1   // i = 1      

@sum

M=0   // sum = 0 (LOOP)

@i // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE D;JGT

@i // sum += i D=M

@sum M=D+M

@i // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being one of the following:

White space

Empty lines/indentation

Line comments

In-line comments

Instructions

A-instruction

C-instruction

Symbols

references

Label declaration (XXX)

(10)

The assembler’s view of an assembly program

        // Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@16

M=1   // i = 1      

@17

M=0   // sum = 0

@16 // if i>RAM[0] goto WRITE D=M

@0 D=D‐M

@18 D;JGT

@16 // sum += i D=M

@17 M=D+M

@16 // i++

M=M+1 

@4 // goto LOOP 0;JMP

@17 D=M

@1

M=D  // RAM[1] = the sum

@22 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being one of the following:

White space

Empty lines/indentation

Line comments

In-line comments

Instructions

A-instruction

C-instruction

Symbols

references

Label declaration (XXX) Assume that there is

no symbol for now!

(11)

White space

        // Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@16    

M=1   // i = 1      

@17  

M=0   // sum = 0

@16    // if i>RAM[0] goto WRITE D=M

@0 D=D‐M

@18  D;JGT

@16    // sum += i D=M

@17 M=D+M

@16    // i++

M=M+1 

@4 // goto LOOP 0;JMP

@17 D=M

@1

M=D  // RAM[1] = the sum

@22 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being one of the following:

White space

Empty lines/indentation

Line comments

In-line comments

Instructions

A-instruction

C-instruction

Symbols

references

Label declaration (XXX)

(12)

White space → ignore/remove them

   

@16    

M=1       

@17   M=0  

@16    D=M

@0 D=D‐M

@18  D;JGT

@16    D=M

@17 M=D+M

@16    M=M+1 

@4 0;JMP

@17 D=M

@1 M=D 

@22 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being one of the following:

White space

Empty lines/indentation

Line comments

In-line comments

Instructions

A-instruction

C-instruction

Symbols

references

Label declaration (XXX)

(13)

Instructions → binary encoding

   

@16    

M=1       

@17   M=0  

@16    D=M

@0 D=D‐M

@18  D;JGT

@16    D=M

@17 M=D+M

@16    M=M+1 

@4 0;JMP

@17 D=M

@1 M=D 

@22 0;JMP

Assembly program

Assembly program =

a stream of text lines, each being one of the following:

White space

Empty lines/indentation

Line comments

In-line comments

Instructions

A-instruction

C-instruction

Symbols

references

Label declaration (XXX)

(14)

Translating / assembling A-instructions

value (v = 0 or 1)

0 v v v v v v v v v v v v v v v

Binary:

@ value // Where value is either a non-negative decimal number // or a symbol referring to such number.

Symbolic:

Translation to binary:

If value is a non-negative decimal number, simple, e.g. @16

If value is a symbol, later.

(15)

Translating / assembling C-instructions

jump dest

comp

1 1 1

a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

dest=comp;jump // Either the dest or jump fields may be empty.

// If dest is empty, the "=" is ommitted;

// If jump is empty, the ";" is omitted.

Symbolic:

Binary:

(16)

Translating / assembling C-instructions

jump dest

comp

1 1 1

a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

dest=comp;jump // Either the dest or jump fields may be empty.

// If dest is empty, the "=" is ommitted;

// If jump is empty, the ";" is omitted.

Symbolic:

Binary:

Example: MD=D+1

(17)

Translating / assembling C-instructions

jump dest

comp

1 1 1

a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

dest=comp;jump // Either the dest or jump fields may be empty.

// If dest is empty, the "=" is ommitted;

// If jump is empty, the ";" is omitted.

Symbolic:

Binary:

Example: D; JGT

(18)

The overall assembly logic

For each (real) command

Parse the command,

i.e. break it into its underlying fields

A-instruction: replace the symbolic reference (if any) with the

corresponding memory address, which is a number

(how to do it, later)

C-instruction: for each field in the instruction, generate the

corresponding binary code

Assemble the translated binary codes into a complete 16-bit machine

instruction

Write the 16-bit instruction to the output file.

        // Computes 1+...+RAM[0]

// And stores the sum in RAM[1].

@i    

M=1   // i = 1      

@sum  

M=0   // sum = 0 (LOOP)

@i    // if i>RAM[0] goto WRITE D=M

@0 D=D‐M

@WRITE  D;JGT

@i    // sum += i D=M

@sum M=D+M

@i    // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Assembly program

(19)

Assembly programs typically have many symbols:

 Labels that mark destinations of goto commands

 Labels that mark special memory locations

 Variables

These symbols fall into two categories:

 User–defined symbols (created by programmers)

 Variables

 Labels (forward reference could be a problem)

 Pre-defined symbols (used by the Hack platform).

Handling symbols (aka symbol resolution)

@R0 D=M

@END D;JLE 

@counter M=D

@SCREEN D=A

@x M=D (LOOP)

@x A=M M=‐1

@x D=M

@32 D=D+A

@x M=D

@counter MD=M‐1

@LOOP D;JGT (END)

@END 0;JMP

Typical symbolic Hack assembly code:

(20)

Label symbols: Used to label destinations of goto commands. Declared by the pseudo-command (XXX). This directive defines the symbol XXX to refer to the instruction memory location holding the next command in the program. (the assembler needs to maintain instrCtr)

Variable symbols: Any user-defined symbol xxx appearing in an assembly program that is not defined elsewhere using the ( xxx) directive is treated as a variable, and is automatically

assigned a unique RAM address, starting at RAM address 16 (the assembler needs to maintain

nextAddr)

(why start at 16? Later.)

By convention, Hack programmers use lower-case and upper-case to represent variable and label names, respectively

Handling symbols: user-defined symbols

@R0 D=M

@END D;JLE 

@counter M=D

@SCREEN D=A

@x M=D (LOOP)

@x A=M M=‐1

@x D=M

@32 D=D+A

@x M=D

@counter MD=M‐1

@LOOP D;JGT (END)

@END 0;JMP

Typical symbolic Hack assembly code:

(21)

Virtual registers:

The symbols R0,…, R15 are automatically

predefined to refer to RAM addresses 0,…,15 I/O pointers: The symbols SCREEN and KBD are

automatically predefined to refer to RAM addresses 16384 and 24576, respectively (base addresses of the screen and keyboard memory maps)

VM control pointers: the symbols SP, LCL, ARG, THIS, and THAT (that don’t appear in the code example on the right) are automatically

predefined to refer to RAM addresses 0 to 4, respectively

(The VM control pointers, which overlap R0,…, R4 will come to play in the virtual machine

implementation, covered in the next lecture)

@R0 D=M

@END D;JLE 

@counter M=D

@SCREEN D=A

@x M=D (LOOP)

@x A=M M=‐1

@x D=M

@32 D=D+A

@x M=D

@counter MD=M‐1

@LOOP D;JGT (END)

@END 0;JMP

Typical symbolic Hack assembly code:

Handling symbols: pre-defined symbols

(22)

        // Computes 1+...+RAM[0]

// And stored the sum in RAM[1]

@i    

M=1   // i = 1      

@sum  

M=0   // sum = 0 (LOOP)

@i    // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE  D;JGT

@i    // sum += i D=M

@sum M=D+M

@i    // i++

M=M+1  @LOOP // goto LOOP 0;JMP (WRITE) @sum D=M @R1 M=D  // RAM[1] = the sum (END) @END 0;JMP

Source code (example) This symbol table is generated by the assembler, and used to translate the symbolic code into binary code. Handling symbols: symbol table

R0       0

R1       1

R2       2

...      ... R15       15

SCREEN         16384

KBD      24576

SP       0

LCL      1

ARG      2

THIS       3

THAT       4

LOOP       4

WRITE       18

END       22

i 16 sum       17

Symbol table

(23)

R0       0

R1       1

R2       2

... R15       15

SCREEN         16384

KBD      24576

SP       0

LCL      1

ARG      2

THIS       3

THAT       4

Symbol table

        // Computes 1+...+RAM[0] // And stored the sum in RAM[1] @i     M=1   // i = 1       @sum   M=0   // sum = 0 (LOOP) @i    // if i>RAM[0] goto WRITE D=M @R0 D=D‐M @WRITE  D;JGT @i    // sum += i D=M @sum M=D+M @i    // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Source code (example)

Initialization: create an empty symbol table and populate it with all the pre-defined symbols

Handling symbols: constructing the symbol table

(24)

R0       0

R1       1

R2       2

... R15       15

SCREEN         16384

KBD      24576

SP       0

LCL      1

ARG      2

THIS       3

THAT       4

LOOP       4

WRITE       18

END       22

Symbol table

        // Computes 1+...+RAM[0] // And stored the sum in RAM[1] @i     M=1   // i = 1       @sum   M=0   // sum = 0 (LOOP) @i    // if i>RAM[0] goto WRITE D=M @R0 D=D‐M @WRITE  D;JGT @i    // sum += i D=M @sum M=D+M @i    // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Source code (example)

Initialization: create an empty symbol table and populate it with all the pre-defined symbols First pass: go through the entire source code, and add all the user-defined label symbols to the symbol table (without generating any code)

Handling symbols: constructing the symbol table

(25)

R0       0

R1       1

R2       2

... R15       15

SCREEN         16384

KBD      24576

SP       0

LCL      1

ARG      2

THIS       3

THAT       4

LOOP       4

WRITE       18

END       22

i 16 sum       17

Symbol table

        // Computes 1+...+RAM[0] // And stored the sum in RAM[1] @i     M=1   // i = 1       @sum   M=0   // sum = 0 (LOOP) @i    // if i>RAM[0] goto WRITE D=M @R0 D=D‐M @WRITE  D;JGT @i    // sum += i D=M @sum M=D+M @i    // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Source code (example)

Initialization: create an empty symbol table and populate it with all the pre-defined symbols First pass: go through the entire source code, and add all the user-defined label symbols to the symbol table (without generating any code) Second pass: go again through the source code, and use the symbol table to translate all the commands. In the process, handle all the user-

Handling symbols: constructing the symbol table

(26)

R0       0

R1       1

R2       2

... R15       15

SCREEN         16384

KBD      24576

SP       0

LCL      1

ARG      2

THIS       3

THAT       4

Symbol table

        // Computes 1+...+RAM[0] // And stored the sum in RAM[1] @i     M=1   // i = 1       @sum   M=0   // sum = 0 (LOOP) @i    // if i>RAM[0] goto WRITE D=M @R0 D=D‐M @WRITE  D;JGT @i    // sum += i D=M @sum M=D+M @i    // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Source code (example)

Handling symbols: constructing the symbol table (one-pass solution?)

(27)

The assembly process (detailed)

 Initialization: create the symbol table and initialize it with the pre- defined symbols

 First pass: march through the source code without generating any code.

For each label declaration (LABEL) that appears in the source code,

add the pair <LABEL , n > to the symbol table

(28)

The assembly process (detailed)

 Second pass: march again through the source, and process each line:

 If the line is a C-instruction, simple

 If the line is @xxx where xxx is a number, simple

 If the line is @xxx and xxx is a symbol, look it up in the symbol table and proceed as follows:

 If the symbol is found, replace it with its numeric value and complete the command’s translation

 If the symbol is not found, then it must represent a new variable:

add the pair <xxx , n > to the symbol table, where n is the next available RAM address, and complete the command’s

translation.

 (Platform design decision: the allocated RAM addresses are

running, starting at address 16).

(29)

Note that comment lines and pseudo-commands (label declarations) generate no code.

0000000000010000 1110111111001000 0000000000010001 1110101010001000 0000000000010000 1111110000010000 0000000000000000 1111010011010000 0000000000010010 1110001100000001 0000000000010000 1111110000010000 0000000000010001 1111000010001000 0000000000010000 1111110111001000 0000000000000100 1110101010000111 0000000000010001 1111110000010000 0000000000000001 1110001100001000 0000000000010110 1110101010000111

Target code

assemble

The result ...

        // Computes 1+...+RAM[0]

// And stored the sum in RAM[1]

@i    

M=1   // i = 1      

@sum  

M=0   // sum = 0 (LOOP)

@i    // if i>RAM[0] goto WRITE D=M

@R0 D=D‐M

@WRITE  D;JGT

@i    // sum += i D=M

@sum M=D+M

@i    // i++

M=M+1 

@LOOP // goto LOOP 0;JMP

(WRITE)

@sum D=M

@R1

M=D  // RAM[1] = the sum (END)

@END 0;JMP

Source code (example)

(30)

Proposed assembler implementation

An assembler program can be written in any high-level language. (and could be run in the other platforms, cross-platform compiling)

The book proposes a language-independent design, as follows.

Software modules:

Parser: Unpacks each command into its underlying fields

Code: Translates each field into its corresponding binary value, and assembles the resulting values

SymbolTable: Manages the symbol table

Main: Initializes I/O files and drives the show.

(31)

Perspective

 Simple machine language, simple assembler

 Most assemblers are not stand-alone, but rather encapsulated in a translator of a higher order

 C programmers that understand the code generated by a C compiler can improve their code considerably

 C programming (e.g. for real-time systems) may involve re-writing critical segments in assembly, for optimization

 Writing an assembler is an excellent practice for writing more

challenging translators, e.g. a VM Translator and a compiler, as we

will do in the next lectures.

參考文獻

相關文件

 Sender Report: used by active session participants to relay transmission and reception statistics.  Receiver Report: used to send reception statistics from those

• The burst profile to use for any uplink transmission is defined by the Uplink Interval Usage Code (UIUC).. – Each UIUC is mapped to a burst profile in the

In this section we define a general model that will encompass both register and variable automata and study its query evaluation problem over graphs. The model is essentially a

This document uses the terminology defined in the UPnP Architecture document, such as: action, SST variable, and action parameter. This sub-section defines the following

If x or F is a vector, then the condition number is defined in a similar way using norms and it measures the maximum relative change, which is attained for some, but not all

To be an effective practitioner, a defined body of formal knowledge and skills is the necessary, but not sufficient, condition to meet workplace requirements. The important

From these characterizations, we particularly obtain that a continuously differentiable function defined in an open interval is SOC-monotone (SOC-convex) of order n ≥ 3 if and only

Courtesy: Ned Wright’s Cosmology Page Burles, Nolette &amp; Turner, 1999?. Total Mass Density