• 沒有找到結果。

ICache

在文檔中 ARM922T (Rev 0) (頁 86-91)

Caches, Write Buffer, and Physical Address TAG (PA TAG) RAM

4.2 ICache

The ARM922T includes an 8KB ICache. The ICache has 256 lines of 32 bytes (8 words), arranged as a 64-way set-associative cache and uses MVAs, translated by CP15 register 13 (see Address translation on page 3-6), from the ARM9TDMI core.

The ICache implements allocate-on-read-miss. Random or round-robin replacement can be selected under software control using the RR bit (CP15 register 1, bit 14).

Random replacement is selected at reset.

Instructions can also be locked in the ICache so that they cannot be overwritten by a linefill. This operates with a granularity of 1/64th of the cache, which is 32 words (128 bytes).

All instruction accesses are subject to MMU permission and translation checks.

Instruction fetches that are aborted by the MMU do not cause linefills or instruction fetches to appear on the AMBA ASB interface.

Note

For clarity, the I bit (bit 12 in CP15 register 1) is called the Icr bit throughout the following text. The C bit from the MMU translation table descriptor corresponding to the address being accessed is called the Ctt bit.

4.2.1 ICache organization

The ICache is organized as four segments, each containing 64 lines, and each line containing eight words. The position of the line within the segment is a number from 0 to 63. This is called the index. A line in the cache can be uniquely identified by its segment and index. The index is independent of the MVA. The segment is selected by bits [6:5] of the MVA.

Bits [4:2] of the MVA specify the word within a cache line that is accessed. For halfword operations, bit [1] of the MVA specifies the halfword that is accessed within the word. For byte operations, bits [1:0] specify the byte within the word that is accessed.

Bits [31:7] of the MVA of each cache line are called the TAG. The MVA TAG is stored in the cache, along with the 8-words of data, when the line is loaded by a linefill.

Cache lookups compare bits [31:7] of the MVA of the access with the stored TAG to determine whether the access is a hit or miss. The cache is therefore said to be virtually addressed. The logical model of the 8KB ICache is shown in Figure 4-1 on page 4-5.

Caches, Write Buffer, and Physical Address TAG (PA TAG) RAM

Figure 4-1 Addressing the 8KB ICache 31

TAG

6

7 5 4 2 1 0

Word Byte Seg

TAG W 0 W 7

CAM RAM

Decoder

0 7

32

RDATA[31:0]

Cache line/index

SEG 0 select 2KB RAM = 64 lines x 8 words 63

0 SEG 0

23 1 Modified Virtual Address

3 0

Caches, Write Buffer, and Physical Address TAG (PA TAG) RAM

4.2.2 Enabling and disabling the ICache

On reset, the ICache entries are all invalidated and the ICache is disabled.

You can enable the ICache by writing 1 to the Icr bit, and disable it by writing 0 to the Icr bit.

When the ICache is disabled, the cache contents are ignored and all instruction fetches appear on the AMBA ASB interface as separate nonsequential accesses. The ICache is usually used with the MMU enabled. In this case the Ctt in the relevant MMU translation table descriptor indicates whether an area of memory is cachable.

If the cache is disabled after having been enabled, all cache contents are ignored. All instruction fetches appear on the AMBA ASB interface as separate nonsequential accesses and the cache is not updated. If the cache is subsequently re-enabled its contents are unchanged. If the contents are no longer coherent with main memory, you must invalidate the ICache before you re-enable it (see Register 7, cache operations register on page 2-17).

If the cache is enabled with the MMU disabled, all instruction fetches are treated as cachable. No protection checks are made, and the physical address is flat-mapped to the modified virtual address.

You can enable the MMU and ICache simultaneously by writing a 1 to the M bit, and a 1 to the Icr bit in CP15 register 1, with a single MCR instruction.

Note

ARM922T implements a nonsequential access on the AMBA ASB interface as an A-TRAN cycle followed by an S-TRAN cycle. It does not produce N-TRAN cycles. A linefill appears as an A-TRAN cycle followed by an S-TRAN cycle.

4.2.3 ICache operation

If the ICache is disabled, each instruction fetch results in a separate nonsequential memory access on the AMBA ASB interface, giving very low bus and memory performance. Therefore, you must enable the ICache as soon as possible after reset.

If the ICache is enabled, an ICache lookup is performed for each instruction fetch regardless of the setting of the Ctt bit in the relevant MMU translation table descriptor:

• If the required instruction is found in the cache, the lookup is called a cache hit.

If the instruction fetch is a cache hit and Ctt=1, indicating a cachable region of memory, then the instruction is returned from the cache to the ARM9TDMI CPU core.

Caches, Write Buffer, and Physical Address TAG (PA TAG) RAM

• If the required instruction is not found in the cache, the lookup is called a cache miss. If it is a cache miss and Ctt=1, then an eight-word linefill is performed, possibly replacing another entry. The entry to be replaced, called the victim, is chosen from the entries that are not locked, using either a random or round-robin replacement policy.If Ctt=0, indicating a noncachable region of memory, then a single nonsequential memory access appears on the AMBA ASB interface.

Note

If Ctt=0, indicating a noncachable region of memory, then the cache lookup results in a cache miss. The only way that it can result in a cache hit is if software has changed the value of the Ctt bit in the MMU translation table descriptor without invalidating the cache contents. This is a programming error. The behavior in this case is architecturally unpredictable and varies between implementations.

4.2.4 ICache replacement algorithm

The ICache and DCache replacement algorithm is selected by the RR bit in the CP15 control register (CP15 register 1, bit 14). Random replacement is selected at reset.

Setting the RR bit to 1 selects round-robin replacement. Round-robin replacement means that entries are replaced sequentially in each cache segment.

4.2.5 ICache lockdown

You can lock instructions into the ICache, causing the ICache to guarantee a hit, and provide optimum and predictable execution time. If you enable the ICache, an ICache lookup is performed for each instruction fetch. If the ICache misses and the Ctt=1 then an eight-word linefill is performed. The entry to be replaced is selected by the victim pointer. You can lock instructions into the ICache by controlling the victim pointer, and forcing prefetches to the ICache. You lock instructions in the ICache by first ensuring the code to be locked is not already in the cache. You can ensure this by invalidating either the whole ICache or specific lines:

MCR p15, 0, Rd, c7, c5, 0 ; Invalidate ICache

MCR p15, 0, Rd, c7, c5, 1 ; Invalidate ICache line using MVA

You can then use a short software routine to load the instructions into the ICache. The software routine must either be noncachable, or already in the ICache but not in an ICache line about to be overwritten. You must enable the MMU to ensure that any TLB misses that occur while loading the instructions cause a page table walk. The software routine operates by writing to CP15 register 9 to force the victim pointer to a specific ICache line and by using the prefetch ICache line operation to force the ICache to

Caches, Write Buffer, and Physical Address TAG (PA TAG) RAM

When all the instructions have been loaded, they are then locked by writing to CP15 register 9 to set the victim pointer base to be one higher than the last entry written. All further linefills now occur in the range victim base to 63. An example ICache lockdown routine is shown in Example 4-1. The example assumes that the number of cache lines to be loaded is not known. The address does not have to be cache line or word-aligned but this is recommended to ensure future compatibility.

Note

The Prefetch ICache Line operation uses MVA format, because address aliasing is not performed on the address in Rd. It is advisable for the associated TLB entry to be locked into the TLB to avoid page table walks during execution of the locked code.

Example 4-1 ICache lockdown routine

ADRL r0,start_address ; address pointer ADRL r1,end_address

MOV r2,#lockdown_base<<26 ; victim pointer

MCR p15,0,r2,c9,c0,1 ; write ICache victim and lockdown base loop MCR p15,0,r0,c7,c13,1 ; Prefetch ICache line

ADD r0,r0,#32 ; increment address pointer to next ICache line

;; do we need to increment the victim pointer?

;; test for segment 0, and if so, increment the victim pointer

;; and write the ICache victim and lockdown base.

AND r3,r0,#0x60 ; extract the segment bits from the address

CMP r3,#0x0 ; test for segment 0

ADDEQ r2,r2,#0x1<<26 ; if segment 0, increment victim pointer MCREQ p15,0,r2,c9,c0,1 ; and write ICache victim and lockdown base

;; have we linefilled enough code?

;; test for the address pointer being less than or equal to the

;; end_address and if so, loop and perform another linefill

CMP r0,r1 ; test for less than or equal to end_address

BLE loop ; if not, loop

;; have we exited with r3 pointing to segment 0?

;; if so, the ICache victim and lockdown base has already been set to one

;; higher than the last entry written.

;; if not, increment the victim pointer and write the ICache victim and

;; lockdown base.

CMP r3,#0x0 ; test for segments 1 to 3

ADDNE r2,r2,#0x1<<26 ; if address is segment 1 to 3,

MCRNE p15,0,r2,c9,c0,1 ; write ICache victim and lockdown base

Caches, Write Buffer, and Physical Address TAG (PA TAG) RAM

在文檔中 ARM922T (Rev 0) (頁 86-91)

相關文件