9
SCD7000A Rev B
cache with a three cycle miss penalty from the primaries.
Each primary cache has a 64-bit read path, a 128-bit write
path, and both caches can be accessed simultaneously. The
primary caches provide the integer and floating-point units
with an aggregate band-width of 3.6 GB per second at an
internal clock frequency of 225 MHz. During an instruction
or data primary cache refill, the secondary cache can
provide a 64-bit datum every cycle following the initial
three cycle latency for a peak bandwidth of 2.4 GB per
second.
Instruction Cache
The ACT 7000ASC has an integrated 16KB, four-way
set associative instruction cache and, even though
instruction address translation is done in parallel with the
cache access, the combination of 4-way set associativity
and 16KB size results in a cache which is virtually indexed
and physically tagged. Since the effective physical index
eliminates the potential for virtual aliases in the cache, it is
possible that some operating system code can be simplified
as compared with the RM5200 Family, R5000 and R4000
class processors.
The data array portion of the instruction cache is 64 bits
wide and protected by word parity while the tag array holds
a 24-bit physical address, 14 housekeeping bits, a valid bit,
and a single bit of parity protection.
By accessing 64 bits per cycle, the instruction cache is
able to supply two instructions per cycle to the superscalar
dispatch unit. For signal processing, graphics, and other
numerical code sequences where a floating-point load or
store and a floating-point computation instruction are being
issued together in a loop, the entire bandwidth available
from the instruction cache will be consumed by instruction
issue. For typical integer code mixes, where instruction
dependencies and other resource constraints restrict the
achievable parallelism, the extra instruction cache
bandwidth is used to fetch both the taken and non-taken
branch paths to minimize the overall penalty for branches.
A 32-byte (eight instruction) line size is used to maximize
the communication efficiency between the instruction
cache and the secondary cache, or memory system.
The ACT 7000ASC is the first MIPS RISC
microprocessor to support cache locking on a per line basis.
The contents of each line of the cache can be locked by
setting a bit in the Tag. Locking the line prevents its
contents from being overwritten by a subsequent cache
miss. Refill will occur only into unlocked cache lines. This
mechanism allows the programmer to lock critical code
into the cache thereby guaranteeing deterministic behavior
for the locked code sequence.
Data Cache
The ACT 7000ASC has an integrated 16KB, four-way
set associative data cache, and even though data address
translation is done in parallel with the cache access, the
combination of 4-way set associativity and 16KB size
results in a cache which is physically indexed and
physically tagged. Since the effective physical index
eliminates the potential for virtual aliases in the cache, it is
possible that some operating system code can be simplified
compared to the RM5200 Family, R5000 and R4000 class
processors. The data cache is non-blocking; that is, a miss
in the data cache will not necessarily stall the processor
pipeline. As long as no instruction is encountered which is
dependent on the data reference which caused the miss, the
pipeline will continue to advance. Once there are two cache
misses outstanding, the processor will stall if it encounters
another load or store instruction. A 32-byte (eight word)
line size is used to maximize the communication efficiency
between the data cache and the secondary cache or memory
system. The data array portion of the data cache is 64 bits
wide and protected by byte parity while the tag array holds
a 24-bit physical address, 3 housekeeping bits, a two bit
cache state field, and has two bits of parity protection. The
normal write policy is write-back, which means that a store
to a cache line does not immediately cause memory to be
updated. This increases system performance by reducing
bus traffic and eliminating the bottleneck of waiting for
each store operation to finish before issuing a subsequent
memory operation. Software can, however, select
write-through on a per-page basis when appropriate, such
as for frame buffers. Cache protocols supported for the data
cache are:
1. Uncached. Reads to addresses in a memory area
identified as uncached will not access the cache.
Writes to such addresses will be written directly to
main memory without updating the cache.
2. Write-back. Loads and instruction fetches will first
search the cache, reading the next memory hierarchy
level only if the desired data is not cache resident. On
data store operations, the cache is first searched to
determine if the target address is cache resident. If it
is resident, the cache contents will be updated, and
the cache line marked for later write-back. If the
cache lookup misses, the target line is first brought
into the cache and then the write is performed as
above.
3. Write-through with write allocate. Loads and
instruction fetches will first search the cache, reading
from memory only if the desired data is not cache
resident; write-through data is never cached in the
secondary cache. On data store operations, the cache
is first searched to determine if the target address is
cache resident. If it is resident, the primary cache
contents will be updated and main memory will also
be written leaving the write-back bit of the cache line
unchanged; no writes will occur into the secondary.
If the cache lookup misses, the target line is first
brought into the cache and then the write is
performed as above.
4. Write-through without write allocate. Loads and
instruction fetches will first search the cache, reading
from memory only if the desired data is not cache
resident; write-through data is never cached in the
secondary. On data store operations, the cache is first
searched to determine if the target address is cache
resident. If it is resident, the cache contents will be
updated and main memory will also be written