4: 04 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet ed ne sd ay ,1 2J an 900 MHz 64-bit ua ry ,2 01 1 02 :3 RM7965A-900UI ge ri of IH S on W Microprocessor Released Issue No. 2: March 2010 Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta Data Sheet Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 1 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 4: 04 AM Legal Information 02 :3 Copyright 01 1 (c) 2010 PMC-Sierra, Inc. All rights reserved. ua ry ,2 The information in this document is proprietary and confidential to PMC-Sierra, Inc., and for its customers' internal use. In any event, no part of this document may be reproduced or redistributed in any form without the express written consent of PMC-Sierra, Inc. ,1 2J an PMC-2100294 (R2) sd ay Disclaimer of IH S on W ed ne None of the information contained in this document constitutes an express or implied warranty by PMC-Sierra, Inc. as to the sufficiency, fitness or suitability for a particular purpose of any such information or the fitness, or suitability for a particular purpose, merchantability, performance, compatibility with other parts or systems, of any of the products of PMC-Sierra, Inc., or any portion thereof, referred to in this document. PMC-Sierra, Inc. expressly disclaims all representations and warranties of any kind regarding the contents or use of the information, including, but not limited to, express and implied warranties of accuracy, completeness, merchantability, fitness for a particular use, or non-infringement. nk at es h Be ta ge ri In no event will PMC-Sierra, Inc. be liable for any direct, indirect, special, incidental or consequential damages, including, but not limited to, lost profits, lost business or lost data resulting from any use of or reliance upon the information, whether or not PMC-Sierra, Inc. has been advised of the possibility of such damage. Trademarks lle d] by Ve For a complete list of PMC-Sierra's trademarks, see our web site at http://www.pmcsierra.com/legal/. Other product and company names mentioned herein may be the trademarks of their respective owners. on tro Patents [c The technology discussed is protected by one or more of the following patent grants. Do wn lo ad ed U.S. Patent Numbers 5,953,748; 5,606,683; 5,760,620; 6,703,950. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 2 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM Contacting PMC-Sierra 01 1 02 :3 4: PMC-Sierra 8555 Baxter Place Burnaby, BC Canada V5A 4V7 ry ,2 Tel: +1 (604) 415-6000 Fax: +1 (604) 415-6200 Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an ua Document Information: document@pmc-sierra.com Corporate Information: info@pmc-sierra.com Technical Support: apps@pmc-sierra.com Web Site: http://www.pmc-sierra.com Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 3 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM Revision History Issue Date Details of Change 1 February 2010 Data sheet created. 2 March 2010 Modified Table 28 (Power) and Notes. Change VccInt and VccP to 1.32 V for > 835 MHz operation. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an ua ry ,2 01 1 02 :3 4: Issue No. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 4 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM Table of Contents 4: Legal Information........................................................................................................................... 2 :3 Copyright................................................................................................................................. 2 02 Disclaimer ............................................................................................................................... 2 01 1 Trademarks ............................................................................................................................. 2 ,2 Patents .................................................................................................................................... 2 ry Revision History............................................................................................................................. 4 an ua List of Figures ................................................................................................................................ 8 2J List of Tables.................................................................................................................................. 9 Definitions ............................................................................................................................. 10 2 Introduction ........................................................................................................................... 11 Features ...................................................................................................................... 11 sd 2.1 ay ,1 1 Block Diagram....................................................................................................................... 13 4 E9000 CPU Core .................................................................................................................. 14 ed ne 3 CPU Registers............................................................................................................. 14 4.2 Superscalar Dispatch .................................................................................................. 15 4.3 Seven-stage Pipeline .................................................................................................. 16 on S IH RM7000 Pipeline Stages................................................................................ 16 4.3.2 E9000 Pipeline Stages ................................................................................... 17 ge ri of 4.3.1 ta Delay slots................................................................................................................... 18 Branch Delay.................................................................................................. 18 4.4.2 Load Delay ..................................................................................................... 18 h Be 4.4.1 nk at es 4.4 W 4.1 4.5 Branch Prediction........................................................................................................ 18 4.6 Integer Unit.................................................................................................................. 18 Register File ................................................................................................... 19 Ve 4.6.1 Integer ALU ................................................................................................................. 19 4.8 Integer Multiply/Divide................................................................................................. 19 d] Floating-Point Coprocessor......................................................................................... 20 lle 4.9 by 4.7 on tro 4.10 Floating-Point Unit....................................................................................................... 20 Do wn lo ad ed [c 4.11 Floating-Point General Register File........................................................................... 22 4.12 System Control Coprocessor (CP0)............................................................................ 22 4.13 System Control Coprocessor Registers...................................................................... 22 4.14 Memory Management Unit (MMU).............................................................................. 23 4.15 Virtual to Physical Address Mapping........................................................................... 24 4.16 Joint TLB .................................................................................................................... 25 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 5 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM 4.17 Instruction TLB ............................................................................................................ 26 04 4.18 Data TLB .................................................................................................................... 26 4: 4.19 Interrupt Handling........................................................................................................ 26 02 :3 4.20 Standby Mode ............................................................................................................. 28 4.21 JTAG Interface ............................................................................................................ 29 ,2 5.2 Data Cache ................................................................................................................. 31 5.3 Secondary Cache........................................................................................................ 32 2J an ua ry Instruction Cache ........................................................................................................ 30 Secondary Caching Protocols........................................................................ 33 5.3.2 Fast Packet Cache Mode............................................................................... 33 ay ,1 5.3.1 Cache Modes .............................................................................................................. 34 5.5 Cache Attributes.......................................................................................................... 35 5.6 Cache Locking ............................................................................................................ 36 5.7 Primary Write Buffer .................................................................................................... 36 5.8 Data Prefetch .............................................................................................................. 36 5.9 Memory Latencies....................................................................................................... 37 IH S on W ed ne sd 5.4 of System Interface ................................................................................................................... 38 System Address/Data Bus .......................................................................................... 38 6.2 System Command Bus ............................................................................................... 39 6.3 Handshake Signals ..................................................................................................... 40 6.4 System Interface Operation ........................................................................................ 41 6.5 Write Modes ................................................................................................................ 43 nk at es h Be ta ge ri 6.1 Integrated Debug .................................................................................................................. 44 7.1 EJTAG Debugging....................................................................................................... 44 7.2 Trace Buffer................................................................................................................. 44 7.3 Test/Breakpoint Registers ........................................................................................... 45 Performance Counters ................................................................................................ 46 Boot-Mode Settings .............................................................................................................. 48 on 8 tro lle 7.4 Ve 7 5.1 by 6 Cache Architecture................................................................................................................ 30 d] 5 01 1 4.22 Reset Sequence.......................................................................................................... 29 RM7000 and RM7965A Differences ..................................................................................... 53 [c 9 Do wn lo ad ed 10 Pin Descriptions .................................................................................................................... 55 11 Absolute Maximum Ratings .................................................................................................. 59 12 DC Electrical Characteristics ................................................................................................ 60 13 Power .................................................................................................................................... 61 13.1 Normal Operating Conditions...................................................................................... 61 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 6 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM 13.2 Power Requirements................................................................................................... 62 04 13.3 Typical Power Consumption........................................................................................ 62 4: 14 AC Electrical Characteristics................................................................................................. 64 02 :3 14.1 Capacitive Load Deration............................................................................................ 64 14.2 Clock Parameters........................................................................................................ 64 01 1 14.3 System Interface Parameters...................................................................................... 64 ,2 14.4 Boot-Time Interface Parameters ................................................................................. 65 ua ry 15 Timing Diagrams ................................................................................................................... 66 an 15.1 Clock Timing................................................................................................................ 66 2J 15.2 System Interface Timing.............................................................................................. 66 ,1 16 Thermal Information.............................................................................................................. 67 ay 17 Packaging and Pinout Information........................................................................................ 68 sd 17.1 256-pin CSBGA Package Diagram ............................................................................. 68 ne 17.2 256-pin CSBGA Alphanumerical Pinout...................................................................... 69 W ed 17.3 256-pin CSBGA Alphabetical Pinout ........................................................................... 71 Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on 18 Ordering Information ............................................................................................................. 73 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 7 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM List of Figures 4: Figure 1 Block Diagram ............................................................................................................. 13 :3 Figure 2 General Purpose Registers......................................................................................... 14 02 Figure 3 Instruction Issue Paradigm.......................................................................................... 15 01 1 Figure 4 Pipeline Execution Diagram ........................................................................................ 17 ,2 Figure 5 CP0 Registers ............................................................................................................. 23 ry Figure 6 Fast Packet Cache Mode............................................................................................ 34 an ua Figure 7 Typical Embedded System Block Diagram with 64-bit SysAD Bus ............................ 38 2J Figure 8 Processor Block Read................................................................................................. 41 ,1 Figure 9 Processor Block Write................................................................................................. 42 ay Figure 10 Multiple Outstanding Reads ...................................................................................... 43 sd Figure 11 Clock Timing.............................................................................................................. 66 ne Figure 12 Input Timing............................................................................................................... 66 Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed Figure 13 Output Timing............................................................................................................ 66 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 8 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM List of Tables 4: Table 1 Acronyms and Abbreviations........................................................................................ 10 :3 Table 2 Instruction Issue Rules ................................................................................................. 15 02 Table 3 Dual Issue Instruction Classes ..................................................................................... 16 01 1 Table 4 Integer ALU Operations................................................................................................ 19 ,2 Table 5 Integer Multiply/Divide Operations ............................................................................... 19 ry Table 6 Floating Point Latencies and Repeat Rates................................................................. 21 an ua Table 7 Kernel Mode Virtual Addressing (32-bit) ...................................................................... 24 2J Table 8 Cause Register............................................................................................................. 26 ,1 Table 9 Interrupt Control Register ............................................................................................. 27 ay Table 10 IPLLO Register........................................................................................................... 27 sd Table 11 IPLHI Register ............................................................................................................ 27 ne Table 12 Interrupt Vector Spacing............................................................................................. 28 ed Table 13 E9000 Cache Operating Modes ................................................................................. 34 on W Table 14 RM7965A Cache Attributes........................................................................................ 35 Table 15 Cache Locking Control ............................................................................................... 36 IH S Table 16 On-Chip Memory Latencies........................................................................................ 37 of Table 17 Watch Registers ......................................................................................................... 45 ge ri Table 18 Performance Counter Control .................................................................................... 46 ta Table 19 System Interface ........................................................................................................ 55 Be Table 20 Clock/Control Interface............................................................................................... 56 nk at es h Table 21 Power Supply ............................................................................................................. 56 Table 22 Interrupt Interface ....................................................................................................... 57 Table 23 JTAG Interface ........................................................................................................... 57 Ve Table 24 Initialization Interface.................................................................................................. 57 by Table 25 (VccIO = 3.15 V - 3.45 V) .......................................................................................... 60 d] Table 26 (VccIO = 2.3 V - 2.7 V) .............................................................................................. 60 lle Table 27 (VccIO = 1.4 V - 1.6 V) HSTL .................................................................................... 60 on tro Table 28 Normal Operating Voltages for 0.13 m CMOS......................................................... 61 [c Table 29 VccINT Power Requirements ..................................................................................... 62 Table 31 Device Compact Model2 ............................................................................................. 67 Table 32 Heat Sink Requirements ............................................................................................. 67 Do wn lo ad ed Table 30 Conditions for Power Requirements .......................................................................... 62 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 9 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Definitions 04 1 :3 4: Table 1 defines the abbreviations used in this data sheet. 02 Table 1 Acronyms and Abbreviations Description CPU Central Processing Unit CPLD Complex Programmable Logic Device DDR Double Data Rate DMA Direct Memory Access ECC Error Correction Code EJTAG Enhanced Joint Test Action Group FCRAM Fast Cycle RAM FPGA Field-programmable Gate Array I/O Input/Output LVTTL Low-voltage Transistor-Transistor Logic MIPS Millions of Instructions Per Second W ed ne sd ay ,1 2J an ua ry ,2 01 1 Acronym or Abbreviation MMU Memory Management Unit MOESI S on Microprocessor without Interlocked Pipeline Stages of IH 5-State Algorithm for Cache Coherency: ri NMI ge PAL Be ta PLL RAM nk at es h ROM SDRAM SMP Ve SSTL by SysAD d] TAP Non-maskable Interrupt Programmed Array Logic Phase Lock Loop Random Access Memory Read-only Memory Synchronous Dynamic RAM Symmetric Multi-processing Stub Series Terminated Logic Multiplexed Address/Data System Bus Test Access Port Translation Lookaside Buffer Do wn lo ad ed [c on tro lle TLB Modified/Owned (Modified-Shared)/Exclusive/ Shared/Invalid Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 10 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Introduction 04 2 02 :3 4: The RM7965A is a high-performance 64-bit microprocessor with features including a sevenstage dual-issue pipeline, tightly coupled L1 and L2 caches, and sophisticated branch prediction for maintaining pipeline efficiency. ua ry ,2 01 1 A 200 MHz 64-bit multiplexed system address and data bus (SysAD) enables a high-bandwidth I/O interface to a variety of system controllers providing connectivity to a wide range of networking peripherals. The RM7965A also contains a vectored and prioritized interrupt controller for versatile interrupt configurations. ne sd Features CPU core with MIPS64-compatible Instruction Set Architecture that features: o 900 MHz operation. o Dual-issue superscalar 7-stage pipeline. o 16-KB, 4-way set associative L1 Instruction cache. o 16-KB, 4-way set associative L1 Data cache. o 256-KB, 4-way set associative L2 cache with industry best 5-cycle access latency. o Error Checking and Correcting (ECC) on L2 cache. o Fast Packet Cache to assist processing of packet data. o 8K-entry branch prediction table. o Fully associative 64-entry TLB with dual pages. o High performance Floating Point unit (IEEE 754). o Fixed-point DSP instructions such as Multiply/Add, Multiply/Subtract, and 3 Operand Multiply. High-performance system interface: o Multiple outstanding reads with out of order return. o 1600 MB/s peak throughput. o 200 MHz maximum frequency using HSTL signaling on the SysAD bus. o Multiplexed address/data bus (SysAD) supports 1.5 V, 2.5 V, and 3.3 V I/O logic. o Processor clock multipliers 2, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 10, 11, 12, 13, 14, 15, 16, 17. ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W 2.1 ay ,1 2J an On-chip EJTAG debug modules ensure smooth and easy debugging hardware and software by allowing both single-step and state examination. The inclusion of a pipeline-rate branch instruction trace buffer facilitates debugging under operating conditions. Integrated on-chip EJTAG controller. 64-entry dynamic Trace Buffer for use in real-time trace and debug. Two 32-bit virtually addressed Watch registers. Integrated performance counters: Do wn lo ad ed Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 11 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM 04 :3 256-pin CSBGA package (27x27 mm) Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an ua ry ,2 01 1 02 Contains 2 independent 32-bit counters. Counts over 30 processor events including mispredicted branches. Enables full characterization and analysis of application software. 4: o o o Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 12 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Block Diagram 04 3 01 1 02 :3 4: Figure 1 Block Diagram ,2 On-Chip Debug 64-bit Integer Unit Dual-Issue Superscalar 64-bit Floating Point Unit Double/Single IEEE-754 ua ry Branch Trace Buffer ,1 2J an Integer Multiplier 8K Entry Branch History Tbl sd ay Instruction Dispatch ne Instruction Cache 16 KB, 4-way Line Lockable Data Cache 16 KB, 4-way Line Lockable S on W ed Memory Manager 64-Entry, Dual Page System Control Secondary Cache 256 KB, 4-way Line Lockable of IH Interface Unit ta Interrupt Interface EJTAG/JTAG Controller Cache Test Mode PLL & Clock Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be SysAD System Interface* ge ri E9000 Core Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 13 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM E9000 CPU Core 04 4 ,2 01 1 02 :3 4: The RM7965A product consists of the E9000 core plus system interface logic. The E9000 is compatible with the MIPS64 instruction set architecture (ISA), which is a superset of the MIPS IV ISA and is fully backwards compatible with the RM7000 CPU core utilized in all RM70xx products. Also included in the E9000 core is a high performance, IEEE 754 compliant floatingpoint unit. 2J an ua ry The E9000 core includes a dual-integer superscalar processor with a two level cache hierarchy, an MMU, and a sophisticated branch predictor. Support is provided for two outstanding reads with out-of-order return. The interrupt controller works in conjunction with the system interrupt controller to provide a robust interrupt architecture. ed CPU Registers W 4.1 ne sd ay ,1 The E9000 core also contains an integrated EJTAG debug module and an integrated Test Access Port (TAP) controller, both of which allow easy debug from the JTAG interface. A 64-entry pipeline-rate trace buffer is included for real-time program flow analysis. ri of IH S on The E9000 contains 32 general purpose registers (GPR), two special purpose registers for integer multiplication and division, and a program counter; there are no condition code bits. Figure 2 shows these processor registers. The E9000 also includes two sets of CP0 registers. The CP0 register sets contain both 32 and 64-bit registers. Only 29 of the 32 registers specified in CP0 Set 0 are implemented, and only 5 of the 32 registers in CP0 Set 1 are implemented. ta ge Figure 2 General Purpose Registers Be 63 63 0 h r0 Multiply/Divide Registers 0 nk at es r1 r2 HI 63 0 LO * Program Counter * R29 63 R30 0 PC R31 Do wn lo ad ed [c on tro lle d] by Ve * Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 14 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Superscalar Dispatch AM 4.2 01 1 02 :3 4: 04 The E9000 incorporates a superscalar dispatch unit that allows it to issue up to two instructions per cycle. For purposes of instruction issue, the E9000 defines four classes of instructions: integer, load/store, branches, and floating-point. There are two logical pipelines, the function, or F, pipeline and the memory, or M, pipeline. Note that the M pipe can execute integer as well as memory type instructions. ry ,2 Table 2 Instruction Issue Rules M Pipe one of: Integer ALU, branch, floating-point, integer mul, div Integer ALU, load/store 2J an ua F Pipe one of: sd ay ,1 Figure 3 is a simplification of the execution unit, and illustrates the basics of the instruction issue mechanism. W ed ne Figure 3 Instruction Issue Paradigm Dispatch Unit ge ri of IH S on Instruction Cache M Pipe IBus FP F Pipe FP M Pipe Integer F Pipe Integer M Pipe tro lle d] by Ve nk at es h Be ta F Pipe IBus Do wn lo ad ed [c on The figure illustrates that one F pipe instruction and one M pipe instruction can be issued concurrently but that two M pipe or two F pipe instructions cannot be issued. Table 3 specifies more completely the instructions within each class. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 15 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Table 3 Dual Issue Instruction Classes Load/Store Floating- Point Branch Integer Mul/Div add, sub, or, xor, shift, etc. lw, sw, ld, sd, ldc1, sdc1, mov, movc, fmov, etc. fadd, fsub, fmult, fmadd, fdiv, fcmp, fsqrt, etc. beq, bne, bCzT, bCzF, j, etc. mult, multu, mad, madu, mul, dmult, dmultu, div, divd, ddiv, ddivd ry Seven-stage Pipeline ua 4.3 ,2 01 1 02 :3 4: 04 Integer ALU sd ay ,1 2J an The E9000 pipeline has been increased to 7 stages versus the 5-stage RM7000 pipeline. Increasing the pipeline to 7 stages and including branch prediction allows the frequency to be increased beyond 800 MHz while maintaining high pipeline efficiency. Figure 3 illustrates the 7-stage pipeline in comparison to the 5-stage pipeline of the RM7000. ed ne Figure 3 Pipeline Comparison A S R D W D M W Be A nk at es E9000 Pipeline R ta C h I ge ri of IH I on W RM7000 Pipeline RM7000 Pipeline Stages Ve 4.3.1 I: Instruction Fetch from instruction cache R: Register File Access D: Data Fetch from data cache W: Write Back to register file lle d] tro by The RM7000 pipeline stages are summarized as follows: Do wn lo ad ed [c on A: Instruction Execution Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 16 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet E9000 Pipeline Stages AM 4.3.2 :3 4: 04 In contrast to the RM7000 pipeline, the E9000 pipeline has two additional stages to allow an extra clock cycle of for both the instruction and the data pipeline regimes. The E9000 pipeline stages can be summarized as follows: I: C: Instruction Cache Access R: Register File Access, Instruction Decode A: Instruction Execution, Data Address Calculation D: Data Cache Access M: Data Bus, Data Alignment W: Write Back to register file 02 ay ,1 2J an ua ry ,2 01 1 Instruction Addressing ne sd The pipeline execution diagram for the E9000 is shown below: W ed Figure 4 Pipeline Execution Diagram on M-Pipe IH S Simple Integer Unit with L/S Unit R ge C ta I Be R M W Integer MAC Unit MAC1 MAC2 MAC3 W F-Pipe Simple Integer Unit A D M Floating-point MAC Unit F1 F2 F3 F4 F5 Floating-point Div/Sqrt Unit (Iterative) Do wn lo ad ed [c on tro lle d] by Ve nk at es h C ri Fetch and Dispatch (2 instructions per cycle) I D of A Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 17 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Delay slots AM 4.4 Branch Delay 1 4.4.1 02 :3 4: 04 The intrinsic branch and load delays are each increased by 1 in the E9000 due to the increase in pipeline length. Load Delay ,1 4.4.2 2J an ua ry ,2 01 The branch delay slot increases from one to two, but with branch prediction, which has been simulated to predict accurately ~95% of the time, the effective branch delay stays about one. The second, or additional, branch delay slot is hidden to the code and is taken as a one-cycle stall in the case where the branch prediction misses. When the branch prediction hits, this second slot is taken with the first instruction of the branch target code. W Branch Prediction on 4.5 ed ne sd ay In the E9000, the load delay slot is increased from one to two. Compilers optimized for the E9000 are able to fill the extra delay slot with non-data dependent instructions. Even code that has not been recompiled, however, will perform nearly optimally on the E9000 core. Integer Unit nk at es h 4.6 Be ta ge ri of IH S The E9000 has an 8K entry branch prediction table, utilizing a correlative branch prediction algorithm which increases the accuracy of prediction to greater than 95%. The correlative algorithm hashes the lower address bits with bits of dynamic prediction from all branches to derive the index for the branch entry. Using this approach a given branch instruction can have a predictor for its "inner" loop and a separate predictor for its "outer" loop. by Ve The E9000 implements the MIPS64 Instruction Set Architecture including five implementation specific instructions not found in the baseline MIPS IV ISA, but which are useful for embedded applications. These instructions are integer multiply-add (MAD), multiply-add unsigned (MADU), multiply-subtract (MSUB), multiply-subtract unsigned (MSUBU), and three-operand integer multiply (MUL). tro lle d] Another instruction new to the E9000 is the Superscalar No-Operation (SSNOP) instruction. This instruction issues a NOP instruction to each integer unit. Do wn lo ad ed [c on The E9000 integer unit includes 32 general-purpose 64-bit registers, the HI/LO result registers for two-operand integer multiply/divide operations, and the program counter (PC). There are two separate execution units: one that can execute function (F) pipe instructions and one that can execute memory (M) pipe instructions. Refer to Table 4 for the instruction issue rules. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 18 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 1 Register File 01 4.6.1 02 :3 4: 04 AM Note that integer multiply/divide instructions, as well as their corresponding MFHi and MFLo instructions, can only be executed in the F pipe execution unit. Within each execution unit, the operational characteristics are the same as on previous MIPS designs with single cycle ALU operations (add, sub, logical, shift), one cycle load delay, and an autonomous multiply/divide unit. ay Integer ALU sd 4.7 ,1 2J an ua ry ,2 The E9000 has 32 general-purpose registers with register location 0 (r0) hard wired to a zero value. These registers are used for scalar integer operations and address calculation. In order to service the two integer execution units, the register file has four read ports and two write ports and is fully bypassed both within and between the two execution units to minimize operation latency in the pipeline. W ed ne The E9000 has two complete integer ALUs each consisting of an integer adder/subtractor, a logic unit, and a shifter. Table 4 shows the functions performed by the ALUs for each execution unit. Each of these units is optimized to perform all operations in a single processor cycle. on Table 4 Integer ALU Operations F Pipe Adder add, sub Logic logic, moves, zero shifts (nop) Shifter non-zero shift ri of IH S Unit M Pipe add, sub, data address add logic, moves, zero shifts (nop) ta ge non-zero shift, store align Integer Multiply/Divide nk at es h Be 4.8 Ve The E9000 has a single dedicated integer multiply/divide unit optimized for high-speed multiply and multiply-accumulate operations. The multiply/divide unit resides in the F pipe execution unit. Table 5 shows the performance of the multiply/divide unit on each operation. Opcode Latency Repeat Rate Stall Cycles 16 bit 4 3 0 32 bit 5 4 0 16 bit 4 3 2 32 bit 5 4 3 DMULT, DMULTU any 9 8 0 DIV, DIVD any 36 36 0 DDIV, DDIVU any 68 68 0 lle d] Operand Size on by Table 5 Integer Multiply/Divide Operations tro MULT/U, MAD/U MSUB, MSUBU Do wn lo ad ed [c MUL Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 19 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 4: 04 AM The baseline MIPS IV ISA specifies that the results of a multiply or divide operation be placed in the Hi and Lo registers. These values can then be transferred to the general-purpose register file using the Move-from-Hi and Move-from-Lo (MFHI/MFLO) instructions. ry ,2 01 1 02 :3 In addition to the baseline MIPS IV integer multiply instructions, the E9000 also implements the 3-operand multiply instruction, MUL. This instruction specifies that the multiply result go directly to the integer register file rather than the Lo register. The portion of the multiply that would have normally gone into the Hi register is discarded. For applications where it is known that the upper half of the multiply result is not required, using the MUL instruction eliminates the necessity of executing an explicit MFLO instruction. ,1 2J an ua The multiply-add instructions, MAD and MADU, multiply two operands and add the resulting product to the current contents of the Hi and Lo registers. The multiply-accumulate operation is the core primitive of almost all digital signal processing algorithms. Therefore, using the E9000 eliminates the need for a separate DSP in many embedded applications. Floating-Point Coprocessor on 4.9 W ed ne sd ay The multiply-sub instructions, MSUB and MSUBU, multiply two operands and subtract the resulting product from the current contents of the Hi and Lo registers. The multiply-subtract operation is a core primitive of digital signal processing algorithms. nk at es h Be ta ge ri of IH S The E9000 incorporates a high-performance fully pipelined floating-point coprocessor that includes a floating-point register file and autonomous execution units for multiply/add/convert and divide/square root. The floating-point coprocessor is a tightly coupled execution unit, decoding and executing instructions in parallel with, and in the case of floating-point loads and stores, in cooperation with the M pipe of the integer unit. The superscalar capabilities of the E9000 allow floating-point computation instructions to issue concurrently with integer instructions. 4.10 Floating-Point Unit lle d] by Ve The E9000 floating-point execution unit supports single and double precision arithmetic, as specified in the IEEE Standard 754. The execution unit is broken into a separate divide/square root unit and a pipelined multiply/add unit. Overlap of divide/square root and multiply/add is supported. [c on tro The E9000 maintains fully precise floating-point exceptions while allowing both overlapped and pipelined operations. Precise exceptions are extremely important in object-oriented programming environments and highly desirable for debugging in any environment. Do wn lo ad ed Floating-point operations include: add subtract multiply Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 20 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet divide square root reciprocal reciprocal square root conditional moves conversion between fixed-point and floating-point format conversion between floating-point formats floating-point compare an ua ry ,2 01 1 02 :3 4: 04 AM Latency single/double Repeat Rate single/double fadd 4 1 fsub 4 1 fmult 4/5 1/2 4/5 1/2 fmsub 4/5 S on fmadd fdiv 21/36 fsqrt 21/36 fcvt.s.w 6 fcvt.s.l Ve fcvt.d.w by fcvt.d.l h IH of 19/34 19/34 36/66 1 3 3 4 1 4 1 4 1 4 1 1 1 fcvt.l.d 4 1 1 1 fmov, fmovc 1 1 fabs, fneg 1 1 d] 4 4 lle fcvt.w.d 19/34 on fcvt.w.s ri 4 ge fcvt.s.d ta 38/68 Be 21/36 frsqrt nk at es 1/2 frecip 6 tro fcvt.l.s [c fcmp W ed ne sd Operation fcvt.d.s Do wn lo ad ed ay Table 6 Floating Point Latencies and Repeat Rates ,1 2J Table 6 gives the latencies of the floating-point instructions in internal processor cycles. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 21 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM 4.11 Floating-Point General Register File 1 02 :3 4: 04 The floating-point general register file (FGR) is made up of thirty-two 64-bit registers. With the floating-point load and store double instructions, LDC1 and SDC1, the floating-point unit can take advantage of the 64-bit wide data cache and issue a floating-point coprocessor load or store doubleword instruction in every cycle. an ua ry ,2 01 The floating-point control register file contains two registers; one for determining configuration and revision information for the coprocessor, and one for control and status information. These registers are primarily used for diagnostic software, exception handling, state saving and restoring, and control of rounding modes. sd ay ,1 2J To support superscalar operations the FGR has four read ports and two write ports and is fully bypassed to minimize operation latency in the pipeline. Three of the read ports and one write port are used to support the combined multiply-add instruction while the fourth read and second write port allows for concurrent floating-point load or store and conditional move operations. ed ne 4.12 System Control Coprocessor (CP0) on W The system control coprocessor (CP0) is responsible for the virtual memory sub-system, the exception control system, and the diagnostics capability of the processor. ta ge ri of IH S For memory management support, the E9000 CP0 is logically identical to the CPU cores used in the RM5200 Family and the RM7000 Family. For interrupt exceptions and diagnostics, the E9000 is a superset of the RM5200 Family and the RM7000 Family, implementing additional features described in the following sections on Interrupts, Test/ Breakpoint registers, and Performance Counters. nk at es h Be The memory management unit controls the virtual memory system page mapping. It consists of an instruction address translation buffer (ITLB) a data address translation buffer (DTLB), a Joint TLB (JTLB), and coprocessor registers used by the virtual memory mapping sub-system. by Ve 4.13 System Control Coprocessor Registers [c on tro lle d] The E9000 incorporates all CP0 registers internally. These registers provide the path through which the virtual memory system's page mapping is examined and modified, exceptions are handled, and operating modes are controlled (kernel vs. user mode, interrupts enabled or disabled, cache features). In addition, the E9000 includes registers to implement a real-time cycle counting facility, to aid in cache and system diagnostics, and to assist in data error detection. Do wn lo ad ed To support the non-blocking caches and enhanced interrupt handling capabilities of the E9000, both the data and control register spaces of CP0 are supported. In the data register space, which is accessed using the MFC0 and MTC0 instructions, the E9000 supports the same registers as found in previous RM7000 processors plus three new registers to support EJTAG Debugging. The three new registers are called: EJTAG Debug, EJTAG DEPC, and EJTAG DESave. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 22 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 4: 04 AM In the control space, the E9000 supports three new registers to support the 64-entry branch Trace Buffer: Trace Buffer Control and Status (TB CSR), Trace Buffer Out (TB Out), and Trace Buffer Index (TB IDX). See Section 7.1 02 :3 Figure 5 shows the CP0 registers. 47/63 Comp are 11* Info 7* Status 12* Cause 13* Index 0 EPC 14* sd TLB ne Random 1 ECC 26* Ta gHi 29* Perf Ctr Cntrl 22* IPLHI 19* Watch2 19* IntControl 20* Watch1 18* Watch Mask 21 DErrAddr0 26* XContext 20* EJTAG Debug 23* DErrAddr1 27* CacheErr 27* EJTAG DEPC 24* TB CSR 22* ErrorEPC 30* EJTAG Desave 31* TB Out 23* TB IDX 24* S Ta gLo 28* on Config 16* IPLLO 18* of IH LLAddr 17* W (entries protected from TLBWR) 0 ed Wired 6* ry Count 9* ua PRId 15* Perf Counter 25* an EntryLo1 3* BadVAddr 8* 2J EntryHi 10* Context 4* ,1 EntryLo0 2* ay PageMask 5* ,2 01 1 Figure 5 CP0 Registers * Register number Used for exception processing (set1) Be ta ge ri Used for memory management nk at es h 4.14 Memory Management Unit (MMU) Do wn lo ad ed [c on tro lle d] by Ve The E9000 has an MMU with a 64 entry TLB, with each entry having dual pages for a total of 128 pages. The page size is programmable to be 4 KB, 16 KB, 64 KB, 256 KB, 1 MB, 16 MB, 64 MB, or 256 MB. Pages can be programmed to be write-protected. The TLB can operate statically or in a demand-paged environment, with TLB misses generating exceptions to load the appropriate page. The TLB replacement algorithm is random, and there is a TLB fence that can be used to lock a subset of the TLB entries, and allow the remainder to be dynamically refilled. The MMU architecture on the E9000 supports both 32 and 64-bit virtual addressing. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 23 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 4: 04 4.15 Virtual to Physical Address Mapping user mode kernel mode supervisor mode ry ,2 01 1 02 :3 The E9000 provides three modes of virtual addressing: 2J an ua These modes allow system software to provide a secure environment for user processes. Bits in the CP0 Status register determine which virtual addressing mode is used. In user mode, the E9000 provides a single, uniform virtual address space of 256 GB (2 GB in 32-bit mode). ne sd ay ,1 When operating in the kernel mode, four distinct virtual address spaces, totaling 1024 GB (4 GB in 32-bit mode), are simultaneously available and are differentiated by the high-order bits of the virtual address. on W ed The E9000 core also supports a supervisor mode in which the virtual address space is 256.5 GB (2.5 GB in 32-bit mode), divided into three regions based on the high-order bits of the virtual address. Figure shows the address space layout for 32-bit operations. IH S Table 7 Kernel Mode Virtual Addressing (32-bit) 0xFFFFFFFF of Kernel virtual address space ri (kseg3) Mapped, 0.5GB 0xDFFFFFFF Supervisor virtual address space Be ta ge 0xE0000000 Uncached kernel physical address space (kseg1) Unmapped, 0.5GB lle d] by 0xA0000000 Mapped, 0.5GB nk at es Ve 0xBFFFFFFF h (ksseg) 0xC0000000 tro 0x9FFFFFFF Cached kernel physical address space (kseg0) Unmapped, 0.5GB 0x7FFFFFFF User virtual address space (kuseg) 0x00000000 Mapped, 2.0GB Do wn lo ad ed [c on 0x80000000 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 24 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 4: 04 AM When the E9000 is configured for 64-bit addressing, the virtual address space layout is an upward compatible extension of the 32-bit virtual address space layout. 02 :3 4.16 Joint TLB ... an ua ry ,2 01 1 For fast virtual-to-physical address translation, the E9000 uses a large, fully associative TLB that maps virtual pages to their corresponding physical addresses. As indicated by its name, the JTLB is used for both instruction and data translations. The JTLB is organized as pairs of even/odd entries, and maps a virtual address and address space identifier (ASID) into the large, 64 GB physical address space. By default, the JTLB is configured as 48 pairs of even/odd entries. The optional 64-even/odd-entry configuration is set at boot time. W ed ne sd ay ,1 2J Two mechanisms are provided to assist in controlling the amount of mapped space and the replacement characteristics of various memory regions. First, the page size can be configured, on a per-entry basis, to use page sizes in the range of 4 KB to 16 MB (in 4x multiples). The CP0 PageMask register is loaded with the desired page size of a mapping, and that size is stored into the TLB, along with the virtual address, when a new entry is written. Thus, operating systems can create special purpose maps; for example, an entire frame buffer can be memory mapped using only one TLB entry. ta ge ri of IH S on The second mechanism controls the replacement algorithm when a TLB miss occurs. The E9000 provides a random replacement algorithm to select a TLB entry to be written with a new mapping. The core also provides a mechanism whereby a system specific number of mappings can be locked into the TLB, thereby avoiding random replacement. This mechanism uses the CP0 Wired register and allows the operating system to guarantee that certain pages are always mapped for performance reasons and to avoid a deadlock condition. It also facilitates the design of real-time systems by allowing deterministic access to critical software. h Be The JTLB also contains information that controls the cache coherency protocol for each page. Specifically, each page has attribute bits to determine whether the coherency algorithm is: uncached write-back write-through with write-allocate write-through without write-allocate write-back with secondary and tertiary bypass lle d] by Ve nk at es [c on tro Note that both of the write-through protocols bypass both the secondary and the tertiary caches since neither of these caches support writes of less than a complete cache line. Do wn lo ad ed These protocols are used for both code and data in the E9000, with data using write-back or write-through depending on the application. The write-through modes support the same efficient frame buffer handling as the RM7000 and RM5200 Families. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 25 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM 4.17 Instruction TLB 4: 04 The E9000 uses a 4-entry instruction TLB (ITLB). The ITLB offers the following advantages: Minimizes contention for the JTLB Eliminates the critical path of translating through a large associative array Allows instruction address and data address translations to occur in parallel Saves power ry ,2 01 1 02 :3 ay ,1 2J an ua Each ITLB entry maps a 4 KB page. The ITLB improves performance by allowing instruction address translation to occur in parallel with data address translation. When a miss occurs on an instruction address translation by the ITLB, the least-recently used ITLB entry is filled from the JTLB. The operation of the ITLB is completely transparent to the user. ne sd 4.18 Data TLB ... of IH S on W ed The E9000 uses a 4-entry data TLB (DTLB) for the same reasons cited above for the ITLB. Each DTLB entry maps a 4 KB page. The DTLB improves performance by allowing data address translation to occur in parallel with instruction address translation. When a miss occurs on a data address translation, the DTLB is filled from the JTLB. The DTLB refill is pseudoLRU; the least recently used entry of the least recently used pair of entries is filled. The operation of the DTLB is completely transparent to the user. ta ge ri 4.19 Interrupt Handling nk at es h Be In order to provide better real time interrupt handling, the RM7965A provides 10 external hardware interrupts, each of which can be separately prioritized and separately vectored. by Ve The performance counter is also a hardware interrupt source using INT13. Historically in the MIPS architecture, interrupt 7 (INT7) was used as the Timer Interrupt. The RM7965A provides a separate interrupt, INT12, for this purpose, thereby releasing INT7 for use as a pure external interrupt. [c on tro lle d] All interrupts (INT[13:0]), the Performance Counter, and the Timer, have corresponding interrupt mask bits, IM[13:0], and interrupt pending bits, IP[13:0], in the Status, Interrupt Control, and Cause registers. The bit assignments for the Interrupt Control and Cause registers are shown in Table 8 and Table 9. (Note the Status register has not changed from the RM5200 product family and is not shown.) 31 30 29:28 27 26 25 24 23:8 7 6:2 0:1 BD 0 CE 0 W2 W1 IV IP[15:0] 0 EXC 0 Do wn lo ad ed Table 8 Cause Register Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 26 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 15:8 7 6:5 4:0 0 IM[15:8] TE 0 Spacing 02 :3 4: 31:16 04 AM Table 9 Interrupt Control Register ,2 01 1 The IV bit in the Cause register is the global enable bit for the enhanced interrupt features. If this bit is clear then interrupt operation is compatible with RM5200 and RM7000 products. 2J an ua ry In the Interrupt Control register, the interrupt vector spacing is controlled by the Spacing field as described below. The Interrupt Mask field (IM[13:8]) contains the interrupt mask for interrupts 8 through 13. IM[15:14] are reserved for future use. sd ay ,1 The Timer Enable (TE) bit is used to gate the Timer Interrupt to the Cause Register. If TE is set to 0, the Timer Interrupt is not gated to IP12. If TE is set to 1, the Timer Interrupt is gated to IP12. W ed ne The setting for Mode Bit 11 is used to determine if the Timer Interrupt replaces the external interrupt (INT5*) as an input to IP7 in the Cause Register. If Mode Bit 11 is set to 0, the Timer Interrupt is gated to IP7. If Mode Bit 11 is set to 1, the external INT5* is gated to IP7. of IH S on In order to utilize both the external Interrupt (INT5*) and the internal Timer Interrupt, Mode Bit 11 must be set to 1, and TE must be set to 1. In this case, the Timer Interrupt will utilize IP12, and INT5* will utilize IP7. Please also reference the logic diagram for interrupt signals in the RM7965A User Manual. ta ge ri The Interrupt Control register uses IM13 to enable the Performance Counter interrupt and to enable the Trace Buffer interrupt. nk at es h Be Priority of the interrupts is set via two new coprocessor 0 registers called Interrupt Priority Level Lo (IPLLO) and Interrupt Priority Level Hi (IPLHI). lle d] by Ve In the IPLLO and IPLHI registers, each interrupt is represented by a four-bit field, thereby allowing each interrupt to be programmed with a priority level from 0 to 15 inclusive. The priorities can be set in any manner, including having all the priorities set exactly the same. Priority 0 is the highest level and priority 15 the lowest. The format of the priority level registers is shown in Table 10 and Table 11. The priority level registers are located in the coprocessor 0 control register space. tro Table 10 IPLLO Register 27:24 23:20 19:16 15:12 11:8 7:4 3:0 IPL7 IPL6 IPL5 IPL4 IPL3 IPL2 IPL1 IPL0 Table 11 IPLHI Register 31:28 27:24 23:20 19:16 15:12 11:8 7:4 3:0 0 0 IPL13 IPL12 IPL11 IPL10 IPL9 IPL8 Do wn lo ad ed [c on 31:28 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 27 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 01 1 02 :3 4: 04 AM In addition to programmable priority levels, the RM7965A also permits the spacing between interrupt vectors to be programmed. For example, the minimum spacing between two adjacent vectors is 0x20 while the maximum is 0x200. This programmability allows the user to either set up the vectors as jumps to the actual interrupt service routines or, if interrupt latency is not paramount, to include the entire interrupt service routine at one vector. Table 12 illustrates the complete set of vector spacing selections along with the coding as required in the Interrupt Control register bits [4:0], ICR. an ua ry ,2 In general, the active interrupt priority, combined with the spacing setting, generates a vector offset, which is then added to the interrupt base address of 0x200 to generate the interrupt exception offset. This offset is then added to the exception base to produce the final interrupt vector address. 0x000 0x1 0x020 0x2 0x040 0x4 0x080 0x8 0x100 0x10 0x200 others reserved ,1 Spacing ge ri of IH S on W ed ne sd ay ICR[4:0] 0x0 2J Table 12 Interrupt Vector Spacing Be ta 4.20 Standby Mode nk at es h The RM7965A provides a means to reduce the amount of power consumed by the internal core when the CPU is not performing any useful operations. This state is known as Standby Mode. lle d] by Ve Executing the WAIT instruction enables interrupts and causes the processor to enter Standby Mode. If the SysAD bus is currently idle when the WAIT instruction completes the W pipe stage, the internal processor clock stops, thereby freezing the pipeline. The phase lock loop, or PLL, internal timer/counter, and the "wake up" input pins: INT[9:0]*, NMI*, ExtReq*, Reset*, and ColdReset* continue to operate in their normal fashion. Do wn lo ad ed [c on tro If the SysAD bus is not idle when the WAIT instruction completes the W pipe stage, then the WAIT is treated as a NOP. Once the processor is in Standby, any interrupt, including the internally generated Timer Interrupt, causes the processor to exit Standby and resume operation where it left off. The WAIT instruction is typically inserted in the idle loop of the operating system or real time executive. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 28 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM 4.21 JTAG Interface 02 :3 4: 04 The RM7965A interface supports JTAG boundary scan in conformance with IEEE 1149.1. The JTAG interface is useful for checking the integrity of the processor's pin connections. 01 1 4.22 Reset Sequence ua ry ,2 The RM7965A uses the same reset interface that is used on the RM7000 and RM5200 product families. This single reset interface is used to reset the entire system. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an Both power on reset and cold reset completely initialize the RM7965A. The configuration mode bit stream is read into the device to configure the E9000 core and the external bus interface (see Section 8). The configuration stream is read in when the VccOK input signal has been asserted and ColdReset* and Reset* remain asserted. During this time, the PLL is achieving lock. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 29 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Cache Architecture 04 5 ua ry ,2 01 1 02 :3 4: The E9000 cache architecture is similar to that of the RM7000. Each core contains 16-KBytes of instruction cache, 16 KB of data cache, and 256 KB of unified secondary cache. The instruction cache, data cache and secondary cache are all four-way set associative. Cache locking is supported for all of the caches, and the caches can be locked with line granularity. This is very useful for keeping frequently called routines in the cache, along with frequently accessed data structures such as look-up tables for routing and other data communications applications. The E9000 data cache is non-blocking, and the pipeline will not stall until a third cache-miss or a data dependency is encountered. ed Instruction Cache W 5.1 ne sd ay ,1 2J an Each primary cache has a 64-bit read path and a 128-bit write path. Both caches can be accessed simultaneously. The primary caches provide the integer and floating-point units with an aggregate bandwidth of 14.4 GB/s at an internal clock frequency exceeding 800 MHz. During an instruction or data primary cache refill, the secondary cache can provide a 64-bit datum every cycle following an initial five-cycle latency, for a peak bandwidth of 7.2 GB/s. IH S on The integrated 16 KB, four-way set associative instruction cache in the E9000 is virtually indexed and physically tagged. The effective physical index eliminates the potential for virtual aliases in the cache. ta ge ri of The data array portion of the instruction cache is 64 bits wide and protected by word parity while the tag array holds a 24-bit physical address, 14 control bits, a valid bit, and a single parity bit. d] by Ve nk at es h Be By accessing 64 bits per cycle, the instruction cache is able to supply two instructions per cycle to the superscalar dispatch unit. For signal processing, graphics, and other numerical code sequences where a floating-point load or store and a floating-point computation instruction are being issued together in a loop, the entire bandwidth available from the instruction cache is consumed by instruction issue. For typical integer code mixes, where instruction dependencies and other resource constraints restrict the level of parallelism that can be achieved, the extra instruction cache bandwidth is used to fetch both the taken and non-taken branch paths to minimize the overall penalty for branches. tro lle A 32-byte (8 instruction) line size is used to maximize the communication efficiency between the instruction cache and the secondary cache, tertiary cache, or memory system. Do wn lo ad ed [c on The E9000 supports cache locking on a per line basis. The contents of each line of the cache can be locked by setting a bit in the Tag RAM. Locking the line prevents its contents from being overwritten by a subsequent cache miss. Refills occur only into unlocked cache lines. This mechanism allows the programmer to lock critical code into the cache, thereby guaranteeing deterministic behavior for the locked code sequence. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 30 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Data Cache AM 5.2 02 :3 4: 04 The E9000 has an integrated 16 KB, four-way set associative data cache that is virtually indexed and physically tagged. Line size is 32-bytes (8 words). The effective physical index eliminates the potential for virtual aliases in the cache. ua ry ,2 01 1 The data cache is non-blocking; that is, a miss in the data cache does not necessarily stall the processor pipeline. As long as no instruction is encountered that is dependent on the data reference that caused the miss, the pipeline continues to advance. Once there are two cache misses outstanding, the processor stalls if it encounters another load or store instruction. ,1 2J an The data array portion of the data cache is 64 bits wide and protected by byte parity while the tag array holds a 24-bit physical address, 3 control bits, a 2-bit cache state field, and 2 parity bits. on W ed ne sd ay The most commonly used write policy is write-back, which means that a store to a cache line does not immediately cause memory to be updated. This increases system performance by reducing bus traffic and eliminating the bottleneck of waiting for each store operation to finish before issuing a subsequent memory operation. Software can, however, select write-through on a per-page basis when appropriate, such as for frame buffers. Cache protocols supported for the data cache are as follows: IH S 1. Uncached ri of Reads to addresses in a memory area identified as uncached do not access the cache. Writes to such addresses are written directly to main memory without updating the cache. ta ge 2. Write-back Ve nk at es h Be Loads and instruction fetches first search the cache, reading the next memory hierarchy level only if the desired data is not cache resident. On data store operations, the cache is first searched to determine if the target address is cache resident. If it is resident, the cache contents are updated and the cache line is marked for later write-back. If the cache lookup misses, the target line is first brought into the cache, after which the write is performed as above. d] by 3. Write-through with write allocate Do wn lo ad ed [c on tro lle Loads and instruction fetches first search the cache, reading from memory only if the desired data is not cache resident; write-through data is never cached in the secondary or tertiary caches. On data store operations, the cache is first searched to determine if the target address is cache resident. If it is resident, the primary cache contents are updated and main memory is written, leaving the write-back bit of the cache line unchanged; no writes occur to the secondary or tertiary caches. If the cache lookup misses, the target line is first brought into the cache, after which the write is performed as above. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 31 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM 4. Write-through without write allocate ,2 01 1 02 :3 4: 04 Loads and instruction fetches first search the cache, reading from memory only if the desired data is not cache resident; write-through data is never cached in the secondary or tertiary caches. On data store operations, the cache is first searched to determine if the target address is cache resident. If it is resident, the cache contents are updated and main memory is written, leaving the write-back bit of the cache line unchanged; no writes occur to the secondary or tertiary caches. If the cache lookup misses, only main memory is written. ua ry 5. Fast Packet CacheTM (Write-back with secondary and tertiary bypass) ne sd ay ,1 2J an Loads and instruction fetches first search the primary cache, reading from memory only if the desired data is not resident; the secondary and tertiary caches are not searched. On data store operations, the primary cache is first searched to determine if the target address is resident. If it is resident, the cache contents are updated, and the cache line marked for later write-back. If the cache lookup misses, the target line is first brought into the cache, after which the write is performed as above. ge Secondary Cache ta 5.3 ri of IH S on W ed Associated with the data cache is the store queue. When the E9000 executes a store instruction, this multi-entry queue is written with the store data while the tag comparison is performed. If the tag matches, then the data is written into the data cache in the next cycle that the data cache is not accessed (the next non-load cycle). The store queue allows the E9000 to execute a store every processor cycle and to perform back-to-back stores without penalty. In the event of a store immediately followed by a load to the same address, a combined merge and cache write occurs such that no penalty is incurred. Ve nk at es h Be The E9000 has an integrated 256 KB, four-way set associative, and block write-back secondary cache. The secondary cache has a 32-byte line size, a 64-bit bus width to match the system interface and primary cache bus widths, and is protected with the same Error Checking and Correcting (ECC) mechanism used in the R4000 processor. The secondary cache tag array holds a 20-bit physical address, two control bits, a 3-bit cache state field, and two parity bits. tro lle d] by By integrating a secondary cache, the E9000 is able to decrease the latency of a primary cache miss without significantly increasing the number of pins and the amount of power required by the processor. From a technology point of view, integrating a secondary cache leverages CMOS technology by using silicon to build the structures that are most amenable to silicon technology; building very dense, low power memory arrays rather than large power hungry I/O buffers. Do wn lo ad ed [c on Further benefits of an integrated secondary cache are flexibility in the cache organization and management policies that are not practical with an external cache. Two previously mentioned examples are the 4-way associativity and write-back cache protocol. A third management policy for which integration affords flexibility is cache hierarchy management. With multiple levels of cache, it is necessary to specify a policy for dealing with cases where two cache lines at level n of the hierarchy could possibly be sharing an entry in level n+1 of the hierarchy. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 32 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet ry ,2 01 1 02 :3 4: 04 AM The E9000 allows entries to be stored in the primary caches that do not necessarily have a corresponding entry in the secondary. The E9000 does not force the primaries to be a subset of the secondary. For example, if primary cache line A is being filled and a cache line already exists in the secondary for primary cache line B at the location where primary A's line would reside, then that secondary entry is replaced by an entry corresponding to primary cache line A and no action occurs in the primary for cache line B. This operation creates the aforementioned scenario where the primary cache line, which initially had a corresponding secondary entry, no longer has such an entry. Such a primary line is called an orphan. In general, cache lines at level n+1 of the hierarchy are called parents of level n's children. Secondary Caching Protocols ed 5.3.1 ne sd ay ,1 2J an ua Another E9000 cache management optimization occurs for the case of a secondary cache line replacement where the secondary line is dirty and has a corresponding dirty line in the primary. In this case, since it is permissible to leave the dirty line in the primary, it is not necessary to write the secondary line back to main memory. Taking this scenario one step further, a final optimization occurs when the aforementioned dirty primary line is replaced by another line and must be written back. In this case, it is written directly to memory, bypassing the secondary cache. Fast Packet Cache Mode ta 5.3.2 ge ri of IH S on W Unlike the primary data cache, the secondary cache supports only block write-back. As noted earlier, cache lines managed with either of the write-through protocols are not placed in the secondary cache. A new caching attribute, write-back with secondary and tertiary bypass, allows the secondary, and tertiary caches to be bypassed entirely. When this attribute is selected, the secondary and tertiary caches are not filled on load misses and are not written on dirty writebacks from the primary cache. nk at es h Be It is possible to bypass the secondary cache using the Fast Packet Cache feature. Fast Packet Cache can be activated on a per page basis, and allows all accesses into cache, and all writebacks to use only the primary data cache. This is useful for manipulating transient packet data and headers without evicting other less transient data from the L2 cache. by Ve Figure 6 illustrates the two level cache hierarchy and shows the tight coupling of the primary and secondary caches. The primary cache accesses occur at the core frequency. Do wn lo ad ed [c on tro lle d] If there is a primary miss that hits in secondary, then a 5-cycle miss penalty occurs. This latency is best in class for a processor in this performance range, and helps optimize the E9000 core for the highest possible performance. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 33 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM Figure 6 Fast Packet Cache Mode 02 :3 4: CPU 1-1-1-1 (Core) 01 1 Primary Cache (L1) ry ,2 Instr 16 KB, Data 16 KB ua 5-1-1-1 (Core) an Secondary Cache (L2) Fast Packet Cache (Bypass Mode) on Cache Modes S 5.4 W ed ne sd ay ,1 2J 256KB, 4-way assoc ri of IH Table 13 summarizes the E9000 cache operating modes. The coherency attributes referred to in Table 13 are written into the TLB entry to program the coherency attribute for that page. 001: Write-through with Allocate L2 Fill L1 Store L1and MM Store to MM Receives L1 displacements Fill L1 Store L1 and MM Store to L1 and MM; Fill L1 Receives L1 displacements ta Store Miss - - - - lle d] by Ve 010: Uncached blocking; Uncached. Reads stall pipeline. Strong ordering enforced. Loads and stores complete in program order Store Hit Be nk at es 000: Write-through No Allocate Read miss to MM h Cache Coherency Attribute ge Table 13 E9000 Cache Operating Modes Fill L1 and L2 Store L1 on tro 011: Writeback [c Store Miss L1, L2: Read MM-> L1, L2, Store L1 ed ad lo Receives L1 Displacements 100: Reserve - - - - 101: Reserve - - - - Do wn Store Miss, Hit L2: Read L2->L1 Store L1 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 34 Cache Coherency Attribute Read miss to MM Store Miss L2 - - - 111: Bypass (Fast Packet Cache); Bypass L2 Store L1 Fill L1, Store L1 Bypassed sd Cache Attributes ne 5.5 ay ,1 2J an ua Fill L1 ry ,2 01 1 - 02 :3 4: 04 110: Uncached NonBlocking; Uncached. Reads do not stall pipeline unless a data dependency exists. Strong ordering not enforced, therefore loads can be completed out of program order Store Hit AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet W ed The RM7965A cache attributes for the instruction, data and internal secondary caches are summarized in Table 14. on Table 14 RM7965A Cache Attributes Primary Instruction Size 16 KB Associativity 4-way Replacement Algorithm cyclic Line size 32 byte Index vAddr11..0 vAddr11..0 pAddr15..0 Tag pAddr35..12 pAddr35..12 pAddr35..16 Write policy N/A write-back, write-through block write-back, bypass Read policy N/A non-blocking (2 outstanding) non-blocking (data only, 2 outstanding) critical word first critical word first critical word first N/A sequential sequential complete line first double (if waiting for data) N/A per word parity per byte parity 8-bit ECC per DW d] Write order tro lle Miss restart following 16 KB 256 KB 4-way 4-way cyclic cyclic 32 byte 32 byte Do wn lo ad ed [c on Protection On-chip Secondary IH of ri ge ta Be h nk at es Ve by Read order Primary Data S Attribute Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 35 04 Cache Locking 4: 5.6 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet ,2 01 1 02 :3 The E9000 core in the RM7965A product allows critical code or data fragments to be locked into the primary and secondary caches. The user has complete control over the locking function. For instruction and data fragments in the primary caches, locking is accomplished by setting either or both of the cache lock enable bits and specifying the set in the CP0 ECC register, then executing either a load instruction for data, or a Fill_I cache operation for instructions. ,1 2J an ua ry Only cache lines within sets A and B of each cache can be locked. Locking within the secondary works identically to the primaries using a separate secondary lock enable bit and the same set selection field. As with the primaries, only sets A and B can be locked. Table 15 summarizes the cache locking capabilities. Lock Enable Primary I ECC[27] Set Select Activate ed ECC[28]=0A W ECC[26] Fill_I ne ECC[28]=0A ECC[28]=1B Primary D sd Cache ay Table 15 Cache Locking Control Load/Store ECC[25] ECC[28]=0A S Secondary on ECC[28]=1B ri Primary Write Buffer ta ge 5.7 Load/Store of IH ECC[28]=1B Fill_I or Ve nk at es h Be Writes to secondary cache or external memory, whether cache miss write-backs or stores to uncached or write-through addresses, use the integrated primary write buffer. The write buffer holds up to four 64-bit address and data pairs. The entire buffer is used for a data cache writeback and allows the processor to proceed in parallel with memory update. For uncached and write- through stores, the write buffer significantly increases performance by decoupling the SysAD bus transfers from the instruction execution stream. Data Prefetch d] by 5.8 The "Hint" field of the data prefetch instruction is used to specify the action taken by the instruction. The instruction can operate normally (that is, fetching data as if for a load operation) or it can allocate and fill a cache line with zeroes on a primary data cache miss. Do wn lo ad ed [c on tro lle The E9000 supports the MIPS IV integer data prefetch (PREF) and floating-point data prefetch (PREFX) instructions. These instructions are used by the compiler or by an assembly language programmer when it is known or suspected that an upcoming data reference is going to miss in the cache. By appropriately placing a prefetch instruction, the memory latency can be hidden under the execution of other instructions. In cases where the execution of a prefetch instruction would cause a memory management or address error exception the prefetch is treated as a NOP. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 36 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Memory Latencies AM 5.9 02 :3 4: 04 Table 16 is a compilation of latencies for the different types of on-chip memory accesses for the E9000. Local cache accesses to the L1 occur at the CPU core frequency, and local L1 misses access L2 with a 5-cycle miss penalty. 01 1 Table 16 On-Chip Memory Latencies Number of Processor Clocks per Double Word Local L1 Hit 1-1-1-1 Local L2 Hit 5-1-1-1 Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an ua ry ,2 Type of Burst Memory Access Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 37 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM System Interface 04 6 02 :3 4: The RM7965A product provides a high performance system interface comprised of a multiplexed address/data bus (SysAD), a parity check bus (SysADC) and a system command bus (SysCmd). The SysAD is 64-bits, SysADC is 8-bits, and SysCmd is 9-bits. ua ry ,2 01 1 Figure 7 shows a typical embedded system using the RM7965A. The diagram shows a system with a bank of DRAMs, and an external agent or ASIC which provides DRAM control and I/O functionality. ,1 x ed W SysAD + SysADC on 72 x ne 64 RM7965A Control sd 8 Address ay Flash/ Boot ROM DRAM 2J an Figure 7 Typical Embedded System Block Diagram with 64-bit SysAD Bus External Agent PCI Bus S SysCmd + Control Be ta ge ri of IH 25 (typ) Ve nk at es h There are many companion chips or system controllers that interface to the SysAD bus that provide connectivity to a variety of interfaces including PCI, PCI-X, Fast Ethernet, Gigabit Ethernet, and T1/T3. They typically include a boot bus that connects to flash or ROM memory, which can be utilized to boot any RM7965A CPU across the SysAD bus. System Address/Data Bus by 6.1 Do wn lo ad ed [c on tro lle d] RM7965A product features an enhanced version of the multiplexed Address/Data bus (SysAD), first introduced with the debut of the RM70xxC products. The function of the SysAD bus is to transfer addresses and data between the CPU and the rest of the system. The enhanced version can run up to 200 MHz, providing up to 12.8 Gbit/s or 1.6 GB/s of bandwidth. It supports legacy designs with a seamless upgrade path for all RM70xx and RM52xx processors, and maintains compatibility with all existing and future companion chips that utilize SysAD functionality. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 38 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 01 1 02 :3 4: 04 AM The 64-bit SysAD bus present on the RM7965A processor enables 36 bits of physical addressing and 64 bits of data, and is supported by an 8-bit parity check bus (SysADC[7:0]) and a 9-bit command bus (SysCmd[8:0]). In addition, there are ten handshake signals and ten interrupt inputs. It can run up to 133 MHz in standard LVTTL mode, or up to 200 MHz in the enhanced HSTL mode. The SysAD bus runs at the same frequency as the RM7965A master clock. The SysAD interface for the RM7965A also supports up to two outstanding reads, and it can return the reads out of order. ay System Command Bus sd 6.2 ,1 2J an ua ry ,2 The SysAD bus is also configurable to allow easy interfacing to memory and I/O systems of varying frequencies. The data rate and the bus frequency at which RM7965A product transmits data to the system interface is programmable at boot time via mode control bits. Additionally, the rate at which the processor receives data is fully controlled by the external device. Therefore, either a low cost interface requiring no read or write buffering, or a faster, highperformance interface can be designed to communicate with the RM7965A processor. ri of IH S on W ed ne All RM7965A processors feature a 9-bit System Command bus, SysCmd[8:0]. The command bus indicates whether the SysAD bus carries address or data information on a per-clock basis. If the SysAD bus carries an address, the SysCmd bus indicates the transaction type (for example, a read or write). If the SysAD bus carries data, then the SysCmd bus contains information about the data (for example, this is the last data word transmitted, or the data contains an error). The SysCmd bus is bidirectional to support both processor requests and external requests to the RM7965A. Processor requests are initiated by the RM7965A and responded to by an external device. External requests are issued by an external agent and require the RM7965A to respond. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge The RM7965A support 1 to 8-byte transfers as well as 32-byte block transfers on the SysAD bus. In the case of a sub-doubleword transfer, the 3 low-order address bits give the byte address of the transfer, and the SysCmd bus indicates the number of bytes being transferred. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 39 04 Handshake Signals 4: 6.3 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 01 1 02 :3 There are 10 handshake signals on the system interface of the RM7965A. Two of these, RdRdy* and WrRdy*, are common to all RM7965A CPUs. They are driven by an external agent to indicate to the RM7965A whether it can accept a new read or write transaction. The RM7965A samples these signals before deasserting the address on read and write requests. 2J an ua ry ,2 ExtRqst* and Release* are also common to all RM7965A CPUs. They are used to transfer control of the SysAD and SysCmd buses from the processor to an external agent. When an external agent requires control of the bus, it asserts ExtRqst*. The RM7965A responds by asserting Release* to release the system interface to slave state. W ed ne sd ay ,1 PRqst* and PAck* are supported by the RM7965A. These signals are used to transfer control of the SysAD and SysCmd buses from the external agent to the processor. These two pins have been added to the system interface to support multiple outstanding reads and facilitate nonblocking cache operations. When the processor needs to reacquire control of the interface, it asserts PRqst*. The external agent responds by asserting PAck* to return control of the interface to the processor. of IH S on RspSwap* is used by the external agent to indicate to the processor when it is returning multiple data requests out of order. For example, when there are two outstanding reads, the external agent asserts RspSwap* when it is going to return the data for the second read before it returns the data for the first read. Be ta ge ri RdType is a pin on the interface that indicates whether a read is an instruction read or a data read. When asserted, the reference is an instruction read. When deasserted it is a data read. RdType is only valid during valid address cycles. Do wn lo ad ed [c on tro lle d] by Ve nk at es h ValidOut* and ValidIn* are used by the RM7965A and its external agents to indicate that there is a valid command and data on the SysAD and SysCmd buses. The RM7965A asserts ValidOut* when it is driving these buses with a valid command and data, and the external agent drives ValidIn* when it has control of the system interface and is driving a valid command and data. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 40 04 System Interface Operation 4: 6.4 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet ,2 01 1 02 :3 To support non-blocking caches and data prefetch instructions, the RM7965A allow two outstanding reads. An external agent may respond to read requests in whatever order it chooses by using the response order indicator pin RspSwap*. No more than two read requests are outstanding to the external agent. Support for multiple outstanding reads can be enabled or disabled via a boot- time mode bit. Refer to Section 8 for a complete list of mode bits. an ua ry The RM7965A can issue read and write requests to an external agent, while an external agent can issue null and read responses to the RM7965A. ne sd ay ,1 2J For processor reads, the RM7965A asserts ValidOut* and simultaneously drives the address and read command on the SysAD and SysCmd buses. If the system interface has RdRdy* asserted, then the processor tristates its drivers and signals the release of the system interface to slave state by asserting Release*. The external agent can then begin sending data to the RM7965A. on W ed Figure 8 shows a processor block read request and the external agent read response for a system. IH S Figure 8 Processor Block Read Addr SysCmd Read Data0 Data1 Data2 Data3 NData NData NData NEOD ta ge ri SysAD of SysClock Be ValidOut* nk at es h ValidIn* RdRdy* Ve WrRdy* by Release* Do wn lo ad ed [c on tro lle d] In Figure 8 the read latency is 4 cycles (ValidOut* to ValidIn*), and the response data pattern is DDxxDD. Figure 9 shows a processor block write where the processor was programmed with write-back data rate boot code 2, or DDxxDDxx. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 41 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM Figure 10 shows a typical RM7965A sequence resulting in two outstanding reads as explained in the following sequence: :3 4: 1. The processor issues a read. 02 2. The external agent takes control of the bus in preparation for returning data to the processor. ,2 01 1 3. The processor encounters another internal cache miss and therefore asserts PRqst* in order to regain control of the bus. ua ry 4. The external agent pulses PAck*, returning control of the bus to the processor. 2J an 5. The processor issues a read for the second miss. sd ay ,1 6. The RspSwap* pin is asserted to denote the out of order response. Not shown in the figure is the completion of the data transfer for the second miss, or any of the data transfer for the first miss. W ed ne 7. The external agent retakes control of the bus and begins returning data (out of order) for the second miss to the processor S on Figure 9 Processor Block Write Data0 Data1 SysCmd Write NData NData of Addr Data2 Data3 NData NEOD ge ri SysAD IH SysClock Be ta ValidOut* h ValidIn* nk at es RdRdy* WrRdy* Do wn lo ad ed [c on tro lle d] by Ve Release* Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 42 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet SysCmd Read1 Data0 4: Addr2 Data1 Data0 Read2 Data1 1 Addr1 System 5 SysClock SysAD Tertiary(Miss) 04 Processor 2 :3 System 02 Tertiary(Miss) Data02 Data12 NData NData 01 Processor Master AM Figure 10 Multiple Outstanding Reads 7 ry ,2 RspSwap* ua ValidOut* 8 2J an ValidIn* ,1 Release* ay 3 PRqst* sd 4 PAck* 6 S Write Modes IH 6.5 on W ed ne 1 TcMatch nk at es h Be ta ge ri of The RM7965A implements two write modes: Pipeline Writes and Write Reissue. Pipelined write mode eliminates these two wait states by allowing the processor to drive a new write address onto the bus immediately after the previous data cycle. This allows for higher SysAD bus utilization. At high frequencies the processor may drive a subsequent write onto the bus prior to the time the external agent deasserts WrRdy*, indicating that it can not accept another write cycle. This can cause the cycle to be missed. Ve Write reissue mode is an enhancement to pipelined write mode and allows the processor to reissue missed write cycles. If WrRdy* is deasserted during the issue phase of a write operation, the cycle is aborted by the processor and reissued at a later time. Do wn lo ad ed [c on tro lle d] by In write reissue mode, a rate of one write every two bus cycles can be achieved. Pipelined writes have the same two bus cycle write repeat rate, but can issue one additional write following the deassertion of WrRdy*. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 43 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Integrated Debug 04 7 02 :3 4: The E9000 has extended the debugging features found on the RM7000 and has added EJTAG Debugging and a 64-entry Branch Inst. Trace Buffer. 01 1 EJTAG Debugging ,2 7.1 ,1 2J an ua ry The EJTAG 2.5 standard is implemented to allow access to the processor subsystem through the EJTAG port. This allows an emulator to be plugged into the EJTAG port to single-step, modify memory and registers, and to provide hardware breakpoints. EJTAG mode on the RM7965A is selected by using the JTAGSEL pin. When JTAGSEL is set to "1", JTAG is selected. When JTAGSEL is set to "0", EJTAG is selected. ne sd ay A new exception vector at 0xBFC0_0240 is allocated for EJTAG Debugging. In addition, a Debug Register Section at 0xff20_0000 and a Debug Memory Section at 0xff30_0000 to 0xff3f_ffff is implemented. on W ed Two new instructions have been added to support on-chip debugging. A Software Debug BreakPoint (SDBBP) allows breakpoints to be taken by the code. Once in the debug exception handler, the Debug Return (DERET) instruction is used to exit the debug exception handler. Trace Buffer nk at es h 7.2 Be ta ge ri of IH S Three new CP0 registers have been added in the CP0 system address space to support EJTAG functionality. The EJTAG_Debug register at CP0_23 serves as the control and status register. The EJTAG_DEPC register at CP0_24 serves as the same purpose for the debug exception as the EPC register does for general exceptions. The EJTAG_DESave register at CP0_31 is used as a general purpose "save area" for EJTAG debug support. See Figure 5. tro lle d] by Ve A Trace buffer is implemented on the processor core to allow tracing of instruction flow. The trace buffers are 64-entries deep and capture branch addresses and branch target addresses so that the precise flow of instruction execution can be reconstructed. Using this sophisticated compression technique, the reconstructed instruction length can be many times larger than the trace buffer length. The trace buffer can trigger an interrupt when it is 1/4, 1/2, 3/4 or completely full. If no interrupt is set, the buffer will wrap around. The trace buffer shares the IP13 interrupt with the Performance Counters. Do wn lo ad ed [c on To support the Trace Buffer, 3 new CP0 register are implemented in the CP0 control address space. The Trace Buffer Control and Status (TB CSR) register is at CP0_22 and performs the function its name suggests. The Trace Buffer Index (TB IDX) register is at CP0_24 and is the address into the trace buffer. The Trace Buffer Out (TB Out) register is at CP0_23 and contains data from the read at the index given in the TB IDX register. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 44 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Test/Breakpoint Registers AM 7.3 ,2 01 1 02 :3 4: 04 To facilitate hardware and software debugging, the RM7965A incorporates a pair of Test/Breakpoint, or Watch registers, called Watch1 and Watch2. Each Watch register can be separately enabled to watch for a load address, a store address, or an instruction address. All address comparisons are done on virtual addresses. An associated register, WatchMask, allows either or both of the Watch registers to compare against an address range rather than a specific address. The range granularity is limited to a power of two. ,1 2J an ua ry When enabled, a match of either Watch register results in an exception. If the Watch is enabled for a load or store address then the exception is the Watch exception as defined for the R4000 by Cause exception code 23. If the Watch is enabled for instruction addresses then a Instruction Watch exception is taken and the Cause exception code is 16. The Watch register that caused the exception is indicated by Cause bits 25:24. Table 17 summarizes a Watch operation. ne sd ay If the DBEN bit is set, an address comparison will cause a Debug Exception, which vectors to 0xbfc00240. ed Table 17 Watch Registers Register 55 54 53:40 39:2 1 0 S DBEN DBOut Rsvd Caddr [39:2] 0 0 0 DBEN DBOut Rsvd Caddr [39:2] 0 Inst Mask Watch2 Mask Watch 1 58 57 56 Watch 1 Caddr [63:59] Store Load Inst Watch 2 Caddr Store Load IH of ge ta Be Mask [39:2] nk at es The W1 and W2 bits of the Cause register indicate which Watch register caused a particular Watch exception. Do wn lo ad ed [c on tro lle d] by Ve 1. Reserved h Note: ri [63:59] Mask [63:59] on 63:59 Watch Mask W Bit Field/Function Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 45 04 Performance Counters 4: 7.4 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet ,2 01 1 02 :3 The RM7965A supports two CP0 performance-counter registers with the PerfCount and PerfControl registers. The PerfCount register is a 64-bit register divided into two independent 32-bit counters, PerfCounter0, PerfCounter1. The counters can be written to by software to initialize event monitoring, and they generate a performance-counter interrupt when the most significant bit in either counter (bit 63 in Counter 2, and bit 31 in Counter 1) is set. ay ,1 2J an ua ry The PerfControl register is a 32-bit register containing two 5-bit fields used to select one of twenty-four event types counted by each counter, as well as a handful of bits which control the overall counting function. Note that only one event type can be counted per counter at a time, and that counting can occur for user code, kernel code or both. The event types and control bits are listed in Table 18. PerfControl Field sd Table 18 Performance Counter Control ed Event Type W 4:0 ne Description Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on Clock cycles 00: Total instructions issued (Integer and Floating Point) 01: 02: Floating-point instructions issued (any COP1 or COP3). Integer instructions issued (no COP1 or COP3). 03: 04: Load instructions issued Store instructions issued 05: 06: Dual issued instruction pairs Branch mispredictions 07: 08: External Cache Misses 09: Stall cycles 0A: Secondary cache misses 0B: Instruction cache misses 0C: Data cache misses 0D: Data TLB misses 0E: Instruction TLB misses 0F: Joint TLB instruction misses 10: Joint TLB data misses 11: Branches taken 12: Branches issued 13: Secondary cache writebacks 14: Data cache writebacks 15: Data cache miss stall cycles (A stall occurs when the data cache is processing two misses and a third miss occurs). 16: Cache misses (all caches). 17: FP possible exception cycles 18: Slip Cycles due to multiplier busy 19: Coprocessor 0 slip cycles 1A: Slip cycles due to pending non-blocking loads 1B: Stall cycles due to full Write buffer 1C: Stall cycles due to Cache instruction 1D: Unused 1E: Stall cycles due to pending non-blocking loads - stall start of exception Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 46 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Description 7:5 Reserved (must be zero) 8 Count in Kernel Mode 04 4: :3 Disable Enable Disable Enable Reserved (must be zero) 2J 31:11 ua 0: 1: ry Count Enable an 10 01 0: 1: 1 Count in User Mode ,2 9 Disable Enable 02 0: 1: AM PerfControl Field ed ne sd ay ,1 The performance counter interrupt only occurs when interrupts are enabled in the Status register, IE=1, and the Interrupt Mask bit 13 (IM13) of the coprocessor 0 interrupt control register is set. The performance counter shares this interrupt with the 64-entry branch Trace Buffer. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W Since a performance counter can be set up to count clock cycles, it can be used as either a second timer, or a watchdog interrupt. A watchdog interrupt can be used as an aid in debugging system or software "hangs." Typically the software is set up to periodically update the count so that no interrupt occurs. When a hang occurs the interrupt ultimately triggers, thereby breaking free from the hang-up. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 47 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Boot-Mode Settings 04 8 01 1 02 :3 4: The RM7965A operating modes are initialized at power-up by the boot-time mode control interface. The serial boot-time mode control interface operates at a very low frequency (SysClock divided by 256), allowing the initialization information to be kept in a low cost EPROM or system interface ASIC. an ua ry ,2 The boot-time serial mode stream is defined below. Bit 0 is presented to the processor as the first bit in the stream following VccOK being asserted. Bit 255 is the last bit transferred. An automated mode bit generation tool (program that runs on a PC) is available on the PMC-Sierra website. Size Field Description Reserved 1 0 Must be set to 0. preBigEndian 1 1 Places the processor in big endian mode. ay ,1 2J Name 1 Sets the SysAD interface width to 32 bits. 2 ne SI32Wide sd 0: Little Endian (Little) 1: Big Endian (Big) 1 Enables overlapping reads on the SysAD interface. 3 on SADRdOverlap W ed 0: 64-bit SysAD (SADSz64) 1: 32-bit SysAD (SADSz32) 2 SysAD interface write protocol. 5:4 9:6 lle d] by Ve nk at es h Be 4 SADDatRate[3:0] ta ge ri of SADWrProt[1:0] IH S 0: Overlap disabled (OvlpDisabled) 1: Overlap enabled (OvlpEnabled) on tro ECacheEn 10 ad lo 0000: Dd 0001: Ddx 0010: Ddxx 0011: Dxdx 0100: Ddxxx 0101: Ddxxxx 0110: Dxxdxx 0111: Ddxxxxxx 1000: Dxxxdxxx 1001-1111: Reserved Enables the external cache. 1 11 Sets ECache protocol for burst mode RAMs. 0: Dual Cycle Deselect, (DCD), 1: Single Cycle Deselect, (SCD). Reserved 1 12 Must be set to 0. Reserved 1 13 Must be set to 0. Do wn SysAD interface write transmit data rate. 0: ECache Disabled 1: ECache Enabled ed [c ECBurstMd 1 00: R4000 compatible (R4000) 01: Reserved 10: Pipelined writes (Pipelined) 11: Write re-issue (ReIssue) Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 48 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Field DrvStren[1:0] 2 15:14 Description AM Size Sets the drive strength of the pad output drivers. 04 Name 02 :3 4: 00: Drive at 67%, 01: Drive at 50%, 10: Drive at 100%, 11: Drive at 83%. 5 20:16 Sample and drive sync generation for the SysAD interface. SyncSysAD[4] = 0, reserved. SyncHalfSysAD 1 21 Sample and drive sync generation is half integers for the SysAD interface. SyncSysAD 0 0 0 0 0 0 0 0 00000 00001 00010 00011 00100 00101 00110 00111 2:1 3:1 4:1 5:1 6:1 7:1 8:1 9:1 01000 01001 01010 01011 01100 01101 01110 01111 10:1 11:1 12:1 13:1 14:1 15:1 16:1 17:1 1 1 1 1 1 1 1 1 00000 00001 00010 00011 00100 00101 00110 00111 Rsvd Rsvd Rsvd Rsvd Rsvd 3.5:1 Rsvd 4.5:1 1 1 1 1 1 1 1 1 01000 01001 01010 01011 01100 01101 01110 01111 Rsvd 5.5:1 Rsvd 6.5:1 Rsvd 7.5:1 Rsvd 8.5:1 S on W an 2J ,1 ay sd ne ed 0 0 0 0 0 0 0 0 ua SyncHalfSysAD IH of ri ge ta Be h nk at es Ve by d] lle tro on 4 25:22 TimIntDis 1 26 ad lo wn Reserved, Must be set to 0. Disables the timer interrupt to interrupt bit 5. 1 27 Sets counter/timer to run at 1X processor clock frequency. 0: Normal frequency (TimerNormal) 1: 1X frequency (Timer1X) Do Timer1X Ratio 0: Timer enabled (TimerEnabled) 1: Timer disabled (TimerDisabled) ed [c BIUPbRsvd[3:0] ry ,2 01 1 SyncSysAD[4:0] Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 49 Size Field SysConfig[1:0] 2 29:28 Description System configuration mode bits, to Config register. 04 Name 30 Enables the core on-chip secondary caches. :3 1 Enables the Ocache tag clear machine on cold reset. 1 31 01 1 02 0: OCache Disabled (OCacheDisabled) 1: OCache Enabled (OCacheEnabled) OTClrEn 4: Value software visible in Config[21:20]. OCacheEn AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet ,2 When OTClrEn = 1, the following will be cleared: L2 Tag, L1 DTag, L1 Itag, L1 Dcache, L1 Icache and BranchPredict RAM. 1 32 Disables all parity checking processor-wide. an ParChkDis ua ry 0: OTag clear machine disabled (OTClrDisabled) 1: OTag clear machine enabled (OTClrEnabled) 1 33 Enables a larger JTLB size on the core. ay TLB64Ent ,1 2J 0: Par check enabled (ParChkEnabled) 1: Par check disabled (ParChkDisabled) sd 0: 48 entry JTLB (TLB48Entry) 1: 64 entry JTLB (TLB64Entry) 1 34 Reserved, must be set to 0. MIPS64Compat 1 35 MIPS 64 compatibility mode. Reorganizes CP0 to be MIPS 64 compatible. W ed ne HitShrFtch on 0: MIPS IV compatibility (PMCCompat) 1: MIPS 64 compatibility (MIPS64Compat) 1 36 CorePbRsvd[3:0] 4 40:37 Reserved, must be set to 0. CkPdAlgn[1:0] 2 42:41 IH S PowerSave nk at es h Be ta ge ri of Reserved, must be set to 0. d] by Ve 10: External MasterClock swing is 0.5-0.4V 11: External MasterClock swing < 0.4V Default setting 00. 1 43 tro Enables or disables the PLL. 0: Enabled (PLLEnabled) 1: Disabled (PLLDisabled) 1 44 MasterClock divide by two for PLL. 0: Divide by one (DivBy1) 1: Divide by two (DivBy2) Do wn lo ad ed [c on DivMa2Core 00: No swing control - Full swing (0-1.2V) - default setting for cy2210 clock driver (common mode voltage - 0.6V) 01: MasterClock (internal f/b signal) swing matched for external MasterClock swing of < 0.5V lle PLLDis Adjusts the MasterClock pad delay matching network. Reduces the swing on the internal MasterClock equivalent signal fed back to PLLs for matching the external MasterClock swing. These mode bits are for HSTL only. They should be left at 00 in the LVTTL mode. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 50 Size Field Description MulFundCore[4:0] 5 49:45 Fundamental clock multiplier for PLL. 04 Name AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet sd ay ,1 2J an ua 8: Multiply by 10 (MultiplyBy10) 9: Multiply by 11 (MultiplyBy11) a: Multiply by 12 (MultiplyBy12) b: Multiply by 13 (MultiplyBy13) c: Multiply by 14 (MultiplyBy14) d: Multiply by 15 (MultiplyBy15) e: Multiply by 16 (MultiplyBy16) f: Multiply by 17 (MultiplyBy17) ry ,2 01 1 02 :3 4: 0: Multiply by 2 (MultiplyBy2) 1: Multiply by 3 (MultiplyBy3) 2: Multiply by 4 (MultiplyBy4) 3: Multiply by 5 (MultiplyBy5) 4: Multiply by 6 (MultiplyBy6) 5: Multiply by 7 (MultiplyBy7) 6: Multiply by 8 (MultiplyBy8) 7: Multiply by 9 (MultiplyBy9) 18: Reserved (MultiplyBy26) 19: Reserved (MultiplyBy27) 1a: Reserved (MultiplyBy28) 1b: Reserved (MultiplyBy29) 1c: Reserved (MultiplyBy30) 1d: Reserved (MultiplyBy31) 1e: Reserved (MultiplyBy32) 1f: Reserved Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne 10: Reserved (MultiplyBy18) 11: Reserved (MultiplyBy19) 12: Reserved (MultiplyBy20) 13: Reserved (MultiplyBy21) 14: Reserved (MultiplyBy22) 15: Reserved (MultiplyBy23) 16: Reserved (MultiplyBy24) 17: Reserved (MultiplyBy25) Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 51 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Size Field Description DivXCore[4:0] 5 54:50 Processor core logic clock divisor from processor core fundamental clock. 04 AM Name an ne sd ay ,1 2J 8: Reserved (DivideBy10) 9: Reserved (DivideBy11) a: Reserved (DivideBy12) b: Reserved (DivideBy13) c: Reserved (DivideBy14) d: Reserved (DivideBy15) e: Reserved (DivideBy16) f: Reserved (DivideBy17) ua ry ,2 01 1 02 :3 4: 0: Reserved (DivideBy2) 1: Reserved (DivideBy3) 2: Reserved (DivideBy4) 3: Reserved (DivideBy5) 4: Reserved (DivideBy6) 5: Reserved (DivideBy7) 6: Reserved (DivideBy8) 7: Reserved (DivideBy9) MBRsvd[2:0] 18: Reserved (DivideBy26) 19: Reserved (DivideBy27) 1a: Reserved (DivideBy28) 1b: Reserved (DivideBy29) 1c: Reserved (DivideBy30) 1d: Reserved (DivideBy31) 1e: Reserved (DivideBy32) 1f: Divide by 1 (DivideBy1) 4 58:55 Reserved, must be set to 0. 3 61:59 Reserved, must be set to 0. 2 63:62 HSTL output delay control. Must be set to 01 in HSTL mode. Must be set to 00 in LVTTL mode. 43 106:64 Reserved, must be set to 0. 2 108:107 HSTL output delay control. lle MBRsvd[45:3] tro d] by HSTLCntl[1:0] Ve ClockPbRsvd[3:0] nk at es h Be ta ge ri of IH S on W ed 10: Reserved (DivideBy18) 11: Reserved (DivideBy19) 12: Reserved (DivideBy20) 13: Reserved (DivideBy21) 14: Reserved (DivideBy22) 15: Reserved (DivideBy23) 16: Reserved (DivideBy24) 17: Reserved (DivideBy25) Must be set to 11 in HSTL mode. Must be set to 00 in LVTTL mode. [c on HSTLCntl[3:2] 147 255:109 Reserved, must be set to 0. Do wn lo ad ed MBRsvd[192:46] Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 52 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM RM7000 and RM7965A Differences RM7000 RM7965A Number of CPU Cores 1 1 Pipeline Stages 5 7 Load Delay, Branch Delay 1 2 Branch Prediction No 8K BHT/Core Hardware Cache Coherency Support No No Secondary Cache Protection Parity Page Size 4 KB - 16 MB Number of ASID Bits 8 New Instructions -- Integer Multiplier Iterative Integrated Buses SysAD SysAD Bus Width 64-bit (RM7000x), RM7065x) or 32-bit (RM7035C) 64-bit SysAD Maximum Bus Frequency 125 MHz (RM70xxA) 133 MHz (LVTTL) or 200 MHz (HSTL) :3 02 1 01 ,2 Error Checking and Correcting (ECC) ry ua an 2J ,1 ay sd ne ed S on 133 MHz (LVTTL) or 200 MHz (HSTL) for all RM70xxC CPUs 4 KB - 256 MB 12 MSUB, MSUBU, SSNOP, SDBBP, DERET Pipelined SysAD Yes (RM7000x only) No Stores TagLo register Stores constant zero No Yes No Yes Integrated Instruction Trace Buffer No Yes Watch Register Addressing Physical Virtual 1 2 IH L3 Cache Interface 4: Feature W 9 of L3 Page Invalidate Cache Op ri On-Chip Debugging Be ta ge EJTAG Emulator Support nk at es h Number of Performance Counters The following lists the significant additions to the RM7965A product: The SysAD bus supports both 133 MHz LVTTL and 200 MHz HSTL SysAD bus frequencies. Integrated debug support includes EJTAG TAP support and on-chip trace buffers. The added debug mechanism includes two new instructions: Software Debug Break Point (SDBBP), and Debug Exception Return (DERET). The added debug mechanism has its own exception vector located at 0xBFC00480 and its own 2 MB memory space at 0xFF200000. New instructions: Multiply-Subtract, both signed and unsigned (MSUB/MSUBU), and superscalar NOP (SSNOP) which issues a NOP to each pipeline. [c on tro lle d] by Ve Branch prediction that provides the CPU core with up to 8K entries of branch history. Virtual Watch register addressing. This is a change from the physical Watch register addressing on the RM7000. The two Watch registers and the Watch Mask have been enlarged to reflect this change. Do wn lo ad ed Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 53 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 2 performance counters so 2 simultaneous events can now be counted. In addition, branch miss-predicts have been added as a performance counter event, and the multiplication stalls event has been removed. Increased page size range. The page size on the RM7000 can range from 4 KB to 16 MB, and on the RM7965A the range is from 4 KB to 256 MB. The ASID has been extended from 8-bits to 12-bits. Load delay and branch delay increases from 1 to 2. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an ua ry ,2 01 1 02 :3 4: 04 AM Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 54 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Pin Descriptions 04 10 02 :3 4: The following is a list of control, data, clock, tertiary cache, interrupt, and miscellaneous pins of the RM7965A. 01 1 Table 19 System Interface Type Description ExtRqst* Input External request Release* Output Release interface ry ,2 Pin Name an ua Signals that the external agent is submitting an external request. Input Read Ready ,1 RdRdy* 2J Signals that the processor is releasing the system interface to slave state Input Write Ready sd WrRdy* ay Signals that an external agent can now accept a processor read. Input Valid Input W ValidIn* ed ne Signals that an external agent can now accept a processor write request. ValidOut* S on Signals that an external agent is now driving a valid address or data on the bus and a valid command or data identifier on the SysCmd bus. Valid output IH Output PRqst* ge ri of Signals that the processor is now driving a valid address or data on the SysAD bus and a valid command or data identifier on the SysCmd bus. PAck* nk at es Input RspSwap* wn lo ad ed [c on tro lle d] by Ve Input Output Processor Request When asserted this signal requests that control of the system interface be returned to the processor. Processor Acknowledge When asserted, in response to PRqst*, this signal indicates to the processor that it has been granted control of the system interface. Response Swap RspSwap* is used by the external agent to signal the processor when it is about to return a memory reference out of order; i.e., of two outstanding memory references, the data for the second reference is being returned ahead of the data for the first reference. In order that the processor will have time to switch the address to the tertiary cache, this signal must be asserted a minimum of two cycles prior to the data itself being presented. Note that this signal works as a toggle; i.e., for each cycle that it is held asserted the order of return is reversed. By default, anytime the processor issues a second read it is assumed that the reads will be returned in order; i.e., no action is required if the reads are indeed returned in order. Read Type During the address cycle of a read request, RdType indicates whether the read request is an instruction read or a data read. Do RdType h Be ta Output Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 55 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Description SysAD[63:0] Input/Output AM Type System address/data bus 04 Pin Name SysADC[7:0] System address/data check bus 02 Input/Output :3 4: A 64-bit address and data bus for communication between the processor and an external agent. SysCmd[8:0] System command/data identifier bus ,2 Input/Output 01 1 An 8-bit bus containing parity check bits for the SysAD bus during data cycles. SysCmdP ry A 9-bit bus for command and data identifier transmission between the processor and an external agent. System Command/Data Identifier Bus Parity ua Input/Output 2J an For the RM7965A, unused on input and zero on output. Input Description ay Type System clock sd Pin Name SysClock ,1 Table 20 Clock/Control Interface W ed ne Master clock input used as the system interface reference clock. All output timings are relative to this input clock. Pipeline operation frequency is derived by multiplying this clock up by the factor selected during boot initialization. SysClock* System clock on Input of IH S Differential clock input used only in HSTL I/O mode. Set SysClock* to VccIO or Do Not Connect for non-HSTL operation. Input VccP Input ge VccIO ta Input Be VccInt Description h Type nk at es Pin Name ri Table 21 Power Supply VccJ Ve Input by Vref_In Vss Power supply for I/O. Vcc for PLL Quiet VccInt for the internal phase locked loop. Must be connected to VccInt through a filter circuit. Power supply used for JTAG. Input Reference voltage for HSTL I/O. Do Not Connect for non-HSTL. Input Ground Return. Input Vss for PLL Quiet Vss for the internal phase locked loop. Must be connected to Vss through a filter circuit. Do wn lo ad ed [c on tro lle d] VssP Power supply for core. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 56 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Type INT[9:0]* Input NMI* Input Description 04 Pin Name AM Table 22 Interrupt Interface 4: Interrupt 02 :3 Ten general processor interrupts, bit-wise ORed with bits 9:0 of the interrupt register. Non-maskable interrupt ,2 01 1 Non-maskable interrupt, ORed with bit 15 of the interrupt register.. ry Table 23 JTAG Interface Type Description JTDI/DBDI Input JTCK/DBCK Input JTDO/DBDO Output JTMS/DBMS Input JTAG/EJTAG command signal, signals that the incoming serial data is command data. JTRST*/DBRST* Input JTAG/EJTAG reset. JTAGSEL Input ua Pin Name 2J JTAG/EJTAG clock input ,1 JTAG/EJTAG serial data in. an JTAG/EJTAG data in sd JTAG/EJTAG data out ay JTAG/EJTAG serial clock input. ne JTAG/EJTAG serial data out. on W ed JTAG/EJTAG command IH S JTAG/EJTAG select of Selects JTAG when JTAGSEL=1 ; selects EJTAG when JTAGSEL=0 ri Notes: The JTRST* input was added to the RM70xxC and RM7965A CPUs to directly control the reset to the JTAG state machine. JTAG boundary scan test equipment must be able to drive JTRST* high to allow JTAG boundary scan operation. 2. The JTRST* input must be connected to GND (Vss) through a 220 to 1 K pull-down resistor to force the JTAG state machine into the reset state to allow normal operation (JTAG boundary scan mode disabled). 3. The JTAG interface electrical characteristics are dependent on the VccJ level chosen (2.5 V or 3.3 V). Ve nk at es h Be ta ge 1. Pin Name Type Description Input Big Endian / Little Endian Control tro lle d] BigEndian by Table 24 Initialization Interface Input ed ad lo wn Do ColdReset* Vcc is OK When asserted, this signal indicates to the RM7965A that the VccInt power supply has been above the recommended value for more than 100 milliseconds and will remain stable. The assertion of VccOK initiates the reading of the boot-time mode control serial stream. [c on VccOK Allows the system to change the processor addressing mode without rewriting the mode ROM. Input Cold Reset This signal must be asserted for a power on reset or a cold reset. ColdReset must be de-asserted synchronously with SysClock. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 57 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Reset* Input Description AM Type Reset 04 Pin Name ModeClock 02 :3 4: This signal must be asserted for any reset sequence. It may be asserted synchronously or asynchronously for a cold reset, or synchronously to initiate a warm reset. Reset must be de-asserted synchronously with SysClock. 1 Boot Mode Clock 1 Output Input Boot Mode Data In ry ModeIn ,2 01 Serial boot-mode data clock output at the system clock frequency divided by two hundred and fifty six. Input HSTL/LVTTL Control an HSTL_Sel* ua Serial boot-mode data input. ,1 2J Asserting this signal low places the system I/O pins in HSTL mode. Pulling this signal high or allowing to float places all system I/O pins in LVTLL mode. sd In HSTL mode, maximum voltage level of the ModeClock is determined by VccJ. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne 1. ay Note Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 58 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 AM Absolute Maximum Ratings Rating Limits Unit VTERM Terminal Voltage with respect to Vss 0.5 to +3.9 4: Symbol :3 11 TSTG Storage Temperature -55 to +125 IIN DC Input Current 20 ua DC Output Current 20 C C C mA mA an IOUT 4 01 -40 to +85 ,2 0 to +85 Industrial ry Commercial TCASE 1 02 Operating Temperature V 2J Notes: Stresses greater than those listed under ABSOLUTE MAXIMUM RATINGS may cause permanent damage to the device. This is a stress rating only and functional operation of the device at these or any other conditions above those indicated in the operational sections of this specification is not implied. Exposure to absolute maximum rating conditions for extended periods may affect reliability. 2. VIN minimum = -2.0 V for pulse width less than 15 ns. VIN should not exceed 3.9 V. 3. When VIN < 0V or VIN > VccIO 4. Not more than one output should be shorted at a time. Duration of the short should not exceed 30 seconds. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 1. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 59 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM DC Electrical Characteristics 04 12 VOL Maximum Conditions 0.2 V |IOUT|= 100 A 0.4 V |IOUT| = 2 mA :3 Minimum ,2 2.4 V VIL -0.3 V 0.8 V VIH 2.0 V VccIO + 0.3 V ua ry VOH an VOL 5 A 5 A 2J VIN 0 VIN = VccIO ,1 IIN Maximum 0.2 V VOL 0.4 V 2.0 V on VOH ed 2.1 V W VOH ne VOL sd Minimum ay Table 26 (VccIO = 2.3 V - 2.7 V) Parameter 0.7 V 1.7 V |IOUT|= 1 mA |IOUT|= 2 mA of VIH IIOUT|= 100 A 0.7 V ri -0.3 V ge 1.7 V VIL Conditions IH S VOL VOH 01 1 VccIO - 0.2 V VOH 02 Parameter 4: Table 25 (VccIO = 3.15 V - 3.45 V) Be ta IIN VccIO + 0.3 V 5 A 5 A VIN 0 VIN = VccIO h Note for Table 25 and Table 26: nk at es For VccIO levels in Table 25 and Table 26, set HSTL_Sel* to VccIO or Do Not Connect. 2. Table 27 (VccIO = 1.4 V - 1.6 V) HSTL by VOL d] VOH tro lle VIL VIH on VREF [c VIN_CLK ed VDIF_CLK lo ad VCM_CLK Minimum Maximum Conditions Vss 0.4 V |IOUT|= 16 mA VccIO-0.4 V VccIO -0.3 V Vref -0.2 V Vref+0.2 V VccIO+0.3 V 0.6 V 0.9 V -0.3 V VccIO+0.3 V 0.1 V VccIO+0.6 V 0.6 V 0.9 V Note 1. Set HSTL_Sel* to Vss for HSTL operation. Do wn Ve Parameter Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 60 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Power 4: 04 13 02 :3 13.1 Normal Operating Conditions 01 1 Table 28 Normal Operating Voltages for 0.13 m CMOS CPU Speed Case Temp Vss VccInt VccIO Commercial 900 MHz 0C to + 85C 0V 1.32 V 50 mV 3.3 V 150 mV ry ua VccP VccJ 1.32 V 50 mV 3.3 V 150 mV an (part labeled as -900) ,2 Grade 2.5 V 200 mV or 1.5 V 100 mV or [1.30 V 50 mV if operated at 835 MHz or less] 2.5 V 200 mV ne sd ay ,1 2J or [1.30 V 50 mV if operated at 835 MHz or less] ed Notes: VccIO should not exceed VccInt by greater than 2.5 V during the power-up sequence. 2. Applying a logic high state to any I/O pin before VccInt becomes stable is not recommended. 3. For normal operation (non-boundary-scan), JTRST* must be pulled down to Vss (0 V) to avoid entering JTAG test mode. 4. VccP must be connected to VccInt through a passive filter circuit. See RM79xx User Manual for recommended circuit. 5. Power supply, D.C. characteristics, and A.C. timing are characterized across these operating ranges, unless otherwise stated. 6. The VccInt and VccP voltages can be reduced (by 20 mV) to 1.30 V 50 mV if the RM7965A is operated at 835 MHz or less. Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W 1. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 61 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 4: 04 13.2 Power Requirements Typ Thermal Max Units 900 MHz Icc Max 2.6 -- 3.65 A Icc Wait 1.0 -- 1.46 Total Power (Max) 3.5 3.7 -- W Total Power (Wait) 1.37 -- W 02 Parameter 01 1 Conditions 2J :3 Table 29 VccINT Power Requirements an 900 MHz ua (Tcase = 50C) (Tcase = 85C) -- ,1 900 MHz A ry 900 MHz ,2 (Tcase = 85C) sd ay (Tcase = 50C) ne Notes: Outputs loaded with 30 pF (if not otherwise specified), and a normal amount of traffic or signal activity. 2. Power values are calculated using the formula: W ed 1. on Power = i(VDD x IDD) of I/O supply power is application-dependant, but typically <20% of VccInt. During WAIT mode, I/O power supply should draw negligible current unless resistively loaded. Be ta ge ri 3. IH S Where i denotes all the various power supplies on the device, VDD is the voltage for supply i in accordance with the condition, and IDD is the current for supply i. nk at es Typical h Table 30 Conditions for Power Requirements Nominal Voltage Nominal Vdd by Ve Process Nominal +2 sigmas of process variation* Nominal +6 sigmas of process variation Maximum Operating Vdd Maximum Vdd lle * The power number for nominal process +2 sigma of process variation is recommended for thermal calculations as it will be the highest power dissipation of almost all parts in almost all applications. The current number for nominal +6 sigma of process variation is recommended for power supply design and is a true worst case. [c on tro Maximum Current d] Note Power For Thermal Calculations Do wn lo ad ed 13.3 Typical Power Consumption Power consumption in an end application depends on many application-specific factors, such as the characteristics of the code being executed, operating temperature of the CPU, and loads being driven. The power consumption in an actual application can be substantially lower than the maximum guaranteed specification shown. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 62 Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an ua ry ,2 01 1 02 :3 4: 04 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 63 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM AC Electrical Characteristics 4: 04 14 02 CLD Max Units Mode 2 ns/25pF 1 Load Derate Min 01 Symbol HSTL Test Conditions 2J Bus Speed LVTTL ,1 Symbol an ua 14.2 Clock Parameters Parameter LVTTL ry ,2 Parameter :3 14.1 Capacitive Load Deration ay Min Units HSTL Max Min Max tSCHigh Transition 2ns 3 ns SysClock Low tSCLow Transition 2ns 3 ns ne sd SysClock High ed SysClock Frequency tSCP Clock Jitter for SysClock tJitterIn SysClock Rise Time tSCRise SysClock Fall Time tSCFall ModeClock Period tModeCKP JTAG Clock Period tJTAGCKP 33.3 133 33.3 200 MHz 7.5 30 5 30 ns 150 150 ps 2 1.3 ns 2 1.3 ns 256 256 tSCP of IH S on W SysClock Period 4 tSCP Be ta ge ri 4 h 14.3 System Interface Parameters lle tro on [c ed ad Ve by Test Conditions I/O Type LVTTL (VccIO=3.3V): mode[15:14]=10 (fastest) Units LVTTL I/O HSTL I/O Min Max Min Max 0.75 4.5 0.75 2.5 ns 0.75 5.5 0.75 2.75 ns 5,6,7 HSTL (VccIO=1.5V): mode[108:107:62:15:14]= 11110 (fastest) 5,6 LVTTL (VccIO=3.3V): mode[15:14]=01 (slowest) 5,6,7 HSTL (VccIO=1.5V): mode[108:107:62:15:14]= 11101 (slowest) 5,6,7 Do wn lo Symbol tDO d] Data Output2,3,7 nk at es Parameter1 Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 64 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Data Setup tDS 4 Data Hold 6 tDH Units HSTL I/O Min Min Max trise = see above table 2.5 1.15 tfall = see above table 1.0 0.75 04 LVTTL I/O AM I/O Type Max 4: 4 Test Conditions ns :3 Symbol ns 02 Parameter1 01 1 Notes: In LVTTL mode, timings are measured from 0.425 x VccIO of clock to 0.425 x VccIO of signal for 3.3V I/O, and from 0.48 x VccIO of clock to 0.48 x VccIO of signal for 2.5V I/O. In HSTL mode, timings are measured from the crossing point of SysClock and SysClock* to 0.75V of the crossing point of the signal. 2. Capacitive load for all LVTTL maximum output timings is 50 pF. Minimum output timings are for capacitive load of 20 pF. 3. Capacitive load for all HSTL minimum and maximum output timings is 20 pF. 4. Data Output timing applies to all signal pins whether tristate I/O or output only. 5. Setup and Hold parameters apply to all signal pins whether tristate I/O or input only. 6. Only mode[108:107:62:15:14]=11110 is tested in HSTL Class I mode during production test. 7. Data shown is for 3.3 V I/O. For 2.5 V I/O derate tDO Max by 0.5 nS, and tDO Min by 0.25 ns. on W ed ne sd ay ,1 2J an ua ry ,2 1. Symbol Mode Data Setup tDS Mode Data Hold tDH Min Max Units 4 SysClock cycles 0 SysClock cycles Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri Parameter of IH S 14.4 Boot-Time Interface Parameters Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 65 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Timing Diagrams 4: 04 15 02 :3 15.1 Clock Timing ry ,2 01 1 Figure 11 Clock Timing tLow tJitterIn ,1 2J Rise tHigh tFall an t ua SysClock sd ay 15.2 System Interface Timing ed ne (SysAD, SysCmd, ValidIn*, ValidOut*, etc.) S on W Figure 12 Input Timing tDS ri of IH SysClock Data Be ta ge Data tDH d] Ve t t DO max DO min Data Data Do wn lo ad ed [c on tro lle Data by SysClock nk at es h Figure 13 Output Timing Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 66 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Thermal Information 04 16 02 :3 4: This product is designed to operate over a wide temperature range when used with a heat sink and is suited for commercial applications such as central office equipment. Maximum long-term operating junction temperature (TJ) to ensure adequate long-term life. 1 0C ,2 01 Minimum ambient temperature (TA) ua ry Table 31 Device Compact Model2 6.52 JA (C/W) (without heat sink) 13.23 2J Junction-to-Board Thermal Resistance, JB ,1 0.27 ay Junction-to-Case Thermal Resistance, JC an 900 MHz ne ed The sum of SA + CS must be less than or equal to: [(105 - TA) / PD ] - JC ] C/W W SA+CS sd Table 32 Heat Sink Requirements 4 88C where: IH S on TA is the ambient temperature at the heat sink location PD is the operating power dissipated in the package5 SA Heat Sink CS Case JC Device Compact Model Junction JB Board ge ri of SA and CS are required for long-term operation Ambient ta Notes: The minimum ambient temperature requirement for Central Office Equipment approximates the minimum ambient temperature requirement for Commercial Equipment. 2. Short-term is used as defined in Telcordia Technologies Generic Requirements GR-63-Core; for more information about the GR-63-CORE standard, see Telcordia Technologies. Network EquipmentBuilding System (NEBS) Requirements: Physical Protection: Telcordia Technologies Generic Requirements GR-63-CORE. Issue 1. October 1995. 3. JC, the junction-to-case thermal resistance, is a measured nominal value plus two sigma. JB, the junction-to-board thermal resistance, is obtained by simulating conditions described in JEDEC Standard JESD 51-8; for more information about the JESD51-8 standard, see Electronic Industries Alliance 1999. Integrated Circuit Thermal Test Method Environmental Conditions -Junction-to-Board: JESD51-8. October 1999. 4. SA is the thermal resistance of the heat sink to ambient. CS is the thermal resistance of the heat sink attached material. The maximum SA required for the airspeed at the location of the device in the system with all components in place. [c on tro lle d] by Ve nk at es h Be 1. Power depends upon the operating mode. To obtain power information, refer to the column under thermal in Table 29. Do wn lo ad ed 5. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 67 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Packaging and Pinout Information 4: 04 17 02 :3 17.1 256-pin CSBGA Package Diagram 01 ,2 D A1 BALL CORNER I.D. INDICATOR 1 aaa (4X) A M C D1,M A1 BALL CORNER 2J an ua O fff ry O eee M C A B b ,1 R2.5 MAX. (4X) E1,N A ed ne sd ay E W ENCAPSULATION EDGE A BOTTOM VIEW DETAIL A A2 bbb A1 ge ri A of IH S TOP VIEW e 0.25 MIN. on B ddd 0.10 MIN C C ta SIDE VIEW C SEATING PLANE nk at es h Be SECTION A-A DETAIL A NOTES: 1) ALL DIMENSIONS IN MILLIMETERS 2) DIMENSION aaa DENOTES PACKAGE PROFILE 3) DIMENSION bbb DENOTES PARALLELISM 4) DIMENSION ddd DENOTES COPLANARITY 5) DIAMETER OF SOLDER MASK OPENING IS 0.58 MM (SMD) 6) PACKAGE COMPLIANT TO JEDEC REGISTERED OUTLINE MO-192 VARIATION BAL-2 WITH EXCEPTION OF PROFILE TOLERANCE, COPLANARITY, AND MAXIMUM OVERALL THICKNESS by Ve 0~15 lle d] 0~0.32 MAX. Dim. A A1 A2 D D1 E E1 M, N e b aaa bbb ddd eee fff Min. 1.47 0.55 0.92 - - - - - - - - - - - - Nom. 1.62 0.65 0.97 27.00 24.13 27.00 24.13 20x20 1.27 0.75 - - - - - Max. 1.77 0.75 1.02 - - - - - - - 0.10 0.25 0.15 0.30 0.15 BSC BSC BSC BSC BSC Do wn lo ad ed [c on tro PACKAGE TYPE: 256 THERMALLY ENHANCED BALL GRID ARRAY - CSBGA+ BODY SIZE: 27 x 27 x 1.62mm Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 68 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet 04 17.2 256-pin CSBGA Alphanumerical Pinout Function Pin Function Pin Function Pin A1 A2 A3 A4 A5 A6 A7 A8 A9 VccIO Vss Vss Do Not Connect SysAD35 Vss SysAD33 SysAD32 Vss B19 B20 C1 C2 C3 C4 C5 C6 C7 VccIO Vss Vss Vss VccIO Do Not Connect Do Not Connect Do Not Connect SysAD34 D17 D18 D19 D20 E1 E2 E3 E4 E17 VccIO Do Not Connect Vss Do Not Connect SysAD5 Do Not Connect VccInt VccIO VccIO J3 J4 J17 J18 J19 J20 K1 K2 K3 VccInt VccIO VccIO SysAD54 SysAD22 Vss SysAD41 SysAD10 SysAD42 A10 A11 A12 A13 A14 A15 A16 A17 A18 A19 A20 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 B13 B14 B15 B16 B17 B18 SysADC1 HSTL_Sel* Vss SysADC2 SysAD62 Vss SysAD60 Do Not Connect Vss Vss VccIO Vss VccIO Vss Vss Do Not Connect SysAD3 SysAD2 SysAD1 SysADC5 SysADC0 SysADC3 SysADC6 VREF_In SysAD30 SysAD29 Do Not Connect Vss Vss C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D15 D16 VccInt SysAD0 SysADC4 SysADC7 VccInt SysAD31 SysAD61 VccInt Do Not Connect Do Not Connect VccIO Vss Vss Do Not Connect Vss Do Not Connect VccIO VccIO Do Not Connect VccInt VccInt VccIO VccInt VccInt VccIO SysAD63 VccInt SysAD28 VccIO E18 E19 E20 F1 F2 F3 F4 F17 F18 F19 F20 G1 G2 G3 G4 G17 G18 G19 G20 H1 H2 H3 H4 H17 H18 H19 H20 J1 J2 Do Not Connect Do Not Connect SysAD59 Vss SysAD36 SysAD4 VccInt VccInt SysAD27 SysAD58 Vss SysAD38 SysAD6 SysAD37 VccInt VccInt SysAD26 SysAD57 SysAD25 SysAD7 SysAD39 SysAD40 SysAD8 SysAD24 SysAD56 SysAD55 SysAD23 Vss SysAD9 K4 K17 K18 K19 K20 L1 L2 L3 L4 L17 L18 L19 L20 M1 M2 M3 M4 M17 M18 M19 M20 N1 N2 N3 N4 N17 N18 N19 N20 SysAD11 SysAD53 SysAD21 SysAD52 SysAD20 SysAD43 SysAD44 SysAD12 VccInt VccInt SysAD51 SysAD19 SysAD50 Vss SysAD13 SysAD45 VccIO VccIO SysAD18 SysAD49 Vss SysAD14 SysAD46 VccInt SysAD47 VccInt SysAD48 SysAD16 SysAD17 :3 02 1 01 ,2 ry ua an 2J ,1 ay sd ne ed W on S IH of ri ge ta Be h nk at es Ve by d] lle tro on [c Do wn lo ad ed Function 4: Pin Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 69 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Function Pin Function Pin Function P1 P2 P3 P4 P17 P18 P19 P20 R1 R2 SysAD15 RspSwap* PAck* VccInt ColdReset* VccOK BigEndian Reset* Vss Do Not Connect U15 U16 U17 U18 U19 U20 V1 V2 V3 V4 INT3* VccIO VccIO INT6* Vss INT7* Vss Vss VccIO RDType W13 W14 W15 W16 W17 W18 W19 W20 Y1 Y2 SysCmd5 SysCmdP VccInt INT1* Vss Vss VccIO Vss VccIO Vss R3 R4 R17 R18 R19 R20 T1 T2 T3 T4 T17 T18 T19 T20 U1 U2 U3 U4 JTDI JTCK VccInt ExtRqst* NMI* Vss PRqst* JTDO VccIO JTRST* VccIO VccInt INT9* INT8* ModeClock Vss JTMS VccIO V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 W1 W2 RdRdy* VccP SysClock* VccInt Do Not Connect VREF_In VccInt SysCmd3 SysCmd6 VccInt INT2* INT5* INT4* VccIO Vss Vss Vss VccIO Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 Y16 Y17 Y18 Y19 Y20 U5 JTAGSEL W3 Vss U6 U7 U8 U9 U10 U11 U12 U13 U14 ValidIn* VssP VccInt VccIO VccInt VccInt VccIO SysCmd7 VccInt W4 W5 W6 W7 W8 W9 W10 W11 W12 Vss WrRdy* Release* SysClock VccInt Do Not Connect Do Not Connect SysCmd1 SysCmd2 :3 02 1 01 ,2 ry ua an 2J ,1 ay sd ne ed W on S IH of ri ge ta Be h nk at es Ve by d] lle tro on [c Vss ModeIn ValidOut* Vss VccP Do Not Connect Vss Do Not Connect SysCmd0 Vss SysCmd4 SysCmd8 Vss VccJ INT0* Vss Vss VccIO Do wn lo ad ed 4: Pin 04 256-pin CSBGA Alphanumerical Pinout cont'd. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 70 AM RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Function Pin Function Pin Function Pin P19 BigEndian R4 JTCK H20 SysAD23 C14 P17 ColdReset* R3 JTDI H17 SysAD24 A14 A17 Do Not Connect T2 JTDO G20 SysAD25 D13 SysAD63 B5 Do Not Connect U3 JTMS G18 SysAD26 B10 SysADC0 C5 Do Not Connect T4 JTRST* F18 SysAD27 A10 SysADC1 C17 D1 C6 D6 D18 E2 E18 D3 E19 Do Not Connect Do Not Connect Do Not Connect Do Not Connect Do Not Connect Do Not Connect Do Not Connect Do Not Connect Do Not Connect U1 Y4 R19 P3 T1 V5 V4 W6 P20 ModeClock ModeIn NMI* PAck* PRqst* RdRdy* RDType Release* Reset* D15 B15 B14 C13 A8 A7 C7 A5 F2 SysAD28 SysAD29 SysAD30 SysAD31 SysAD32 SysAD33 SysAD34 SysAD35 SysAD36 2J A13 B11 C10 B9 B12 C11 W7 V7 Y11 SysADC2 SysADC3 SysADC4 SysADC5 SysADC6 SysADC7 SysClock SysClock* SysCmd0 A4 B16 C4 Do Not Connect Do Not Connect Do Not Connect P2 C9 B8 RspSwap* SysAD0 SysAD1 SysAD37 SysAD38 SysAD39 W11 W12 V12 SysCmd1 SysCmd2 SysCmd3 C16 Do Not Connect B7 SysAD2 D20 Do Not Connect B6 SysAD3 SysAD61 :3 02 1 01 ,2 ry ua an ,1 ay sd ne ed W on S IH G3 G1 H2 Function 4: Pin SysAD62 H3 SysAD40 Y13 SysCmd4 K1 SysAD41 W13 SysCmd5 Do Not Connect F3 SysAD4 K3 SysAD42 V13 SysCmd6 W9 Do Not Connect E1 SysAD5 L1 SysAD43 U13 SysCmd7 R2 Do Not Connect G2 SysAD6 L2 SysAD44 Y14 SysCmd8 W10 Do Not Connect H1 SysAD7 M3 SysAD45 W14 SysCmdP Y10 Do Not Connect H4 Y8 Do Not Connect J2 h Be ta ge ri of V9 SysAD8 N2 SysAD46 U6 ValidIn* SysAD9 N4 SysAD47 Y5 ValidOut* ExtRqst* K2 SysAD10 N18 SysAD48 F17 VccInt A11 HSTL_Sel* K4 SysAD11 M19 SysAD49 G17 VccInt Y17 INT0* L3 SysAD12 L20 SysAD50 L17 VccInt W16 INT1* M2 SysAD13 L18 SysAD51 N17 VccInt V15 INT2* U15 INT3* Ve by d] lle INT4* INT5* tro V16 nk at es R18 V17 N1 SysAD14 K19 SysAD52 D10 VccInt P1 SysAD15 K17 SysAD53 D14 VccInt N19 SysAD16 J18 SysAD54 C15 VccInt N20 SysAD17 H19 SysAD55 D7 VccInt INT6* M18 SysAD18 H18 SysAD56 D11 VccInt U20 INT7* L19 SysAD19 G19 SysAD57 E3 VccInt T20 INT8* K20 SysAD20 F19 SysAD58 J3 VccInt T19 INT9* K18 SysAD21 E20 SysAD59 N3 VccInt U5 JTAGSEL J19 SysAD22 A16 SysAD60 C8 VccInt ed [c on U18 Do wn lo ad 04 17.3 256-pin CSBGA Alphabetical Pinout Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 71 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Pin Function Pin Function VccInt Y1 VccIO C20 Vss D8 VccInt V18 VccIO F20 Vss F4 VccInt W2 VccIO J20 Vss G4 VccInt T3 VccIO M20 Vss L4 VccInt V3 VccIO R1 Vss R17 T18 U10 U14 V14 U11 V11 W15 P4 VccInt VccInt VccInt VccInt VccInt VccInt VccInt VccInt VccInt W19 U4 U12 U16 Y20 Y16 P18 V6 Y7 VccIO VccIO VccIO VccIO VccIO VccJ VccOK VccP VccP V1 W1 W17 Y9 U2 V2 W18 Y2 Y6 Vss Vss Vss Vss Vss Vss Vss Vss Vss U8 V8 W8 VccInt VccInt VccInt B13 V10 A9 VREF_In VREF_In Vss Y18 U19 V19 Vss Vss Vss A1 VccIO B1 Vss W3 Vss D5 VccIO B17 Vss Y3 Vss D9 VccIO C1 Vss D17 VccIO F1 Vss VccIO J1 Vss R20 Vss J17 VccIO M1 V20 Vss A6 C18 VccIO A18 VccIO D4 VccIO D12 VccIO D16 VccIO E4 VccIO Vss Y12 Vss 4: :3 02 1 01 ,2 U7 VssP W5 WrRdy* D2 Vss A3 Vss A15 Vss A19 Vss Vss d] B3 VccIO C19 Vss VccIO D19 Vss lle ry Vss Vss VccIO A12 Vss VccIO B4 Vss U17 VccIO B20 Vss Do wn lo ad ed ua B18 U9 on T17 Vss Vss [c M4 tro J4 W4 W20 h A20 Vss Vss C2 nk at es VccIO Ve VccIO C3 by B19 ge A2 VccIO ta VccIO B2 Be M17 an 2J ,1 ay sd ne W on S IH of Vss Vss ri Y15 Y19 E17 Vss AM Function 04 Pin C12 ed 256-pin CSBGA Alphabetical Pinout cont'd. Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 72 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet AM Ordering Information ge ri of IH S on W ed ne sd ay ,1 2J an ua ry ,2 01 1 02 :3 4: 04 18 Be Do wn lo ad ed [c on tro lle d] by Ve nk at es h RM7965A-900UI (leaded) ta Valid Combinations Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 73 RM7965A-900UI 900 MHz 64-bit Microprocessor Data Sheet Do wn lo ad ed [c on tro lle d] by Ve nk at es h Be ta ge ri of IH S on W ed ne sd ay ,1 2J an ua ry ,2 01 1 02 :3 4: 04 AM End of Document Proprietary and Confidential to PMC-Sierra, Inc., and for its customers' internal use. Document No.: PMC-2100294, Issue 2 74