PCI64 Spartan-II Interface V 3.0 January 31, 2000 Data Sheet R LogiCORETM Facts Core Specifics Xilinx Inc. 2100 Logic Drive San Jose, CA 95124 Phone: +1 408-559-7778 Fax: +1 408-377-3259 E-mail: Techsupport: support.xilinx.com Feedback: logicore@xilinx.com URL: http://www.xilinx.com Introduction With Xilinx LogiCORE PCI64 Spartan-II interface, a designer can build a customized, 64-bit, 0-33 MHz fully PCI compliant system with the highest possible sustained performance (264 Mbytes/sec), and up to 135,000 system gates in the Spartan-II FPGAs. Features * * * * * * * * * * * * Fully V2.2 PCI compliant 64-bit, 0-33 MHz PCI Initiator/ target Interface Zero wait-state burst operation Hot Swap CompactPCI friendly Programmable single-chip solution with customizable back-end functionality Pre-defined implementation for predictable timing in Xilinx Spartan-II FPGAs Incorporates Xilinx Smart-IP Technology Universal PCI support in Spartan-II 3.3 V and 5 V operation at 0-33 MHz Master automatically handles 64-bit or 32-bit PCI transactions without knowledge of target bus width Fully verified design tested with Xilinx testbench and hardware Configurable on-chip dual-port FIFOs can be added for maximum burst speed Supported Initiator functions - Memory Read, Memory Write, Memory Read Multiple (MRM), Memory Read Line (MRL) commands - I/O Read, I/O Write commands - Configuration Read, Configuration Write commands - Bus Parking - Special Cycles, Interrupt Acknowledge - Basic Host Bridging January 31, 2000 Device Family: System Clock fmax: Device Features Used: Spartan-II 0-33MHz Bi-directional data buses SelectIO Block SelectRAM+TM (optional user FIFO) Supported Devices1/Percent Resources Used Slice 4 input Devices I/O flip LUTs 64 flops bit 2S100FG456-6 46% 13% 24% 2S150FG456-6 35% 9% 17% Provided with Core Documentation; PCI Design Guide PCI Implementation Guide PCI Data Book Design File Formats: Verilog/VHDL Simulation Model Verilog/VHDL Instantiation Code NGO Netlist Constraints Files: User Constraint File (UCF) Guide files Verification Tools: Verilog/VHDL Testbench Reference designs & Example designs: application notes: PING Reference Design, Synthesizable Asynchronous PCI FIFO Design Tool Requirements Xilinx Core Tools: 2.1i SP5 Tested Entry/Verifica- For Core Instantiation: Synopsys tion Tools2: FPGA Express, Synopsys FPGA Compiler, Synplicity Synplify, Examplar For Core Verification:Leonardo, Cadence Verilog XL, MTI ModelSim PE/Plus, Aldec Active-HDL Xilinx provides technical support for this LogiCORETM product when used as described in the User's Guide and in the Application Notes. Xilinx cannot guarantee timing, functionality, or support of product if implemented in devices not listed above, or if customized beyond that referenced in the product documentation, or if any changes are made in sections of design marked as "DO NOT MODIFY". 1. Re-targeting the PCI core to an unlisted device or package will void the guarantee of timing. See "Smart-IP Technology - guaranteed timing" on page 3 for details. 2. Use -6 for 0-66 MHz operation and -5 for 0-33 MHz operation. 3. See Xilinx website for update on tested design tools. Features (cont.) * * Supported Target functions (PCI Master and Slave) - Type 0 Configuration Space Header - Up to 3 Base Address Registers (memory or I/O with adjustable block size from 16 Bytes to 2 GBytes, medium decode speed) - Parity Generation (PAR), Parity Error Detection (PERR# and SERR#) - Memory Read, Memory Write, Memory Read Multiple (MRM), Memory Read Line (MRL), Memory Write Invalidate (MWI) commands - I/O Read, I/O Write commands - Configuration Read, Configuration Write commands - 64-bit and 32-bit data transfers, burst transfers with linear address ordering - Target Abort, Target Retry, Target Disconnect - Full Command/Status Registers Available for configuration and download on the web - Web-based configuration tool - Generation of proven design files - Instant access to new releases Applications * * * * Embedded applications within telecommunication, networking, and industrial systems PCI add-in boards such as graphic cards, video adapters, LAN adapters and data acquisition boards Hot Swap CompactPCI boards Other applications that need PCI General Description The LogiCORETM PCI64 Interfaces are pre-implemented and fully tested modules for the Xilinx Spartan-II FPGAs. The pinout for the device and the relative placement of the internal Configurable Logic Blocks (CLBs) are pre-defined. Critical paths are controlled by TimeSpecs and guide files to ensure predictable timing. This significantly reduces engineering time required to implement the PCI portion of your design. Resources can instead be focused on the unique back-end logic in the FPGA and on the system level design. As a result, LogiCORETM PCI products can minimize your product development time. Xilinx Spartan-II FPGAs enable designs of fully PCI-compliant systems. The devices meet all required electrical and timing parameters including AC output drive characteristics, input capacitance specifications (10pF), 7 ns setup and 0 ns hold to system clock, and 6 ns system clock to output. These devices meet all specifications for PCI 3.3 V and 5 V. The PCI Compliance Checklist has detailed information about electrical compliance. Other features that enable efficient implementation of a complete PCI system in Spartan-II includes: * Block SelectRAM+TM memory: Blocks of on-chip ultrafast RAM with synchronous write and dual-port RAM January 31, 2000 * * * * * * capabilities. Used in PCI Interfaces to implement FIFO Select-RAMTM memory: on-chip ultra-fast RAM with synchronous write option and dual-port RAM option. Used in PCI Interfaces to implement FIFO Individual output enable for each I/O Internal 3-state bus capability 8 global low-skew clock or signal distribution networks IEEE 1149.1-compatible boundary scan logic support Designed for CompactPCI Hot Swap support The Master and Slave Interface module is carefully optimized for best possible performance and utilization in the Spartan-II FPGA architecture. Smart-IP Technology: Guaranteed Timing Drawing on the architectural advantages of Xilinx FPGAs, new Xilinx Smart-IP technology ensures highest performance, predictability, repeatability, and flexibility in PCI designs. The Smart-IP technology is incorporated in every LogiCORE PCI core. Xilinx Smart-IP technology leverages the Xilinx architectural advantages, such as look-up tables (LUTs), distributed RAM, and segmented routing, as well as floor planning information, such as logic mapping and relative location constraints. This technology provides the best physical layout, predictability, and performance. Additionally, these predetermined features allow for significantly reduced compile times over competing architectures. PCI cores made with Smart-IP technology are unique by maintaining their performance and predictability regardless of the device size. To guarantee the critical setup, hold, and min. and max. clock-to-out timing, the PCI core is delivered with Smart-IP constraint files that are unique for a device and package combination. These constraint files guide the implementation tools so that the critical paths always are within PCI specification. Retargeting the PCI core to an unsupported device will void the guarantee of timing. Contact one of the Xilinx XPERTs partners for support of unlisted devices and packages. See the XPERTs section in chapter 7 of the Xilinx PCI Data Book for contact information. Universal PCI Support Since Spartan-II FPGAs are capable of operating either 3.3 V or 5 V PCI environments, the designer can easily build universal PCI cards. This requires loading one of the bitstreams at power up. Refer to the PCI Implementation Guide and Building a Universal PCI Card using Xilinx FPGAs Application Note. Functional Description The LogiCORE PCI64 Master and Slave Interface is partitioned into five major blocks and an user application as shown in Figure 1. Each block is described below. PCI Configuration Space This block provides the first 64 Bytes of Type 0, version 2.1 Configuration Space Header (CSH) (see Table 1) to support software-driven "Plug-and Play" initialization and configuration. This includes information for Command, Status, and three Base Address Registers (BARs). These BARs illustrate how to implement memory- or I/O-mapped address spaces. Table 1: PCI Configuration Space Header 31 16 15 Device ID Status 0 Header Type Parity Generator/Checker Command This block generates/checks even parity across the AD bus, the CBE lines, PAR and the PAR64 signal. It also reports data parity errors via PERR- and address parity errors via SERR-. Rev ID Latency Timer 08h Cache 0Ch Line Size Base Address Register 1 (BAR1) 14h Base Address Register 2 (BAR2) 18h Base Address Register 3 (BAR3) 1Ch Base Address Register 4 (BAR5) 20h Base Address Register 5 (BAR5) 24h Cardbus CIS Pointer 28h Subsystem Vendor ID 2Ch Expansion ROM Base Address Reserved CapPtr Interrupt Pin Reserved 30h 34h 38h Reserved Min_Gnt Interrupt Line 3Ch 40h-FFh Note: Italicized address areas are not implemented in LogiCORE PCI32 Spartan-II Interface default configuration. These locations return zero during configuration read accesses. Each BAR sets the base address for the interface and allows the system software to determine the addressable range required by the interface. Every BAR designated as a memory space can be made to represent a 32-bit or a 64bit space. January 31, 2000 The I/O interface block handles the physical connection to the PCI bus including all signaling, input and output synchronization, output three-state controls, and all requestgrant handshaking for bus mastering. 04h 10h Max_Lat PCI I/O Interface Block 00h Base Address Register 0 (BAR0) Subsystem ID The capability for extending configuration space has been built into the backend interface. This capability, including the ability to implement a CapPtr in configuration space, allows the user to implement functions, such as Advanced Configuration and Power Interface (ACPI), in the backend design. Vendor ID Class Code BIST Using a combination of Configurable Logic Block (CLB) flipflops for the read/write registers and CLB look-up tables for the read-only registers results in optimized logic mapping and placement. Target State Machine This block controls the PCI interface for Target functions. The states implemented are a subset of equations defined in "Appendix B" of the PCI Local Bus Specification. The controller is a high-performance state machine using one-hot (state-per-bit) encoding for maximum performance. State-per-bit encoding of the next-state logic functions facilitates a high performance implementation in the Xilinx FPGA architecture. Initiator State Machine This block controls the PCI interface for Initiator functions. The states implemented are a subset of equations defined in "Appendix B" of the PCI Local Bus Specification. The Initiator Control Logic also uses state-per-bit encoding for maximum performance. User Application with Optional Burst FIFOs The LogiCORE PCI64 Interface provides a simple, generalpurpose interface with a 64-bit data path and latched address for de-multiplexing the PCI address/data bus. This user interface allows the rest of the device to be used in a wide range of 32-bit and 64/bit applications. Typically, the user application contains burst FIFOs to increase PCI system performance. An on-chip read/write FIFO, built from the on-chip synchronous dual-port RAM (Block SelectRAM+TM) available in Spartan-II FPGAs, supports data transfers in excess of 66 MHz. PCI64 Spartan-II Interface V 3.0 Several synthesizable re-usable bridge designs including commonly used backend functions, such as doorbells and mailboxes, are provided with the core. FIFOs and multiple FIFOs. The user is not locked into one DMA engine, hence, a DMA that fits a specific application can be designed. Interface Configuration The theoretical maximum bandwidth of a 64-bit, 33 MHz PCI bus is 264 MBytes/sec. Attaining this maximum bandwidth will depend on several factors, including the PCI design used, PCI chipset, the processor's ability to keep up with your data stream, the maximum capability of your PCI design, and other traffic on the PCI bus. Older chipsets and processors will tend to allow less bandwidth than newer ones. The LogiCORE PCI64 Interface can easily be configured to fit unique system requirements by using Xilinx web-based PCI configuration tool or by changing the Verilog or VHDL configuration file. The following customization options are supported by the LogiCORE product and described in product documentation. * * * * * * Initiator or target functionality Base Address Register configuration (1-3 Registers, size and mode) Configuration Space Header ROM Initiator and target state machine (e.g., termination conditions, transaction types and request/transaction arbitration) Burst functionality User Application including FIFO (back-end design) Supported PCI Commands Table 2 illustrates the PCI bus commands supported by the LogiCORETM PCI64 Interface. The PCI Compliance Checklist has more details on supported and unsupported commands. Burst Transfer The PCI bus derives its performance from its ability to support burst transfers. The performance of any PCI application depends largely on the size of the burst transfer. A FIFO to support PCI burst transfer can efficiently be implemented using the Spartan-II on-chip RAM features, both Distributed and Block SelectRAM+TM. Each Spartan-II CLB supports four 16x1 RAM blocks. This corresponds to 64 bits of single-ported RAM or 32 bits of dual-ported RAM, with simultaneous read/write capability. Bandwidth Xilinx LogiCORE PCI64 Interface supports fully compliant zero wait-state bust operations for both sourcing and receiving data. This Interface supports a sustained bandwidth of up to 264 MBytes/sec. The design can be configured to take advantage of the ability of the LogiCORE PCI64 Interface to do very long bursts. Since the FIFO is not of fixed size, bursts can go on for as long as the chipset arbiter will allow. Furthermore, since the FIFOs and DMA are decoupled from the proven core, a designer can modify these functions without affecting the critical PCI timing. The flexible Xilinx backend, combined with support for many different PCI features, gives users a solution that lends itself to being used in many high-performance applications. Xilinx is able to support different depths of FIFOs as well as dual port FIFOs, synchronous or asynchronous No additional wait-states are inserted in response to a waitstate from another agent on the bus. Either IRDY or TRDY is kept asserted until the current data phase ends, as required by the V2.2 PCI Specification. See Table 3 for PCI bus transfer rates for various operations. Table 2: PCI Bus Commands CBE [3:0] Command 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Interrupt Acknowledge Special Cycle I/O Read I/O Write Reserved Reserved Memory Read Memory Write Reserved Reserved Configuration Read Configuration Write Memory Read Multiple Dual Address Cycle Memory Read Line Memory Write Invalidate PCI Master Yes Yes Yes Yes Ignore Ignore Yes Yes Ignore Ignore Yes Yes Yes No1 Yes No1 PCI Slave Yes Ignore Yes Yes Ignore Ignore Yes Yes Ignore Ignore Yes Yes Yes Ignore Yes Yes Note: 1. The Initiator can present these commands; however, they either require additional user-application logic to support them or are not thoroughly tested. Table 3: LogiCORE PCI64 Transfer Rates Zero Wait-State Mode Operation Transfer Rate Initiator Write (PCI LogiCORE) 3-1-1-1 Initiator Read (PCI LogiCORE) 4-1-1-1 Target Write (PCI LogiCORE) 5-1-1-1 Target Read (PCI LogiCORE) 6-1-1-1 ***Note: Initiator Read and Target Write operations have effectively the same bandwidth for burst transfer. January 31, 2000 Timing Specification The Spartan-II FPGA devices, together with the LogiCORE PCI64 product enable design of fully compliant PCI systems. The maximum speed at which your back-end is capable of running can be affected by the size of the design as well as by the loading of the hot signals coming directly from the PCI bus. Table 4 shows the key timing parameters for the LogiCORE PCI64 Interface that must be met for full PCI compliance. Verification Methods Xilinx has developed a system-level testbench that allows simulation of an open PCI environment in which a LogiCORE-PCI-based design may be tested by itself or with other simulatable PCI agents. Included in these agents are a behavioral host and target, and several plug-in modules, including a PCI signal recorder and a PCI protocol monitor. Using these tools, the PCI developers can write microcodestyle test scripts that can be used to verify different busoperation scenarios, including those in the PCI Compliance Checklist. The Xilinx PCI testbench is a powerful verification tool that is also used as the basis for verification of the PCI LogiCORE. The PCI LogiCORE is also tested in hardware for electrical, functional, and timing compliance. Table 4: 33 MHz Timing Parameters Parameter Ref. CLK Cycle Time Tcyc CLK High Time Thigh CLK Low Time Tlow CLK to Bus SigTICKOF nals Valid3 CLK to REQ# and TICKOF GNT# Valid3 Tri-state to Active Ton CLK to Tri-state Toff Bus Signal Setup TPSD to CLK (IOB) Bus Signal Setup to CLK (CLB) GNT# Setup to TPSD CLK GNT# Setup to TPSD CLK (CLB) Input Hold Time TPHD After CLK (IOB) Input Hold Time After CLK (CLB) RST# to Tri-state Trst-off PCI Spec. Min 15 6 6 2 Max 30 2 6 6 2 2 6 2 14 3 141 3 5 51 5 5 5 5 0 0 0 0 40 402 Notes: 1. Controlled by TIMESPECS, included in product. January 31, 2000 LogiCORE PCI32 Spartan-II-5 Min Max 151 30 6 6 2 6 PCI64 Spartan-II Interface V 3.0 PCI Asynchronous Write FIFO Reference Design FIFO Controller LogiCORE PCI 64/32 Spartan-II Interface PCI Bus Power Management Module Reference Design PCI Asynchronous Read FIFO Reference Design Custom DMA Module x9097 Figure 1: LogiCORETM PCI Interface Block Diagram Ping64 Reference Design Device Utilization The Xilinx PING64 Application example, delivered in Verilog and VHDL, has been developed to provide an easy-tounderstand example which demonstrates many of the principles and techniques required to successfully use a LogiCORE PCI64 Interface in a System-on-a-Chip solution. The PING64 design is also used as a test vehicle for PCI core verification. The Target/Initiator options require a variable amount of CLB resources for the PCI64 Interface. Asynchronous PCI FIFO Design Note * The first-in first-out memory queue with independent read and write clocks and backup reference design is available for use with the LogiCORE PCI64 Interface. It is delivered in Verilog and VHDL as a drop-in module for the Spartan-II FPGAs. * This design supports data widths of 32, 36, 64, or 72 bits with memory depth of 63 locations implemented in SelectRAM+. Figure 1 presents a block diagram of the design. The pci_async_fifo features fully synchronous and independent clock domains for the read and write ports and back-up synchronous to the read clock. The design also supports full and empty status, along with almost-full and almost-empty flags and invalid read or write requests are rejected without affecting the FIFO state. The pci-async-fifo data sheet lists the features and specifics of the PCI FIFO design. Utilization of the device can vary slightly, depending on the configuration choices made by the designer. Factors that can affect the size of the core are: * Number of Base Address Registers Used: Turning off any unused BARs will save resources. Size of the BARs: Setting the BAR to a smaller size requires more flip-flops. A smaller address space requires more flip-flops to decode. Latency timer: Disabling the latency timer will save resources. It must be enabled for bursting. Recommended Design Experience The LogiCORE PCI64 Interface is pre-implemented; thereby allowing engineering focus at the unique back-end functions of a PCI design. Regardless, PCI is a high-performance system that is challenging to implement in any technology, ASIC or FPGA. Therefore, previous experience with building high-performance, pipelined FPGA designs using Xilinx implementation software, TIMESPECs, and guide files is recommended. The challenge to implement a complete PCI design including back-end functions varies depending on the configuration and the functionality of your application. Contact your local Xilinx representative for a closer review and estimation for your specific requirements. January 31, 2000