PCI32 Spartan-II Interface V 3.0 January 31, 2000 Data Sheet R LogiCORETM Facts Core Specifics Xilinx Inc. 2100 Logic Drive San Jose, CA 95124 Phone: +1 408-559-7778 Fax: +1 408-377-3259 E-mail: Techsupport: support.xilinx.com Feedback: logicore@xilinx.com URL: http://www.xilinx.com Introduction With Xilinx LogiCORE PCI32 Spartan-II interface, a designer can build a customized, 32-bit, 0-33 MHz fully PCI compliant system with the highest possible sustained performance (128 Mbytes/sec), and up to 135,000 system gates in the Spartan-II FPGAs. Features * * * * * * * * * * * Fully 2.2 PCI compliant 32-bit, 33 MHz PCI Initiator/ Target Interface Zero wait-state burst operation Hot Swap CompactPCI friendly Programmable single-chip solution with customizable back-end functionality Pre-defined implementation for predictable timing in Xilinx Spartan-II FPGAs Incorporates Xilinx Smart-IP Technology Universal PCI support in Spartan-II 3.3 V and 5 V operation at 0-33 MHz Fully verified design tested with Xilinx testbench and hardware Configurable on-chip dual-port FIFOs can be added for maximum burst speed Supported Initiator functions - Memory Read, Memory Write, Memory Read Multiple (MRM), Memory Read Line (MRL) commands - I/O Read, I/O Write commands - Configuration Read, Configuration Write commands - Bus Parking - Special Cycles, Interrupt Acknowledge - Basic Host Bridging January 31, 2000 Device Family: System Clock fmax: Device Features Used: Spartan-II 0-33MHz SelectMAP Configuration (opt) Multi-standard SelectIO Block SelectRAM+TM (optional user FIFO) Boundary scan Supported Devices1/Percent Resources Used Slice 4 input Devices I/O flip LUTs flops 32 2S30PQ208-5 37% 33% 56% bit 2S50PQ208-5 35% 19% 31% 2S100PQ208-5 35% 12% 20% 2S150PQ208-5 35% 8% 14% Provided with Core Documentation; PCI Design Guide PCI Implementation Guide PCI Data Book Design File Formats: Verilog/VHDL Simulation Model Verilog/VHDL Instantiation Code NGO Netlist Constraints Files: User Constraint File (UCF) Verification Tools: Verilog/VHDL Testbench Reference Designs & Example designs: ApplicationNotes: PING Reference Design Asynchronous PCI FIFO Design Tool Requirements Xilinx Core Tools: 2.1i SP5 Tested Entry/Verifica- For Core Instantiation: Synopsys FPGA Express, Synopsys FPGA Compiler, tion Tools2: Synplicity Synplify, Examplar For Core Verification:Leonardo, Cadence Verilog XL, MTI ModelSim PE/ Plus, Aldec Active-HDL Xilinx provides technical support for this LogiCORETM product when used as described in the product documentation. Xilinx cannot guarantee timing, functionality, or support of product if implemented in devices not listed above, or if customized beyond that referenced in product documentation, or if any changes are made in sections of design marked as "DO NOT MODIFY". 1. Re-targeting the PCI core to an unlisted device will avoid timing guarantee. Refer to the "mart-IP Technology: Guaranteed Timing" section for details. 2. See Xilinx website for tested design tools update. Features (cont.) * * Supported Target functions (PCI Master and Slave) - Type 0 Configuration Space Header - Up to 3 Base Address Registers (memory or I/O with adjustable block size from 16 Bytes to 2 GBytes, medium decode speed) - Parity Generation (PAR), Parity Error Detection (PERR# and SERR#) - Memory Read, Memory Write, Memory Read Multiple (MRM), Memory Read Line (MRL), Memory Write Invalidate (MWI) commands - I/O Read, I/O Write commands - Configuration Read, Configuration Write commands - Interrupt knowledge - 32-bit data transfers, burst transfers with linear address ordering - Target Abort, Target Retry, Target Disconnect - Full Command/Status Registers Available for configuration and download on the web - Web-based configuration tool - Generation of proven design files - Instant access to new releases Applications * * * * Embedded applications within telecommunication, networking, and industrial systems PCI add-in boards such as graphic cards, video adapters, LAN adapters and data acquisition boards Hot Swap CompactPCI boards Other applications that need PCI General Description The LogiCORETM PCI32 Interfaces are pre-implemented and fully tested modules for the Xilinx Spartan-II FPGAs. The pinout for the device and the relative placement of the internal Configurable Logic Blocks (CLBs) are pre-defined. Critical paths are controlled by TimeSpecs and guide files to ensure predictable timing. This significantly reduces engineering time required to implement the PCI portion of your design. Resources can instead be focused on the unique back-end logic in the FPGA and on the system level design. As a result, LogiCORETM PCI products can minimize your product development time. Xilinx Spartan-II FPGAs enable designs of fully PCI-compliant systems. The devices meet all required electrical and timing parameters including AC output drive characteristics, input capacitance specifications (10pF), 7 ns setup and 0 ns hold to system clock, and 11 ns system clock to output. These devices meet all specifications for PCI 3.3 V and 5 V. January 31, 2000 The PCI Compliance Checklist has detailed information about electrical compliance. Other features that enable efficient implementation of a complete PCI system in the Spartan-II include: * * * * * * * Block SelectRAM+TM memory: Blocks of on-chip ultrafast RAM with synchronous write and dual-port RAM capabilities. Used in PCI Interfaces to implement FIFO Select-RAMTM memory: on-chip ultra-fast RAM with synchronous write option and dual-port RAM option. Used in PCI Interfaces to implement FIFO Individual output enable for each I/O Internal 3-state bus capability 4 global low-skew clock or signal distribution networks IEEE 1149.1-compatible boundary scan logic support Designed for CompactPCI Hot Swap support The Master and Slave Interface module is carefully optimized for best possible performance and utilization in the Spartan-II FPGA architecture. Smart-IP Technology: Guaranteed Timing Drawing on the architectural advantages of Xilinx FPGAs, new Xilinx Smart-IP technology is incorporated in every LogiCORE PCI core and ensures highest performance, predictability, reproducibility, and flexibility in PCI designs. Xilinx Smart-IP technology leverages the Xilinx architectural advantages, such as look-up tables (LUTs), distributed RAM, and segmented routing, as well as floor planning information, such as logic mapping and relative location constraints. This technology provides the best physical layout, predictability, and performance. Additionally, these predetermined features allow for significantly reduced compile times over competing architectures. PCI cores made with Smart-IP technology are unique by maintaining their performance and predictability regardless of the device size. To guarantee the critical setup, hold, and min. and max. clock-to-out timing, the PCI core is delivered with Smart-IP constraint files that are unique for a device and package combination. These constraint files guide the implementation tools so that the critical paths always are within PCI specification. Retargeting the PCI core to an unsupported device will void the guarantee of timing. Contact one of the Xilinx XPERTs partners for support of unlisted devices and packages. See the XPERTs section in chapter 7 of the Xilinx PCI Data Book for contact information. Universal PCI Support Since Spartan-II FPGAs are capable of operating either 3.3 V or 5 V PCI environments, the designer can easily build universal PCI cards. This requires loading one of the bitstreams at power up. Refer to the PCI ImplementationGuide and Building a Universal PCI Card using Xilinx FPGAs Application Note. Functional Description The LogiCORE PCI32 Master and Slave Interface is partitioned into five major blocks and an user application as shown in Figure 1. Each block is described below. PCI Configuration Space This block provides the first 64 Bytes of Type 0, version 2.1 Configuration Space Header (CSH) (see Table 1) to support software-driven "Plug-and Play" initialization and configuration. This includes information for Command, Status, and three Base Address Registers (BARs). These BARs illustrate how to implement memory- or I/O-mapped address spaces. Table 1: PCI Configuration Space Header 31 16 15 0 Device ID Vendor ID 00h Status Command 04h Class Code BIST Header Type Rev ID Latency Timer 08h Cache 0Ch Line Size the read-only registers results in optimized logic mapping and placement. The LogiCORE PCI32 Interface has the ability to add extended configuration capabilities as defined in PCI Specification V2.2. This capability, including the ability to implement a CapPtr in configuration space, allows the user to implement extended functions, such as Power Management, Hot Swap CSR, and Message Based Interrupts, in the backend design. PCI I/O Interface Block The I/O interface block handles the physical connection to the PCI bus including all signaling, input and output synchronization, output three-state controls, and all requestgrant handshaking for bus mastering. Parity Generator/Checker This block generates/checks even parity across the AD bus, the CBE lines, PAR and the PAR signal. It also reports data parity errors via PERR- and address parity errors via SERR-. Target State Machine This block controls the PCI interface for Target functions. The states implemented are a subset of equations defined in "Appendix B" of the PCI Local Bus Specification. The controller is a high-performance state machine using one-hot (state-per-bit) encoding for maximum performance. State-per-bit encoding of the next-state logic functions facilitates a high performance implementation in the Xilinx FPGA architecture. Base Address Register 0 (BAR0) 10h Base Address Register 1 (BAR1) 14h Base Address Register 2 (BAR2) 18h Base Address Register 3 (BAR3) 1Ch Base Address Register 4 (BAR5) 20h Base Address Register 5 (BAR5) 24h Initiator State Machine Cardbus CIS Pointer 28h This block controls the PCI interface for Initiator functions. The states implemented are a subset of equations defined in "Appendix B" of the PCI Local Bus Specification. The Initiator Control Logic also uses state-per-bit encoding for maximum performance. Subsystem Vendor ID 2Ch Subsystem ID Expansion ROM Base Address Reserved CapPtr Min_Gnt Interrupt Pin Reserved 34h 38h Reserved Max_Lat 30h Interrupt Line 3Ch 40h-FFh Note: Italicized address areas are not implemented in the LogiCORE PCI32 Spartan-II Interface default configuration. These locations return zero during configuration read accesses. Each BAR sets the base address for the interface and allows the system software to determine the addressable range required by the interface. Every BAR designated as a memory space can be made to represent a 32-bit space. Using a combination of Configurable Logic Block (CLB) flipflops for the read/write registers and CLB look-up tables for January 31, 2000 User Application with Optional Burst FIFOs The LogiCORE PCI32 Interface provides a simple, generalpurpose interface with a 32-bit data path and latched address for de-multiplexing the PCI address/data bus. This user interface allows the rest of the device to be used in a wide range of 32-bit applications.Typically, the user application contains burst FIFOs to increase PCI system performance. An on-chip read/write FIFO, built from the on-chip synchronous dual-port RAM (Block SelectRAM+TM) available in Spartan-II FPGAs, supports data transfers in excess of 66 MHz. Several synthesizable re-usable bridge designs including commonly used backend functions, such as doorbells and mailboxes, are provided with the core. PCI32 Spartan-II Interface V 3.0 Interface Configuration The LogiCORE PCI32 Interface can easily be configured to fit unique system requirements by using Xilinx web-based PCI configuration tool or by changing the Verilog or VHDL configuration file. The following customization options, also described in our documentation, are supported by this LogiCORE product: * * * * * * Initiator or target functionality Base Address Register configuration (1-3 Registers, size and mode) Configuration Space Header ROM Initiator and target state machine (e.g., termination conditions, transaction types and request/transaction arbitration) Burst functionality User Application including FIFO (back-end design) Table 2 illustrates the PCI bus commands supported by the LogiCORETM PCI32 Interface. The PCI Compliance Checklist has more details on supported and unsupported commands. Table 2: PCI Bus Commands Command 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Interrupt Acknowledge Special Cycle I/O Read I/O Write Reserved Reserved Memory Read Memory Write Reserved Reserved Configuration Read Configuration Write Memory Read Multiple Dual Address Cycle Memory Read Line Memory Write Invalidate PCI Master Yes Yes Yes Yes Ignore Ignore Yes Yes Ignore Ignore Yes Yes Yes No1 Yes No1 Each Spartan-II CLB supports four 16x1 RAM blocks. This corresponds to 64 bits of single-ported RAM or 32 bits of dual-ported RAM, with simultaneous read/write capability. Each Spartan-II device has two columns of Block RAM. The V300 device has 55, 536 bits of Block SelectRAM+ that can be used to create deep, dual-ported FIFOs. Table 3: LogiCORE PCI32 Transfer Rates Zero Wait-State Mode Operation Transfer Rate Initiator Write (PCI LogiCORE) 3-1-1-1 Initiator Read (PCI LogiCORE) 4-1-1-1 Target Write (PCI LogiCORE) 5-1-1-1 Target Read (PCI LogiCORE) 6-1-1-1 ***Note: Initiator Read and Target Write operations have effectively the same bandwidth for burst transfer. Supported PCI Commands CBE [3:0] mented using the Spartan-II on-chip RAM features, both Distributed and Block SelectRAM+TM. PCI Slave Yes Ignore Yes Yes Ignore Ignore Yes Yes Ignore Ignore Yes Yes Yes Ignore Yes Yes Note: 1. The Initiator can present these commands; however, they either require additional user-application logic to support them or are not thoroughly tested. Burst Transfer Bandwidth Xilinx LogiCORE PCI32 Interface supports fully compliant zero wait-state bust operations for both sourcing and receiving data. This Interface supports a sustained bandwidth of up to 128 MBytes/sec. The design can be configured to take advantage of the ability of the LogiCORE PCI32 Interface to do very long bursts. Since the FIFO is not of fixed size, bursts can go on for as long as the chipset arbiter will allow. Furthermore, since the FIFOs and DMA are decoupled from the proven core, a designer can modify these functions without affecting the critical PCI timing. The flexible Xilinx backend supporting numerous PCI features, gives users a solution used in many high-performance applications. Xilinx supports different depths of FIFOs and dual port FIFOs, synchronous/asynchronous FIFOs, and multiple FIFOs. The user is not locked into one DMA engine, hence, an application specific can be designed. The theoretical maximum bandwidth of a 32-bit, 33 MHz PCI bus is 128 MBytes/sec. Attaining this maximum bandwidth will depend on several factors, including the PCI design used, PCI chipset, the processor's ability to keep up with your data stream, the maximum capability of your PCI design, and other traffic on the PCI bus. Older chipsets/ processors will allow less bandwidth than the newer ones. No additional wait-states are inserted in response to a waitstate from another agent on the bus. Either IRDY or TRDY is kept asserted until the current data phase ends, as required by the V2.2 PCI Specification. See Table 4 for PCI bus transfer rates for various operations. The PCI bus derives its performance from its ability to support burst transfers. The performance of any PCI application depends largely on the size of the burst transfer. A FIFO to support PCI burst transfer can efficiently be imple- January 31, 2000 Timing Specification Table 4: 33 MHz PCI32 Transfer Rates The Virtex Series FPGA devices, together with the LogiCORE PCI32 product enable design of fully compliant PCI systems. The maximum speed at which your back-end is capable of running can be affected by the size of the design as well as by the loading of the hot signals coming directly from the PCI bus. Table 3 shows the key timing parameters for the LogiCORE PCI32 Interface that must be met for full PCI compliance. Verification Methods Xilinx has developed a system-level testbench that allows simulation of an open PCI environment in which a LogiCORE-PCI-based design may be tested by itself or with other simulatable PCI agents. Included in these agents are a behavioral host and target, and several plug-in modules, including a PCI signal recorder and a PCI protocol monitor. Using these tools, the PCI developers can write microcodestyle test scripts that can be used to verify different busoperation scenarios, including those in the PCI Compliance Checklist. The Xilinx PCI testbench is a powerful verification tool that is also used as the basis for PCI LogiCORE verification. The PCI LogiCORE is also tested in hardware for electrical, functional, and timing compliance. Parameter PCI Spec. Ref. CLK Cycle Time Tcyc CLK High Time Thigh CLK Low Time Tlow CLK to Bus SigTICKOF nals Valid3 CLK to REQ# and TICKOF GNT# Valid3 Tri-state to Active Ton CLK to Tri-state Toff Bus Signal Setup TPSD to CLK (IOB) Bus Signal Setup to CLK (CLB) GNT# Setup to TPSD CLK GNT# Setup to TPSD CLK (CLB) Input Hold Time TPHD After CLK (IOB) Input Hold Time After CLK (CLB) RST# to Tri-state Trst-off Min 15 6 6 2 Max 30 2 6 6 2 FIFO Controller LogiCORE PCI 64/32 Spartan-II Interface PCI Bus PCI Asynchronous Write FIFO Reference Design PCI Asynchronous Read FIFO Reference Design Custom DMA Module x9097 Figure 1: LogiCORE PCI Interface Block Diagram January 31, 2000 2 6 2 14 3 141 3 5 51 5 5 5 5 0 0 0 0 40 402 Notes: 1. Controlled by TIMESPECS, included in product. Power Management Module Reference Design LogiCORE PCI32 Spartan-II-5 Min Max 151 30 6 6 2 6 PCI32 Spartan-II Interface V 3.0 Ping Reference Design Device Utilization The Xilinx PING64 Application Example, delivered in Verilog and VHDL, has been developed to provide an easy-tounderstand example which demonstrates many of the principles and techniques required to successfully use a LogiCORE PCI32 Interface in a System-on-a-Chip solution. The PING design is also used as a test vehicle for PCI core verification. The Target/Initiator options require a variable amount of CLB resources for the PCI32 Interface. Asynchronous PCI FIFO Design Note * The first in first-out memory queue with independent read and write clocks and Backup Reference Design is available for use with the LogiCORE PCI32 Interface. It is delivered in Verilog and VHDL as a drop-in module for the Spartan-II FPGAs. * This design supports data widths of 32, 36, 64, or 72 bits with memory depth of 63 locations implemented in SelectRAM+. Figure 1 presents a block diagram of the design. The pci_async_fifo features fully synchronous and independent clock domains for the read and write ports and back-up synchronous to the read clock. The design also supports full and empty status, along with almost-full and almost-empty flags and invalid read or write requests are rejected without affecting the FIFO state. For detailed information on features and design specifics, refer to the Asynchronous PCI FIFO design datasheet. Utilization of the device can vary slightly, depending on the configuration choices made by the designer. Factors that can affect the size of the core are: * Number of Base Address Registers Used. Turning off any unused BARs will save resources. Size of the BARs. Setting the BAR to a smaller size requires more flip-flops. A smaller address space requires more flip-flops to decode. Latency timer. Disabling the latency timer will save resources. It must be enabled for bursting. Recommended Design Experience The LogiCORE PCI32 Interface is pre-implemented; thereby allowing engineering focus at the unique back-end functions of a PCI design. Regardless, PCI is a high-performance system that is challenging to implement in any technology, ASIC or FPGA. Therefore, previous experience with building high-performance, pipelined FPGA designs using Xilinx implementation software, TIMESPECs, and guide files is recommended. The challenge to implement a complete PCI design including back-end functions varies depending on the configuration and the functionality of your application. Contact your local Xilinx representative for a closer review and estimation for your specific requirements. January 31, 2000