Synchronous SRAM Interface to Cypress CY7C1380C-167
Overview
This Synchronous SRAM (SSRAM) interface creates a 32-bit data width tri-state interface to a 2MBx36-bit Cypress CY7C1380C-167 device. Other devices in the same family with differing address widths may also be used (1MB, 2MB, and 4MB devices). The SSRAM interface is merely a collection of Avalon signals that map to the SSRAM device. No controller operations are offered in this interface, such as burst-support.
This document describes general interface guidelines for the CY7C1380C-167 device as well as guidelines specific to the Nios Development Board, Cyclone II Edition.
System Requirements
Timing Information
The CY7C1380C is a pipelined SRAM device which delivers requested read-data after one clock cycle from when address and control are presented. Because of SSRAM Tco, FPGA Tsu, and read-to-write bus turnaround timing requirements, a minimum of two clock-cycles of read latency are used for Avalon reads. At high clock speeds, an additional cycle of read latency is used to allow for proper timing, up to the SSRAM’s maximum specified speed of 167MHz. All writes complete with zero latency, once pending reads are completed. This means that for continuous reads from SSRAM, data is available after the initial two (or three) cycle delay (plus latency due to the registered tri-state bridge). Thus, for optimal performance, reads and writes should be as continuous as possible, because writing to SSRAM breaks the read pipeline and causes a delay when reads are resumed.
To review, “read latency” in Avalon timing means additional clocks that must be added beyond one clock to perform a read operation, in a pipelined manner. For example, when an Avalon read is performed to an interface requiring one cycle of latency, Avalon expects the data to be ready before the second clock edge following initiation of the transfer. Similarly, with two cycles of latency, Avalon expects read data three cycles following initiation of the transfer. For additional details please refer to the Avalon Interface Specification.
For high-speed SSRAM operation, it is essential that the phase of the clock driving SSRAM have a precise relationship with the Avalon clock driving the SSRAM interface. It is recommended that this relationship be maintained using a PLL available in Altera FPGAs. Example designs in Nios II development kits use this PLL configuration. For additional details on PLL operation and what the guidelines below mean, please refer to the PLL chapter of the device handbook appropriate to the Altera FPGA family you are targeting.
General clocking guidelines based on the Nios Development Board, Cyclone II Edition (EP2C35F672-6 device):
|
Clock Speed |
SSRAM Read Latency |
Desired Tco (Avalon to SSRAM clock) |
Suggested PLL phase shift to SSRAM clock (“Normal” PLL mode) |
|
Up to 130MHz |
2 cycles |
-2.0ns |
-4.8ns |
|
Above 130MHz |
3 cycles |
1.5ns |
-1.17ns |
The above PLL setting guidelines are based on worst-case timing analysis, using data from the CY7C1380C-167 datasheet and Quartus II Timing Analyzer report for the Altera EP2C35F672-6 device on the Nios Development Board, Cyclone II Edition. If you are using a different speed SSRAM device, different Altera device, or if your target-board differs, resulting in different propagation delays between the FPGA and SSRAM interface, it is recommended that a thorough timing analysis be performed to ensure proper operation at high clock speeds.
An explanation of the timing guidelines provided above can be found in the Timing Analysis Methodology section of this document. For operation at lower clock speeds, up to 100MHz, it is likely that no modification to the above settings will be required if targeting a custom board. However, performing a thorough timing analysis to any off-chip memory interface is always recommended.
In addition to the above PLL settings, it is recommended that you use the same PLL to generate your Avalon clock and external SSRAM clock. This will ensure the Tco relationship between the two clocks remains constant.
Required I/O Assignments
In addition to the I/O from the SOPC Builder system module to the SSRAM device, several additional pins must be tied high or low in the FPGA design. For the Nios Development Board, Cyclone II Edition, the only additional I/O requiring assignments are:
The following additional I/O must be tied high-or-low for this SSRAM interface to operate properly. These pins are tied high or low via pull-up/pull-down resistors on the Nios Development Board, Cyclone II Edition:
Un-used SSRAM I/O
The CY7C1380 features a 36-bit data bus (4 parity bits). These parity bits are not used in this interface.
Timing Analysis Methodology
Calculating the speeds where each latency setting may operate, and any PLL phase shift necessary is accomplished as follows:
The following analysis was performed to obtain the latency, clock speed, and PLL phase shift guidelines above. This analysis may serve as a starting-point if you wish to use this SSRAM interface at high-speeds with a different Altera device (Family, density, or speed-grade), different speed-grade SSRAM device, or have a board with very long (or short) trace lengths resulting in significantly different propagation delays.
For the following examples, timing data is based on the Altera EP2C35F672-6 device on the Nios Development Board, Cyclone II Edition, board layout (trace length) information for this development board, and published timing parameters in the Cypress CY7C1380C-167 device datasheet.
FPGA-to-SSRAM Setup Relationship
The FPGA-to-SSRAM setup relationship can be calculated using three components: Tco (clock to output) on FPGA I/O to SSRAM, Tpd (propagation delay) on the target board, and Tsu (setup time) for SSRAM I/O:
Note: Each of the figures above represents worst-case delays.
The sum of these is 5.3ns. This means that to satisfy the FPGA-to-SSRAM portion of the timing analysis, the SSRAM’s rising clock edge where a read or write is to be initiated must occur at least 5.3ns after the rising edge of the Avalon clock driving the SSRAM interface in the FPGA.
SSRAM-to-FPGA Setup Relationship
The SSRAM-to-FPGA setup relationship is calculated just as the FPGA-to-SSRAM relationship, but in the opposite direction:
The sum of these is 9.7ns. Therefore, the SSRAM’s rising clock edge must precede the Avalon clock driving the SSRAM interface in the FPGA by a minimum of 9.7ns. In addition, the FPGA has a negative Th (hold) time requirement of -5.1ns. This means that read data from the SSRAM must be present and valid at the FPGA I/O between 5.3ns and 5.1ns before the rising edge of the Avalon clock driving SSRAM; the SSRAM may hold the read data for a longer period of time if necessary.
Clock Adjustments – Zero Phase Shift
Using the above calculations, the maximum clock speed at two cycles of latency can be easily calculated if the SSRAM and Avalon clocks are exactly in phase: The minimum period is 9.7ns (higher of the two calculations above), and therefore the maximum clock speed is 1/period, or approximately 103MHz.
For low-speed operation, the above calculation means that maintaining the phase relationship of Avalon and SSRAM clocks is all that is necessary. However, by using the precise programmable clock phase-shift features of Altera PLLs, it is possible to increase performance substantially by adjusting the Tco relationship between Avalon and SSRAM clocks. Figure 1 illustrates the timing relationship when the SSRAM & Avalon clocks are in phase.
------------------------------------------------------------------------------
Figure 1: SSRAM read timing with two cycles of Avalon read latency, zero phase-shift between Avalon & SSRAM clocks:
(Not to scale)
Avalon & SSRAM clock (in phase):
+---+ +---+ +---+ +---+
| | | | | | | |
---+ +---+ +---+ +---+ +---
^ ^ ^
a c e
Read address & control from FPGA to SSRAM:
-----\ /-----\ /-----\ /-------------
X “A” X “B” X
-----/ \-----/ \-----/ \-------------
^
b
Read data from SSRAM to FPGA:
-----------------------\ /-----\ /----
X “A” X “B”
-----------------------/ \-----/ \----
^
d
Legend:
a: Avalon presents read to address “A” to SSRAM
b: Read request to address “A” presented to SSRAM I/O after FPGA Tco + board Tpd
c: SSRAM registers read request to address “A”
d: After one clock delay + Tco delay + board Tpd, SSRAM read data from address “A” presented to FPGA I/O
e: FPGA registers read data “A” from SSRAM
------------------------------------------------------------------------------
Clock Adjustments – Maximum Two-Cycle Latency Performance
By adjusting the clock phase to utilize the inequality between FPGA-to-SSRAM and SSRAM-to-FPGA timing, we can achieve higher speeds. This is done by “pulling” the SSRAM clock back with a negative phase-shift with respect to Avalon. With a shift of -2.0ns, the FPGA-to-SSRAM setup relationship is now increased from 5.3ns to 7.3ns, while the SSRAM-to-FPGA relationship is decreased from 9.7ns to 7.7ns (much closer to equality), thus allowing for a maximum clock speed of approximately 130MHz at two cycles of latency. Figure two illustrates this relationship.
------------------------------------------------------------------------------
Figure 2: SSRAM read timing with two cycles of Avalon read latency (-2.0ns phase-shift between Avalon & SSRAM clocks)
(Not to scale)
Avalon clock to SSRAM interface:
+---+ +---+ +---+ +---+
| | | | | | | |
---+ +---+ +---+ +---+ +---
^ ^
b f
External SSRAM clock:
+---+ +---+ +---+ +---+
| | | | | | | |
---+ +---+ +---+ +---+ +---
^ ^
a d
Read address & control from FPGA to SSRAM:
-----\ /-----\ /-----\ /-------------
X “A” X “B” X
-----/ \-----/ \-----/ \-------------
^
c
Read data from SSRAM to FPGA:
-----------------------\ /-----\ /----
X “A” X “B”
-----------------------/ \-----/ \----
^
e
Legend:
a: Rising edge of SSRAM clock, 2.0ns before Avalon clock
b: Rising edge of Avalon clock; Avalon presents read to address “A” to SSRAM
c: Read request to address “A” presented to SSRAM I/O after FPGA Tco + board Tpd
d: SSRAM registers read request to address “A”
e: After one clock + Tco delay, SSRAM read data from address “A” presented to FPGA I/O
f: FPGA registers read data “A” from SSRAM
------------------------------------------------------------------------------
To summarize, with two cycles of read latency, the clock phase shift is calculated by shifting the clock such that the FPGA-to-SSRAM relationship and the SSRAM-to-FPGA relationship are approximately equal; the corresponding PLL phase shift can then be used to the maximum speed allowed with two cycles of read latency.
Clock Adjustments –Three-Cycle Latency Operation
For operation above 130MHz, it is no longer possible to read from SSRAM with two cycles of read latency as depicted above; a timing violation will occur on either the FPGA-to-SSRAM relationship or SSRAM-to-FPGA relationship. Therefore, we can only increase clock speed by specifying that Avalon look for read data after an additional cycle of latency. To accomplish this, the clock phase shift must also be adjusted because the SSRAM will only hold read data until just after the 2nd clock edge following a read request (at that time, it will transition to read data for the next address in the read pipeline).
Calculating the phase shift for 3-cycle read latency is also more tricky and error-prone than the above calculation, based on documented worst-case delays. This is because we must now pay attention to best case delays. Figure 3 illustrates the relationship with three cycles of read latency
------------------------------------------------------------------------------
Figure 3: SSRAM read timing with three cycles of Avalon read latency (+1.5ns phase-shift between Avalon & SSRAM clocks)
(Not to scale)
Avalon clock to SSRAM interface:
+---+ +---+ +---+ +---+ +---
| | | | | | | | |
---+ +---+ +---+ +---+ +---+
^ ^ ^ ^ ^
a f g h i
External SSRAM clock:
+---+ +---+ +---+ +---+
| | | | | | | |
---+ +---+ +---+ +---+ +---
^ ^ ^
b e g
Read address & control from FPGA to SSRAM:
-----\ /-----\ /-----\ /-------------
X “A” X “B” X
-----/ \-----/ \-----/ \-------------
^ ^ ^
b c d
Read data from SSRAM to FPGA:
-----------------------\ /-----\ /----
X “A” X “B”
-----------------------/ \-----/ \----
^ ^
f h
Legend:
a: Rising edge of Avalon clock; Avalon presents read to address “A” to SSRAM
b: Rising edge of SSRAM clock, 1.5ns after Avalon clock
c: Previous Avalon request to SSRAM asserted until after rising edge of SSRAM clock
d: Read request to address “A” presented to SSRAM I/O after FPGA Tco + board Tpd
e: SSRAM registers read request to address “A”
f: After one clock + Tco delay, SSRAM read data from address “A” presented to FPGA I/O
g: SSRAM clock edge after which read data will change
h: Read data from SSRAM no longer valid, Tdoh (min) from SSRAM after clock edge “g”
i: FPGA registers read data “A” from SSRAM
------------------------------------------------------------------------------
As inferred in Figure 3, there are two key points that require precise attention in order to operate with three clock cycles of latency. These are in addition to the basic calculations that are described above in the zero-phase-shift description of SSRAM interface timing.
The first occurs in the relationship between “b” and “c” in Figure 3, where data from the FPGA to SSRAM is about to change following the rising edge of the Avalon clock. We must ensure that Th (hold) time to SSRAM (0.5ns) is respected. It is recommended that this relationship be verified with an oscilloscope set to observe control signals from Avalon to SSRAM in relation to the SSRAM clock, at the SSRAM device, and that this measurement be performed with nominal conditions in order to observe the best case Tco delay from FPGA to SSRAM. Therefore, the I/O used in the measurement should be selected from the Quartus II Timing Analyzer report with the least Tco delay amongst I/O to the SSRAM interface. In addition, this experiment should be performed observing rising and falling edges of signals from the FPGA to SSRAM. These measurements yielded the maximum phase shift – 1.5ns – recommended for three-cycle read-latency operation on the Nios Development board, Cyclone II Edition.
The second occurs between “h” and “i” in Figure 3, where read data from SSRAM changes following the clock edge immediately after read data is valid. At first glance it may appear that SSRAM read data is switching before the Avalon clock edge, causing incorrect data to be read. However, recall that the FPGA I/Os have a negative minimum hold time requirement, as reported in the Quartus II Timing Analyzer report. Thus, data must only be valid until (in our working example) 5.1ns before the clock edge at “i”.
To verify the FPGA hold relationship, the SSRAM timing parameter “Tdoh” (data output hold after clock rise), the minimum board propagation delay, and PLL phase shift are used.
The minimum time that read data is valid before clock edge “i’ is Tp (period) – Tdoh(min) – Tpd (min) – Tco (phase delay Avalon to SSRAM), or, Tp – 1.5ns – 0.5ns – 1.5ns. Since the minimum hold time is 5.1 ns before clock edge “i”, Tp (max) may be solved for: Tp (max) – 1.5 – 0.5 -1.5 = 5.1; Tp (max) = 8.6ns. Thus, the minimum clock speed that can safely be used with three cycles of read latency this scenario is approximately 117MHz.
This is an important figure to note – since we are increasing Avalon read latency interfacing to a device with fixed read latency, we must ensure that this minimum clock speed is respected. While operating with three cycles of read latency, timing margins actually improve with higher clock speeds, up to the maximum SSRAM device clock speed of 167MHz.
When performing the above calculations & measurements, it is suggested that you start with zero phase-shift (Tco) between Avalon and SSRAM clocks, and then adjust clock phase slightly if necessary to correct a timing violation at the desired clock speed. Then, re-calculate the other timing parameters to ensure that they still meet timing requirements; this approach was used to verify the 1.5ns phase shift used in this example. Additionally, not all systems may meet timing requirements through the spectrum of possible clock frequencies. For example, if the maximum allowable Tco delay between Avalon & SSRAM clocks had been closer to 0ns, the minimum speed for three-cycle read-latent operations would have increased. Careful analysis of your system is recommended.