ASIC Design Guidelines
Introduction
The Atmel ASIC Design Guidelines constitute a general set of recommendations
intended for use by designers when preparing circuits for fabrication by Atmel. The
guidelines are independent of any particular CAD tool or silicon process. They are
applicable to Gate Arrays, Cell-Based ASICs (CBICs) and full-custom designs.
Although they do not give specific coding recommendations, they apply equally to
designs captured in Verilog or VHDL as to designs captured as schematics.
These guidelines do not cover general principles of ASIC design; rather they highlight
specific design practices which are regarded as unsafe, and which can lead to
devices which are difficult to test, and whose correct operation cannot be guaranteed
under all circumstances. For each unsafe, and therefore non-recommended design
practice, an alternative safe, and therefore recommended practice is proposed.
The current paradigm shift towards system level integration (SLI), incorporating multi-
ple complex functional blocks and a variety of memories on a single circuit, gives rise
to a new set of design requirements at integration level. These design guidelines do
not fully address these issues yet. The recommendations are principally aimed at the
design of the blocks and memory interfaces which are to be integrated into the sys-
tem-on-chip. However, the guidelines given here are fully consistent with the require-
ments of system level integration. Respect for these guidelines will significantly ease
the integration effort, and ensure that the individual blocks are easily reusable in other
systems.
These design guidelines have been drawn up in the light of experience with large
numbers of ASIC designs over more than a decade.
The Atmel ASIC Design Guidelines have a particular significance during the signoff of
each design prior to submission for fabrication:
Atmel customers must sign off a design to confirm that it complies with all the recom-
mendations in the Atmel ASIC Design Guidelines. For each case of non-compliance,
the case must be discussed with the ASIC Support Center, and if necessary a formal
Authorization must be obtained.
Application
Specific IC
(ASIC)
Application
Note
Rev. 1205A–12/99
Design
Guidelines
ASIC
2
Synchronous Circuits
Experience has shown that the safest methodology for
time-domain control of an ASIC is synchronous design.
A synchronous circuit is one in which:
all data storage elements are clocked, and in normal
operation change state only in response to the clock
signal
the same active edge of a single clock signal is applied
at precisely the same point in time at every clocked
cell in the device.
Examples of circuit elements which contradict these princi-
ples are given below, and methods of achieving synchro-
nous design are given in the four sections which follow.
Non-recommended Circuits
Circuits which violate the principles of synchronous design
include the following elements:
Flip-flop driving clock input of another flip-flop
The clock input of the second flip-flop is skewed by the
clock-to-q delay of the first flip-flop, and is not activated on
every clock edge. See Figure 1.
Figure 1. Flip-flop driving clock input of another flip-flop
An example of a circuit containing this element is a ripple
counter.
Gated clock line
Gating in a clock line (Figure 2) causes clock skew and can
introduce spikes which trigger the flip-flop. This is particu-
larly the case when there is a multiplexer in the clock line.
Figure 2. Gated clock line
Double-edged clocking
The two flip-flops are clocked on opposite edges of the
clock signal (Figure 3). This makes synchronous resetting
and test methodologies such as scan-path insertion diffi-
cult, and causes difficulties in determining critical signal
paths.
Figure 3. Double-edged clocking
DQ
CK QB CK QB
DQ
DQ
CK QB
CTRL
CLK
DQ
CK QB
DQ
CK QB
CLK
ASIC
3
Flip-flop driving asynchronous reset of another flip-
flop.
In Figure 4, the second flip-flop can change state at a time
other than the active clock edge, violating the principle of
synchronous design. In addition, this circuit contains a
potential race condition between the clock and reset of the
second flip-flop.
Figure 4. Flip-flop driving asynchronous reset of another
flip-flop
An example of a circuit containing this element is an asyn-
chronously reset counter.
Recommended Circuits
Methods of achieving the requirements of synchronous
design, and avoiding the non-recommended situations
described above are dealt with in subsequent sections, as
follows:
Synchronous clocking by means of clock buffering: See
Clock Buffering on page 4.
Flip-flop driving clock signal of another flip-flop: See
Gated Clocks on page 10.
Gated clocks: See Gated Clocks on page 10.
Double-edged clocking: See Double-edged Clocking on
page 11.
System clock generation: See Clock Generation and
Overall Circuit Control on page 12.
Asynchronous resets: See Asynchronous Resets on
page 13.
DQ
CK QB
DQ
CK QB
R
CLK
ASIC
4
Clock Buffering
To achieve the requirement of a simultaneous application
of a single clock signal at all storage elements in a design,
and avoid problems due to fanout, a clock buffering
scheme needs to be implemented consistently throughout
a circuit. This is often done automatically as part of place-
ment and routing; if not, the principles described in this sec-
tion should be followed.
Non-recommended Circuits
Circuits which violate the principles of consistent clock buff-
ering include the following elements:
Unequal depth of clock buffering
The depth of clock buffering differs between different clock
application points, causing clock skew. See Figure 5.
Figure 5. Unequal depth of clock buffering
Clock Source
Clock Application
Points
ASIC
5
Unbalanced fanout on clock buffers
As shown in Figure 6, the difference between the fanouts at
the two intermediate buffers gives rise to different load-
dependent delays, causing clock skew.
Excessive clock fanout
Excessive clock fanout leads to slow clock edges, which
can cause a number of problems, including an increased
risk of metastability in flip-flops which capture external
asynchronous signals.
Figure 6. Unbalanced fanout on clock buffers
Clock Source Clock Application
Points
ASIC
6
Recommended Circuits
The recommended clock buffering scheme is balanced tree
buffering, which must satisfy the following conditions:
1. The same depth of buffering to all clocked cells. (A
suggestion is to use the naming convention: ck0 at
application point, then ck1, ck2, ... ckn, and join
equivalent levels up the circuit hierarchy. Note that n
must be even to retain clock polarity.) See Figure 7.
2. The same fanout on all buffers. This must be
checked after placement and routing, to ensure that
tracking capacitances do not unbalance the fanout.
3. Lightly loaded buffers to keep clock edges sharp
(max 50% of max relative fanout). An alternative is
to use a combination of geometric and tree buffer-
ing, as illustrated in Figure 8.
Balanced clock tree buffering
Figure 7. Balanced clock tree buffering
Clock Source Clock Application
Points
CK0
CK0
CK0
CK0
CK1
CK0
CK0
CK0
CK0
CK1
CK0
CK0
CK0
CK0
CK1
CK0
CK0
CK0
CK0
CK1
CK4 CK3 CK2
ASIC
7
Combined geometric/tree buffering
By using an intermediate buffer of a suitable drive strength
at each clock fanout point, the relative fanout at each buffer
is reduced, and clock edges remain sharp.
Figure 8. Combined geometric/tree buffering
Clock Source Clock Application
Points
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
BUF2
ASIC
8
Clock Bar Cells
The use of clock bar cells for clock distribution from within a
standard cell area is recommended (during placement and
routing, if they are available), as shown in Figure 9. A sin-
gle Clock bar cell, positioned correctly in the centre of the
standard cell area, can provide a balanced clock net distri-
bution. This runs a vertical clock trunk through the middle
of the cell area, allowing clock net branches to feed cells on
either side of the trunk. This method reduces the risk of
clock skew by halving the effective clock path length along
a row of cells, compared with a clock supplied from one
end of the cell row. It also guides the router to prevent a
long clock path being threaded through the standard cells,
and prevents clock net looping.
It is recommended to use only one clock bar cell per stan-
dard cell area (otherwise clock looping may occur). By
using clock bar cells, there will be a balanced clock net dis-
tribution within each standard cell area.
Balanced clock routing using clock bar cells
Figure 9. Balanced clock routing using clock bar cells
CK0
CK0
CK0
From balanced clock tree
Standard cell row Clock bar cell
Clock routing to
individual cells
Std. Cell Area 1 Std. Cell Area 2
Std. Cell Area 3
ASIC
9
Clock Guidance
The use of clock guidance is recommended if available,
before starting place and route. A central clock trunk should
be run between the standard cell areas with branches feed-
ing off either side into the standard cell areas themselves,
as shown in Figure 10. A bad example of Clock guidance is
given in Figure 11, highlighting the risk of clock skew.
Good example of clock guidance
Figure 10. Good clock guidance for routing
It is important to have an even number of rows (in the stan-
dard cell areas), because an odd number of rows can force
the place and route software to create loops on the clock
net.
Bad example of clock guidance
Figure 11. Bad clock guidance for routing
Clock Compilers
If Clock compilers are available, they help to maintain a a
balanced clock network, but should be used with care. The
clock compiler automatically adjusts the clock buffering to
make the equivalent delays for each cell area the same as
the longest delay. This means additional buffer cells may
be added both outside and inside the standard cell areas,
and the cell areas themselves may be split.
Clock Driver
Std Cell Area 1
Std Cell Area 2
Std Cell Area 3 Std Cell Area 4
Clock Driver
Std Cell Area 1
Std Cell Area 2
Std Cell Area 3 Std Cell Area 4
ASIC
10
Gated Clocks
A seemingly obvious way of controlling the operation of a
flip-flop is to gate the clock signal with a control signal, or to
multiplex two alternative clocks into its clock input. This
practice is dangerous on two counts:
A glitch on the gate output can cause a clock edge.
Gating in the clock line introduces clock skew.
Non-recommended Circuits
A particularly unsafe circuit element is shown in Figure
12.
Multiplexer on clock line
Figure 12. Multiplexer on clock line
Toggling the multiplexer control signal inevitably causes a
glitch on the ck input to the flip-flop, which may cause it to
capture invalid data.
Recommended Circuits
Two circuit elements which are recommended for use in
synchronous designs are illustrated here. They are the
enabled (E-type) flip-flop and the toggle (T-type) flip-flop.
They remove the need for gated clocks, or for using the
output from one flip-flop as the input to another.
Enabled (E-type) flip-flop
The enable signal (the multiplexer select line) controls the
input of data to the flip-flop. If enable is low, the existing
value of q is re-input at the next clock cycle. If enable is
high, a new data value is clocked in. See Figure 13.
Note: A version of the E-type flip-flop can be constructed with a
synchronous reset. A recommended way of constructing
an E-type flip-flop is using AOI logic. See Design for
Speed on page 31.
Figure 13. E-type flip-flop
Toggle (T-type) flip-flop
The toggle flip-flop is the basic element in synchronous
counters. The toggle signal (the multiplexer select line)
controls state of the flip-flop. If toggle is low, the flip-flop
retains its existing value at the next clock edge; if toggle is
high, it takes the opposite value. See Figure 14.
Note: A version of the T-type flip-flop can be constructed with a
synchronous reset. A recommended way of constructing
a T-type flip-flop is using AOI logic. See Design for
Speed on page 31.
Figure 14. T-type flip-flop
DQ
CK QB
R
S
A
B
MUX
CTRL
CKA
CKB
DQ
CK QB
R
S
A
B
MUX
EN
D
DQ
CK QB
R
S
A
B
MUX
T
ASIC
11
Double-edged Clocking
In an attempt to increase data throughput rates, use is
sometimes made of both the rising and the falling clock
edge for clocked elements. This practice, however, violates
the principles of synchronous design given in Synchro-
nous Circuits on page 2, and causes a number of prob-
lems, in particular:
An asymmetrical clock duty cycle can cause setup and
hold violations.
It is difficult to determine critical signal paths.
Test methodologies such as scan-path insertion are
difficult, as they rely on all flip-flops being activated on
the same clock edge. If scan insertion is required in a
circuit with double-edged clocking, multiplexers must be
inserted in the clock lines to change to single-edged
clocking in test mode. See, however, the warning in
“Multiplexer on clock line” on page 10.
The recommended alternative is to use a single-edged
clocking scheme with a higher clock frequency.
A general principle of synchronous circuit design is that the
minimum time resolution available within the circuit is the
duration of one complete clock cycle.
Non-recommended Circuit
Pipelined logic with double-edged clocking
In a circuit as shown in Figure 15, an asymmetrical clock
duty cycle could cause setup and hold time violations, and
a scan-path cannot easily be threaded through the flip-
flops.
Figure 15. Pipelined logic with double-edged clocking
Recommended Circuit
Pipelined logic with single-edged clocking
The equivalent synchronous circuit (Figure 16) requires a
clock frequency of double the previous version.
It is also recommended that enabled logic is used where
required. See Gated Clocks on page 10.
Figure 16. Pipelined logic with single-edged clocking
DQ
CK QB
DQ
CK QB
DQ
CK QB
CLK
Combina-
tional
Logic
Combina-
tional
Logic
DQ
CK QB
DQ
CK QB
DQ
CK QB
CLK
Combina-
tional
Logic
Combina-
tional
Logic
(Double Frequency)
ASIC
12
Clock Generation and Overall Circuit Control
If clocks of different speeds are required by different
blocks, or the internal clock is required at a speed faster or
slower than the externally available clock, it is recom-
mended that a single clock generation block is constructed
at the top level of a circuit. This produces the internal
clocks required by all the functional blocks in the circuit.
See Figure 17.
Communication between the internal blocks is achieved by
the same principles as for asynchronous external inputs.
See Asynchronous Inputs on page 17.
Recommended Circuit
Figure 17. Clock generation module at circuit top level
Generating higher- or lower-speed internal
clocks
If the externally available clock signal is of a higher fre-
quency than that required for an internal clock, a synchro-
nous binary counter (made from T-type flip-flops) is
recommended to perform the required clock division.
Latching of data conditionally, or at a lower frequency than
this internal clock is achieved by the use of individual E-
type flip-flops for data storage.
Alternatively, a PLL can be used to produce a higher-speed
internal clock than the external reference clock.
Clock
Generation
Module
Block 1
Block 2
Block 3
Internal Clocks
CLK1
CLK2
CLK3
CLK
External
Reference
Clock
ASIC
13
Asynchronous Resets
The general recommendations for dealing with resets
within an ASIC are as follows:
1. The circuit must be brought to a known state, both
within test and in operation, within a stated and
agreed number of clock cycles. The known state is
generally achieved by means of a reset mechanism.
2. If an asynchronous reset is required, use a single
global asynchronous reset driven by an external
input. A tree buffering scheme similar to that for
clock distribution may be required to ensure a sharp
edge on the reset signal. The benefit of a reset of
this nature is that it places the entire circuit in a
known state in response to a change on a single
input signal, with no clock cycles required for the
known state to propagate.
3. If a power-on reset (POR) pad is used, the circuit
must contain another global reset for test purposes.
4. If a local reset is required, use a synchronous reset.
Non-recommended Circuit
A local asynchronous reset such as on a counter causes a
change of state in a storage element which is not triggered
by the active clock edge, and therefore violates the princi-
ples of synchronous design given in Synchronous Circuits
on page 2.
Local asynchronous reset of a flip-flop
In Figure 18, the local asynchronous reset causes a
change of state on the second flip-flop which is not syn-
chronized with the active clock edge.
Figure 18. Flip-flop driving asynchronous reset of another flip-flop
DQ
CK QB
R
DQ
CK QB
C
LK
Combina-
tional
Logic
Q
R
ASIC
14
Recommended Circuits
The circuits given below overcome the problems discussed
in the previous section.
A general recommendation is, if necessary, to organize
resets into a hierarchy, from global (which may be asyn-
chronous) to local (which must be synchronous).
Global asynchronous reset of all flip-flops
In Figure 19, a single external reset signal (rext) is con-
nected to all flip-flops. The buffering which may be required
is not shown.
Figure 19. Global asynchronous reset of all flip-flops
Local synchronous reset of a flip-flop
In Figure 20, the (active low) reset signal (r) is gated with
the d-input of the second flip-flop, making it synchronous.
The second flip-flop changes state only on an active clock
edge.
Figure 20. Flip-flop driving a synchronous reset of another flip-flop
DQ
CK QB
R
DQ
CK QB
CLK
R
DQ
CK QB
R
QQ
CK QB
R
REXT
DQ
CK QB
DQ
CK QB
CLK
Combina-
tional
Logic
Q
R
ASIC
15
Shift Registers
Shift registers are particularly intolerant of clock skew. A
problem which occurs in their design is that long shift regis-
ters may require internal clock buffering. If not properly
designed, this buffering can cause clock skew within the
shift register, and interfacing problems between the shift
register and the rest of the circuit.
Non-recommended Circuits
Not recommended is a chain of clock buffers within shift
register, in either the forward or the reverse direction.
These cases are illustrated below.
Shift register with forward chain of clock buffers
The problem with a forward chain of clock buffers (Figure
21) is that internal clock skew can cause data fallthrough
(where one stage of the shift register is skipped).
Figure 21. Shift register with forward chain of clock buffers.
Shift register with reverse chain of clock buffers
As shown in Figure 22 below, the problem with a reverse
chain of clock buffers is the timing interface between the
first D-type and the input data received from the rest of the
circuit.
Figure 22. Shift register with reverse chain of clock buffers.
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
Q
D
Q
B
C
LK
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
QQ
CK QB
DQ
CK QB
DQ
CK QB
Q
D
QB
CLK
ASIC
16
Recommended Circuits
There are two recommended ways of constructing the
clock buffering scheme within a shift register:
1. Use balanced clock tree buffering as in the rest of
the circuit. See Clock Buffering on page 4 and Fig-
ure 23 below. As an additional safety feature, buffer-
ing can be introduced in the data lines between
each flip-flop.
2. Use a FIFO.
Shift register with balanced clock tree buffering
As shown in Figure 23, the clock tree within the shift regis-
ter must be balanced (in terms of relative fanout) with the
same levels of clock tree in other parts of the circuit. Note
the naming convention for clock signals which facilitates
this.
Figure 23. Shift register with balanced tree of clock buffers.
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
DQ
CK QB
Q
D
QB
CK2
CK1
CK0
CK0
ASIC
17
Asynchronous Inputs
A problem arises at the interface between a synchronous
circuit and an external asynchronous input. At the flip-flop
which captures the asynchronous input, there is a probabil-
ity of metastability occurring. This section suggests some
circuits which capture an external asynchronous input with
a minimal risk of metastability.
Note: For large designs, inter-block communication is similar to
external asynchronous interfacing.
Non-recommended Circuits
Not recommended is any circuit using a complicated feed-
back loop to capture an asynchronous input. The function
of such circuits is obscure, and they run the risk of creating
more problems than they solve. They are also very sensi-
tive to noise, and their function can be altered by place-
ment and routing delays.
Recommended Circuits
There are two recommended approaches to the problem of
capturing an asynchronous input signal:
1. Two (or more) D-type registers in series to reduce
the probability of metastability (Figure 24).
2. Use an asynchronous handshake circuit (Figure
25).
In all cases, the asynchronous event is a rising edge on the
d (external) input to the first flip-flop. The pulse width of this
signal is indeterminate, but is at least one clock cycle. The
asynchronous event may occur simultaneously with a rising
clock edge.
A general point which applies to all situations where meta-
stability is possible is as follows:
The rise and fall times of both the clock and data signals
are significant: fast edges reduce the probability of
metastability.
Two D-type flip-flops in series to capture an
asynchronous input
If the first flip-flop goes into a metastable state, the proba-
bility that it will still be in that state at the next rising clock
edge is low. Should this, however, occur, the metastable
state is propagated to the d (internal) output and into the
rest of the circuit. The probability of this situation is reduced
by additional flip-flops in series.
Figure 24. Two D-type flip-flops in series to capture an asynchronous input
The common characteristics of circuits of this nature are as
follows:
In order for the d (external) rising edge to cause a rising
edge on the d (internal) output, there must be at least
one clock cycle between asynchronous inputs during
which d (external) is low. This reduces the maximum
frequency for the recognition of external events to half
that of the internal clock frequency.
If the flip-flop which receives the asynchronous d
(external) rising edge settles (after a period of
metastability) into the state with q = 0, the external input
is lost unless it persists beyond the next rising clock
edge.
Metastability can be caused by a rising or a falling edge
on the d (external) input.
DQ
CK QB
DQ
CK QB
C
LK (internal)
D (external) D (internal)
ASIC
18
Asynchronous handshake circuit
A circuit of the type shown in Figure 25 can be used to
detect an asynchronous event: a rising edge on d (exter-
nal). These events must occur at longer time intervals than
two clock cycles.
The external event (d) drives the clock input of the first flip-
flop. This is the only flip-flop in the circuit which has a clock
input not driven by the system clock (clk). The d-input to
this flip-flop is tied to logic 1. It has an asynchronous input
driven from the system reset (r) and from the qb outputs of
the second and third flip-flops.
Figure 25. Asynchronous handshake circuit
In reset mode (r = 0), the first flip-flop is reset asynchro-
nously. This state takes two clock cycles to propagate to
the d (internal) signal. In active mode (r = 1), a rising edge
on d (external) immediately drives the q-output from the
first flip-flop high. After one rising clock edge, this propa-
gates to the q-output from the second flip-flop, and after a
second clock edge, to the d (internal) output from the third
flip-flop. At this time, the qb outputs from the second and
third flip-flops are both low. This logic level propagates
through the OR and the AND gates in the feedback loop,
forcing a reset on the first flip-flop, which is now ready to
receive another rising edge on the d (external) input. The
circuit function is illustrated in Figure 26.
Figure 26. Operation of asynchronous handshake circuit
The d (internal) signal can be used as an acknowledge sig-
nal to the external system which is supplying the d (exter-
nal) inputs.
The risk of metastability is at the second flip-flop: caused
by simultaneous rising edges on the (asynchronous) q-out-
put from the first flip-flop and the system clock. If this
occurs, there are three possibilities:
The second flip-flop settles into a q = 1 state before the
next rising clock edge. This is then clocked by the third
flip-flop, and the circuit functions normally.
DQ
CK QB
DQ
CK QB
CLK
(
internal
)
D (external)
D (internal)
DQ
CK QB
R
R
1
Async
Reset
Event
1
Event
2
Event
3
Event
4
Event
1
Event
2
R
D
(external)
CLK
(internal)
Async
Reset
D
(
internal
)
Event 3 too
close to Event 2
Event 4 during
Reset state
ASIC
19
The second flip-flop settles into a q = 0 state before the
next rising clock edge. This causes no change to the
third flip-flop, and the feedback loop to the first flip-flop is
unaffected. Therefore the first flip-flop retains its q = 1
value to be clocked by the second flip-flop on the next
rising clock edge. The effect of this is to delay the
recognition of the asynchronous event by one clock
cycle.
The metastable state persists until the next rising clock
edge. In this case there is a possibility of the third flip-flop
entering a metastable state as well. However, the
probability of a metastable state persisting for an entire
clock cycle, and forcing the third flip-flop into a similar
state, is extremely low. This risk can be further reduced
by inserting additional flip-flops, at the expense of an
additional clock cycle as the minimum delay between
recognized inputs.
Note: Metastability can only be caused by a rising edge of the
d (external) input, whereas in the previous two circuits it
can be caused by either edge. The only restriction on
pulse width for the asynchronous handshake circuit is
the minimum pulse width of the first flip-flop.
This circuit will enter an unknown state if it receives simul-
taneous rising edges on the d (external) and reset (r) sig-
nals.
ASIC
20
Delay Lines and Monostables
There is often an apparent requirement to create a short
pulse within a circuit, of duration less than a clock cycle.
This generally requires the use of a delay line within a
monostable element, as shown in Figure 29 below. A multi-
vibrator circuit (Figure 31) is based on a similar principle.
More generally, asynchronous circuits often rely on delay
lines for their correct operation, for example in an attempt
to overcome race conditions.
The practice of delay-line dependent circuits is not recom-
mended, as the actual timing of the delay line is difficult to
predict, and is highly sensitive to temperature and process
spread.
In particular, due to simulation model constraints it is not
permitted to short two inputs of a logic gate to the same
source signal (Figure 27). The problem is that the gate
delays are characterized with one signal changing. For a
NAND3 driven to a one (Figure 28), if two signals change
simultaneously there are two transistors pulling the output
high, instead of one. This will reduce the delay time by
about 50% compared to the simulation model.
Non-recommended Circuits
In general, any circuit which relies on delays for its opera-
tion is not recommended. All gates in series which are not
used for buffering must be considered as delay lines. Five
specific examples are given below:
NAND2 gate used as delay element
Figure 27. NAND2 gate used as a delay element
NAND3 gate with two inputs connected together
Figure 28. NAND3 gate with two inputs connected
together
Monostable pulse generator
Figure 29. Monostable pulse generator
Pulse generator using a flip-flop
Figure 30. Pulse generator using a flip-flop
Multivibrator
Figure 31. Multivibrator
Care must be taken not to create inadvertently an equiva-
lent circuit to this one, for example, in the (synchronous)
reset loop of a counter.
B
A
c
a
b
Delay Line
rigger
Puls
DQ
CK QB
LK
Delay Line
Puls
1
R
Delay Line
Trigger
Oscillatin
g
Signal
ASIC
21
Recommended Circuit
If at all possible, delay-line dependent circuits should be
avoided completely. The safe solution to the problem is as
follows:
1. Use a higher clock speed. The best time resolution
available in a circuit is the width of one clock cycle.
2. Use a synchronous pulse generator, as illustrated in
Figure 32 below.
Synchronous pulse generator
Figure 32. Synchronous pulse generator
Authorization
Delay-dependent circuitry is only accepted by Atmel when
it is accompanied by post-layout (H)Spice simulation
results of the relevant circuit elements.
Pulse
DQ
CK QB
DQ
CK QB
CLK
T
rigger
ASIC
22
Bistable Elements
Data storage elements should not be created by cross-cou-
pling NAND or NOR gates to form bistable elements. There
are a number of problems associated with bistable ele-
ments of this nature, including asynchronous operation,
unknown output states for certain input combinations, sen-
sitivity to input spikes, and the lack of timing constraint
checking in simulation.
Non-recommended Circuits
Non-recommended circuits include cross-coupled NAND or
NOR gates and RS flip-flops. These are illustrated in Figure
33, Figure 34 and Figure 35 below.
It is important to avoid the inadvertent creation of cross-
coupled NAND/NOR gates by means of feedback loops
within combination logic.
Cross-coupled NAND gates
Figure 33. Cross-coupled NAND gates forming bistable
storage element
Cross-coupled NOR gates
Figure 34. Cross-coupled NOR gates forming bistable
storage element
RS flip-flop
Figure 35. Asynchronous RS flip-flop
Recommended Circuits
The recommended methods of overcoming the problems
listed in the previous section are as follows:
1. Use D-types with gated set/reset as required.
2. Use a latch configured as RS flip-flop. See the
example circuit in Figure 36 below.
3. Avoid R-S races in the control of RS flip-flops.
Latch configured as RS flip-flop
Figure 36. Latch configured as RS flip-flop
S
(Active High)
R
(Active High)
Q
QB
R
(Active Low)
S
(Active Low)
Q
QB
DQ
CK QB
R
S
S
(Active Low)
0
0
R
(
Active Low
)
Q
Q
B
DQ
LD QB
S
S
(Active Low)
0
R
(Active High)
Q
Q
B
ASIC
23
RAMs/ROMs in Synchronous Circuits
The problem of interfacing RAMs and dual-port RAMs into
synchronous circuits is that they are double-edge triggered:
the address is latched on the opposite clock edge to the
data. This scheme is shown in relation to the ME and
WEbar signals used by RAM and dual-port RAM in Figure
37 below. The ROM ME signal also latches the address on
the rising edge.
Figure 37. ME and WEbar (RAM/DPRAM) timing scheme
Recommended Circuits
ME and WEBar Generation
To achieve synchronicity with the rest of the circuit, connect
the RAM or dual-port RAM ME signal to an inverted system
clock. One method of generating the WEbar signal is to use
a D-type flip flop, with the inverted ME signal driving the
clock, and an active-high external write request (wext) driv-
ing the d-input. The Webar signal is taken from the qb out-
put. This produces the required delay of WEbar with
respect to ME. This configuration is shown in Figure 38,
and the resulting waveforms for a write cycle in Figure 39.
Figure 38. Interfacing RAM/DPRAM into a synchronous circuit
Latch Address Latch Data
Addr
Setup
Addr
Hold
Data
Setup
Data
Hold
ME
(RAM/DPRAM)
WEbar
(Write)
WEbar
(
Read
)
DQ
CK QB
DQ
CK QB
/
/
//
ADD ADD
W
EXT WEB
ME
DIN DI DO DOUT
CLK
RAM
ASIC
24
Figure 39. ME and WEbar timing scheme using flip-flop
for WEbar generation
A consequence is that the clock duty cycle needs to be
checked: the shorter phase needs to be longer than the
setup and hold times and maximum propagation delay in
the RAM, ROM, dual-port RAM and interfacing circuitry.
Avoiding Floating Outputs during Write Phase
During a write cycle, the output of a RAM/DPRAM (with
tristate outputs) is floating. The propagation of this state
can be avoided by means of the circuitry shown in Figure
40.
Figure 40. Avoiding floating RAM/DPRAM output propagation
Data and
Address Ready
Latch Address Latch Data
Write Request
CLK
ME
WEXT
W
EBAR
/
/
/
ADD
WEB
ME
DI DO
RAM
ADD
WEXT
DIN
ME
/
DOU
T
ASIC
25
Internal Tristates
Internal tristates for data bus access within a circuit must
be used with care, and should be avoided if possible.
Potential problems are an undriven bus (particularly at ini-
tialization time) and conflicting bus drivers. An undriven bus
floats to an intermediate state, causing high static currents.
Non-recommended Circuit
The general configuration of a circuit which is susceptible
to problems of tristate control is shown in Figure 41 below.
Local control of tristate enables
Figure 41. Tristate bus with no central control of tristate
enables. Do not use the Hzpull cell as a memory device.
The tristate enables are controlled locally, with no means of
ensuring that there is no conflict (two driving simulta-
neously) or no undriven state, with no driver switched on.
The Hzpull part retains the existing state of the bus, but it
cannot initialize a tristated bus and creates asynchronous
storage.
Recommended Circuits
1. Decode tristate control through a central control
decoder. It is recommended that the operation of
this decoder is documented by means of a truth
table or Karnaugh map.
2. Provide one driver which is activated on non-con-
trolled states. In particular, ensure that this driver is
active during the reset state of the circuit.
3. Do not rely on Hzpull as a memory device. Its func-
tion is to prevent static dissipation, and it has a poor
timing check.
4. Eliminate the tristates altogether by using multi-
plexed data bus lines. See Multiplexers vs tristates
on page 26.
These three points are illustrated in Figure 42 below.
Central control of tristate enables
Figure 42. Tristate bus with central control of tristate enables and additional driver activated on non-controlled states
Note: The Hzpull part is not strictly necessary in the above
schematic. It is included for additional security during
control transitions.
E0
E1
E2
E3
D0
D1
D2
D3
No central
control of
tristate
enables
Data Bus
(1 bit)
HZPULL
E0
E1
E2
E3
D0
D1
D2
D3
Data Bus
(1 bit)
HZPULL
0
R
CTRL
/
Control
Decoder
(Active Low)
Tristate Driver
Activated on
Non-controlled
States
ASIC
26
Multiplexers vs tristates
5. Preferably, multiplex data lines instead of using
tristate-driven buses. The factors to be taken into
account are as follows:
Tristates (disadvantages):
large area
limited buffering
large routing load, consequently slow
Multiplexers (advantages):
small area
efficient routing
Note: The control decoding is the same for a tristate-driven bus
as for a multiplexed set of data lines.
ASIC
27
Paralleling Signals
For various reasons it sometimes appears necessary to
include a wired OR or equivalent construction in a circuit, in
order to provide parallel data signals. This practice is not
recommended. The use of wired OR parts should be
avoided wherever possible.
Non-recommended Circuit
Any circuit element which makes implicit or explicit use of
the wired OR part is not recommended. An example is
shown in Figure 43 below.
Figure 43. Wired OR part used to create higher fanout
The function of this circuit may not be modeled properly,
and there are placement and routing hazards.
Recommended Circuit
Use buffers of the appropriate strength and logic combina-
tions which avoid the use of wired OR gates. The previous
circuit can be replaced by the following equivalent:
Figure 44. Higher-fanout buffer replacing wired OR part
X Y
INV3
INV3
X Y
INV6
ASIC
28
Fanout
The relative fanout on any net in a circuit is the ratio of the
total load (due to driven inputs and tracking capacitance) to
the drive strength of the output driving the net. In general
the relative fanout should not exceed 12 (a process-inde-
pendent figure derived from Atmel cell characterization
data), otherwise the signals on the net are unacceptably
delayed, and edges are unacceptably slow.
The special case of fanout in clock signals is dealt with in
Clock Buffering on page 4.
Non-recommended Circuits
Any circuit which has excessive fanout on a data or control
signal is not recommended. An example is shown in Figure
45.
Figure 45. Excessive fanout on control signal
Tristate Enable
ASIC
29
Recommended Circuits
Use geometric or tree buffering in order to reduce fanout.
Examples of each type are shown in Figure 46 and Figure
47.
Figure 46. Geometric buffering on control signal
Tristate Enable
INV4
ASIC
30
Figure 47. Tree buffering on control signal
Authorization
Relative fanout affects the speed of operation of a circuit.
Given sufficient time, highly loaded nets will eventually set-
tle to their correct logical value.
Accordingly, maximum relative fanout may be exceeded if
no clock signals are involved, and data signals have suffi-
cient time margin on input to clocked elements.
Tristate Enable