I'd be receiving an LVDS clock pair @ 360Mhz, running part of the
internal logic at 360. This internal logic includes DSP48 slices (but
need to be pipelined in the fabric since I need more than 48-bit 'C'
input for adder). Preliminary testing indicates that it can go above
360 with light user intervention. One thing I'm cautious about is, the
rest of logic runs much slower, at 90Mhz. Initially was thinking of
using /4 version, but Peter Alfke's post regarding added skews due to
loading differences in DCM outputs is making me think about it
carefully.
For otuput, I'd be using ODDR to multiplex 360 Mhz logic, to send the
data out at 360Mhz DDR (so the data can look like 360Mhz 'clock').
Data is LVDS, so is the forwarded LVDS clock pair @ 360Mhz. The
receiving device will use both edges of the forwarded 360 Mhz clock to
sample the data. Clock to output delay is not good, 3+ ns, but since
the clock will be forwarded and will incur effectively the same delay
as data (other than IOB-IOB clk skew), as long as I send out 180 deg
version of internal 360 clock using ODDR, it should be ok. Not sure
what kind of SI issue there will be, however.
I have an option of running it at 180Mhz if 360 is risky. External
device will be different. Am I playing too safe by going to 180? Will
360 be a challenge?
I'd appreciate feedback.
Questions about the clock enable :
- Is there an easy way to specify it as clock enable, so the tool knows
about it for timing analysis, and so you don't have to specify
multi-cycle constraints for gobs of FFs?
- And how do you make the enable signal go on the global clock net?
NET "ENABLE_NET_NAME" TNM=FFS "ENABLED_FFS";
TIMESPEC TS1000 = FROM : ENABLED_FFS : TO : ENABLED_FFS : 11.1ns;
>
> - And how do you make the enable signal go on the global clock net?
>
You ask someone from Xilinx! I've not yet started my V4 design. I just
remembered that from the marketing spiel we had.
Cheers, Syms.
http://www.xilinx.com/products/design_resources/conn_central/resource/ssio_resources.htm
for discussions on the 1 Gbit/s/pin Virtex-4 differential I/O and you may
get a better feel for the margins you can get in your design.
If all your LVDS are at the same clock rate, the DCM skews shouldn't be a
problem and could easily be registered back to the slower domain if the
skews were an issue.
Xilinx went to great lengths to make the Virtex-4 I/O capable of some pretty
high speeds without taxing the designer.
<fastgr...@yahoo.com> wrote in message
news:1115750937....@f14g2000cwb.googlegroups.com...
Altera claims their parts have a better LVDS 'eye' because they have
superior (i.e. less) pin capacitance. The capacitance gives a lot of ISI.
I've been bitten by Xilinx FPGA pin capacitance before, albeit in a slightly
different situation. I guess it's the inevitable consequence of trying to
fit every possible I/O standard onto every pin. If I'm reading the page
properly, Altera mitigate this by having different sets of pins which are
capable of different I/O standards. Separate 'Rocket I/Os' also solve this
problem.
Of course, the OP's data rate is significantly lower than the rate shown in
the link, so should be no problem! ;-)
Cheers, Syms.
<snip>
> You might also like to look at this link.
>
http://www.altera.com/products/devices/stratix2/utilities/st2-signal_integrity.html?f=tchio&k=g3
>
> Altera claims their parts have a better LVDS 'eye' because they have
> superior (i.e. less) pin capacitance. The capacitance gives a lot of ISI.
<snip>
But doesn't the Altera information use external LVDS terminations and
monitor external to the part rather than internal terminations and the eye
seen by the receiver? By looking outside the package that has an embedded
differential termination, isn't the data skewed?
I am suprised at you. Their white paper clearly shows the simualtion is
done with the external termination, and not the internal one.
Use the internal one, and the capacitance does not matter (do the sim
yourself if you do not believe me).
Of course, you really should use the LVDS where it is specified. If you
want 1.3 Gbs, use our MGTs .... oh, I forgot, then do not have 2-GX
parts ....
The fact that their LVDS works up to 1.3 Gbs in simulation is nice, but
can it be used in a real application on a real board?
We have our ML450 Network Board for V4 to demonstrate 1 Gbs DDR
interfaces, and it works just fine. Ask your FAE for a demo, or go
online and buy the board.
Austin
Quoting Symon's last response [1] to such a misleading,
incomplete, and inaccurate claim:
>>
>>Total bollocks
>>
And, just recently, some other Austin from Xilinx wrote [2]:
>>
>>Placing the cap at the receiver is really bad from a signal
>>integrity standpoint: it makes for a huge reflection
>>
Exactly the point I was trying to make a couple years back,
with which you disagreed so virulently. Rather than repeat my
explanation of why high input C is bad, just re-read [3].
[ I agree that if the cap is right at the receiver in a
point-point connection, all it will see is the filtered edge on
the initial transition; but ignoring the aftereffects of that
massive reflection is foolhardy. ]
>
> We have our ML450 Network Board for V4 to demonstrate 1 Gbs DDR
> interfaces, and it works just fine.
>
The last time you said that, I asked [4]:
>> Where in Xilinx's V4 documentation might one find these pictures
>>and eye diagrams, including real world vs. simulated waveforms at
>>the driver, receiver, and points in between ?
Also from that thread, I suggested some other measurements
to make on your "A vs. X" test platform:
>> Since you have that spiffy board at hand, I'd love to see
>> plots of the following:
>>
>> A) X vs. A ICCO for the "Hammer Test" at several toggle rates
>>
>> B1) X vs. A waveforms for a high speed single ended standard (xSTL)
>> B2) X vs. A ICCO for a high speed single ended standard (xSTL)
>>
>> C1) X vs. A waveforms for 1 Gbps differential LVDS
>> C2) X vs. A ICCO for 1 Gbps differential LVDS
>>
>> D) X vs. A differential TDR input waveforms into a DT termination
>> at 100, 200, 500 ps input edge rates
When might we see some published data on those, particularly items C &
D?
Brian
[1]
http://groups-beta.google.com/group/comp.arch.fpga/msg/a6252d7a3566ea3f?hl=en
[2]
http://groups-beta.google.com/group/comp.arch.fpga/msg/7ef4d001e4d8ff65?hl=en
[3]
http://groups-beta.google.com/group/comp.arch.fpga/msg/a044806f313848e6?hl=en
[4]
http://groups-beta.google.com/group/comp.arch.fpga/msg/3619e923a589ef59?hl=en
Get the ML450 board, or ask for the documentation.
Lots of scope shots are available (ask your FAE).
or, http://www.xilinx.com/publications/prod_mktg/pn0010778.pdf (page 2)
Or, go to one of our RocketLabs and measure it for yourself.
As for the reflection, the LVDS transmitter is also a 100 ohm
termination, so reflections are absorbed at the transmitter (when the
LVDS is properly done and meets the specifications, which ours do).
Anyone with an IBIS simulator can see all of the above happening, so I
really don't want to take this any further - demanding to see scope
shots of things is pretty pointless when the simulations are perfectly
good (when they are done correctly).
But, I am sure our Marketing Folks will be rolling our scope shots as
part of pitch-packs, etc. for those who are unable or unwilling to do
the SI engineering that their job requires of them.
Austin
No you didn't bum steer... you were right initially. CE nets can be put
onto a global clock network. Look at the CLB switch box in FPGA Editor
again... each CE pin can be driven by a bounce pip (4 stubs in the
middle right edge of the switch box), and these 4 bounces can all be
driven by the GLK pips on the lower left edge of the switch box.
There's your path.
Cheers,
Ajay Roopchansingh
Xilinx Inc
Here's a quote from National's LVDS manual.
"In a good design the connector contributes 2 pF to 3 pF, the trace
contributes 2 pF to 3 pF, and the device contributes 4 pF to 5 pF.The total
load in such a design is around 10 pF. The flexibility of programmable
devices comes at the cost of capacitance. National Bus LVDS products have an
I/O capacitance of 5 pF. The I/O capacitance of a programmable device is
approximately double or 10 pF. This increase in capacitance will lower the
loaded bus impedance, thereby reducing the available noise margin and
lowering the reliability of operation in the design."
http://www.national.com/appinfo/lvds/files/ownersmanual.pdf
>
> Anyone with an IBIS simulator can see all of the above happening, so I
> really don't want to take this any further - demanding to see scope
> shots of things is pretty pointless when the simulations are perfectly
> good (when they are done correctly).
>
I don't want to see scope shots, I agree I want to see a simulation 'done
correctly'. I think I already have on Altera's website.
Best regards, Syms.
Make up your mind Austin. On numerous occasions you have recommended
that people run simulation of I/O systems to see what should happen,
and you have recommened the IBIS models. To suggest that Altera does
not know how to run simulations is insulting. Enough!
Philip Freidin
===================
Philip Freidin
philip....@fpga-faq.org
Host for WWW.FPGA-FAQ.ORG
What is 12.5pF in series with 12.5pF?
Yes, that is right, 6.25pF differential load, not 12.5pF.
Falling for the A FUD is especially embarrassing when you just repeat
things which are factually incorrect.
All these things are taken into account from the simulation.
Austin
1. They ignored the top comment lines of the IBIS model which instructs
them how to model the package (since package modeling is incorrect and
wrong in IBIS 3.2).
2. They used an external resistor instead of the internal termination.
Run it right, or not at all.
Austin
All true.
I would suggest that you should have more of a differential line, than
two single ended 50 ohm lines, but it doesn't change anything at all
(you still end up being differentially terminated at the receiver, with
6.25pF across 100 ohms).
The eye is plenty good for up to 1 Gbs (see the ML450).
It does not work up to 1.3 Gbs, because we didn't design it to work up
to there: that is what the MGTs are for.
If there is a 'beauty contest' for the 'best LVDS eye pattern', I will
admit we come in second (due to increased Cpin), but I will not admit
that it matters so far as use, function, or anything important is
concerned. The Idly feature that allows for independent skew adjustment
for each IO pin (pair) to center the eye sampling point to within
+/-78ps is a far more useful feature than having 'pretty eyes'.
Austin
Best wishes on getting Austin to stop with his
"but it's really half, differentially" handwaving.
I've tried before, with results similar to that
"but it goes to Eleven" bit from "Spinal Tap".
>
>a way to improve things is to drive this capacitance
>with a lower impedance
>
Also, when you've got plenty of drive margin, a differential
attenuator ahead of the FPGA (with internal termination) works
nicely to attenuate the reflection, and also makes for a convenient
differential probe point. If you have 6dB to spare, even the most
horrible of loads presents at least 12dB return loss, with the probe
seeing 1/4 the reflection voltage of the original circuit.(however, the
attenuator doesn't lower the drive impedance as does your suggestion )
Brian
>
>But, I am sure our Marketing Folks will be rolling our scope shots
>as part of pitch-packs, etc. for those who are unable or unwilling
>to do the SI engineering that their job requires of them.
>
Let's see if I've got this straight [1]:
A) Xilinx publicly posts in FPGA and SI forums touting their
real world X vs. A package testing, and asks for feedback [2]
B) Forum users post some suggested measurements, which a
certain Xilinx employee says they can make
C) Two months later, when asked when said measurements might
be published, the very same Xilinx employee cops an attitude
>
>Get the ML450 board, or ask for the documentation.
>
That would be the same manual (UG077 v1.2) that mentions a
HyperTransport compliant DUT interface connector, without
pointing out that the the specified V4 FPGA Cin is 5x the
allowed HyperTransport max Cin for a 1 Gbps part ???
As to why that matters: a HyperTransport test probe attempting
to monitor the input link to the FPGA can't function properly
because Cin reflections off the FPGA would prevent the probe from
properly clocking the data at the mid T-line probe sampling point.
There are ways around this, but life would be easier if Xilinx
actually bothered to meet the spec in the first place.
Lacking that, proper documentation of your part's shortcomings,
and how and when to work around them, would be appropriate.
Brian
[1] Speaking of those unable to perform the SI engineering that is
required of them : when might we expect publication of characterized
static DCI power and DCI impedance modulation limits for the five year
old Virtex2 FPGA family ?
[2]
http://groups-beta.google.com/group/comp.arch.fpga/msg/d1004ae1fdca9825?hl=en
All I am trying to point out is that the load is 6.25pF + 100 ohms, not
12.5pF + 100 ohms.
When folks wave their arms and state 12.5pF is the LVDS load, they are
miss-stating it.
Simple point.
And once you do the simulations, or look at the actual waveforms, you
realize that it is mostly just a beauty contest. In communications
theory, excess bandwidth in the channel only adds to the error rate (due
to noise). Some band limiting is a good thing. Too much is a bad thing
(eg using the LVDS at 1.3 Gbs where it wasn't designed to be used, that
is where our MGTs are to be used).
Austin
Sigh.
See below.
Austin
Brian Davis wrote:
> Austin,
>
>>Lots of scope shots are available (ask your FAE).
>>
>
> Then why not publish them, along with a comparison of IBIS/HSPICE
> simulations versus the real world measurements?
All I can say, is that they are coming. Just takes awhile. Right now
we have much more important things to do: tout our power advantage, our
static current advantage, our speed advantage, our MGT advantage, our
PPC advantage, our SI packaging breakthrough ...
Showing an IBIS simulation of a five year old interface is just not high
on our list -- too many customers use it, and are perfectly delighted
with it. We do not want to be defocused and stop pointing out the areas
where we are clearly superior.
>
>
>>But, I am sure our Marketing Folks will be rolling our scope shots
>>as part of pitch-packs, etc. for those who are unable or unwilling
>>to do the SI engineering that their job requires of them.
>>
>
>
> Let's see if I've got this straight [1]:
>
> A) Xilinx publicly posts in FPGA and SI forums touting their
> real world X vs. A package testing, and asks for feedback [2]
Sure.
>
> B) Forum users post some suggested measurements, which a
> certain Xilinx employee says they can make
I did. Yes.
>
> C) Two months later, when asked when said measurements might
> be published, the very same Xilinx employee cops an attitude
OK, so I was snippy. I am told that the measurements will be done, but
again, it isn't a high priority.
>
>
>>Get the ML450 board, or ask for the documentation.
>>
>
>
> That would be the same manual (UG077 v1.2) that mentions a
> HyperTransport compliant DUT interface connector, without
> pointing out that the the specified V4 FPGA Cin is 5x the
> allowed HyperTransport max Cin for a 1 Gbps part ???
True: we are not an ASIC/ASSP. That is the one area where they win
(they can make these specs as tight as they please). But guess what?
We are growing, increasing sales, and ASICs are not. Our real
competition now is no longer other FPGA companies; it is the ASIC/ASSP
providers. We can supply features and circuits on technologies they
can't (yet). Who has 10 Gbs transceivers? Who has the lowest power
405PPC? Who has the lowest power/highest performance DSP48 blocks for
DSP applications? We do, they don't.
>
> As to why that matters: a HyperTransport test probe attempting
> to monitor the input link to the FPGA can't function properly
> because Cin reflections off the FPGA would prevent the probe from
> properly clocking the data at the mid T-line probe sampling point.
I claim in a real system, with a compliant transmitter, there will be
sufficient return loss matching to make the eye visible, and useful.
But, I agree, that in some cases, what you see is not what you get.
That can happen with a simple single ended input pin, and is definitely
true about 1Gbs, where observing it, breaks it (often). I think that
there is a whole class of people out there who have to see it to believe
it. OK. But, they should get used to the fact that none of the test
equipment is really fast enough to show them what they want to see. And
it is only getting worse.
>
> There are ways around this, but life would be easier if Xilinx
> actually bothered to meet the spec in the first place.
Already explained why we can't do that: 35 IO standards in one pin has
to make some compromises.
>
> Lacking that, proper documentation of your part's shortcomings,
> and how and when to work around them, would be appropriate.
We got all that. That is what the user's guide is for. That is what
the datasheet is for. Should we place a billboard on 101 South that
states the IOB pin capacitance is ~ 12pF? It is already in the
datasheet. So is the MGT, PPC, DSP48, etc. What do you think we should
spend time on?
>
>
> Brian
>
>
> [1] Speaking of those unable to perform the SI engineering that is
> required of them : when might we expect publication of characterized
> static DCI power and DCI impedance modulation limits for the five year
> old Virtex2 FPGA family ?
I think all this is now covered between data sheets, user's guides, and
technical answers on our website. Let me know if there is something
missing between those three resources.
Generally speaking, if we don't specify it, then you are on your own to
use it there. For example, if you chose to set the resistance to 100
ohms, to match a 100 ohm single ended line, we are not going to claim we
meet any standard (there isn't any), and we aren't going to spend time
characterizing all the silicon for it. I believe we state the range of
the resistance from 40 ohms to 150 ohms, but when you use it at anything
other than 50 ohms, you are required to check it out (I would run the
spice simulations -- you may request impedances other than 50 ohms for
the spice models of DCI, 40, 50, 68, and 75 are the ones we have if I
recall correctly), as that is not any one of the 35 IO standards that we
designed the IOB to support.
A small change, such as using the DCI at 68 ohms instead of 50 ohms is
used by quite a few (to save power). You can characterize it if you
need to, and if you feel there is a benefit you can derive, but unusual
usage of a feature in an area it was not intended to be used (not
specified), is not guaranteed.
I'm now leaning toward doing it - parts of core @360, DDR data output
@360 (720, effectively) along with forwarded clock @360. I'd be
running simulation to make sure there isn't any big issue at 720Mbps,
but since it's much lower than 1.2Gbps, I'm optimistic.
Can't say Altera is out of running, however. I just wanted to make
sure I could do it in some FPGA device before committing to the
interface.
Thanks again.
<snip>
>> There are ways around this, but life would be easier if Xilinx
>> actually bothered to meet the spec in the first place.
>
> Already explained why we can't do that: 35 IO standards in one pin has
> to make some compromises.
Perhaps it is time to make some pins less "jack of all trades, master
of none", and provide some with more focus ?
-jg
It is something we agonize over everytime we look at a new family.
Should we add IO standard specific IOB's? How many? How are they to be
organized?
What should the IO/CLB ratio be?
Or, should we continue with the present plans (if it ain't broke, don't
fix it)?
What business did we lose because we could not meet a customer's
requirement? How do we know we even lost any business at all?
We did add MGTs (and PPC's, EMAC's, DSP48's, ECC_BRAM's, FIFO_BRAM's,
etc), so it isn't like we are not looking at adding new things, or
mixing things up (the patented ASMBL architecture for example).
360 MHz, 720 Mbs DDR LVDS is now over five years old as something that
either X or A has provided with their devices. One can argue the fine
points, but as a gross capability, it has been there for quite awhile.
Austin
Which of the following posts regarding Cin is more helpful
for both Xilinx and its' customers:
Austin [1]:
>
> Use the internal one, and the capacitance does not matter
> (do the sim yourself if you do not believe me).
>
Brian [2]:
>
> At no point have I claimed that the V2 inputs are unusable,
> but only that, in the presence of high speed drivers, extra
> engineering effort needs to be expended to both understand the
> impact of the V2 input capacitance on the interconnect, and
> find a work-around that is appropriate for the design at hand.
>
Austin wrote:
>
>When folks wave their arms and state 12.5pF is the LVDS load,
>they are miss-stating it.
>
The only I/O capacitance number published in your datasheet is a
single-ended parameter called Cin (or if you prefer, C_comp from
the IBIS files).
Quoting this published datasheet Cin value is perfectly valid,
and does not require "correction".
Comparing that number against the single ended Cin's of other
devices, or against a single ended spec, is also perfectly valid.
I have never said the differential load is 12.5 pf; it is clear
from my posts that I understand this, and also understand that
the assumption of Cdiff_effective = 1/2 Cin_single_ended applies
only for the differential components of the signals on the Tline.
I find it rather inconsistent that in past discussions of
Xilinx's newly onerous SSO limits for the current mode output
drivers, you've been quite insistent that real world paths are
NOT perfectly balanced-
Yet when discussing the effects of high Cin, you posit that
everything is perfectly balanced back to a perfect source
termination, so that a 50-60% voltage reflection off of your
input pins is never a problem.
If only all FPGA input buffers could live happily ever after
there in Austin's world, where all connections are ideal
differential point-point links, all drivers have perfect back
terminations, and no probing or multidrops are ever allowed.
>
>In communications theory, excess bandwidth in the channel only adds
>to the error rate (due to noise). Some band limiting is a good thing.
>
And massive, coherent input reflections do not fit the AWGN
assumptions of most channel models, now do they?
Brian
p.s. As for your other post, I'll reply once I finish recovering
from a hard drive crash at home and can find my old files again.
[1]
http://groups-beta.google.com/group/comp.arch.fpga/msg/57bbb3ea78e194ed?hl=en
[2]
http://groups-beta.google.com/group/comp.arch.fpga/msg/a044806f313848e6?hl=en
Well, things are getting a little less busy with my day job, so I finally
have time to start replying again... I figured I'd start with an easy one.
> The fact that their LVDS works up to 1.3 Gbs in simulation is nice, but
> can it be used in a real application on a real board?
Yes. Stratix II has LVDS running at 1.3 Gbps reliably across process,
temperature, voltage. Beautiful eye diagrams. In simulation and on the
board. And as noted here
(http://www.altera.com/products/devices/stratix2/features/performance/st2-perf_improvements.html),
we will be increasing the spec to 1.25 Gbps in an upcoming version of
Quartus II.
BTW, our simulations line up very will with board measurements. We offer
accurate IBIS models that we proudly stand behind.
Regards,
Paul Leventis
Altera Corp.
According to our engineer who ran the sims, we did use on-chip termination
for both V4 and Stratix II. I read the whitepaper again
(http://www.altera.com/literature/wp/signal-integrity_s2-v4.pdf) and I can't
find anywhere where it says we didn't use on-chip termination.
> The fact that their LVDS works up to 1.3 Gbs in simulation is nice, but
> can it be used in a real application on a real board?
Sorry to hammer on this again, but the above mentioned whitepaper does show
some beautiful eye diagrams for SII and some ugly ones for V4. It also
shows how nicely our lab measurement (of 1.3 Gbps LVDS on Stratix II)
compares to the IBIS simulation.
That is what all of the wonderful features are for in V4 (SSIO, IODLY,
DCM, etc.). All of the above go a long way to support the fabric. Even
though the fabric will run at 500 MHz, it is far easier to mux it down
to 200 MHz, or 100 MHz (using the built in SSIO features) which makes
place and route easier, and also provides a lot of margin.
Just go buy the ML450 board (network interfaces), and you will get a
fully working platform to test out all of your ~ 400 MHz up to 500 MHz
DDR interfaces.
Austin
There is a hard serializer/deserializer circuitry available for the
left and right LVDS I/O banks. These SERDES blocks allow you to
deserialize/serialize by any factor between 4 and 10x. For example,
you could bring in a 4x data bus running at 312.5 Mhz. Or you can
bypass the SERDES block and use the DDR registers for a 2x SERDES. Or
bypass completely for 1x... but not at 1.25 Gbps. I don't know what
speed the SERDES/DDR I/O clock can run at or will run at when we update
this specification. I'm sure it will be published at the time.
We also have dedicated Dynamic Phase Alignment (DPA) circuitry for
source-synchronous applications. The DPA block enables you to
eliminate channel-to-channel and clock-to-channel skew. It achieves
this by selecting the best clock phase to use for each I/O pair,
centering the sampling window in the eye.
According to the data sheet, you can run the LVDS I/O up to 500 MHz in
the fastest speed grade part. That would get you 1 Gbps. More likely,
you would use the SERDES. For example, at 130 MHz and using x8
serialization, you get 1.04 Gbps per pair. Here is a link to the DPA
datasheet:
http://www.altera.com/literature/hb/stx2/stx2_sii52005.pdf
John
"Paul Leventis" <paul.l...@utoronto.ca> wrote in message
news:1116463566.3...@g43g2000cwa.googlegroups.com...
Sorry for taking so long to reply.
> I want to send bits a_1, a_2, a_3, a_4 etc. on I/O LVDS_A
> I want to send bits b_1, b_2, b_3, b_4 etc. on I/O LVDS_B
> I use the serdes to do this. Can I ensure that a_n appears at (more or
> less)
> the same time as b_n? I.e. that the shift registers in the two serdes are
> aligned?
That's what the SERDES block is for. You just need to instantiate a
altlvds_rx (receiver) or altlvds_tx (transmitter) with the number of
channels you want in the link. Each of the channels will share a common
PLL. Therefore, they share a common clock, and the enable pulses derived
from that clock.
And if you want to give the manual another stab ;-), I've been told that
volume 2, chapter 5 of the Stratix II handbook, "High-Speed Differential I/O
Interfaces with DPA in Stratix II Devices"
http://www.altera.com/literature/hb/stx2/stx2_sii52005.pdf is helpful.
Figures 5-2, 5-11 and 5-12 are most applicable in this case.