How does the DCM phase shifting circuitry work? Xilinx Spartan 3

Craig Yarbrough

unread,

Apr 4, 2006, 5:18:18 PM4/4/06

to

Essentially I need to know, for any given DCM configuration, how much
the DCM outputs will shift in phase for each time I nail PSINCDEC. I'm
thinking that if I understand better how the PS part of the DCM circuit
works I can answer this for myself. I've got a case in with Xilinx but
either they're not understanding my question, or they're not sure how
to answer, or who knows. Any help would be greatly appreciated. Here's
the correspondence so far:

----------------------
Me:
I'm using some DCMs in dynamic phase-shift mode in a Spartan 3 and I'm
trying to understand how the granularity of each dynamic phase shift is
tied to the period of CLKIN. Does the phase shift feature use a fixed
tap delay? If so, the phase shift granularity would not be dependent on
the period of CLKIN. Also, after reading XAPP462 I surmised that the
longer the period of CLKIN (slower the CLKIN frequency) the less number
of effective phase shift steps. This seems counter-intuitive. Can you
help me to understand how the dynamic phase shifting is implemented in
the DCM?
-----------------
Xilinx:
Craig,
The DCM can always delay ~10ns. When your clock period is slower than
~100Mhz (10ns period), you will not be able to phase shift the full360
degrees. Additionally for slower frequencies, taps are combined as per
XAPP462, such that you may not have 256 tap changes, but will still
have 10ns of delay to work with. It's a little confusing, but the
circuitry basically does not allow the same resolution at slow speeds
as it does at high.
Hope this clears things up.
------------------
Me:
Thanks for your very quick response. I'm beginning to understand a
little better. The goal is for us to know, for any given DCM config,
how much dynamic phase shift occurs each time we nail PSINCDEC. For a
DCM that has a max shift of 10ns and 512 steps or taps, is it safe to
say that each tap is about 20ps? I understand that for slower CLKIN
frequencies the taps could be combined to give 40ps, 60ps, etc. per
step, and thus you'd have less than 255 steps in either direction. Is
there a way to tell if the DCM we're using is combining taps, and how
many taps per step? Thanks!
-------------------
Xilinx:
Craig,
There will never be 512 taps, there are max 256 and each is around
40ps. The weight of each unit of increment will depend on the
frequency of clk in. This is where XAPP462's equations come in.
Sounds like you got it from there.
------------------------
Me:
No not quite. The 512 taps/steps I got from equation 4 (pg 42), where
TCLKIN is less than 10 ns and the phase shift limits are +/-255.
However that's for fixed phase shifting. For variable phase shifting
there's 256 taps/steps when the shift limits are +/-128. Still, are you
sure there's only 256 taps in the delay line, since there's +/-255
steps available in fixed phase shifting?

Also, if CLKIN is 250MHz and I step PSINCDEC once, will the output
clock shift in phase by 40ps (or one tap delay)? What if CLKIN is
300MHz and I step once, will the output clock shift in phase by the
same 40ps, or some multiple? Here I'm still confused as to how each
unit of increment is tied to the frequency of CLKIN. Equation 9 (pg 45)
doesn't hold true.
--------------------

John_H

unread,

Apr 4, 2006, 5:54:46 PM4/4/06

to

The DCM is covered well in the data sheets.

The Spartan3 DC & Switching data sheet specifies a tap resolution of 30 to
60 ps.

The variable phase shift is different between Spartan3 and Spartan3E. Since
you're using Spartan3, the "older" style of variable phase shift is used:
each PSINCDEC event is 1/256 of the CLKIN, not 1 tap delay. The Spartan3
Functional data sheet has 3 paragraphs on Variable Phase Shift mode that
talks about the 1/256 increment. What might not be clear is that there is a
period after the PSINCDEC event before the next event can be applied,
providing a fundamental limit to the speed of phase adjustment the system
can attain. Those details are also mentioned in the same paragraphs.

Each PSINCDEC event *might not* result in a tap change since ~40 ps
corresponds to 10 ns. Faster than ~100 MHz will have more than one event
per tap change on average. Slower than ~100 MHz (the DCM is good to 25 MHz)
you get more than one tap changed per event on average. The transition
point is dependent on the actual tap delay.

Happy reading!

"Craig Yarbrough" <hyar...@harris.com> wrote in message
news:1144185498....@e56g2000cwe.googlegroups.com...

Jim Granville

unread,

Apr 4, 2006, 5:57:03 PM4/4/06

to

Craig Yarbrough wrote:
> Essentially I need to know, for any given DCM configuration, how much
> the DCM outputs will shift in phase for each time I nail PSINCDEC. I'm
> thinking that if I understand better how the PS part of the DCM circuit
> works I can answer this for myself. I've got a case in with Xilinx but
> either they're not understanding my question, or they're not sure how
> to answer, or who knows. Any help would be greatly appreciated. Here's
> the correspondence so far:

Peter A. can probably help.
It sounds like you are looking for a simple controlled delay line,
with predictable behaviour ?

The DCM is a lot more than that: when you see signals like 'locked'
and PSDONE, and mention of negative delays, then there is more
'under the hood' than a simple delay line.

Thus you are likely to see jitter, but ISTR Peter A. has mentioned
there are ways to 'dumb down' the DCM, to a simpler subset, but more
predictable operation.

ie your challenge is likely to be (somehow) turning off the features
you do not need :)

-jg

Austin Lesea

unread,

Apr 4, 2006, 5:58:01 PM4/4/06

to Craig Yarbrough

Craig,

The delay line has many taps, but the phase shifter has only 256
possible settings (from 0/256 to 255/256 of one period).

So, if you increment, or decrement, you will phase shift by 1/256 of a
period, or by nothing at all (if 1/256 of a period is less than one tap
of the physical delay line).

Take a "simple" case of 39.063 MHz (25.6 ns period):

1/256 of 25.6 ns = 100 ps

Each increment or decrement will phase shift by 100 ps (or the nearest
tap granularity to the desired phase).

Austin

Craig Yarbrough

unread,

Apr 4, 2006, 6:31:59 PM4/4/06

to

Thanks for the responses. That clears things up quite a bit. One
followup question, is the ratio of tap increment/decrement to the CLKIN
frequency fixed at DCM creation, or is it dynamic? If during normal
operation I increase CLKIN from 39.063MHz to 51.723MHz will the ratio
remain the same? Or will I see a proportionally larger phase shift with
the faster clock?

- Craig

Steve Knapp (Xilinx Spartan-3 Generation FPGAs)

unread,

Apr 4, 2006, 7:29:43 PM4/4/06

to

For Spartan-3 FPGAs, the VARIABLE phase shift is _always_ a fraction
(1/256th) of the input clock period--the equivalent of about 1.4
degrees or pi/128 radians. In Spartan-3 FPGAs, the DCM logic converts
this value to the appropriate number of tap delays, with each tap being
between 30 to 60 ps.

Assume a 166.667 MHz clock, which has a 6 ns clock period. Each
PSINCDEC increment/decrement step is 6/265 ns = ~23 ps, less than a tap
delay. The DCM control logic will decide wether or not to actually
shift when the shift value falls below the tap resolution.

Or, let's take your specific example.

If CLKIN is 39.063 MHz, then the clock period is ~26 ns. Each PSINCDEC
value is 26/256 ns = ~102 ps.

If you changed the input clock to 51.723 MHz (please reset the DCM when
changing input frequencies please), then the clock period shrinks to
~19.33 ns. Now each PSINCDEC value is 19.33/256 ns = ~75.5 ps. On
Spartan-3, the size of the PSINCDEC value changes according to the
input clock frequency. Spartan-3E FPGAs behave differently.

Did this sufficiently answer your question?

---------------------------------
Steven K. Knapp
Applications Manager, Xilinx Inc.
General Products Division
Spartan-3/-3E FPGAs
http://www.xilinx.com/spartan3e
---------------------------------
The Spartan(tm)-3 Generation: The World's Lowest-Cost FPGAs.

Craig Yarbrough

unread,

Apr 4, 2006, 7:52:35 PM4/4/06

to

Perfect Steve thanks. Thanks everyone!

- Craig

Jim Granville

unread,

Apr 4, 2006, 8:43:04 PM4/4/06

to

Would it be correct to also add this ?
- Chooses the nearest physical [~40ns] tap, to the desired delay (N x
102ps, or N x 75.5ps)
- Saturates at nom 10ns

>Spartan-3E FPGAs behave differently.

Whilst we are on this subject, to this detail,
can you give some info on how does Spartan 3E differ, and why ?

-jg

Austin Lesea

unread,

Apr 4, 2006, 9:43:20 PM4/4/06

to

Jim,

Yes, nearest tap to actual value, and as for how Spartan stuff works, I
have to defer to those engineers (as they did things differently).

Austin

Austin Lesea

unread,

Apr 4, 2006, 9:41:54 PM4/4/06

to

Craig,

Remains the same.

Austin

PeterC

unread,

Apr 5, 2006, 12:02:31 AM4/5/06

to

A somewhat related question: I have simulated (behavioural) a Spartan 3
DCM with the PHASE_SHIFT parameter initialized to zero. I am using the
DCM in Variable phase shift mode. Input frequency is 100 MHz.

However, the behavioural simulation shows that once the core asserts
LOCKED_OUT high, there is still a significant (approx. 3 ns) phase
shift between the input clock and clk0_out.

Is this clk_in to clk0_out phase relationship shown in the behavioural
simulation accurate?

(Apologies for tagging onto the end of this thread, I wanted to get an
answer efficiently and a few of the DCM gurus are likely to still be
looking at this thread.)

Allan Herriman

unread,

Apr 5, 2006, 1:24:52 AM4/5/06

to

It's meant to make the phase of the feedback input (clkfb) the same as
the clock input (clkin). If you have an external (to the DCM) delay
between the clk0 output and the feedback input, you will see a phase
shift between clkin and clk0.

Does this match your configuration?

Regards,
Allan

PeterC

unread,

Apr 5, 2006, 1:46:31 AM4/5/06

to

Thanks Allan.

That makes sense. I did not have the clkfb input enabled as I am not
trying to compensate for skew. I have just added it to the core,
re-compiled and re-simulated and there is still the same phase shift
between clkin and clk0 (=clkfb). Admittedly I do not reset the core (as
is recommended following the inclusion of clkfb).

Must I reset the core?

Obviously I am still missing something or there is a discrepancy
between the behavioural model of the DCM and its real operation - but I
very much doubt that this is the case given the maturity of the S3
device.

Any other hints?

Brian Davis

unread,

Apr 5, 2006, 7:49:18 AM4/5/06

to

PeterC wrote:
>
>Is this clk_in to clk0_out phase relationship shown in the behavioural
>simulation accurate?
>

Allan Herriman replied:

>
> It's meant to make the phase of the feedback input (clkfb) the same as
> the clock input (clkin)
>

One caution: the default DCM configuration inserts an intentional
delay in the DCM feedback path, making the clkfb LEAD the input
clock by about 1.5 ns in the V2 family (not sure about S3 numbers).

This is done to insure zero hold at IOB inputs in the default
SYSTEM_SYNCHRONOUS mode.

This also used to cause 'clock creep' in cascaded DCM's, but the
latest S/W might set the DCM to SOURCE_SYNCHRONOUS if it
sees a cascade.

The V2 delay element is described nicely in XAPP259 v1.0 (pp 4-5);
however, the equivalent documention for S3, XAPP462 v1.1 (pp 32-34),
is horribly confused by the notion that delaying feedback makes the
output clock happen earlier.

Steve Knapp (Xilinx Spartan-3 Generation FPGAs)

unread,

Apr 5, 2006, 1:31:48 PM4/5/06

to

Jim Granville wrote:
> Steve Knapp (Xilinx Spartan-3 Generation FPGAs) wrote:

[ ... snip ...]

> >Spartan-3E FPGAs behave differently.
>
> Whilst we are on this subject, to this detail,
> can you give some info on how does Spartan 3E differ, and why ?
>
> -jg

The only difference is in the DLL phase shifter feature included with
the DCM. Most everything else is identical between Spartan-3 and
Spartan-3E DCMs.

There's a summary of the differences in the following Answer Record,
but I'll follow up here with the abbreviated version.
http://www.xilinx.com/xlnx/xil_ans_display.jsp?getPagePath=23004

In FIXED phase shift mode, the difference depends on which version ISE
that you are using, as described in the data sheet and the Answer
Record. Physically, the Spartan-3 DLL performs a fixed phase shift by
as much as a full clock cycle forward or backward. The Spartan-3E DLL
performs a fixed phase shift by as much as _half_ a clock cycle forward
or backward. For nearly all applications, the Spartan-3E half-clock
shift provides the same flexibility as the full clock shift, but with
significantly less silicon.

In VARIABLE phase shift mode, the difference is that the Spartan-3 DLL
performs a variable phase shift in fractions of a clock period, 1/256th
of a full circle. Think degrees, angles, radians, using your favorite
angular unit. Extra logic within the Spartan-3 DLL calculates the
delay line change. The Spartan-3E DLL also performs a variable phase
shift using a delay line. However, in Spartan-3E, you have raw control
over the delay. The shift is always in time, not in some angular unit.

Jim Granville

unread,

Apr 5, 2006, 5:33:11 PM4/5/06

to

Thanks,
When you say time for 3E, do you mean 'calibrated time', or some
multiple of a ~30-60ps delay chain ?
I can see that the extra logic in the -3, (should?) track temp/Vcc
changes - or does it grab the multiplier only when the DCM is reset ?

How does the -3E manage temp/vcc/process variations, or does the
user do that ?

-jg

Austin Lesea

unread,

Apr 5, 2006, 5:54:52 PM4/5/06

to

Jim,

The tap state machine is always trying to keep one entire period in one
of the delay lines. This way, the unit is always self calibrating, it
always "knows" how many taps equals one period.

So when you ask for 23/256 of a period shift, the arithmetic unit solves
for the closet tap (truncating).

To make the silicon take up less space, the delay line itself is
optimized to not change over the PVT (as much as it would otherwise).

How this is done is the subject of issued patents, so for those that are
curious, they can look these up.

Austin

Jim Granville

unread,

Apr 5, 2006, 6:50:02 PM4/5/06

to

Austin Lesea wrote:
> Jim,
>
> The tap state machine is always trying to keep one entire period in one
> of the delay lines. This way, the unit is always self calibrating, it
> always "knows" how many taps equals one period.
>
> So when you ask for 23/256 of a period shift, the arithmetic unit solves
> for the closet tap (truncating).
>
> To make the silicon take up less space, the delay line itself is
> optimized to not change over the PVT (as much as it would otherwise).
>
> How this is done is the subject of issued patents, so for those that are
> curious, they can look these up.

Yes, I can follow that for the -3, but Steve was suggesting the 3E was
slightly different, so I was wanting to clarify the deails.

-jg

PeterC

unread,

Apr 5, 2006, 7:07:44 PM4/5/06

to

Thank you Brian - pointers to answer records and the past thread
greatly appreciated.

Austin Lesea

unread,

Apr 6, 2006, 10:57:52 AM4/6/06

to

Jim,

I do not think it is any different in this regard, but Steve will
correct me if I am wrong,

Austin

Brian Davis

unread,

Apr 6, 2006, 12:14:17 PM4/6/06

to

PeterC wrote:
> Thank you Brian - pointers to answer records and the past thread
> greatly appreciated.

Sure; I can't find my folder of DCM simulation notes right now, but
searching the Xilinx Answer Records for "DCM" or "DCM simulation"
will turn up a boatload of the DCM simulation quirks; I've listed some
more of them below.

You may also want to try running a post-PAR timing simulation to see
what the DCM delay looks like with the back-annotated delays.

11067 SimPrim - ModelSim Simulations: Input and Output clocks
of the DCM and CLKDLL models do not appear to be de-skewed

13213 UniSim, SimPrim, Simulation - How do I simulate the DCM
without connecting the CLK Feedback (CLKFB) port? (VHDL)

11344 UniSim - Variables passed to GENERICs in functional simulation
are not working properly (VHDL)

18390 7.1i Timing Analyzer/TRACE - Changing the DESKEW_ADJUST
parameter does not affect the DCM value (Tdcmino)

20845 6.3i UniSim, Simulation- There is a Delta-cycle difference
between clk0 and clk2x in the DCM model

22064 7.1i UniSim, Simulation - There is a Delta-cycle difference
between CLK0 and CLKDV in the DCM model

6362 UniSim, SimPrim, Simulation - When I simulate a DCM or CLKDLL,
the LOCKED signal does not activate unless simulation is run in
ps time resolution

18115 8.1i/7.1i Simulation - DCM outputs are "0" and the DCM does not
lock UniSim and SimPrim VHDL models) (DCM reset requirement)

19005 Virtex-II/Virtex-II Pro, Clocking Wizard - The LOCKED signal
doed noy go high for cascaded DCM when CLKDV is used

have fun,
Brian

PeterC

unread,

Apr 9, 2006, 8:20:41 PM4/9/06

to

Another point regarding the latency of the dynamic phase shifting - the
data sheet states:

"The phase adjustment may require as many as 100 CLKIN cycles plus
three PSCLK cycles to take effect, at which point the output PSDONE
goes High for one PSCLK cycle."

In reality, what does "may require" mean?

Is there anything that can be done (eg. through CLKIN or PSCLK
frequency selection say) to reduce this 100 CLKIN cycles?

Does this "100 CLKIN cycles" vary between devices, with Vcc, temp?

I would like to avoid experiments with actual devices during my design
phase.