Reply by edmoore August 5, 20132013-08-05
Xilinx Xapp224 , which describes data recovery using 4x oversampling, might be useful to the OP.
Reply by RCIngham August 5, 20132013-08-05
>On 8/2/2013 6:35 AM, RCIngham wrote: >>> On 8/1/13 5:56 AM, RCIngham wrote: >>>>> On 7/31/13 9:36 AM, RCIngham wrote: >>>>>> [snip] >>>>>>> >>>>>> Unless 'length' is limited, your worst case has header >>>> "0000001111111111" >>>>>> (with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, >> which >>>> will >>>>>> have 2728 ones stuffed into them. Total line packet length is 19113 >>>>>> symbols. If the clocks are within 1/19114 of each other, the same >> number >>>> of >>>>>> symbols will be received as sent, ASSUMING no jitter. You can't >> assume >>>>>> that, but if there is 'not much' jitter then perhaps 1/100k will be >>>> good >>>>>> enough for relative drift to not need to be corrected for. >>>>>> >>>>>> So, for version 1, use the 'sync' to establish the start of frame
and
>>>> the >>>>>> sampling point, simulate the 'Rx fast' and 'Rx slow' cases in >> parallel, >>>> and >>>>>> see whether it works. >>>>>> >>>>>> BTW, this is off-topic for C.A.F., as it is a system design problem >> not >>>>>> related to the implementation method. >>>>>> >>>>>> >>>>> >>>>> Since you can resynchronize your sampling clock on each transition >>>>> received, you only need to "hold lock" for the maximum time between >>>>> transitions, which is 7 bit times. This would mean that if you have
a
>>>>> nominal 4x clock, some sample points will be only 3 clocks apart (if >> you >>>>> are slow) or some will be 5 clocks apart (if you are fast), while
most
>>>>> will be 4 clock apart. This is the reason for the 1 bit stuffing. >>>>> >>>> >>>> The bit-stuffing in long sequences of zeroes is almost certainly
there
>> to >>>> facilitate a conventional clock recovery method, which I am proposing >> not >>>> using PROVIDED THAT the clocks at each end are within a sufficiently >> tight >>>> tolerance. Detect the ones in the as-sent stream first, then decide >> which >>>> are due to bit-stuffing, and remove them. >>>> >>>> Deciding how tight a tolerance is 'sufficiently tight' is probably >>>> non-trivial, so I won't be doing it for free. >>>> >>>> >>> >>> Since a 4x clock allows for a 25% data period correction, and we will >>> get an opportunity to do so every 7 data periods, we can tolerate
about
>>> a 25/7 ~ 3% error in clock frequency. (To get a more exact value we
will
>>> need to know details like jitter and sampling apertures, but this
gives
>>> us a good ball-park figure). Higher sampling rates can about double >>> this, the key is we need to be able to know which direction the error
is
>>> in, so we need to be less than a 50% of a data period error including >>> the variation within a sample clock. >>> >>> To try to gather the data without resynchronizing VASTLY decreases
your
>>> tolerance for clock errors as you need to stay within a clock cycle
over
>>> the entire message. >>> >>> The protocol, with its 3 one preamble, does seem like there may have >>> been some effort to enable the use of a PLL to generate the data >>> sampling clock, which may have been the original method. This does
have
>>> the advantage the the data clock out of the sampler is more regular
(not
>>> having the sudden jumps from the resyncronizing), and getting a set a >>> burst of 1s helps the PLL to get a bit more centered on the data. My >>> experience though is that with FPGAs (as would be on topic for this >>> group), this sort of PLL synchronism is not normally used, but >>> oversampling clocks with phase correction is fairly standard. >>> >> >> Some form of clock recovery is essential for continuous ('synchronous') >> data streams. It is not required for 'sufficiently short' asynchronous
data
>> bursts, the classic example of which is RS-232. What I am suggesting is >> that the OP determines - using simulation - whether these frames are
too
>> long given the relative clock tolerances for a system design without
clock
>> recovery. >> >> As I previously noted, this is first a 'system design' problem. Only
after
>> that has been completed does it become an 'FPGA design' problem. > >I don't think the frame length is the key parameter, rather it is the 6 >zero, one insertion that guarantees a transition every 7 bits. > >-- > >Rick >
Simulation or other experiment will indicate which of us (if either) is correct. --------------------------------------- Posted through http://www.FPGARelated.com
Reply by Richard Damon August 2, 20132013-08-02
On 8/2/13 6:30 AM, alb wrote:
> Hi Richard, > > On 02/08/2013 06:22, Richard Damon wrote: > [] >>> The bit-stuffing in long sequences of zeroes is almost certainly there to >>> facilitate a conventional clock recovery method, which I am proposing not >>> using PROVIDED THAT the clocks at each end are within a sufficiently tight >>> tolerance. Detect the ones in the as-sent stream first, then decide which >>> are due to bit-stuffing, and remove them. >>> >>> Deciding how tight a tolerance is 'sufficiently tight' is probably >>> non-trivial, so I won't be doing it for free. >>> >>> >> >> Since a 4x clock allows for a 25% data period correction, and we will >> get an opportunity to do so every 7 data periods, we can tolerate about >> a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will >> need to know details like jitter and sampling apertures, but this gives >> us a good ball-park figure). Higher sampling rates can about double >> this, the key is we need to be able to know which direction the error is >> in, so we need to be less than a 50% of a data period error including >> the variation within a sample clock. > > According to your math it looks like a 2x clock allows for a 50% data > period correction and therefore a 50/7 ~6% error in clock frequency, > which seems to me quite counter intuitive... Am I missing something? >
The details are that for a Nx sampling clocks, every time you see a clock, you can possibly shift N/2-1 high speed clock cycles every adjustment. For example, with a 16x clock, you can correct for the edge being between -7 and +7 sampling clocks from the expected point. If it is 8 clocks off, you don't know if is should be +8 or -8, so you are in trouble. If N is odd, you can possibly handle (N-1)/2 cycles. (Note that this assumes negligible jitter.) So our final allowable shift in data clocks is (N/2-1)/N which can also be written as 1/2-1/N, which leads to my 6% for N large (50% correction) and 3% for N=4. For N-2 this gives us 0%.
> [] >> The protocol, with its 3 one preamble, does seem like there may have >> been some effort to enable the use of a PLL to generate the data >> sampling clock, which may have been the original method. This does have >> the advantage the the data clock out of the sampler is more regular (not >> having the sudden jumps from the resyncronizing), and getting a set a >> burst of 1s helps the PLL to get a bit more centered on the data. My >> experience though is that with FPGAs (as would be on topic for this >> group), this sort of PLL synchronism is not normally used, but >> oversampling clocks with phase correction is fairly standard. > > This is indeed what I'm looking for, oversampling (4x or 8x) and phase > correct. >
Reply by alb August 2, 20132013-08-02
On 02/08/2013 16:16, rickman wrote:
[]
>> It is an All Digital PLL: >> >> http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf >> >> all the elements of a PLL are implemented in the digital domain. > > I guess I wasn't aware that a digital PLL wasn't *all* digital. That is > what I have been referring to as digital.
you might find this article interesting: http://www.silabs.com/Support%20Documents/TechnicalDocs/AN575.pdf
Reply by rickman August 2, 20132013-08-02
On 8/2/2013 6:35 AM, RCIngham wrote:
>> On 8/1/13 5:56 AM, RCIngham wrote: >>>> On 7/31/13 9:36 AM, RCIngham wrote: >>>>> [snip] >>>>>> >>>>> Unless 'length' is limited, your worst case has header >>> "0000001111111111" >>>>> (with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, > which >>> will >>>>> have 2728 ones stuffed into them. Total line packet length is 19113 >>>>> symbols. If the clocks are within 1/19114 of each other, the same > number >>> of >>>>> symbols will be received as sent, ASSUMING no jitter. You can't > assume >>>>> that, but if there is 'not much' jitter then perhaps 1/100k will be >>> good >>>>> enough for relative drift to not need to be corrected for. >>>>> >>>>> So, for version 1, use the 'sync' to establish the start of frame and >>> the >>>>> sampling point, simulate the 'Rx fast' and 'Rx slow' cases in > parallel, >>> and >>>>> see whether it works. >>>>> >>>>> BTW, this is off-topic for C.A.F., as it is a system design problem > not >>>>> related to the implementation method. >>>>> >>>>> >>>> >>>> Since you can resynchronize your sampling clock on each transition >>>> received, you only need to "hold lock" for the maximum time between >>>> transitions, which is 7 bit times. This would mean that if you have a >>>> nominal 4x clock, some sample points will be only 3 clocks apart (if > you >>>> are slow) or some will be 5 clocks apart (if you are fast), while most >>>> will be 4 clock apart. This is the reason for the 1 bit stuffing. >>>> >>> >>> The bit-stuffing in long sequences of zeroes is almost certainly there > to >>> facilitate a conventional clock recovery method, which I am proposing > not >>> using PROVIDED THAT the clocks at each end are within a sufficiently > tight >>> tolerance. Detect the ones in the as-sent stream first, then decide > which >>> are due to bit-stuffing, and remove them. >>> >>> Deciding how tight a tolerance is 'sufficiently tight' is probably >>> non-trivial, so I won't be doing it for free. >>> >>> >> >> Since a 4x clock allows for a 25% data period correction, and we will >> get an opportunity to do so every 7 data periods, we can tolerate about >> a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will >> need to know details like jitter and sampling apertures, but this gives >> us a good ball-park figure). Higher sampling rates can about double >> this, the key is we need to be able to know which direction the error is >> in, so we need to be less than a 50% of a data period error including >> the variation within a sample clock. >> >> To try to gather the data without resynchronizing VASTLY decreases your >> tolerance for clock errors as you need to stay within a clock cycle over >> the entire message. >> >> The protocol, with its 3 one preamble, does seem like there may have >> been some effort to enable the use of a PLL to generate the data >> sampling clock, which may have been the original method. This does have >> the advantage the the data clock out of the sampler is more regular (not >> having the sudden jumps from the resyncronizing), and getting a set a >> burst of 1s helps the PLL to get a bit more centered on the data. My >> experience though is that with FPGAs (as would be on topic for this >> group), this sort of PLL synchronism is not normally used, but >> oversampling clocks with phase correction is fairly standard. >> > > Some form of clock recovery is essential for continuous ('synchronous') > data streams. It is not required for 'sufficiently short' asynchronous data > bursts, the classic example of which is RS-232. What I am suggesting is > that the OP determines - using simulation - whether these frames are too > long given the relative clock tolerances for a system design without clock > recovery. > > As I previously noted, this is first a 'system design' problem. Only after > that has been completed does it become an 'FPGA design' problem.
I don't think the frame length is the key parameter, rather it is the 6 zero, one insertion that guarantees a transition every 7 bits. -- Rick
Reply by rickman August 2, 20132013-08-02
On 8/2/2013 3:49 AM, alb wrote:
> On 02/08/2013 06:19, rickman wrote: > [] >>> >>> This is what I meant indeed. I believe I confused DPLL with ADPLL... >> >> I am not familiar with ADPLL. What is that? > > It is an All Digital PLL: > > http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf > > all the elements of a PLL are implemented in the digital domain.
I guess I wasn't aware that a digital PLL wasn't *all* digital. That is what I have been referring to as digital. -- Rick
Reply by RCIngham August 2, 20132013-08-02
>On 8/1/13 5:56 AM, RCIngham wrote: >>> On 7/31/13 9:36 AM, RCIngham wrote: >>>> [snip] >>>>> >>>> Unless 'length' is limited, your worst case has header >> "0000001111111111" >>>> (with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros,
which
>> will >>>> have 2728 ones stuffed into them. Total line packet length is 19113 >>>> symbols. If the clocks are within 1/19114 of each other, the same
number
>> of >>>> symbols will be received as sent, ASSUMING no jitter. You can't
assume
>>>> that, but if there is 'not much' jitter then perhaps 1/100k will be >> good >>>> enough for relative drift to not need to be corrected for. >>>> >>>> So, for version 1, use the 'sync' to establish the start of frame and >> the >>>> sampling point, simulate the 'Rx fast' and 'Rx slow' cases in
parallel,
>> and >>>> see whether it works. >>>> >>>> BTW, this is off-topic for C.A.F., as it is a system design problem
not
>>>> related to the implementation method. >>>> >>>> >>> >>> Since you can resynchronize your sampling clock on each transition >>> received, you only need to "hold lock" for the maximum time between >>> transitions, which is 7 bit times. This would mean that if you have a >>> nominal 4x clock, some sample points will be only 3 clocks apart (if
you
>>> are slow) or some will be 5 clocks apart (if you are fast), while most >>> will be 4 clock apart. This is the reason for the 1 bit stuffing. >>> >> >> The bit-stuffing in long sequences of zeroes is almost certainly there
to
>> facilitate a conventional clock recovery method, which I am proposing
not
>> using PROVIDED THAT the clocks at each end are within a sufficiently
tight
>> tolerance. Detect the ones in the as-sent stream first, then decide
which
>> are due to bit-stuffing, and remove them. >> >> Deciding how tight a tolerance is 'sufficiently tight' is probably >> non-trivial, so I won't be doing it for free. >> >> > >Since a 4x clock allows for a 25% data period correction, and we will >get an opportunity to do so every 7 data periods, we can tolerate about >a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will >need to know details like jitter and sampling apertures, but this gives >us a good ball-park figure). Higher sampling rates can about double >this, the key is we need to be able to know which direction the error is >in, so we need to be less than a 50% of a data period error including >the variation within a sample clock. > >To try to gather the data without resynchronizing VASTLY decreases your >tolerance for clock errors as you need to stay within a clock cycle over >the entire message. > >The protocol, with its 3 one preamble, does seem like there may have >been some effort to enable the use of a PLL to generate the data >sampling clock, which may have been the original method. This does have >the advantage the the data clock out of the sampler is more regular (not >having the sudden jumps from the resyncronizing), and getting a set a >burst of 1s helps the PLL to get a bit more centered on the data. My >experience though is that with FPGAs (as would be on topic for this >group), this sort of PLL synchronism is not normally used, but >oversampling clocks with phase correction is fairly standard. >
Some form of clock recovery is essential for continuous ('synchronous') data streams. It is not required for 'sufficiently short' asynchronous data bursts, the classic example of which is RS-232. What I am suggesting is that the OP determines - using simulation - whether these frames are too long given the relative clock tolerances for a system design without clock recovery. As I previously noted, this is first a 'system design' problem. Only after that has been completed does it become an 'FPGA design' problem. --------------------------------------- Posted through http://www.FPGARelated.com
Reply by alb August 2, 20132013-08-02
Hi Richard,

On 02/08/2013 06:22, Richard Damon wrote:
[]
>> The bit-stuffing in long sequences of zeroes is almost certainly there to >> facilitate a conventional clock recovery method, which I am proposing not >> using PROVIDED THAT the clocks at each end are within a sufficiently tight >> tolerance. Detect the ones in the as-sent stream first, then decide which >> are due to bit-stuffing, and remove them. >> >> Deciding how tight a tolerance is 'sufficiently tight' is probably >> non-trivial, so I won't be doing it for free. >> >> > > Since a 4x clock allows for a 25% data period correction, and we will > get an opportunity to do so every 7 data periods, we can tolerate about > a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will > need to know details like jitter and sampling apertures, but this gives > us a good ball-park figure). Higher sampling rates can about double > this, the key is we need to be able to know which direction the error is > in, so we need to be less than a 50% of a data period error including > the variation within a sample clock.
According to your math it looks like a 2x clock allows for a 50% data period correction and therefore a 50/7 ~6% error in clock frequency, which seems to me quite counter intuitive... Am I missing something? []
> The protocol, with its 3 one preamble, does seem like there may have > been some effort to enable the use of a PLL to generate the data > sampling clock, which may have been the original method. This does have > the advantage the the data clock out of the sampler is more regular (not > having the sudden jumps from the resyncronizing), and getting a set a > burst of 1s helps the PLL to get a bit more centered on the data. My > experience though is that with FPGAs (as would be on topic for this > group), this sort of PLL synchronism is not normally used, but > oversampling clocks with phase correction is fairly standard.
This is indeed what I'm looking for, oversampling (4x or 8x) and phase correct.
Reply by alb August 2, 20132013-08-02
Hi Lasse,

On 01/08/2013 00:03, langwadt@fonz.dk wrote:
[]
>> It does not work not because of Nyquist limit, but because the >> recovery >> >> of a phase shift cannot be done with just two clocks per bit. >> > > may not technically be Nyquist limit, but like so many things in > nature the same relations are repeated
A signal traveling on a physical channel (be it on a cable, a PCB route, an FPGA interconnection...) will have sharp transitions at the beginning of its journey and sloppier ones at the end due to losses, but if you take a comparator and discriminate a '1' or '0', then you do not 'need' higher frequencies than half the data rate itself (or symbol rate to be precise). If you take a sinusoidal waveform and put a threshold at 0, then you have two symbols per cycle. Why sampling at the data rate is not sufficient then? Because there are several other factors. First of all encoding and decoding are processes which do introduce 'noise' as well as 'limitations'. Having a comparator to discriminate 0/1 does introduce noise in the time of transaction, therefore distorting the phase of the signal. The medium itself might be source of other jitter since it is sensitive to the environment (temperature, pressure, humidity, ...). TRANSMITTER MEDIUM RECEIVER +-------------------+ +-------------------+ | +---| |---+ | | '10100101' -> |ENC| -\/\/\/\/->|DEC| -> '10101110' | | +---| physical |---+ | +-------------------+ signal +-------------------+ ^ ^ | | +-----+ +-----+ | clk | | clk | +-----+ +-----+ You do not care about reconstructing a physical signal (like in ADC sampling), you *do* care about reconstructing a data stream. Another source of troubles are the two clock generators on the TX and RX. They cannot be assumed to be perfectly matching and any difference will lead to a phase drift which eventually will spoil your data sampling.
> and if you take NRZ you'll notice that the highest "frequency" > (0101010101..) is only half of the data rate
that is why a clock frequency = to data rate is sufficient to 'sample' the information. <nitpick mode on> the NRZ is a line code, i.e. a translation of your data stream with appropriate physical signal (light, current, sound, ...) for the chosen physical medium (fiber, cable, air, ...) and has nothing to do with a toggling bit. <nitpick mode off>
Reply by alb August 2, 20132013-08-02
On 02/08/2013 06:19, rickman wrote:
[]
>> >> This is what I meant indeed. I believe I confused DPLL with ADPLL... > > I am not familiar with ADPLL. What is that?
It is an All Digital PLL: http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf all the elements of a PLL are implemented in the digital domain.