FPGARelated.com
Forums

serial protocol specs and verification

Started by alb July 26, 2013
On 7/31/2013 5:37 PM, alb wrote:
> On 31/07/2013 13:44, rickman wrote: > [] >>> <neatpick mode on> >>> Nyquist criterion has nothing to do with being able to sample data. As a >>> matter of fact your internal clock is perfectly capable to sample data >>> flowing in your fpga without the need to be 2x the data rate. >>> <neatpick mode off> >> >> I don't know what you are talking about. If you asynchronously sample, >> you very much do have to satisfy the Nyquist criterion. A 2x clock, >> because it isn't *exactly* 2x, can *not* be used to capture a bitstream >> so that you can find the the transitions and know which bit is which. > > A data stream which is *exactly* flowing with a frequency f can be > *exactly* sampled with a clock frequency f, it happens continuously in > your synchronous logic. What happened to Nyquist theorem? > > If you have a protocol with data and clock, does it mean that you will > recognize only half of the bits because your clock rate is just equal to > your data rate? I'm confused... > > IMO calling a signal 'asynchronous' does not make any difference. Mr. > Nyquist referred to reconstructing an analog signal with a discrete > sampling (no quantization error involved). How does that applies to > digital transmission?
Yes, you are right about the rates. I was not thinking of this correctly. The Nyquist theorem looks at *frequency* content which is not the same as bit rate.
>> Otherwise there wouldn't be so many errors in the existing circuit. > > It does not work not because of Nyquist limit, but because the recovery > of a phase shift cannot be done with just two clocks per bit. > > [] >> You can use the FPGA PLL to multiply your clock from 2x to 4x to allow >> the DPLL to work correctly. > > This is what I meant indeed. I believe I confused DPLL with ADPLL...
I am not familiar with ADPLL. What is that? -- Rick
On 8/1/13 5:56 AM, RCIngham wrote:
>> On 7/31/13 9:36 AM, RCIngham wrote: >>> [snip] >>>> >>> Unless 'length' is limited, your worst case has header > "0000001111111111" >>> (with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, which > will >>> have 2728 ones stuffed into them. Total line packet length is 19113 >>> symbols. If the clocks are within 1/19114 of each other, the same number > of >>> symbols will be received as sent, ASSUMING no jitter. You can't assume >>> that, but if there is 'not much' jitter then perhaps 1/100k will be > good >>> enough for relative drift to not need to be corrected for. >>> >>> So, for version 1, use the 'sync' to establish the start of frame and > the >>> sampling point, simulate the 'Rx fast' and 'Rx slow' cases in parallel, > and >>> see whether it works. >>> >>> BTW, this is off-topic for C.A.F., as it is a system design problem not >>> related to the implementation method. >>> >>> >> >> Since you can resynchronize your sampling clock on each transition >> received, you only need to "hold lock" for the maximum time between >> transitions, which is 7 bit times. This would mean that if you have a >> nominal 4x clock, some sample points will be only 3 clocks apart (if you >> are slow) or some will be 5 clocks apart (if you are fast), while most >> will be 4 clock apart. This is the reason for the 1 bit stuffing. >> > > The bit-stuffing in long sequences of zeroes is almost certainly there to > facilitate a conventional clock recovery method, which I am proposing not > using PROVIDED THAT the clocks at each end are within a sufficiently tight > tolerance. Detect the ones in the as-sent stream first, then decide which > are due to bit-stuffing, and remove them. > > Deciding how tight a tolerance is 'sufficiently tight' is probably > non-trivial, so I won't be doing it for free. > >
Since a 4x clock allows for a 25% data period correction, and we will get an opportunity to do so every 7 data periods, we can tolerate about a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will need to know details like jitter and sampling apertures, but this gives us a good ball-park figure). Higher sampling rates can about double this, the key is we need to be able to know which direction the error is in, so we need to be less than a 50% of a data period error including the variation within a sample clock. To try to gather the data without resynchronizing VASTLY decreases your tolerance for clock errors as you need to stay within a clock cycle over the entire message. The protocol, with its 3 one preamble, does seem like there may have been some effort to enable the use of a PLL to generate the data sampling clock, which may have been the original method. This does have the advantage the the data clock out of the sampler is more regular (not having the sudden jumps from the resyncronizing), and getting a set a burst of 1s helps the PLL to get a bit more centered on the data. My experience though is that with FPGAs (as would be on topic for this group), this sort of PLL synchronism is not normally used, but oversampling clocks with phase correction is fairly standard.
On 02/08/2013 06:19, rickman wrote:
[]
>> >> This is what I meant indeed. I believe I confused DPLL with ADPLL... > > I am not familiar with ADPLL. What is that?
It is an All Digital PLL: http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf all the elements of a PLL are implemented in the digital domain.
Hi Lasse,

On 01/08/2013 00:03, langwadt@fonz.dk wrote:
[]
>> It does not work not because of Nyquist limit, but because the >> recovery >> >> of a phase shift cannot be done with just two clocks per bit. >> > > may not technically be Nyquist limit, but like so many things in > nature the same relations are repeated
A signal traveling on a physical channel (be it on a cable, a PCB route, an FPGA interconnection...) will have sharp transitions at the beginning of its journey and sloppier ones at the end due to losses, but if you take a comparator and discriminate a '1' or '0', then you do not 'need' higher frequencies than half the data rate itself (or symbol rate to be precise). If you take a sinusoidal waveform and put a threshold at 0, then you have two symbols per cycle. Why sampling at the data rate is not sufficient then? Because there are several other factors. First of all encoding and decoding are processes which do introduce 'noise' as well as 'limitations'. Having a comparator to discriminate 0/1 does introduce noise in the time of transaction, therefore distorting the phase of the signal. The medium itself might be source of other jitter since it is sensitive to the environment (temperature, pressure, humidity, ...). TRANSMITTER MEDIUM RECEIVER +-------------------+ +-------------------+ | +---| |---+ | | '10100101' -> |ENC| -\/\/\/\/->|DEC| -> '10101110' | | +---| physical |---+ | +-------------------+ signal +-------------------+ ^ ^ | | +-----+ +-----+ | clk | | clk | +-----+ +-----+ You do not care about reconstructing a physical signal (like in ADC sampling), you *do* care about reconstructing a data stream. Another source of troubles are the two clock generators on the TX and RX. They cannot be assumed to be perfectly matching and any difference will lead to a phase drift which eventually will spoil your data sampling.
> and if you take NRZ you'll notice that the highest "frequency" > (0101010101..) is only half of the data rate
that is why a clock frequency = to data rate is sufficient to 'sample' the information. <nitpick mode on> the NRZ is a line code, i.e. a translation of your data stream with appropriate physical signal (light, current, sound, ...) for the chosen physical medium (fiber, cable, air, ...) and has nothing to do with a toggling bit. <nitpick mode off>
Hi Richard,

On 02/08/2013 06:22, Richard Damon wrote:
[]
>> The bit-stuffing in long sequences of zeroes is almost certainly there to >> facilitate a conventional clock recovery method, which I am proposing not >> using PROVIDED THAT the clocks at each end are within a sufficiently tight >> tolerance. Detect the ones in the as-sent stream first, then decide which >> are due to bit-stuffing, and remove them. >> >> Deciding how tight a tolerance is 'sufficiently tight' is probably >> non-trivial, so I won't be doing it for free. >> >> > > Since a 4x clock allows for a 25% data period correction, and we will > get an opportunity to do so every 7 data periods, we can tolerate about > a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will > need to know details like jitter and sampling apertures, but this gives > us a good ball-park figure). Higher sampling rates can about double > this, the key is we need to be able to know which direction the error is > in, so we need to be less than a 50% of a data period error including > the variation within a sample clock.
According to your math it looks like a 2x clock allows for a 50% data period correction and therefore a 50/7 ~6% error in clock frequency, which seems to me quite counter intuitive... Am I missing something? []
> The protocol, with its 3 one preamble, does seem like there may have > been some effort to enable the use of a PLL to generate the data > sampling clock, which may have been the original method. This does have > the advantage the the data clock out of the sampler is more regular (not > having the sudden jumps from the resyncronizing), and getting a set a > burst of 1s helps the PLL to get a bit more centered on the data. My > experience though is that with FPGAs (as would be on topic for this > group), this sort of PLL synchronism is not normally used, but > oversampling clocks with phase correction is fairly standard.
This is indeed what I'm looking for, oversampling (4x or 8x) and phase correct.
>On 8/1/13 5:56 AM, RCIngham wrote: >>> On 7/31/13 9:36 AM, RCIngham wrote: >>>> [snip] >>>>> >>>> Unless 'length' is limited, your worst case has header >> "0000001111111111" >>>> (with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros,
which
>> will >>>> have 2728 ones stuffed into them. Total line packet length is 19113 >>>> symbols. If the clocks are within 1/19114 of each other, the same
number
>> of >>>> symbols will be received as sent, ASSUMING no jitter. You can't
assume
>>>> that, but if there is 'not much' jitter then perhaps 1/100k will be >> good >>>> enough for relative drift to not need to be corrected for. >>>> >>>> So, for version 1, use the 'sync' to establish the start of frame and >> the >>>> sampling point, simulate the 'Rx fast' and 'Rx slow' cases in
parallel,
>> and >>>> see whether it works. >>>> >>>> BTW, this is off-topic for C.A.F., as it is a system design problem
not
>>>> related to the implementation method. >>>> >>>> >>> >>> Since you can resynchronize your sampling clock on each transition >>> received, you only need to "hold lock" for the maximum time between >>> transitions, which is 7 bit times. This would mean that if you have a >>> nominal 4x clock, some sample points will be only 3 clocks apart (if
you
>>> are slow) or some will be 5 clocks apart (if you are fast), while most >>> will be 4 clock apart. This is the reason for the 1 bit stuffing. >>> >> >> The bit-stuffing in long sequences of zeroes is almost certainly there
to
>> facilitate a conventional clock recovery method, which I am proposing
not
>> using PROVIDED THAT the clocks at each end are within a sufficiently
tight
>> tolerance. Detect the ones in the as-sent stream first, then decide
which
>> are due to bit-stuffing, and remove them. >> >> Deciding how tight a tolerance is 'sufficiently tight' is probably >> non-trivial, so I won't be doing it for free. >> >> > >Since a 4x clock allows for a 25% data period correction, and we will >get an opportunity to do so every 7 data periods, we can tolerate about >a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will >need to know details like jitter and sampling apertures, but this gives >us a good ball-park figure). Higher sampling rates can about double >this, the key is we need to be able to know which direction the error is >in, so we need to be less than a 50% of a data period error including >the variation within a sample clock. > >To try to gather the data without resynchronizing VASTLY decreases your >tolerance for clock errors as you need to stay within a clock cycle over >the entire message. > >The protocol, with its 3 one preamble, does seem like there may have >been some effort to enable the use of a PLL to generate the data >sampling clock, which may have been the original method. This does have >the advantage the the data clock out of the sampler is more regular (not >having the sudden jumps from the resyncronizing), and getting a set a >burst of 1s helps the PLL to get a bit more centered on the data. My >experience though is that with FPGAs (as would be on topic for this >group), this sort of PLL synchronism is not normally used, but >oversampling clocks with phase correction is fairly standard. >
Some form of clock recovery is essential for continuous ('synchronous') data streams. It is not required for 'sufficiently short' asynchronous data bursts, the classic example of which is RS-232. What I am suggesting is that the OP determines - using simulation - whether these frames are too long given the relative clock tolerances for a system design without clock recovery. As I previously noted, this is first a 'system design' problem. Only after that has been completed does it become an 'FPGA design' problem. --------------------------------------- Posted through http://www.FPGARelated.com
On 8/2/2013 3:49 AM, alb wrote:
> On 02/08/2013 06:19, rickman wrote: > [] >>> >>> This is what I meant indeed. I believe I confused DPLL with ADPLL... >> >> I am not familiar with ADPLL. What is that? > > It is an All Digital PLL: > > http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf > > all the elements of a PLL are implemented in the digital domain.
I guess I wasn't aware that a digital PLL wasn't *all* digital. That is what I have been referring to as digital. -- Rick
On 8/2/2013 6:35 AM, RCIngham wrote:
>> On 8/1/13 5:56 AM, RCIngham wrote: >>>> On 7/31/13 9:36 AM, RCIngham wrote: >>>>> [snip] >>>>>> >>>>> Unless 'length' is limited, your worst case has header >>> "0000001111111111" >>>>> (with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, > which >>> will >>>>> have 2728 ones stuffed into them. Total line packet length is 19113 >>>>> symbols. If the clocks are within 1/19114 of each other, the same > number >>> of >>>>> symbols will be received as sent, ASSUMING no jitter. You can't > assume >>>>> that, but if there is 'not much' jitter then perhaps 1/100k will be >>> good >>>>> enough for relative drift to not need to be corrected for. >>>>> >>>>> So, for version 1, use the 'sync' to establish the start of frame and >>> the >>>>> sampling point, simulate the 'Rx fast' and 'Rx slow' cases in > parallel, >>> and >>>>> see whether it works. >>>>> >>>>> BTW, this is off-topic for C.A.F., as it is a system design problem > not >>>>> related to the implementation method. >>>>> >>>>> >>>> >>>> Since you can resynchronize your sampling clock on each transition >>>> received, you only need to "hold lock" for the maximum time between >>>> transitions, which is 7 bit times. This would mean that if you have a >>>> nominal 4x clock, some sample points will be only 3 clocks apart (if > you >>>> are slow) or some will be 5 clocks apart (if you are fast), while most >>>> will be 4 clock apart. This is the reason for the 1 bit stuffing. >>>> >>> >>> The bit-stuffing in long sequences of zeroes is almost certainly there > to >>> facilitate a conventional clock recovery method, which I am proposing > not >>> using PROVIDED THAT the clocks at each end are within a sufficiently > tight >>> tolerance. Detect the ones in the as-sent stream first, then decide > which >>> are due to bit-stuffing, and remove them. >>> >>> Deciding how tight a tolerance is 'sufficiently tight' is probably >>> non-trivial, so I won't be doing it for free. >>> >>> >> >> Since a 4x clock allows for a 25% data period correction, and we will >> get an opportunity to do so every 7 data periods, we can tolerate about >> a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will >> need to know details like jitter and sampling apertures, but this gives >> us a good ball-park figure). Higher sampling rates can about double >> this, the key is we need to be able to know which direction the error is >> in, so we need to be less than a 50% of a data period error including >> the variation within a sample clock. >> >> To try to gather the data without resynchronizing VASTLY decreases your >> tolerance for clock errors as you need to stay within a clock cycle over >> the entire message. >> >> The protocol, with its 3 one preamble, does seem like there may have >> been some effort to enable the use of a PLL to generate the data >> sampling clock, which may have been the original method. This does have >> the advantage the the data clock out of the sampler is more regular (not >> having the sudden jumps from the resyncronizing), and getting a set a >> burst of 1s helps the PLL to get a bit more centered on the data. My >> experience though is that with FPGAs (as would be on topic for this >> group), this sort of PLL synchronism is not normally used, but >> oversampling clocks with phase correction is fairly standard. >> > > Some form of clock recovery is essential for continuous ('synchronous') > data streams. It is not required for 'sufficiently short' asynchronous data > bursts, the classic example of which is RS-232. What I am suggesting is > that the OP determines - using simulation - whether these frames are too > long given the relative clock tolerances for a system design without clock > recovery. > > As I previously noted, this is first a 'system design' problem. Only after > that has been completed does it become an 'FPGA design' problem.
I don't think the frame length is the key parameter, rather it is the 6 zero, one insertion that guarantees a transition every 7 bits. -- Rick
On 02/08/2013 16:16, rickman wrote:
[]
>> It is an All Digital PLL: >> >> http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf >> >> all the elements of a PLL are implemented in the digital domain. > > I guess I wasn't aware that a digital PLL wasn't *all* digital. That is > what I have been referring to as digital.
you might find this article interesting: http://www.silabs.com/Support%20Documents/TechnicalDocs/AN575.pdf
On 8/2/13 6:30 AM, alb wrote:
> Hi Richard, > > On 02/08/2013 06:22, Richard Damon wrote: > [] >>> The bit-stuffing in long sequences of zeroes is almost certainly there to >>> facilitate a conventional clock recovery method, which I am proposing not >>> using PROVIDED THAT the clocks at each end are within a sufficiently tight >>> tolerance. Detect the ones in the as-sent stream first, then decide which >>> are due to bit-stuffing, and remove them. >>> >>> Deciding how tight a tolerance is 'sufficiently tight' is probably >>> non-trivial, so I won't be doing it for free. >>> >>> >> >> Since a 4x clock allows for a 25% data period correction, and we will >> get an opportunity to do so every 7 data periods, we can tolerate about >> a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will >> need to know details like jitter and sampling apertures, but this gives >> us a good ball-park figure). Higher sampling rates can about double >> this, the key is we need to be able to know which direction the error is >> in, so we need to be less than a 50% of a data period error including >> the variation within a sample clock. > > According to your math it looks like a 2x clock allows for a 50% data > period correction and therefore a 50/7 ~6% error in clock frequency, > which seems to me quite counter intuitive... Am I missing something? >
The details are that for a Nx sampling clocks, every time you see a clock, you can possibly shift N/2-1 high speed clock cycles every adjustment. For example, with a 16x clock, you can correct for the edge being between -7 and +7 sampling clocks from the expected point. If it is 8 clocks off, you don't know if is should be +8 or -8, so you are in trouble. If N is odd, you can possibly handle (N-1)/2 cycles. (Note that this assumes negligible jitter.) So our final allowable shift in data clocks is (N/2-1)/N which can also be written as 1/2-1/N, which leads to my 6% for N large (50% correction) and 3% for N=4. For N-2 this gives us 0%.
> [] >> The protocol, with its 3 one preamble, does seem like there may have >> been some effort to enable the use of a PLL to generate the data >> sampling clock, which may have been the original method. This does have >> the advantage the the data clock out of the sampler is more regular (not >> having the sudden jumps from the resyncronizing), and getting a set a >> burst of 1s helps the PLL to get a bit more centered on the data. My >> experience though is that with FPGAs (as would be on topic for this >> group), this sort of PLL synchronism is not normally used, but >> oversampling clocks with phase correction is fairly standard. > > This is indeed what I'm looking for, oversampling (4x or 8x) and phase > correct. >