comp.arch.fpga | serial protocol specs and verification

Hi all,

I have the following specs for the physical level of a serial protocol:

> For the communication with Frontend asynchronous LVDS connection is used.
> The bitrate is set to 20 Mbps.
> Data encoding on the LVDS line is NRZI:
> - bit '1' is represented by a transition of the physical level,
> - bit '0' is represented by no transition of the physical level,
> - insertion of an additional bit '1' after 6 consecutive bits '0'.

Isn't there a missing requirement on reset condition of the line?
System clock is implicitly defined on a different section of the specs
and is set at 40MHz.

At the next layer there's a definition of a 'frame' as a sequence of 16
bit words preceded by a 3 bit sync pattern (111) and a header of 16 bits
defining the type of the packet and the length of the packet (in words).

I'm writing a test bench for it and I was wondering whether there's any
recommendation you would suggest. Should I take care about randomly
select the phase between the system clock and the data?

Any pointer is appreciated.
Cheers,

Al

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Reply by rickman ●July 26, 20132013-07-26

On 7/26/2013 11:22 AM, alb wrote:
> Hi all,
>
> I have the following specs for the physical level of a serial protocol:
>
>> For the communication with Frontend asynchronous LVDS connection is used.
>> The bitrate is set to 20 Mbps.
>> Data encoding on the LVDS line is NRZI:
>> - bit '1' is represented by a transition of the physical level,
>> - bit '0' is represented by no transition of the physical level,
>> - insertion of an additional bit '1' after 6 consecutive bits '0'.
>
> Isn't there a missing requirement on reset condition of the line?
> System clock is implicitly defined on a different section of the specs
> and is set at 40MHz.
>
> At the next layer there's a definition of a 'frame' as a sequence of 16
> bit words preceded by a 3 bit sync pattern (111) and a header of 16 bits
> defining the type of the packet and the length of the packet (in words).
>
> I'm writing a test bench for it and I was wondering whether there's any
> recommendation you would suggest. Should I take care about randomly
> select the phase between the system clock and the data?

Async, eh?  At 2x clock to data?  Not sure I would want to design this. 
  I assume you have to phase lock to the data stream somehow?  I think 
that is the part I would worry about.

In simulation I would recommend that you both jitter the data clock at a 
high bandwidth and also with something fairly slow.  The slow variation 
will test the operation of your data extraction with a variable phase 
and the high bandwidth jitter will check for problems from only having 
two samples per bit.  I don't know how this can be expected to work myself.

I did something similar where I had to run a digital phase locked loop 
on standard NRZ data (no encoding) and used a 4x clock, but I think I 
proved to myself I could do it with a 3x clock, it just becomes 
impossible to detect when you have a sample error... lol.

-- 

Rick

Reply by alb ●July 28, 20132013-07-28

On 27/07/2013 01:59, rickman wrote:
> On 7/26/2013 11:22 AM, alb wrote:
>> Hi all,
>>
>> I have the following specs for the physical level of a serial protocol:
>>
>>> For the communication with Frontend asynchronous LVDS connection is
>>> used.
>>> The bitrate is set to 20 Mbps.
>>> Data encoding on the LVDS line is NRZI:
>>> - bit '1' is represented by a transition of the physical level,
>>> - bit '0' is represented by no transition of the physical level,
>>> - insertion of an additional bit '1' after 6 consecutive bits '0'.
>>
[]
> 
> Async, eh?  At 2x clock to data?  Not sure I would want to design this.
>  I assume you have to phase lock to the data stream somehow?  I think
> that is the part I would worry about.

currently they are experiencing a large loss of packets as well as many
corrupted packets (CRC errors). I'm not sure the current implementation
is doing phase lock.

> 
> In simulation I would recommend that you both jitter the data clock at a
> high bandwidth and also with something fairly slow.  The slow variation
> will test the operation of your data extraction with a variable phase
> and the high bandwidth jitter will check for problems from only having
> two samples per bit.  I don't know how this can be expected to work myself.

Since modules are likely to have different temperatures being far apart,
I would certainly expect a phase problem. Your idea to have a slow and a
high frequency variation in the phase generation might bring out some
additional info.

> 
> I did something similar where I had to run a digital phase locked loop
> on standard NRZ data (no encoding) and used a 4x clock, but I think I
> proved to myself I could do it with a 3x clock, it just becomes
> impossible to detect when you have a sample error... lol.

what do you mean by saying 'it becomes impossible to detect when you
have a sample error'?

Reply by Richard Damon ●July 28, 20132013-07-28

On 7/26/13 11:22 AM, alb wrote:
> Hi all,
> 
> I have the following specs for the physical level of a serial protocol:
> 
>> For the communication with Frontend asynchronous LVDS connection is used.
>> The bitrate is set to 20 Mbps.
>> Data encoding on the LVDS line is NRZI:
>> - bit '1' is represented by a transition of the physical level,
>> - bit '0' is represented by no transition of the physical level,
>> - insertion of an additional bit '1' after 6 consecutive bits '0'.
> 
> Isn't there a missing requirement on reset condition of the line?
> System clock is implicitly defined on a different section of the specs
> and is set at 40MHz.
> 
> At the next layer there's a definition of a 'frame' as a sequence of 16
> bit words preceded by a 3 bit sync pattern (111) and a header of 16 bits
> defining the type of the packet and the length of the packet (in words).
> 
> I'm writing a test bench for it and I was wondering whether there's any
> recommendation you would suggest. Should I take care about randomly
> select the phase between the system clock and the data?
> 
> Any pointer is appreciated.
> Cheers,
> 
> Al
> 

You don't need to specify a reset state, as either level will work. At
reset the line will be toggling every 7 bit times due to the automatic
insertion of a 1 after 6 0s.

I would be hard pressed to use 40 MHz as a system clock, unless I was
allowed to use both edges of the clock (so I could really sample at a 4x
rate).

For a test bench, I would build something that could be set to work
slightly "off frequency" and maybe even with some phase jitter in the
data clock. I am assuming that system clock does NOT travel between
devices, or there wouldn't be as much need for the auto 1 bit, unless
this is just a bias leveling, but if isn't real great for that.

Reply by alb ●July 29, 20132013-07-29

On 29/07/2013 03:05, Richard Damon wrote:
> On 7/26/13 11:22 AM, alb wrote:
>> Hi all,
>>
>> I have the following specs for the physical level of a serial protocol:
>>
>>> For the communication with Frontend asynchronous LVDS connection is used.
>>> The bitrate is set to 20 Mbps.
>>> Data encoding on the LVDS line is NRZI:
>>> - bit '1' is represented by a transition of the physical level,
>>> - bit '0' is represented by no transition of the physical level,
>>> - insertion of an additional bit '1' after 6 consecutive bits '0'.
>>
>> Isn't there a missing requirement on reset condition of the line?
>> System clock is implicitly defined on a different section of the specs
>> and is set at 40MHz.
[]
> You don't need to specify a reset state, as either level will work. At
> reset the line will be toggling every 7 bit times due to the automatic
> insertion of a 1 after 6 0s.

Uhm, since there's a sync pattern of '111' I have to assume that no
frame is transmitted when only zeros are flowing (with the '1' stuffed
every 6 zeros).

> I would be hard pressed to use 40 MHz as a system clock, unless I was
> allowed to use both edges of the clock (so I could really sample at a 4x
> rate).

I'm thinking about having a system clock multiplied internally via PLL
and then go for a x4 or x8 in order to center the bit properly.

> 
> For a test bench, I would build something that could be set to work
> slightly "off frequency" and maybe even with some phase jitter in the
> data clock. 

Rick was suggesting a phase jitter with a high and a low frequency
component. This can be even a more realistic case since it models slow
drifts due to temperature variations... I do not know how critical would
be to simulate *all* jitter components of a clock (they may depend on
temperature, power noise, ground noise, ...).

> I am assuming that system clock does NOT travel between
> devices, or there wouldn't be as much need for the auto 1 bit, unless
> this is just a bias leveling, but if isn't real great for that.

Your assumption is correct. No clock distribution between devices.

Reply by rickman ●July 29, 20132013-07-29

On 7/28/2013 2:32 PM, alb wrote:
> On 27/07/2013 01:59, rickman wrote:
>> On 7/26/2013 11:22 AM, alb wrote:
>>> Hi all,
>>>
>>> I have the following specs for the physical level of a serial protocol:
>>>
>>>> For the communication with Frontend asynchronous LVDS connection is
>>>> used.
>>>> The bitrate is set to 20 Mbps.
>>>> Data encoding on the LVDS line is NRZI:
>>>> - bit '1' is represented by a transition of the physical level,
>>>> - bit '0' is represented by no transition of the physical level,
>>>> - insertion of an additional bit '1' after 6 consecutive bits '0'.
>>>
> []
>>
>> Async, eh?  At 2x clock to data?  Not sure I would want to design this.
>>   I assume you have to phase lock to the data stream somehow?  I think
>> that is the part I would worry about.
>
> currently they are experiencing a large loss of packets as well as many
> corrupted packets (CRC errors). I'm not sure the current implementation
> is doing phase lock.
>
>>
>> In simulation I would recommend that you both jitter the data clock at a
>> high bandwidth and also with something fairly slow.  The slow variation
>> will test the operation of your data extraction with a variable phase
>> and the high bandwidth jitter will check for problems from only having
>> two samples per bit.  I don't know how this can be expected to work myself.
>
> Since modules are likely to have different temperatures being far apart,
> I would certainly expect a phase problem. Your idea to have a slow and a
> high frequency variation in the phase generation might bring out some
> additional info.
>
>>
>> I did something similar where I had to run a digital phase locked loop
>> on standard NRZ data (no encoding) and used a 4x clock, but I think I
>> proved to myself I could do it with a 3x clock, it just becomes
>> impossible to detect when you have a sample error... lol.
>
> what do you mean by saying 'it becomes impossible to detect when you
> have a sample error'?

I was assuming that perhaps you were doing something I didn't quite 
understand, but I'm pretty sure I am on target with this.  You *must* up 
your sample rate by a sufficient amount so that you can guarantee you 
get a minimum of two samples per bit.  Otherwise you have no way to 
distinguish a slipped sample due to clock mismatch.  Clock frequency 
mismatch is guaranteed, unless you are using the same clock somehow.  Is 
that the case?  If so, the sampling would just be synchronous and I 
don't follow where the problem is.

It is not just a matter of phase, but of frequency.  With a 2x clock, 
seeing a transition 3 clocks later doesn't distinguish one bit time from 
two bit times.

I'm having trouble expressing myself I think, but I'm trying to say the 
basic premise of this design is flawed because the sample clock is only 
2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x 
the samples have four states, expected timing, fast timing, slow timing 
and "error" timing meaning the loop control isn't working.

Data     ____----____----____----____----____----____----____
SmplClk  --__--__--__--__--__--__--__--__--__--__--__--__--__
SmplData -----____----____----____----____----____----____----

This is how you expect it to work.  But if the data is sampled slightly 
off it looks like this.

Data     ____---____----____----____----____----____----____
SmplClk  --__--__--__--__--__--__--__--__--__--__--__--__--__
SmplData -----________----____----____----____----____----___

You can't use a locked loop like this because you have no info on 
whether you are sampling fast or slow.

The sample clock does not need to be any particular ratio to the data 
stream if you use an NCO to control the sample rate.  Then the phase 
detection will bump the rate up and down to suit.

Do you follow what I am saying?  Or have I mistaken what you are doing?

-- 

Rick

Reply by rickman ●July 29, 20132013-07-29

On 7/29/2013 5:09 AM, alb wrote:
>
> Rick was suggesting a phase jitter with a high and a low frequency
> component. This can be even a more realistic case since it models slow
> drifts due to temperature variations... I do not know how critical would
> be to simulate *all* jitter components of a clock (they may depend on
> temperature, power noise, ground noise, ...).

Just to be clear my suggestion for simulating with both fast and slow 
clock frequency variations is not intended to match any real world 
conditions so much, but just to exercise the circuit in two ways that I 
would expect to detect failures.

If the clock is sampling the data on the edge, it is random which level 
is measured.  This can be simulated by a fast jitter in the clock.  A 
slow noise component in the clock frequency would provide for simulation 
of mismatched clock frequencies in both the positive and negative 
directions.  Another way of implementing the slow drift is to just 
simulate at a very slightly higher frequency and at a very slightly 
lower frequency.  That might show errors faster and more deterministically.

-- 

Rick

Reply by glen herrmannsfeldt ●July 29, 20132013-07-29

rickman <gnuarm@gmail.com> wrote:
> On 7/28/2013 2:32 PM, alb wrote:

(snip)

>>>>> For the communication with Frontend asynchronous LVDS 
>>>>> connection is used.

(snip)
>>> Async, eh?  At 2x clock to data?  Not sure I would want to 
>>> design this.
>>>   I assume you have to phase lock to the data stream somehow?  
>>> I think that is the part I would worry about.
>>> two samples per bit.  I don't know how this can be expected to work myself.

(snip)
>> Since modules are likely to have different temperatures being far apart,
>> I would certainly expect a phase problem. Your idea to have a slow and a
>> high frequency variation in the phase generation might bring out some
>> additional info.

(snip)
> I was assuming that perhaps you were doing something I didn't quite 
> understand, but I'm pretty sure I am on target with this.  
> You *must* up your sample rate by a sufficient amount so that 
> you can guarantee you get a minimum of two samples per bit.  
> Otherwise you have no way to distinguish a slipped sample due 
> to clock mismatch.  Clock frequency mismatch is guaranteed, 
> unless you are using the same clock somehow.  

Everyone's old favorite asynchronous serial RS232 usually uses a
clock at 16x, though I have seen 64x. From the beginning of the 
start bit, it counts half a bit time (in clock cycles), verifies
the start bit (and not random noise) then counts whole bits and
decodes at that point. So, the actual decoding is done with a 1X
clock, but with 16 (or 64) possible phase values. It resynchronizes
at the beginning of each character, so it can't get too far off.

> It is not just a matter of phase, but of frequency.  With a 2x clock, 
> seeing a transition 3 clocks later doesn't distinguish one bit 
> time from two bit times.

For 10Mbit ethernet, on the other hand, as well as I understand it
the receiver locks (PLL) to the transmitter. Manchester coding is
wasteful of bandwidth, but allows for a simpler receiver. 
I believe it is usual to feed the transmit clock to the PLL to keep
it close to the right frequency until a signal comes in. Speeds up
the lock time. 

> I'm having trouble expressing myself I think, but I'm trying to say the 
> basic premise of this design is flawed because the sample clock is only 
> 2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x 
> the samples have four states, expected timing, fast timing, slow timing 
> and "error" timing meaning the loop control isn't working.

Seems to me that it should depend on how far of you can get. 
For async RS232, you have to stay within about a quarter bit time
over 10 bits, so even if the clock is 2% off, it still works.
But as above, that depends on having a clock of the appropriate
phase.

-- glen

Reply by rickman ●July 29, 20132013-07-29

On 7/29/2013 1:40 PM, glen herrmannsfeldt wrote:
> rickman<gnuarm@gmail.com>  wrote:
>> On 7/28/2013 2:32 PM, alb wrote:
>
> (snip)
>
>>>>>> For the communication with Frontend asynchronous LVDS
>>>>>> connection is used.
>
> (snip)
>>>> Async, eh?  At 2x clock to data?  Not sure I would want to
>>>> design this.
>>>>    I assume you have to phase lock to the data stream somehow?
>>>> I think that is the part I would worry about.
>>>> two samples per bit.  I don't know how this can be expected to work myself.
>
> (snip)
>>> Since modules are likely to have different temperatures being far apart,
>>> I would certainly expect a phase problem. Your idea to have a slow and a
>>> high frequency variation in the phase generation might bring out some
>>> additional info.
>
> (snip)
>> I was assuming that perhaps you were doing something I didn't quite
>> understand, but I'm pretty sure I am on target with this.
>> You *must* up your sample rate by a sufficient amount so that
>> you can guarantee you get a minimum of two samples per bit.
>> Otherwise you have no way to distinguish a slipped sample due
>> to clock mismatch.  Clock frequency mismatch is guaranteed,
>> unless you are using the same clock somehow.
>
> Everyone's old favorite asynchronous serial RS232 usually uses a
> clock at 16x, though I have seen 64x. From the beginning of the
> start bit, it counts half a bit time (in clock cycles), verifies
> the start bit (and not random noise) then counts whole bits and
> decodes at that point. So, the actual decoding is done with a 1X
> clock, but with 16 (or 64) possible phase values. It resynchronizes
> at the beginning of each character, so it can't get too far off.

Yes, that protocol requires a clock matched to the senders clock to at 
least 2.5% IIRC.  The protocol the OP describes has much longer char 
sequences which implies much tighter clock precision at each end and I'm 
expecting it to use a clock recovery circuit... but maybe not.  I think 
he said they don't use one but get "frequent" errors.


>> It is not just a matter of phase, but of frequency.  With a 2x clock,
>> seeing a transition 3 clocks later doesn't distinguish one bit
>> time from two bit times.
>
> For 10Mbit ethernet, on the other hand, as well as I understand it
> the receiver locks (PLL) to the transmitter. Manchester coding is
> wasteful of bandwidth, but allows for a simpler receiver.
> I believe it is usual to feed the transmit clock to the PLL to keep
> it close to the right frequency until a signal comes in. Speeds up
> the lock time.
>
>> I'm having trouble expressing myself I think, but I'm trying to say the
>> basic premise of this design is flawed because the sample clock is only
>> 2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x
>> the samples have four states, expected timing, fast timing, slow timing
>> and "error" timing meaning the loop control isn't working.
>
> Seems to me that it should depend on how far of you can get.
> For async RS232, you have to stay within about a quarter bit time
> over 10 bits, so even if the clock is 2% off, it still works.
> But as above, that depends on having a clock of the appropriate
> phase.

Not sure why you mention phase.  In 232 type character async you have 
*no* phase relationship between clocks.  There is no PLL so you aren't 
phase locked to the data either.  I guess you mean a clock with enough 
precision?

I've never analyzed an async design with longer data streams so I don't 
know how much precision would be required, but I"m sure you can't do 
reliable data recovery with a 2x clock (without a pll).  I think this 
would contradict the Nyquist criterion.

In my earlier comments when I'm talking about a PLL I am referring to a 
digital PLL.  I guess I should have said a DPLL.

-- 

Rick

Reply by glen herrmannsfeldt ●July 29, 20132013-07-29

rickman <gnuarm@gmail.com> wrote:

(snip, I wrote)

>> Everyone's old favorite asynchronous serial RS232 usually uses a
>> clock at 16x, though I have seen 64x. From the beginning of the
>> start bit, it counts half a bit time (in clock cycles), verifies
>> the start bit (and not random noise) then counts whole bits and
>> decodes at that point. So, the actual decoding is done with a 1X
>> clock, but with 16 (or 64) possible phase values. It resynchronizes
>> at the beginning of each character, so it can't get too far off.
 
> Yes, that protocol requires a clock matched to the senders clock to at 
> least 2.5% IIRC.  The protocol the OP describes has much longer char 
> sequences which implies much tighter clock precision at each end and I'm 
> expecting it to use a clock recovery circuit... but maybe not.  I think 
> he said they don't use one but get "frequent" errors.

(snip)
>> Seems to me that it should depend on how far of you can get.
>> For async RS232, you have to stay within about a quarter bit time
>> over 10 bits, so even if the clock is 2% off, it still works.
>> But as above, that depends on having a clock of the appropriate
>> phase.
 
> Not sure why you mention phase.  In 232 type character async you have 
> *no* phase relationship between clocks.  There is no PLL so you aren't 
> phase locked to the data either.  I guess you mean a clock with enough 
> precision?

The reason for the 16x clock is that it can then clock the bits
in one at a time with any of 16 different phases. That is, the actual
bits are only looked at once (usually). 
 
> I've never analyzed an async design with longer data streams 
> so I don't know how much precision would be required, but I"m 
> sure you can't do reliable data recovery with a 2x clock (without 
> a pll).  I think this would contradict the Nyquist criterion.

If you start from the leading edge of the start bit, choose which
cycle of the 2x clock is closest to the center, and count from there,
seems to me you do pretty well if the clocks are close enough. Also,
the bit times should be pretty close to correct.

> In my earlier comments when I'm talking about a PLL I am 
> referring to a digital PLL.  I guess I should have said a DPLL.

I was thinking of an analog one. I still remember when analog (PLL
based) data separators were better for floppy disk reading. 
Most likely by now, digital ones are better, possibly because
of a higher clock frequency.
 
-- glen

Previous12 3 4 5 Next

serial protocol specs and verification

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group