Sign in

username:

password:



Not a member?

Search Comp.Arch.FPGA



Search tips

fpga by Keywords

Altera | ASIC | CPLD | Cyclone | DCM | DDR | DSP | Ethernet | ISE | JTAG | Linux | LVDS | Microblaze | ML310 | Modelsim | NIOS | OPB | PCI | Quartus | RocketIO | SDRAM | Spartan | Spartan3 | SRAM | Stratix | Verilog | VHDL | Virtex | Virtex-4 | Virtex-II | Xilinx | XST


Ads

See Also

DSPEmbedded SystemsElectronics

Comp.Arch.FPGA | how fast is ... fast.

There are 10 messages in this thread.

You are currently looking at messages 0 to 10.

how fast is ... fast. - LC - 2010-06-14 08:45:00

Hi,

In a design I'm working on I have a machine that produces
128 bit of data. This data is destined to a 16 bit DAC running
8x faster.

Knowing that I can produce the 128bit up to a 120MHz rate
I wanted to generate the 8x 16bit stream as fast as I can
(and I'm using a 500MHz DAC)

The 128 to 16bit It's all done in a component I called front8x
that selects counts and selects the 16bit slice out of the 128bit data
as below (being the clock output the one used to run the rest of the 
circuit).

In practice I found it to run up to 300MHz.

Should I expect that this would be the right up limit I could do it ?
Is there any clever design of this frontend to allow higher speed ?

( note: the phase of the clock out to the DAC is set on another PLL so
I'm surely well by setting the DAC to sample at the middle of the eye 
pattern. So no issues here )

I would like very much to read some comments, please.

Thanks.

Luis C.

(device CycloneIII-FBGA fastest grade, all outs using LVDS)

--


LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.std_logic_arith.ALL;

ENTITY front8x IS

PORT (	clkin:	IN STD_LOGIC;	-- master frontend clock
	sync:	IN STD_LOGIC;	-- Sync frontend
	clkout:	OUT STD_LOGIC;	-- 1/8 main clock out
	datain:	IN STD_LOGIC_VECTOR (127 downto 0);-- system data bus
	dacout:	OUT STD_LOGIC_VECTOR (15 downto 0)-- DAC data bus
	);

END front8x;


ARCHITECTURE regmux8 OF front8x IS

SIGNAL	dacreg:	STD_LOGIC_VECTOR ( 15 DOWNTO 0 );
SIGNAL	datareg:STD_LOGIC_VECTOR ( 127 DOWNTO 0 );
SIGNAL	cntr:	INTEGER RANGE 0 TO 7;

BEGIN

dacout <= dacreg;


--------------------------
-- Main 8:1 cycle
-- clockout rise at count=4
-- bigdata is fetch at cont = last = 7

main: PROCESS(clkin,sync)

BEGIN
	IF (clkin='1' AND clkin'EVENT)
		THEN
			IF (sync='0')	THEN	cntr <= 0;
					ELSE	cntr <= cntr + 1;
			END IF;
			
			case cntr is
			  when 0 => dacreg <= datareg(127 downto 112);
			  when 1 => dacreg <= datareg(111 downto 96);
			  when 2 => dacreg <= datareg(95 downto 80);
			  when 3 => dacreg <= datareg(79 downto 64);
			  when 4 => dacreg <= datareg(63 downto 48);
			  when 5 => dacreg <= datareg(47 downto 32);
			  when 6 => dacreg <= datareg(31 downto 16);
			  when others => dacreg <= datareg(15 downto 0);
					 datareg <= datain;
										
			END CASE;
			
			IF (cntr > 4)	THEN clkout <= '1';
					ELSE clkout <= '0';
			END IF;

	END IF;

END PROCESS main;

END regmux8;
______________________________
Join the blogging team on FPGARelated.com and earn rewards! Details Here.



Re: how fast is ... fast. - Nial Stewart - 2010-06-14 10:34:00

> Is there any clever design of this frontend
to allow higher speed ?

As you're stepping through the 128 bits to extract the 16 bit output
I would have implemented it as a big shift register and take off the
top/bottom bits.

Then all the tools have to worry about is a single register to register delay
rather than big mux required to 'select' the correct 16 bits.


Nial. 



Re: how fast is ... fast. - Symon - 2010-06-14 14:45:00

On 6/14/2010 1:45 PM, LC wrote:
>
> Should I expect that this would be the right up limit I could do it ?
> Is there any clever design of this frontend to allow higher speed ?
>
Does XAPP265 give you any architectural hints that you can use in your 
Altera part?
HTH., Syms.

Re: how fast is ... fast. - LC - 2010-06-15 07:57:00

Symon wrote:
> On 6/14/2010 1:45 PM, LC wrote:
>>
>> Should I expect that this would be the right up limit I could do it ?
>> Is there any clever design of this frontend to allow higher speed ?
>>
> Does XAPP265 give you any architectural hints that you can use in your 
> Altera part?
> HTH., Syms.

Tks, Symon,
Indeed there are some variations induced by this reading that I'll try.
Thanks.

Luis C.

Re: how fast is ... fast. - LC - 2010-06-15 08:06:00

Nial Stewart wrote:
> 
> As you're stepping through the 128 bits to extract the 16 bit output
> I would have implemented it as a big shift register and take off the
> top/bottom bits.
> 
> Then all the tools have to worry about is a single register to register delay
> rather than big mux required to 'select' the correct 16 bits.
> 
> 
> Nial. 
> 

Nial,

Ok, good idea. Tks, will try.
Not sure what you mean by top/bottom but I presume it is something like
having the long 1 bit 128bit SR with outputs at 0, 16, 32 etc while the 
parallel 128bit word has bit reordering such as each shift would produce 
the next 16bit word to come out. If I'm missing something let me know.

Thanks.
Luis C.


Re: how fast is ... fast. - Nial Stewart - 2010-06-16 06:45:00

> Ok, good idea. Tks, will try.
> Not sure what you mean by top/bottom but I presume it is something like
> having the long 1 bit 128bit SR with outputs at 0, 16, 32 etc while the parallel
128bit word has 
> bit reordering such as each shift would produce the next 16bit word to come out. If
I'm missing 
> something let me know.

Something like this, with a load value....

signal shift_reg             : std_logic_vector(127 downto 0);
signal output                : std_logic_vector(15 downto 0);
signal load_value            : std_logic_vector(127 downto 0);


:
:
:

process(clk,rst)
begin
if(rst = '1') then
    shift_reg      <= (others => '0');
    output         <= (others => '0');
elsif(rising_edge(clk)) then
    if(load = '1') then
        shift_reg <= load_value;
    else
        shift_reg(127 downto 112) <= shift_reg(111 downto 96;
        shift_reg(111 downto 96)  <= shift_reg(95 downto 80);
        shift_reg(79 downto 64)   <= shift_reg(63 downto 48);
                :
                :
        shift_reg(31 downto 16)   <= shift_reg(15 downto 0);

    end if;

    output <= shift_reg(127 downto 112);

end if;
end process;


Nial 



Re: how fast is ... fast. - LC - 2010-06-16 08:30:00

Nial Stewart wrote:
>> Ok, good idea. Tks, will try.
>> Not sure what you mean by top/bottom but I presume it is something like
>> having the long 1 bit 128bit SR with outputs at 0, 16, 32 etc while the parallel
128bit word has 
>> bit reordering such as each shift would produce the next 16bit word to come out.
If I'm missing 
>> something let me know.
> 
> Something like this, with a load value....
> 
> signal shift_reg             : std_logic_vector(127 downto 0);
> signal output                : std_logic_vector(15 downto 0);
> signal load_value            : std_logic_vector(127 downto 0);
> 
> 
> :
> :
> :
> 
> process(clk,rst)
> begin
> if(rst = '1') then
>     shift_reg      <= (others => '0');
>     output         <= (others => '0');
> elsif(rising_edge(clk)) then
>     if(load = '1') then
>         shift_reg <= load_value;
>     else
>         shift_reg(127 downto 112) <= shift_reg(111 downto 96;
>         shift_reg(111 downto 96)  <= shift_reg(95 downto 80);
>         shift_reg(79 downto 64)   <= shift_reg(63 downto 48);
>                 :
>                 :
>         shift_reg(31 downto 16)   <= shift_reg(15 downto 0);
> 
>     end if;
> 
>     output <= shift_reg(127 downto 112);
> 
> end if;
> end process;
> 
> 
> Nial 
> 
> 

Thanks for the clarification.

Yes, Now I've tested both: the 1 bit SR with 128 with bit
reordering on both sides (which is just messing up with the bit order 
must not consume precious time)
And the SR in 16bit chunks approach you suggested.

Both resulted identical (as expected... they are after all not too 
different if we think of the data path delay).
Both were indeed a bit faster than my previous counter/mux approach.

Now I'm closer to 400MHz...

I believe that is what I could do with this technology.

Again, tks,
Luis C.

Re: how fast is ... fast. - Symon - 2010-06-16 09:31:00

On 6/15/2010 12:57 PM, LC wrote:
> Symon wrote:
>> On 6/14/2010 1:45 PM, LC wrote:
>>>
>>> Should I expect that this would be the right up limit I could do it ?
>>> Is there any clever design of this frontend to allow higher speed ?
>>>
>> Does XAPP265 give you any architectural hints that you can use in your
>> Altera part?
>> HTH., Syms.
>
> Tks, Symon,
> Indeed there are some variations induced by this reading that I'll try.
> Thanks.
>
> Luis C.

Hi Luis,
You might want to pay particular attention to the DDR registers in the 
IOBs. I expect your Altera part has the same features, but I dunno for 
sure. The registers mean that your internal logic can run at half the 
speed of the external signals. Which is nice.
HTH, Syms.

Re: how fast is ... fast. - rickman - 2010-06-16 18:58:00

On Jun 16, 9:31=A0am, Symon
<symon_bre...@hotmail.com> wrote:
> On 6/15/2010 12:57 PM, LC wrote:
>
> > Symon wrote:
> >> On 6/14/2010 1:45 PM, LC wrote:
>
> >>> Should I expect that this would be the right up limit I could do it ?
> >>> Is there any clever design of this frontend to allow higher speed ?
>
> >> Does XAPP265 give you any architectural hints that you can use in your
> >> Altera part?
> >> HTH., Syms.
>
> > Tks, Symon,
> > Indeed there are some variations induced by this reading that I'll try.
> > Thanks.
>
> > Luis C.
>
> Hi Luis,
> You might want to pay particular attention to the DDR registers in the
> IOBs. I expect your Altera part has the same features, but I dunno for
> sure. The registers mean that your internal logic can run at half the
> speed of the external signals. Which is nice.
> HTH, Syms.

That's what I would suggest.  By using the DDR registers, the data
stream can be split into odd/even words with parallel paths.  Then
each stream would only need to run at half the rate on the I/O pins.
Since you already have the 500 MHz clock you can just divide that by
two to generate two enables, one for the odd and one for the even data
streams.  I've never used the DDR registers.  You probably want to
look closely at the example code that Altera provides.

Rick
______________________________
Join the blogging team on FPGARelated.com and earn rewards! Details Here.

Re: how fast is ... fast. - LC - 2010-06-22 06:46:00

rickman wrote:
> On Jun 16, 9:31 am, Symon <symon_bre...@hotmail.com> wrote:
>> On 6/15/2010 12:57 PM, LC wrote:
>>
>>> Symon wrote:
>>>> On 6/14/2010 1:45 PM, LC wrote:
>>>>> Should I expect that this would be the right up limit I could do it
?
>>>>> Is there any clever design of this frontend to allow higher speed ?
>>>> Does XAPP265 give you any architectural hints that you can use in your
>>>> Altera part?
>>>> HTH., Syms.
>>> Tks, Symon,
>>> Indeed there are some variations induced by this reading that I'll try.
>>> Thanks.
>>> Luis C.
>> Hi Luis,
>> You might want to pay particular attention to the DDR registers in the
>> IOBs. I expect your Altera part has the same features, but I dunno for
>> sure. The registers mean that your internal logic can run at half the
>> speed of the external signals. Which is nice.
>> HTH, Syms.
> 
> That's what I would suggest.  By using the DDR registers, the data
> stream can be split into odd/even words with parallel paths.  Then
> each stream would only need to run at half the rate on the I/O pins.
> Since you already have the 500 MHz clock you can just divide that by
> two to generate two enables, one for the odd and one for the even data
> streams.  I've never used the DDR registers.  You probably want to
> look closely at the example code that Altera provides.
> 
> Rick

Many thaks Folks,
Very good tips.

tks,
Luis C.