FPGARelated.com
Forums

Inferring Dynamic shift registers in XST

Started by Josh Graham April 25, 2004
Hello all,
I am trying to get XST (ISE 6.1) to infer a dynamic shift register
implemented using Virtex II LUTS. I have used the VHDL model shown
below.
However XST does not use LUTS, instead flip-flops are used. When I
change the line srout <= sr(n) where n is a static value XST manages
to use LUTS for the shift register. Can anybody please tell me what I
am doing wrong. This is the same code as given in XST Synthesis ans
Verification guide.
Thanks
Josh 

entity addressablesr is
   generic(Depth_g : natural := 96);
   port(ck : in std_logic;
        en : in std_logic;
	srin : in std_logic;
	addr : in integer range 95 downto 0;
	srout : out std_logic);        
end entity;

architecture behaviour of addressablesr is
   signal sr : std_logic_vector(Depth_g-1 downto 0); 
begin

   process(ck,srin)
   begin
      if (ck'event and ck = '1') then
	    if en = '1' then
	       sr <= sr(Depth_g-2 downto 0) & srin;
	    end if;
      end if;
   end process;   
   srout <= sr(addr);
end architecture;
Josh Graham wrote:

> Hello all, > I am trying to get XST (ISE 6.1) to infer a dynamic shift register > implemented using Virtex II LUTS. I have used the VHDL model shown > below. > However XST does not use LUTS, instead flip-flops are used. When I > change the line srout <= sr(n) where n is a static value XST manages > to use LUTS for the shift register. Can anybody please tell me what I > am doing wrong. This is the same code as given in XST Synthesis ans > Verification guide.
Howdy Josh, Does the guide say that it will use only LUTs for an addressable (or dynamic) shift register? My understanding and experience is that it will take all consecutive and more importantly, unused bits, and roll them into a LUT based SRL (assuming it doesn't have a reset). I'm pretty certain that it is configured during synthesis/map/P&R and can't be addressable or dynamic. Ray Andraka uses these a lot, so I'll bet he has an efficient way to do what you want (although I would plan on it being larger than your originally thought). Good luck, Marc
I use instantiation inside a generate for the dynamic shift registers.
I've found that inference is too dependent on the particular synthesis
tool, and even the version of the tool.  Much less hassle to jus tinfer the
structure you want, except of course if it is retargeted to a device family
that doesn't have SRL16's.  I built mine as a separate component
parameterized for width, placement and a few other things so that I could
easily reuse it in other designs.

Marc Randolph wrote:

> Josh Graham wrote: > > > Hello all, > > I am trying to get XST (ISE 6.1) to infer a dynamic shift register > > implemented using Virtex II LUTS. I have used the VHDL model shown > > below. > > However XST does not use LUTS, instead flip-flops are used. When I > > change the line srout <= sr(n) where n is a static value XST manages > > to use LUTS for the shift register. Can anybody please tell me what I > > am doing wrong. This is the same code as given in XST Synthesis ans > > Verification guide. > > Howdy Josh, > > Does the guide say that it will use only LUTs for an addressable (or > dynamic) shift register? My understanding and experience is that it > will take all consecutive and more importantly, unused bits, and roll > them into a LUT based SRL (assuming it doesn't have a reset). I'm > pretty certain that it is configured during synthesis/map/P&R and can't > be addressable or dynamic. > > Ray Andraka uses these a lot, so I'll bet he has an efficient way to do > what you want (although I would plan on it being larger than your > originally thought). > > Good luck, > > Marc
-- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Marc Randolph <mrand@my-deja.com> wrote in message news:<vfOdnQ0PnMfd8BHdRVn-tw@comcast.com>...
> Josh Graham wrote: > > > Hello all, > > I am trying to get XST (ISE 6.1) to infer a dynamic shift register > > implemented using Virtex II LUTS. I have used the VHDL model shown > > below. > > However XST does not use LUTS, instead flip-flops are used. When I > > change the line srout <= sr(n) where n is a static value XST manages > > to use LUTS for the shift register. Can anybody please tell me what I > > am doing wrong. This is the same code as given in XST Synthesis ans > > Verification guide. > > Howdy Josh, > > Does the guide say that it will use only LUTs for an addressable (or > dynamic) shift register? My understanding and experience is that it > will take all consecutive and more importantly, unused bits, and roll > them into a LUT based SRL (assuming it doesn't have a reset). I'm > pretty certain that it is configured during synthesis/map/P&R and can't > be addressable or dynamic. > > Ray Andraka uses these a lot, so I'll bet he has an efficient way to do > what you want (although I would plan on it being larger than your > originally thought). > > Good luck, > > Marc
I had the opposite problem in Verilog, when I had a deep pipeline more than 4b deep I think, with no inter logic, it did roll into a wide srl16 and halved my clock in doing so. There is an option in the XST preferences display for turning this feature on/off. I think if I can tune the srl with an extra true FF at the end it will be useable at full speed, just the timing for srls is slower on the output bit. There's also some appnotes on it and there is some template code for these if you look in the right place. I would suggest not coding them directly unless you really know that they must be used. If you suddenly decide to insert logic into the middle, you will have to redo that. regards johnjakson_usa_com
Hi Marc,
Thanks for your input. After further experimentation I found that LUT
based addressable SR was only being inferred for SR of length 16 bits
or less. I managed to build a 128 bit addressable SR by cascading 8
SRLC16E primitives and using MUXF5,F6 and F7 to get the addressed bit.
But this makes my HDL source Xilinx specific.
Josh
The OP stated that he was trying to infer a DYNAMIC shift register, which implies an SRL16.
Otherwise, he needs a register file with a mux, which is both bulky and slow.  The synthesis tools
are notoriously inconsistent and picky about the style needed for this structure to be inferred as
an SRL16, and I've seen different tools wind up with different results, even when going between
versions of the same tool.  In order to get the dynamic shift register reliably it unfortunately
needs to be instantiated.  The tools are much better with static shift registers (ones whos length
does not change in the operating circuit), but still care is needed to make sure a flip-flop is
inserted after each SRL16.  The synthesis tools that recognize SRL16's generally do now put a
flip-flop after a single SRL16.  They are not so good at putting a flip-flop after every SRL16 when
several are cascaded together to get delays of more than 17 clocks.  As long as you have the
flip-flops after each SRL, and they are properly placed (which implies that they do not use the
reset), the SRL16's will not be the limiting factor in clock speed.  Leave those registers off, and
you'll severely limit the clocking.  BTW, don't attempt to use both the dynamic and Q15 outputs of
the same SRL16 in a high performance design.  The Q15 output is just as slow as the Y output if it
does not connect through a flip-flop in the same slice.


john jakson wrote:

> > > Marc > > I had the opposite problem in Verilog, when I had a deep pipeline more > than 4b deep I think, with no inter logic, it did roll into a wide > srl16 and halved my clock in doing so. There is an option in the XST > preferences display for turning this feature on/off. I think if I can > tune the srl with an extra true FF at the end it will be useable at > full speed, just the timing for srls is slower on the output bit. > There's also some appnotes on it and there is some template code for > these if you look in the right place. > > I would suggest not coding them directly unless you really know that > they must be used. If you suddenly decide to insert logic into the > middle, you will have to redo that. > > regards > > johnjakson_usa_com
-- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
It also limits the clock speed because of the slow clock to out when using
either the Y or Q15 outputs without a register.

Josh Graham wrote:

> Hi Marc, > Thanks for your input. After further experimentation I found that LUT > based addressable SR was only being inferred for SR of length 16 bits > or less. I managed to build a 128 bit addressable SR by cascading 8 > SRLC16E primitives and using MUXF5,F6 and F7 to get the addressed bit. > But this makes my HDL source Xilinx specific. > Josh
-- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Ray Andraka <ray@andraka.com> wrote in message news:<408DD446.5727B22@andraka.com>...
> The OP stated that he was trying to infer a DYNAMIC shift register, which implies an SRL16.
whoops, missed the DYNAMIC part,
> Otherwise, he needs a register file with a mux, which is both bulky and slow. The synthesis tools > are notoriously inconsistent and picky about the style needed for this structure to be inferred as
indeed
> an SRL16, and I've seen different tools wind up with different results, even when going between
I'll be adding FFed srls later on when the dust has settled. perhaps I should look at dynamic too for queues, although ram based will probably give me about same cost/perf/area regards johnjakson_usa_com
Depends on the depth.  For small ranges in depth, the SRL16 is more efficient than RAM based.

john jakson wrote:

> I'll be adding FFed srls later on when the dust has settled. perhaps I > should look at dynamic too for queues, although ram based will > probably give me about same cost/perf/area > > regards > > johnjakson_usa_com
-- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759