FPGARelated.com
Forums

Re: Dual Port RAM Inference

Started by Jonathan Bromley May 9, 2009
On Fri, 8 May 2009 10:01:30 -0700 (PDT), peter@xilinx.com wrote:

>This is not a Xilinx or Altera circuit design problem, > nor is it a VHDL problem. It is a systems design issue.
With respect, Peter, it is definitely neither a circuit design nor a system design problem. Speaking for myself (and for Rick too, I'm pretty sure) I know well enough what the capabilities and limitations of the BRAMs are, and how to work with them successfully. I already know what form of BRAM I want, and I can easily enough instantiate it. I have already chosen a set of behaviours that I know are available in both Xilinx and Altera BRAMs. But I don't want the grotesque non-portable ugliness of instantiated and/or wizard-generated BRAM components. So I seek a way of writing VHDL and Verilog code that correctly describes the memories' simulation behaviour, at the appropriate level of abstraction, and that will allow a range of synthesis tools to infer correctly the BRAM properties that I need. As has already been said by others, it is not hard to do this for BRAM configurations with only one write port. As soon as you add a second write port, things get much more vexatious and you get significantly less help from the coding guidelines in vendor documentation. There are good reasons for this, as have already been discussed; I made a promise (which I aim to keep) to find out just what can be done, and to write it up in a convenient vendor-neutral form. Systems design it ain't; it's all about finding a valid HDL coding style that reliably gets a desired result out of a range of different vendors' tools. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.
On May 9, 5:02=A0am, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com>
wrote:
> On Fri, 8 May 2009 10:01:30 -0700 (PDT), pe...@xilinx.com wrote: > >This is not a Xilinx or Altera circuit design problem, > > nor is it a VHDL problem. It is a systems design issue. > > With respect, Peter, it is definitely neither a circuit > design nor a system design problem. =A0Speaking for myself > (and for Rick too, I'm pretty sure) I know well enough > what the capabilities and limitations of the BRAMs are, > and how to work with them successfully. =A0I already > know what form of BRAM I want, and I can easily enough > instantiate it. =A0I have already chosen a set of > behaviours that I know are available in both Xilinx > and Altera BRAMs. > > But I don't want the grotesque non-portable ugliness > of instantiated and/or wizard-generated BRAM components. > > So I seek a way of writing VHDL and Verilog code that > correctly describes the memories' simulation behaviour, > at the appropriate level of abstraction, and that will > allow a range of synthesis tools to infer correctly the > BRAM properties that I need. =A0As has already been said by > others, it is not hard to do this for BRAM configurations > with only one write port. =A0As soon as you add a second > write port, things get much more vexatious and you > get significantly less help from the coding guidelines > in vendor documentation. =A0There are good reasons for > this, as have already been discussed; I made a promise > (which I aim to keep) to find out just what can be > done, and to write it up in a convenient vendor-neutral > form. =A0Systems design it ain't; it's all about finding > a valid HDL coding style that reliably gets a desired > result out of a range of different vendors' tools. > -- > Jonathan Bromley, Consultant > > DOULOS - Developing Design Know-how > VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services > > Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK > jonathan.brom...@MYCOMPANY.comhttp://www.MYCOMPANY.com > > The contents of this message may contain personal views which > are not the views of Doulos Ltd., unless specifically stated.
Why did I call it a systems problem? The BRAM behaves like a synchronous two-port RAM should, common clock or uncorrelated clocks, as long as you do not perform "simultaneous" write and read operations on the same location. Two writes with conflicting data will leave the content undefined, while a write and a read can result in an undefined output. Two reads are no problem. Protecting against these system issues is quite complicated, and would sacrifice performance. What does the user community expect from us (Xilinx)? Peter Alfke
Peter Alfke wrote:

> Protecting against these system issues is quite complicated, and would > sacrifice performance. > What does the user community expect from us (Xilinx)?
What if we wrote you a vhdl and verilog model that captures your English description above and a testbench to demonstrate that modelsim agrees. Then you would give the models to the right person and see to it that ise will synthesize a netlist that passes the same testbench. -- Mike Treseler
Peter Alfke wrote:

> Why did I call it a systems problem? > The BRAM behaves like a synchronous two-port RAM should, common clock > or uncorrelated clocks, as long as you do not perform "simultaneous" > write and read operations on the same location. > Two writes with conflicting data will leave the content undefined, > while a write and a read can result in an undefined output. Two reads > are no problem. > Protecting against these system issues is quite complicated, and would > sacrifice performance. > What does the user community expect from us (Xilinx)?
I didn't need such a feature so far, but with Altera Quartus you can specify for dual port RAMs, if you want to read the old content when simultaneous writing at the same location, or you can speficy "I don't care" (which I assume is faster). Maybe this features makes sense for some projects. But I don't think that it makes sense to specify the behaviour, if a BRAM has two write ports and from both ports are written to the same address simultaneously. And if it makes sense, it should be easy to catch this rare case in user logic, e.g. a simple priority algorithm with static logic. Implementing this for the write/read-case in user logic would be more complicated and maybe slower than what is possible with low-level support. -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de
On May 9, 2:03=A0pm, Peter Alfke <al...@sbcglobal.net> wrote:
> On May 9, 5:02=A0am, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com> > wrote: > > > > > On Fri, 8 May 2009 10:01:30 -0700 (PDT), pe...@xilinx.com wrote: > > >This is not a Xilinx or Altera circuit design problem, > > > nor is it a VHDL problem. It is a systems design issue. > > > With respect, Peter, it is definitely neither a circuit > > design nor a system design problem. =A0Speaking for myself > > (and for Rick too, I'm pretty sure) I know well enough > > what the capabilities and limitations of the BRAMs are, > > and how to work with them successfully. =A0I already > > know what form of BRAM I want, and I can easily enough > > instantiate it. =A0I have already chosen a set of > > behaviours that I know are available in both Xilinx > > and Altera BRAMs. > > > But I don't want the grotesque non-portable ugliness > > of instantiated and/or wizard-generated BRAM components. > > > So I seek a way of writing VHDL and Verilog code that > > correctly describes the memories' simulation behaviour, > > at the appropriate level of abstraction, and that will > > allow a range of synthesis tools to infer correctly the > > BRAM properties that I need. =A0As has already been said by > > others, it is not hard to do this for BRAM configurations > > with only one write port. =A0As soon as you add a second > > write port, things get much more vexatious and you > > get significantly less help from the coding guidelines > > in vendor documentation. =A0There are good reasons for > > this, as have already been discussed; I made a promise > > (which I aim to keep) to find out just what can be > > done, and to write it up in a convenient vendor-neutral > > form. =A0Systems design it ain't; it's all about finding > > a valid HDL coding style that reliably gets a desired > > result out of a range of different vendors' tools. > > -- > > Jonathan Bromley, Consultant > > > DOULOS - Developing Design Know-how > > VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services > > > Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK > > jonathan.brom...@MYCOMPANY.comhttp://www.MYCOMPANY.com > > > The contents of this message may contain personal views which > > are not the views of Doulos Ltd., unless specifically stated. > > Why did I call it a systems problem? > The BRAM behaves like a synchronous two-port RAM should, common clock > or uncorrelated clocks, as long as you do not perform "simultaneous" > write and read operations on the same location. > Two writes with conflicting data will leave the content undefined, > while a write and a read can result in an undefined output. Two reads > are no problem. > Protecting against these system issues is quite complicated, and would > sacrifice performance. > What does the user community expect from us (Xilinx)?
I'm not sure you are talking about the same problem that we are. We don't have a problem with how the block ram works. We just want to be able to use them without using instantiation, for a number of reasons. Block rams can be inferred as long as they are used in single port or pseudo dual port modes. But if they are needed with two write ports, the tools have a lot of trouble inferring a dual port block ram. The error message I got from XST 10.1 clearly showed that the tools understood that I wanted a ram with two write ports, but it could not figure out that I wanted it to use the block ram. Or am I missing something about your statements that affect this issue? Actually, the VHDL description of a block ram that uses two processes to write to the same memory also has undefined behavior. If both processes write to the same location using the same clock edge, it is undefined which process will run first and which will run second; so the result written to the block ram is undefined... maybe not in the same way as the hardware, but it is still undefined. Rick
Rick,
I hope this can help...
below you can find the code to infer dual port-ram with
both port sharing the same clock.
I suppose the secret could be using a shared variable (instead of a
signal) as RAM...

regards
Sandro


entity ramInference is
  generic (
    g_data_w : natural := 9;
    g_addr_w : natural := 11
    );
  port (
    i_clkA  : in  std_logic;
    --i_clkB  : in  std_logic;
    i_enA   :     std_logic;
    i_weA   :     std_logic;
    i_addrA : in  std_logic_vector (g_addr_w - 1 downto 0);
    i_dataA : in  std_logic_vector (g_data_w - 1 downto 0);
    o_dataA : out std_logic_vector (g_data_w - 1 downto 0);

    i_enB   :     std_logic;
    i_weB   :     std_logic;
    i_addrB : in  std_logic_vector (g_addr_w - 1 downto 0);
    i_dataB : in  std_logic_vector (g_data_w - 1 downto 0);
    o_dataB : out std_logic_vector (g_data_w - 1 downto 0)
    );
end ramInference;


architecture Behavioral of ramInference is

  constant c_ram_sz : natural := 2**(g_addr_w);

  type t_ram is array (c_ram_sz - 1 downto 0) of
    std_logic_vector (g_data_w - 1 downto 0);

  shared variable v_ram : t_ram := (
    1      => X"05",
    2      => X"08",
    3      => X"1A",
    -- ...
    others => X"00"
    );

begin

  p_portA : process (i_clkA)
  begin
    if rising_edge(i_clkA) then
      if (i_enA = '1') then
        -- READ FIRST
        o_dataA(g_data_w - 1 downto 0) <= v_ram(conv_integer
(i_addrA));
        -- WRITE AFTER
        if (i_weA = '1') then
          v_ram(conv_integer(i_addrA)) := i_dataA(g_data_w - 1 downto
0);
        end if;
      end if;
    end if;
  end process;

  p_portB : process (i_clkA)
  begin
    if rising_edge(i_clkA) then
      if (i_enB = '1') then
        -- WRITE FIRST
        if (i_weB = '1') then
          v_ram(conv_integer(i_addrB)) := i_dataB(g_data_w - 1 downto
0);
        end if;
        -- READ AFTER
        o_dataB(g_data_w - 1 downto 0) <= v_ram(conv_integer
(i_addrB));
      end if;
    end if;
  end process;

end Behavioral;
The BRAM does not have the necessary dual address decoders. The best
option is to clock at half speed and multiplex. Read before write is
most usual.
On May 9, 2:26=A0pm, Jacko <jackokr...@gmail.com> wrote:
> The BRAM does not have the necessary dual address decoders. The best > option is to clock at half speed and multiplex. Read before write is > most usual.
All Xilinx BRAMs have dual address decoders, and each port also has the option of read before or after write or retain previous output. It seems there is no argument about the hardware, but there is about the software... Peter Alfke
On May 10, 12:15=A0am, Peter Alfke <al...@sbcglobal.net> wrote:
> All Xilinx BRAMs have dual address decoders, and each port also has > the option of read before or after write or retain previous output. > It seems there is no argument about the hardware, but there is about > the software... > Peter Alfke
Peter, This time... (quite) no argument about the software too (see my previous post). XST (your software [xilinx]) infers the bram with two r/w ports both with "READ FIRST" and with "WRITE FIRST" options... Maybe the only software (vhdl) argument could be "how to infer dual port BRAM with different bus sizes for the two ports" regards Sandro
On May 9, 4:31=A0pm, Sandro <sdro...@netscape.net> wrote:
> > Peter, > This time... (quite) no argument about the software too (see my > previous post). > XST (your software [xilinx]) infers the bram with two r/w ports both > with "READ FIRST" and with "WRITE FIRST" options... > > Maybe the only software (vhdl) argument could be "how to infer dual > port BRAM with > different bus sizes for the two ports" > > regards > Sandro
Thought I would chime in on some of the comments and observations from this thread. Starting with the most recent comment, if you need different port widths in either the read vs. write of the same port or different widths on the dual port, you do need to instantiate. Neither XST, Synplify or Precision support RAMs with different port widths. I can comment from the XST side that we have investigated this and plan to some day offer this however to date, have not been able to include this capability. As Sandro explains, you should be able to infer a common clock dual port RAM (assuming same port widths) in any of the READ_FIRST, WRITE_FIRST or NO_CHANGE modes. It is fairly straightforward in verilog to code this however for VHDL as explained, you do need to use a shared variable to accomplish this. I am more familiar with Verilog than VHDL but my understanding is that the shared variable is necessary for proper simulation when accessing the same array at the same time. In terms of coding examples for these RAMs, most of the coding examples can be found in the Xilinx Language Templates which are accessible from Xilinx Project Navigator. Open the Templates and look in VHDL or Verilog --> Synthesis Constructs --> Coding Examples --
> RAM to see several examples. In the Single-Port descriptions you
can see the differences between READ_FIRST, WRITE_FIRST and NO_CHANGE mode however unfortunately for the dual port not all have been adapted there but in theory should work. I will see if in 11.2 we can get the templates updated to include all of the dual port examples for these. One other note, if you are inferring a BRAM in which you never plan to read from the same port at the time you are writing, describe NO_CHANGE mode. It will save power but not many realize this. In terms of memory collisions (writing to the same memory address on a dual port RAM as either reading or writing on the other) this described in the device User Guides and the Synthesis and Simulation Design Guide so I hope that most understand what it is and what should be done to avoid them however as for inferring dual-port BRAM, you do need to heed more caution. A behavioral RTL simulation will not alert or model a collision so you can very well simulate a collision behaviorally and get a seemingly valid result but the implementation can give something different. This is not covered by static timing analysis as this is a dynamic situation. It can be covered and alerted by timing simulation however many choose not to do timing simulations so in lieu of that some synthesis tools have decided to arbitrate the access to the same memory locations with additional logic around the BRAM. Both Synplicity and Precision do this however XST does not. Most people who are aware of this, disable the addition of the collision avoidance logic using a synthesis attribute as it can slow the RAM down, add more resources and add more power to the FPGA design and in many cases is not needed however if you do disable this, you need to take extra care to ensure an undetected collision will not give undesired results in your design. I too try to avoid instantiation of BRAM however one advantage it does give you is it will alert you to a memory collision as it is modeled in the UNISIM. As mentioned before a timing simulation (no matter how the RAM was entered) can also detect this. In system testing, can not detect this. Reason being, collisions are as unpredictable as a timing error and while a system may behave one way in one device in one environmental condition (temperature or voltage) during a collision, it may behave differently in another device or under a different environmental condition) so I would not trust in-system testing to this any more than I would a timing violation. Hopefully this clears up some of the issues identified in this thread. I often do infer RAMs in my designs however there are certain circumstances (such as different port widths) that necessitate instantiation so we are still not in a full RTL world when it comes to RAMs. However more situations than most know can be inferred with relative ease (i.e. dual-port, byte enables, read modes, initialization from an external file, all can be inferred now). Regards, -- Brian Philofsky -- Xilinx Applications