FPGARelated.com
Forums

Memory Handling in Altera Cyclone devices

Started by Vazquez September 29, 2003
Hello,

I am trying to transform a functional behavioral description of
a controller module to real hardware in a Cyclone device(EP1C6C256C7).

The VHDL model of the controller is responsible for the control of the
write and read transactions to a data field (two-dimensional array)
which could
be for example a SRAM block.

The problem is that the write- and read- addresses and the
-next-addresses of the SRAM are calculated in smaller subarrays which
also have a two dimensional structure. However these sub-arrays are
too large to be exclusively synthesized
in logical ressources of the FPGA. So doubtless the memory bits in the
Cyclone device which are available in form of RAM, ROM, FIFOs have to
be used in order
to afford the synthesis.

The problem is how to split the behavioral description into
submodules to achieve the same functionality on the one hand and to
use the memory bits on the other hand. An essential question is where
and how
to place the control logic cleverly.

Maybe someone has some good or basic idea. I would be grateful for any
suggestion.

Some examplary VHDL code of the functional description of the module.
The signal row_write represents the current blocknumber of the
SRAM-block and is
gathered from a fifo which comprises blocknumbers from 0 to 255.
(tb_data_out)
The SRAM-block is described as if being whithin the file controller.
The question here is how to "detach" or to split this array and the
other sub-arrays.

----------------------------------------------------------------------
----------------------------------------------------------------------
generic( ROW_ADDR_BITS: integer:= 8;          
           ROWS         : integer:= 256;      
           COL_ADDR_BITS: integer:= 3;        
           COLS         : integer:= 8           
         );

type t_matri is array (0 to COLS*ROWS-1) of std_logic_vector(7 downto
0);
signal data : t_matri;
type t_addr is array (0 to ROWS-1) of std_logic_vector(ROW_ADDR_BITS-1
downto 0);
signal next_addr : t_addr;

type t_col is array (0 to ROWS-1) of std_logic_vector(COL_ADDR_BITS-1
downto 0);
signal last_col  : t_col;  

signal row_write      : integer range 0 to 255; 

begin
.
.
.
     if (write='1' and writing='0' and lsfull='0') then
         if tb_empty='0' then
            row_write <= conv_integer(tb_data_out);
            last_col(row_write) <= "000";   

            data(row_write*COLS + conv_integer(last_col(row_write))) 
                                                           <=data_in; 
       // Here the argument of data consists of the sum of two values
          which are from two different sub-arrays.
          Question: How to deal this split in real hardware-blocks?
     .
     .
     .        
     elsif (write='1' and writing='1' and lsfull='0') then
              .
              .
              .
              next_addr(row_write)  <= tb_data_out;
     end if;
---------------------------------------------------------------------
---------------------------------------------------------------------
Thank you very much for your help.

Best regards

Andr&#4294967295;s V&#4294967295;zquez
G & D - Digital System Development 
email: andres.vazquez@gmx.de
Hi Andres,

You should take a look at the lpm_ram component provided as part of Quartus.
There is a good explanation of it provided in the Quartus Help file.

LPM_RAM provides you with a parameterized memory; you should be able to make
your own wrapper around it if you wish.  Quartus will automatically map your
RAM into the correct set of M4K memories on Cyclone.  If you target Stratix,
Quartus will automatically select the best memory type (M512, M4K, MegaRAM)
to implement your memory.

Regards,

Paul Leventis
Altera Corp.

"Vazquez" <andres.vazquez@gmx.de> wrote in message
news:eee19a7a.0309290552.72b03494@posting.google.com...
> Hello, > > I am trying to transform a functional behavioral description of > a controller module to real hardware in a Cyclone device(EP1C6C256C7). > > The VHDL model of the controller is responsible for the control of the > write and read transactions to a data field (two-dimensional array) > which could > be for example a SRAM block. > > The problem is that the write- and read- addresses and the > -next-addresses of the SRAM are calculated in smaller subarrays which > also have a two dimensional structure. However these sub-arrays are > too large to be exclusively synthesized > in logical ressources of the FPGA. So doubtless the memory bits in the > Cyclone device which are available in form of RAM, ROM, FIFOs have to > be used in order > to afford the synthesis. > > The problem is how to split the behavioral description into > submodules to achieve the same functionality on the one hand and to > use the memory bits on the other hand. An essential question is where > and how > to place the control logic cleverly. > > Maybe someone has some good or basic idea. I would be grateful for any > suggestion. > > Some examplary VHDL code of the functional description of the module. > The signal row_write represents the current blocknumber of the > SRAM-block and is > gathered from a fifo which comprises blocknumbers from 0 to 255. > (tb_data_out) > The SRAM-block is described as if being whithin the file controller. > The question here is how to "detach" or to split this array and the > other sub-arrays. > > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > generic( ROW_ADDR_BITS: integer:= 8; > ROWS : integer:= 256; > COL_ADDR_BITS: integer:= 3; > COLS : integer:= 8 > ); > > type t_matri is array (0 to COLS*ROWS-1) of std_logic_vector(7 downto > 0); > signal data : t_matri; > type t_addr is array (0 to ROWS-1) of std_logic_vector(ROW_ADDR_BITS-1 > downto 0); > signal next_addr : t_addr; > > type t_col is array (0 to ROWS-1) of std_logic_vector(COL_ADDR_BITS-1 > downto 0); > signal last_col : t_col; > > signal row_write : integer range 0 to 255; > > begin > . > . > . > if (write='1' and writing='0' and lsfull='0') then > if tb_empty='0' then > row_write <= conv_integer(tb_data_out); > last_col(row_write) <= "000"; > > data(row_write*COLS + conv_integer(last_col(row_write))) > <=data_in; > // Here the argument of data consists of the sum of two values > which are from two different sub-arrays. > Question: How to deal this split in real hardware-blocks? > . > . > . > elsif (write='1' and writing='1' and lsfull='0') then > . > . > . > next_addr(row_write) <= tb_data_out; > end if; > --------------------------------------------------------------------- > --------------------------------------------------------------------- > Thank you very much for your help. > > Best regards > > Andr&#4294967295;s V&#4294967295;zquez > G & D - Digital System Development > email: andres.vazquez@gmx.de
Dear Mr Leventis,

thank you for your answer.

The problem of the QuartusII-Software-Compiler seems to be 
the recognition of RAM-structures.
The following source code is synthesized using memory bits 
when the signal writing is not used.
When the signal writing is used then QuartusII tries to synthesize
it without any memory bits.
Where could be the problem in the use of the signal writing?

Kind regards
Andr&#4294967295;s
G&D System Development - FPGA design


-------------------------------------------------
-------------------------------------------------
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;

PACKAGE test_ram_package IS

   CONSTANT ram_width : INTEGER := 8;
   CONSTANT ram_depth : INTEGER := 2048;
   
   TYPE word IS ARRAY(0 to ram_width - 1) of std_logic;
   TYPE ram IS ARRAY(0 to ram_depth - 1) of word;
   SUBTYPE address_vector IS INTEGER RANGE 0 to ram_depth - 1;

   CONSTANT xram_width : INTEGER := 12;
   CONSTANT xram_depth : INTEGER := 16;
   
   TYPE xword IS ARRAY(0 to xram_width - 1) of std_logic;
   TYPE xram IS ARRAY(0 to xram_depth - 1) of address_vector;
   SUBTYPE xaddress_vector IS INTEGER RANGE 0 to xram_depth - 1;
   
END test_ram_package;

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.std_logic_arith.ALL;
USE ieee.std_logic_unsigned.ALL;
USE work.test_ram_package.ALL;


ENTITY test_inferred_ram IS
   PORT
   (  --reset  : IN   std_logic;
      clock1 : IN   std_logic;
      clock2 : IN   std_logic;
      data   : IN   word;
      write_address: IN  address_vector;
      read_address:  IN  xaddress_vector;
	  write_xaddress : IN xaddress_vector;
	  xdata			 : IN address_vector;
      we     : IN   std_logic;
      q      : OUT  word
   );
END test_inferred_ram;

ARCHITECTURE rtl OF test_inferred_ram IS

  
   SIGNAL ram_block : RAM;
   SIGNAL xram_block : XRAM;
   SIGNAL read_address_reg : xaddress_vector;
   SIGNAL writing : std_logic;
   
BEGIN

   PROCESS (clock1)
   BEGIN
      --IF reset='1' then
      --   ram_block <= (others => (others=>'0'));
      --   xram_block <= (others=>0);
      --   writing <= '0';
      IF (clock1'event AND clock1 = '1') THEN
         IF (we = '1' and writing='0') THEN
            ram_block(write_address) <= data;
			xram_block(write_xaddress) <= xdata;
			writing <= '1';
	     ELSIF (we='1' and writing='1') then
	           ram_block(write_address) <= data;
			   xram_block(write_xaddress) <= xdata;
			   writing <= '0';
         END IF;
      END IF;
   END PROCESS;

   PROCESS (clock2)
   BEGIN
      IF (clock2'event AND clock2 = '1') THEN
         q <= ram_block(xram_block(read_address_reg));
         read_address_reg <= read_address;
      END IF;
   END PROCESS;
   
END rtl;
Hi Andres,

The problem isn't your "writing" signal -- it is the asynchronous clearing
of the memory.  Memories are not clearable -- if you want to clear your
memory, you must do so by iterating through it and writing 0s.

Regards,

Paul Leventis
Altera Corp.



"Vazquez" <andres.vazquez@gmx.de> wrote in message
news:eee19a7a.0310010044.249af36e@posting.google.com...
> Dear Mr Leventis, > > thank you for your answer. > > The problem of the QuartusII-Software-Compiler seems to be > the recognition of RAM-structures. > The following source code is synthesized using memory bits > when the signal writing is not used. > When the signal writing is used then QuartusII tries to synthesize > it without any memory bits. > Where could be the problem in the use of the signal writing? > > Kind regards > Andr&#4294967295;s > G&D System Development - FPGA design > > > ------------------------------------------------- > ------------------------------------------------- > LIBRARY ieee; > USE ieee.std_logic_1164.ALL; > > PACKAGE test_ram_package IS > > CONSTANT ram_width : INTEGER := 8; > CONSTANT ram_depth : INTEGER := 2048; > > TYPE word IS ARRAY(0 to ram_width - 1) of std_logic; > TYPE ram IS ARRAY(0 to ram_depth - 1) of word; > SUBTYPE address_vector IS INTEGER RANGE 0 to ram_depth - 1; > > CONSTANT xram_width : INTEGER := 12; > CONSTANT xram_depth : INTEGER := 16; > > TYPE xword IS ARRAY(0 to xram_width - 1) of std_logic; > TYPE xram IS ARRAY(0 to xram_depth - 1) of address_vector; > SUBTYPE xaddress_vector IS INTEGER RANGE 0 to xram_depth - 1; > > END test_ram_package; > > LIBRARY ieee; > USE ieee.std_logic_1164.ALL; > USE ieee.std_logic_arith.ALL; > USE ieee.std_logic_unsigned.ALL; > USE work.test_ram_package.ALL; > > > ENTITY test_inferred_ram IS > PORT > ( --reset : IN std_logic; > clock1 : IN std_logic; > clock2 : IN std_logic; > data : IN word; > write_address: IN address_vector; > read_address: IN xaddress_vector; > write_xaddress : IN xaddress_vector; > xdata : IN address_vector; > we : IN std_logic; > q : OUT word > ); > END test_inferred_ram; > > ARCHITECTURE rtl OF test_inferred_ram IS > > > SIGNAL ram_block : RAM; > SIGNAL xram_block : XRAM; > SIGNAL read_address_reg : xaddress_vector; > SIGNAL writing : std_logic; > > BEGIN > > PROCESS (clock1) > BEGIN > --IF reset='1' then > -- ram_block <= (others => (others=>'0')); > -- xram_block <= (others=>0); > -- writing <= '0'; > IF (clock1'event AND clock1 = '1') THEN > IF (we = '1' and writing='0') THEN > ram_block(write_address) <= data; > xram_block(write_xaddress) <= xdata; > writing <= '1'; > ELSIF (we='1' and writing='1') then > ram_block(write_address) <= data; > xram_block(write_xaddress) <= xdata; > writing <= '0'; > END IF; > END IF; > END PROCESS; > > PROCESS (clock2) > BEGIN > IF (clock2'event AND clock2 = '1') THEN > q <= ram_block(xram_block(read_address_reg)); > read_address_reg <= read_address; > END IF; > END PROCESS; > > END rtl;
Hello Mr Leventis,

you said that to clear memory I have to iterate through it and write 0s.
In Cyclone Devices there is the possibility to create FIFO and RAM structures
by using the MegaWizard Manager. By doing so there is a signal called 'aclr'.
What function does this signal have in
a) FIFOs
b) RAMs    ?
Is the content of the memory set to 0 or only the surrounding registers?

Thank you very much

Kind regards
Andr&#4294967295;s V&#4294967295;zquez
G&D System Development


"Paul Leventis" <paul.leventis@utoronto.ca> wrote in message news:<rgAeb.22504$lKj.10858@news04.bloor.is.net.cable.rogers.com>...
> Hi Andres, > > The problem isn't your "writing" signal -- it is the asynchronous clearing > of the memory. Memories are not clearable -- if you want to clear your > memory, you must do so by iterating through it and writing 0s. > > Regards, > > Paul Leventis > Altera Corp. > > > > "Vazquez" <andres.vazquez@gmx.de> wrote in message > news:eee19a7a.0310010044.249af36e@posting.google.com... > > Dear Mr Leventis, > > > > thank you for your answer. > > > > The problem of the QuartusII-Software-Compiler seems to be > > the recognition of RAM-structures. > > The following source code is synthesized using memory bits > > when the signal writing is not used. > > When the signal writing is used then QuartusII tries to synthesize > > it without any memory bits. > > Where could be the problem in the use of the signal writing? > > > > Kind regards > > Andr&#4294967295;s > > G&D System Development - FPGA design > > > > > > ------------------------------------------------- > > ------------------------------------------------- > > LIBRARY ieee; > > USE ieee.std_logic_1164.ALL; > > > > PACKAGE test_ram_package IS > > > > CONSTANT ram_width : INTEGER := 8; > > CONSTANT ram_depth : INTEGER := 2048; > > > > TYPE word IS ARRAY(0 to ram_width - 1) of std_logic; > > TYPE ram IS ARRAY(0 to ram_depth - 1) of word; > > SUBTYPE address_vector IS INTEGER RANGE 0 to ram_depth - 1; > > > > CONSTANT xram_width : INTEGER := 12; > > CONSTANT xram_depth : INTEGER := 16; > > > > TYPE xword IS ARRAY(0 to xram_width - 1) of std_logic; > > TYPE xram IS ARRAY(0 to xram_depth - 1) of address_vector; > > SUBTYPE xaddress_vector IS INTEGER RANGE 0 to xram_depth - 1; > > > > END test_ram_package; > > > > LIBRARY ieee; > > USE ieee.std_logic_1164.ALL; > > USE ieee.std_logic_arith.ALL; > > USE ieee.std_logic_unsigned.ALL; > > USE work.test_ram_package.ALL; > > > > > > ENTITY test_inferred_ram IS > > PORT > > ( --reset : IN std_logic; > > clock1 : IN std_logic; > > clock2 : IN std_logic; > > data : IN word; > > write_address: IN address_vector; > > read_address: IN xaddress_vector; > > write_xaddress : IN xaddress_vector; > > xdata : IN address_vector; > > we : IN std_logic; > > q : OUT word > > ); > > END test_inferred_ram; > > > > ARCHITECTURE rtl OF test_inferred_ram IS > > > > > > SIGNAL ram_block : RAM; > > SIGNAL xram_block : XRAM; > > SIGNAL read_address_reg : xaddress_vector; > > SIGNAL writing : std_logic; > > > > BEGIN > > > > PROCESS (clock1) > > BEGIN > > --IF reset='1' then > > -- ram_block <= (others => (others=>'0')); > > -- xram_block <= (others=>0); > > -- writing <= '0'; > > IF (clock1'event AND clock1 = '1') THEN > > IF (we = '1' and writing='0') THEN > > ram_block(write_address) <= data; > > xram_block(write_xaddress) <= xdata; > > writing <= '1'; > > ELSIF (we='1' and writing='1') then > > ram_block(write_address) <= data; > > xram_block(write_xaddress) <= xdata; > > writing <= '0'; > > END IF; > > END IF; > > END PROCESS; > > > > PROCESS (clock2) > > BEGIN > > IF (clock2'event AND clock2 = '1') THEN > > q <= ram_block(xram_block(read_address_reg)); > > read_address_reg <= read_address; > > END IF; > > END PROCESS; > > > > END rtl;
I believe the aclr just clears the output latch.

In a FIFO it will also reset the counters.  This is as good as clearing the
ram as the FIFO will only output data it has already stored.

In a ram you will need a state machine to clear the entire memory... that is
unless you can handle unwritten locations.

You might, however, be able to set up the ram as dual port.. one port with
the 1kx4 or whatever you want, the other set to maximum width.. say 256x16.
This means it can be cleared in a minimum of 256 clocks instead of 1024.

Am not sure about the Cyclone.. BUT with the Xilinx it is only the output
latch of the ram that is cleared.. this means the ram always returns 0x00
until it is released.. but you can still write to it.. this means you can
background clear it while appearing to be clear on the second port.

Someone might be able to say if this is true for the Cyclone too but I
believe from what I have read it is so.


Simon


"Vazquez" <andres.vazquez@gmx.de> wrote in message
news:eee19a7a.0310052333.cb7e111@posting.google.com...
> Hello Mr Leventis, > > you said that to clear memory I have to iterate through it and write 0s. > In Cyclone Devices there is the possibility to create FIFO and RAM
structures
> by using the MegaWizard Manager. By doing so there is a signal called
'aclr'.
> What function does this signal have in > a) FIFOs > b) RAMs ? > Is the content of the memory set to 0 or only the surrounding registers? > > Thank you very much > > Kind regards > Andr&#4294967295;s V&#4294967295;zquez > G&D System Development > > > "Paul Leventis" <paul.leventis@utoronto.ca> wrote in message
news:<rgAeb.22504$lKj.10858@news04.bloor.is.net.cable.rogers.com>...
> > Hi Andres, > > > > The problem isn't your "writing" signal -- it is the asynchronous
clearing
> > of the memory. Memories are not clearable -- if you want to clear your > > memory, you must do so by iterating through it and writing 0s. > > > > Regards, > > > > Paul Leventis > > Altera Corp. > > > > > > > > "Vazquez" <andres.vazquez@gmx.de> wrote in message > > news:eee19a7a.0310010044.249af36e@posting.google.com... > > > Dear Mr Leventis, > > > > > > thank you for your answer. > > > > > > The problem of the QuartusII-Software-Compiler seems to be > > > the recognition of RAM-structures. > > > The following source code is synthesized using memory bits > > > when the signal writing is not used. > > > When the signal writing is used then QuartusII tries to synthesize > > > it without any memory bits. > > > Where could be the problem in the use of the signal writing? > > > > > > Kind regards > > > Andr&#4294967295;s > > > G&D System Development - FPGA design > > > > > > > > > ------------------------------------------------- > > > ------------------------------------------------- > > > LIBRARY ieee; > > > USE ieee.std_logic_1164.ALL; > > > > > > PACKAGE test_ram_package IS > > > > > > CONSTANT ram_width : INTEGER := 8; > > > CONSTANT ram_depth : INTEGER := 2048; > > > > > > TYPE word IS ARRAY(0 to ram_width - 1) of std_logic; > > > TYPE ram IS ARRAY(0 to ram_depth - 1) of word; > > > SUBTYPE address_vector IS INTEGER RANGE 0 to ram_depth - 1; > > > > > > CONSTANT xram_width : INTEGER := 12; > > > CONSTANT xram_depth : INTEGER := 16; > > > > > > TYPE xword IS ARRAY(0 to xram_width - 1) of std_logic; > > > TYPE xram IS ARRAY(0 to xram_depth - 1) of address_vector; > > > SUBTYPE xaddress_vector IS INTEGER RANGE 0 to xram_depth - 1; > > > > > > END test_ram_package; > > > > > > LIBRARY ieee; > > > USE ieee.std_logic_1164.ALL; > > > USE ieee.std_logic_arith.ALL; > > > USE ieee.std_logic_unsigned.ALL; > > > USE work.test_ram_package.ALL; > > > > > > > > > ENTITY test_inferred_ram IS > > > PORT > > > ( --reset : IN std_logic; > > > clock1 : IN std_logic; > > > clock2 : IN std_logic; > > > data : IN word; > > > write_address: IN address_vector; > > > read_address: IN xaddress_vector; > > > write_xaddress : IN xaddress_vector; > > > xdata : IN address_vector; > > > we : IN std_logic; > > > q : OUT word > > > ); > > > END test_inferred_ram; > > > > > > ARCHITECTURE rtl OF test_inferred_ram IS > > > > > > > > > SIGNAL ram_block : RAM; > > > SIGNAL xram_block : XRAM; > > > SIGNAL read_address_reg : xaddress_vector; > > > SIGNAL writing : std_logic; > > > > > > BEGIN > > > > > > PROCESS (clock1) > > > BEGIN > > > --IF reset='1' then > > > -- ram_block <= (others => (others=>'0')); > > > -- xram_block <= (others=>0); > > > -- writing <= '0'; > > > IF (clock1'event AND clock1 = '1') THEN > > > IF (we = '1' and writing='0') THEN > > > ram_block(write_address) <= data; > > > xram_block(write_xaddress) <= xdata; > > > writing <= '1'; > > > ELSIF (we='1' and writing='1') then > > > ram_block(write_address) <= data; > > > xram_block(write_xaddress) <= xdata; > > > writing <= '0'; > > > END IF; > > > END IF; > > > END PROCESS; > > > > > > PROCESS (clock2) > > > BEGIN > > > IF (clock2'event AND clock2 = '1') THEN > > > q <= ram_block(xram_block(read_address_reg)); > > > read_address_reg <= read_address; > > > END IF; > > > END PROCESS; > > > > > > END rtl;
Vazquez, Simon:

Yes, the M4K blocks in Cyclone (and the M512 and M512K blocks in Stratix)
are asynchronously clearable on their input and/or output registers (depends
on the single/dual-port setup, etc.).  And the M4K block can be configured
in a mixed-width mode as described by Simon.

- Paul Leventis
Altera Corp.