comp.arch.fpga | Altera's altsyncram MAXIMUM

What does this generic means?

I am wondering if I am missing out on a possible memory optimization.

Altera's docs are decidedly vague and a search on their website brings up nothing.

-- Pete

Reply by ●November 17, 20032003-11-17

Hi Peter,

> I am wondering if I am missing out on a possible memory optimization.
yes you do.

Quartus allocates memory by depth first, 512x8bit therefore uses two M4Ks 
in 512x4 mode. If your memory width and depth is a power of two, allocation 
order doesn't matter except for some speed details. But a 700x8bit memory 
is much better allocated by width than by depth (because only 3 M4Ks are 
needed for the first compared to 4 for the latter). (see 
http://www.altera.com/support/kdb/rd03292002_9305.html for further details)
MAXIMUM_DEPTH should help you to force Quartus not to waste this addtional 
memory block.

Unfortunately it doesn't work. Not even the way Altera thinks it should 
work. I had a long (and somewhat bizarre) service request the last entry 
being the following one:

-- Altera wrote
This is to let you know that a software problem request has been filed in 
order to reflect this issue.  I will let you know as soon the software 
group gets back to me with any infomation or when a resolution is made.
-- Altera wrote much more, but [snip]

This was written the 25th of august and the service request was closed 
without further comment. I have posted an additional request asking for the 
actual state of the problem request about one month ago and did not receive 
any answer. Either Altera doesn't care or they don't want to state that 
this is an issue at present before they are able to ship the new Quartus 
4.0 (hopefully fixing this and a lot of other things) - who knows?
If anyone in the group thinks he can help on this topic or has further 
details I would be thankful to hear about it as Quartus wastes a lot of my 
memory and this has to change!

I have to say that life with Altera mySupport is very ambiguous to me. 
Answers are generally quick and friendly (which is already a lot) but 
generally only helpful when problems are simple. Whenever the problem gets 
more complex or there is a bug thinks get very slow (or even stop).

Regards, Manfred

BTW: "Release notes for Service Pack 2 will be released on Friday, October 
24, 2003." (seen on 
https://www.altera.com/support/software/download/service_packs/quartus/dnl- 
qii30sp2.jsp the 17th november)




======= Service Request Detail (reordered for your convenience)
 Request #: 10363308  Status: Closed  Date Opened (PDT): 8/19/03 9:03 AM  
Date Closed (PDT): 9/4/03 6:52 PM  Inquiry Type: Product Question

 Device Family: CYCLONE  Device:
   Title: FIFO implementation size

 Description: I have created a 1300word by 8bit FIFO (sfifo). The 
implementation of this needs 16384 memory bits. Why?

The FIFO-size should result in about 1300x8=10400 memory bits. As the 
blocksize of the embedded ram in Cyclone is 4096bits which can be organized 
512x8 I expect Quartus to use three M4K's resulting in 4096*3=12288bits. 
Obviously it uses a fourth block, why?

Regards, Manfred
 ------ 8/19/03 3:17 PM
  To Customer
  Hello Manfred,

This is to let you know that I am currently looking into this.  I will let 
you know as soon as I am able to verify the problem as you have described 
and come into a resolution.

------ 8/19/03 4:20 PM
  To Customer
  Hello Manfred,

Since 1300 is larger than 1k, it'll use 2kx2 mode for best performance.  To 
get the x8 mode you'll need 4 M4Ks.  Click custom on (page 6 out of 8 of 
the megawizard), then you get an option to set Maximum depth option and if 
you set 512 then it'll use that mode and should only need 3 M4Ks.

For more information on this, you may refer to the following link:

http://www.altera.com/support/kdb/rd03292002_9305.html

------ 8/20/03 12:36 AM
  From Customer
  Hello Marlon,

thanks for your quick and helpful reply. Now the behaviour of Quartus is 
clear to me.
Unfortunately setting the parameter max. block depth to 512 in the 
Megawizard Plug-In Manager as you proposed does not result in a smaller 
memory consumption. I have attached the packed project for your 
convenience.
Setting this parameter adds the following line in the scfifo instantiation 
code:           maximum_depth => 512,
however this parameter is not described in the Quartus II help page for the 
scfifo-Megafunction. Why?

Regards, Manfred

------ 8/20/03 9:47 AM
  To Customer
  Hello Manfred,

The MAXIMUM_DEPTH parameter is an internal parameter so there won't be any 
information on this in the Quartus II Help or Megawizard.

------ 8/20/03 11:26 PM
  From Customer
  Hello Marlon,

again: Unfortunately setting the parameter max. block depth to 512 in the 
Megawizard Plug-In Manager as you proposed does NOT result in a smaller 
memory consumption. Why? Please check with the attached project file.

Regards, Manfred
 ------ 8/21/03 5:08 PM
  To Customer
  Hello Manfred,

Sorry for the inconvenience, but actually, in order to get the x8 mode 
you'll need 4 M4Ks.

------ 8/21/03 11:49 PM
  From Customer
  Hello Marlon,

could you please specify why it is not possible to implement a 1300x8 FIFO 
in 3 M4K Blocks as this information is the opposite of both your first 
advice and the mentioned support database page 
(http://www.altera.com/support/kdb/rd03292002_9305.html).
What exactly is the parameter maximal block depth for then?

Regards, Manfred

 ------ 8/25/03 6:50 PM
  To Customer
  Hello Manfred,

This is to let you know that a software problem request has been filed in 
order to reflect this issue.  I will let you know as soon the software 
group gets back to me with any infomation or when a resolution is made.

Reply by Subroto Datta ●November 17, 20032003-11-17

petersommerfeld@hotmail.com (Peter Sommerfeld) wrote in message news:<5c4d983.0311170541.5bd0c1db@posting.google.com>...
> What does this generic means?
> 
> I am wondering if I am missing out on a possible memory optimization.
> 
> Altera's docs are decidedly vague and a search on their website brings up nothing.
> 
> -- Pete

MAXIMUM_DEPTH controls the underlying RAM block size that will be used
to construct the user's altsyncram memory.  By default, the altsyncram
megafunction will round up the memory depth to the next power-of-2,
and use that as a RAM block size.  For example, if you ask for a
3K-word memory, altsyncram will normally construct it from 4K RAM
blocks, because this gives the best performance.  If you are running
short of RAM blocks, you could specify MAXIMUM_DEPTH=1024 for this
example, and the altsyncram megafunction will construct the 3K memory
from 1K-word RAM blocks, which might potentially use 1/4 fewer RAM
blocks.  The penalty for doing this is that the 3K-word memory
constructed from 1K-word RAM blocks will need LEs to mux and de-mux
the data, and will also run slower as a result.

In summary, MAXIMUM_DEPTH is a control to increase memory efficiency
for non-power-of-2 memory depths, but at a cost of lower memory
performance, and a few LEs to stitch the smaller RAM blocks together. 
MAXIMUM_DEPTH can only take power-of-2 values, with 32 being the
smallest meaningful value, since it corresponds to the shallowest M512
memory block configuration.

- Subroto Datta
Altera Corp.

Reply by Peter Sommerfeld ●November 18, 20032003-11-18

Hi Manfred, Subroto:

Thank you very much for your in-depth replies. I'm happy to see that
MAXIMUM_DEPTH does what I was hoping it does, because I need many RAMs
at non-power-of-2 bits storage, and I'm feeling a little too lazy to
write my own muxing logic.

Manfred, I compiled a design that had one depth-first and one
width-first RAM block, each being 1,089 x 32 bits. The depth-first
used 16 M4k's as 4096x2, and the width-first used 9 M4k's as 128x32,
so the functionality appears to be working for me. Perhaps certain
memory configuration work properly with MAXIMUM_DEPTH, while others
(ie. yours) do not?

As expected the critical path was in the width-first logic, but was
still 220 MHz+.

I am using Quartus II 3.0 SP2. I found the release notes at
http://www.altera.com/literature/rn/rn_qts.pdf.

Thanks again,

-- Pete

Manfred M&#4294967295;cke <manfred.getmuecke@ridgmxof.thisat> wrote in message news:<oprysn2vdygdoir8@news.inode.at>...
> Hi Peter,
> 
> > I am wondering if I am missing out on a possible memory optimization.
> yes you do.
> 
> Quartus allocates memory by depth first, 512x8bit therefore uses two M4Ks 
> in 512x4 mode. If your memory width and depth is a power of two, allocation 
> order doesn't matter except for some speed details. But a 700x8bit memory 
> is much better allocated by width than by depth (because only 3 M4Ks are 
> needed for the first compared to 4 for the latter). (see 
> http://www.altera.com/support/kdb/rd03292002_9305.html for further details)
> MAXIMUM_DEPTH should help you to force Quartus not to waste this addtional 
> memory block.
> 
> Unfortunately it doesn't work. Not even the way Altera thinks it should 
> work. I had a long (and somewhat bizarre) service request the last entry 
> being the following one:
> 
> -- Altera wrote
> This is to let you know that a software problem request has been filed in 
> order to reflect this issue.  I will let you know as soon the software 
> group gets back to me with any infomation or when a resolution is made.
> -- Altera wrote much more, but [snip]
> 
> This was written the 25th of august and the service request was closed 
> without further comment. I have posted an additional request asking for the 
> actual state of the problem request about one month ago and did not receive 
> any answer. Either Altera doesn't care or they don't want to state that 
> this is an issue at present before they are able to ship the new Quartus 
> 4.0 (hopefully fixing this and a lot of other things) - who knows?
> If anyone in the group thinks he can help on this topic or has further 
> details I would be thankful to hear about it as Quartus wastes a lot of my 
> memory and this has to change!
> 
> I have to say that life with Altera mySupport is very ambiguous to me. 
> Answers are generally quick and friendly (which is already a lot) but 
> generally only helpful when problems are simple. Whenever the problem gets 
> more complex or there is a bug thinks get very slow (or even stop).
> 
> Regards, Manfred
> 
> BTW: "Release notes for Service Pack 2 will be released on Friday, October 
> 24, 2003." (seen on 
> https://www.altera.com/support/software/download/service_packs/quartus/dnl- 
> qii30sp2.jsp the 17th november)
> 
> 
> 
> 
> ======= Service Request Detail (reordered for your convenience)
>  Request #: 10363308  Status: Closed  Date Opened (PDT): 8/19/03 9:03 AM  
> Date Closed (PDT): 9/4/03 6:52 PM  Inquiry Type: Product Question
> 
>  Device Family: CYCLONE  Device:
>    Title: FIFO implementation size
> 
>  Description: I have created a 1300word by 8bit FIFO (sfifo). The 
> implementation of this needs 16384 memory bits. Why?
> 
> The FIFO-size should result in about 1300x8=10400 memory bits. As the 
> blocksize of the embedded ram in Cyclone is 4096bits which can be organized 
> 512x8 I expect Quartus to use three M4K's resulting in 4096*3=12288bits. 
> Obviously it uses a fourth block, why?
> 
> Regards, Manfred
>  ------ 8/19/03 3:17 PM
>   To Customer
>   Hello Manfred,
> 
> This is to let you know that I am currently looking into this.  I will let 
> you know as soon as I am able to verify the problem as you have described 
> and come into a resolution.
> 
> ------ 8/19/03 4:20 PM
>   To Customer
>   Hello Manfred,
> 
> Since 1300 is larger than 1k, it'll use 2kx2 mode for best performance.  To 
> get the x8 mode you'll need 4 M4Ks.  Click custom on (page 6 out of 8 of 
> the megawizard), then you get an option to set Maximum depth option and if 
> you set 512 then it'll use that mode and should only need 3 M4Ks.
> 
> For more information on this, you may refer to the following link:
> 
> http://www.altera.com/support/kdb/rd03292002_9305.html
> 
> ------ 8/20/03 12:36 AM
>   From Customer
>   Hello Marlon,
> 
> thanks for your quick and helpful reply. Now the behaviour of Quartus is 
> clear to me.
> Unfortunately setting the parameter max. block depth to 512 in the 
> Megawizard Plug-In Manager as you proposed does not result in a smaller 
> memory consumption. I have attached the packed project for your 
> convenience.
> Setting this parameter adds the following line in the scfifo instantiation 
> code:           maximum_depth => 512,
> however this parameter is not described in the Quartus II help page for the 
> scfifo-Megafunction. Why?
> 
> Regards, Manfred
> 
> ------ 8/20/03 9:47 AM
>   To Customer
>   Hello Manfred,
> 
> The MAXIMUM_DEPTH parameter is an internal parameter so there won't be any 
> information on this in the Quartus II Help or Megawizard.
> 
> ------ 8/20/03 11:26 PM
>   From Customer
>   Hello Marlon,
> 
> again: Unfortunately setting the parameter max. block depth to 512 in the 
> Megawizard Plug-In Manager as you proposed does NOT result in a smaller 
> memory consumption. Why? Please check with the attached project file.
> 
> Regards, Manfred
>  ------ 8/21/03 5:08 PM
>   To Customer
>   Hello Manfred,
> 
> Sorry for the inconvenience, but actually, in order to get the x8 mode 
> you'll need 4 M4Ks.
> 
> ------ 8/21/03 11:49 PM
>   From Customer
>   Hello Marlon,
> 
> could you please specify why it is not possible to implement a 1300x8 FIFO 
> in 3 M4K Blocks as this information is the opposite of both your first 
> advice and the mentioned support database page 
> (http://www.altera.com/support/kdb/rd03292002_9305.html).
> What exactly is the parameter maximal block depth for then?
> 
> Regards, Manfred
> 
>  ------ 8/25/03 6:50 PM
>   To Customer
>   Hello Manfred,
> 
> This is to let you know that a software problem request has been filed in 
> order to reflect this issue.  I will let you know as soon the software 
> group gets back to me with any infomation or when a resolution is made.

Reply by Subroto Datta ●November 19, 20032003-11-19

sdatta@altera.com (Subroto Datta) wrote in message news:<ca4d800d.0311171211.14b76e97@posting.google.com>...
> petersommerfeld@hotmail.com (Peter Sommerfeld) wrote in message news:<5c4d983.0311170541.5bd0c1db@posting.google.com>...
> > What does this generic means?
> > 
> > I am wondering if I am missing out on a possible memory optimization.
> > 
> > Altera's docs are decidedly vague and a search on their website brings up nothing.
> > 
> > -- Pete
> 
> MAXIMUM_DEPTH controls the underlying RAM block size that will be used
> to construct the user's altsyncram memory.  By default, the altsyncram
> megafunction will round up the memory depth to the next power-of-2,
> and use that as a RAM block size.  For example, if you ask for a
> 3K-word memory, altsyncram will normally construct it from 4K RAM
> blocks, because this gives the best performance.  If you are running
> short of RAM blocks, you could specify MAXIMUM_DEPTH=1024 for this
> example, and the altsyncram megafunction will construct the 3K memory
> from 1K-word RAM blocks, which might potentially use 1/4 fewer RAM
> blocks.  The penalty for doing this is that the 3K-word memory
> constructed from 1K-word RAM blocks will need LEs to mux and de-mux
> the data, and will also run slower as a result.
> 
> In summary, MAXIMUM_DEPTH is a control to increase memory efficiency
> for non-power-of-2 memory depths, but at a cost of lower memory
> performance, and a few LEs to stitch the smaller RAM blocks together. 
> MAXIMUM_DEPTH can only take power-of-2 values, with 32 being the
> smallest meaningful value, since it corresponds to the shallowest M512
> memory block configuration.
> 
> - Subroto Datta
> Altera Corp.

Hi Manfred, Peter,

The MAXIMUM_DEPTH description that was posted in my previous reply
applies to the altsyncram megafunction, and indirectly to scfifo and
dcfifo megafunctions.  The FIFO megafunctions do not support
non-power-of-2 depths, so the memory example I gave does not apply. 
In Quartus II 4.0, the FIFO MegaWizard plug-in will not allow you to
enter non-power-of-2 depths.

The only reason for specifying a MAXIMUM_DEPTH parameter in a FIFO
megafunction in pre-4.0 versions of Quartus would be to enforce a
smaller RAM block size to give added freedom to the fitter. 
MAXIMUM_DEPTH values of 128, 256, and 512 can fit in either M512
blocks or M4K blocks.  A MAXIMUM_DEPTH value of 4096 can fit in either
an M4K block or an M-RAM.

Here's an example:  I have a 2K word FIFO, and I don't care if it goes
into M4K blocks or M512 blocks.  If I set MAXIMUM_DEPTH=512, the FIFO
will be constructed from 512-word RAM slices, which gives the fitter
the flexibility to place the FIFOs in either M512 blocks or M4K
blocks.

- Subroto Datta
Altera Corp.

Reply by ●November 20, 20032003-11-20

Hi Subroto,

> The FIFO megafunctions do not support non-power-of-2 depths, so the 
> memory example I gave does not apply.

This is a very clean answer to a very long service request issue, it would 
have saved me a lot of time getting the very same answer from Altera 
mySupport. Instead they left me with a dangling service request and the 
information that there is a potential bug in Quartus. Do you have the 
possibility to look into that, or to share your knowledge with your support 
team? I would appreciate getting an official answer from mySuport, really 
closing my service request.
BTW: Why do you restrict FIFO depths to powers of two? That would allow 
trading memory usage versus implementation speed (like with altsyncram).

Regards, Manfred

Reply by Mike Treseler ●November 20, 20032003-11-20

Manfred M=FCcke wrote:

> BTW: Why do you restrict FIFO depths to powers of two? That would allow=
=20
> trading memory usage versus implementation speed (like with altsyncram)=
=2E

Probably because FIFO storage is based on a ram,
and ram comes in increments of one address bit.

As Subroto said, the extra space from altsyncram MAXIMUM_DEPTH
to the top could not be used as RAM in any case.

  -- Mike Treseler

Reply by ●November 21, 20032003-11-21

>> BTW: Why do you restrict FIFO depths to powers of two? That would allow 
>> trading memory usage versus implementation speed (like with altsyncram).

> Probably because FIFO storage is based on a ram,
> and ram comes in increments of one address bit.

True as long as the size of the memory/FIFO is smaller than the memory 
blocks available in the device. A Cyclone for example uses M4K memory 
blocks with 4096bit each (as the name suggests). So for RAM/FIFOs <4096bit 
you will always pay with a full M4K (as long as tey are implemented in 
memory blocks), but for RAM/FIFOs >4096bits the M4K-block is the smallest 
building unit, allowing you to implement a RAM/FIFO using 3*4096=12288bits 
from 3 M4K-blocks (depending on the FIFO width). Because address decoding 
is easier when aligning by depth an to improve speed, it can make sense to 
use more (four in our example) M4K-blocks wasting some memory, but it is by 
no ways a necessity.
This is a limitation which does not apply to RAM but only to FIFOs and will 
be introduced in Quartus 4.0 as Subroto said. However RAM and FIFOs are 
both implemented in the very same memory blocks so it's up to the 
Wizard/Module Designer to allow or restrict the depth. It is a choice to 
restrict FIFO depths to powers of two but as long as there is no special 
FIFO-RAM block no must. My question was why this limitation which restricts 
potential savings on memory bit consumption will be introduced.

Regards, Manfred

Reply by ●November 23, 20032003-11-23

Followup to:  <opryznd7oggdoir8@news.inode.at>
By author:    =?iso-8859-15?Q?Manfred_M=FCcke?= <manfred.getmuecke@ridgmxof.thisat>
In newsgroup: comp.arch.fpga
> 
> True as long as the size of the memory/FIFO is smaller than the memory 
> blocks available in the device. A Cyclone for example uses M4K memory 
> blocks with 4096bit each (as the name suggests). So for RAM/FIFOs <4096bit 
> you will always pay with a full M4K (as long as tey are implemented in 
> memory blocks), but for RAM/FIFOs >4096bits the M4K-block is the smallest 
> building unit, allowing you to implement a RAM/FIFO using 3*4096=12288bits 
> from 3 M4K-blocks (depending on the FIFO width). Because address decoding 
> is easier when aligning by depth an to improve speed, it can make sense to 
> use more (four in our example) M4K-blocks wasting some memory, but it is by 
> no ways a necessity.
> 

There is another issue, which is that the RAMs are actually 4608 bits,
not 4096.  I have seen Quartus refuse to use those extra bits in
situations where it could have, because it prefers to organize by
depth, and apparently no way to work around this.

I would really like to see:

(a) support of non-power-of-two memory sizes;
(b) ability to optimize for RAM consumption at the expense of timing.

This in particular was an issue when I tried to create a 16384 x 9 bit
ROM, and yes, I needed all 9 bits...

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

Reply by ●December 6, 20032003-12-06

Hi Subroto,

I would like to renew my question: Why do you restrict FIFO depths to 
powers of two? I can't see the need for that.

Regards, Manfred

Previous12 Next

Altera's altsyncram MAXIMUM_DEPTH

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group