FPGARelated.com
Forums

What to do with an improved algorithm?

Started by Mike Field September 3, 2018
On Wednesday, 5 September 2018 23:43:28 UTC+12, Gene Filatov  wrote:
> > I don't know the literature well, but I think it would be cool if you > actually write an article detailing your approach! > > Gene
I'll work on writing one up over the next few days, as well as posting a sample implementation. Mike
> What do you think?
That's good stuff. I wonder why you think using a BRAM is bad? It's good to use the hard cores in an FPGA instead of the fabric--it's less power and deterministic routing times. Is the CORDIC still advantageous in a modern FPGA? The last time I needed to find sine, I used a coarse BRAM lookup that output sine on one port and cos on another. Those were used as derivatives for a 2nd-order Taylor. Two multipliers (more hard cores) are used (using Horner's Rule) for the 1st and 2nd-order interpolations. I don't remember how many digits of accuracy this yields, but the latency is low.
Le mercredi 10 octobre 2018 21:52:06 UTC-4, Kevin Neilson a écrit :
> > What do you think? > > That's good stuff. I wonder why you think using a BRAM is bad? It's good to use the hard cores in an FPGA instead of the fabric--it's less power and deterministic routing times. > > Is the CORDIC still advantageous in a modern FPGA? The last time I needed to find sine, I used a coarse BRAM lookup that output sine on one port and cos on another. Those were used as derivatives for a 2nd-order Taylor. Two multipliers (more hard cores) are used (using Horner's Rule) for the 1st and 2nd-order interpolations. I don't remember how many digits of accuracy this yields, but the latency is low.
Cordic is useful to compute high-precision atan2. Otherwise for 2 16-bit inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you take advantage of the symmetries).
> Cordic is useful to compute high-precision atan2. Otherwise for 2 16-bit inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you take advantage of the symmetries).
I think you missed the part about the Taylor interpolation.
On Thursday, October 11, 2018 at 7:37:37 AM UTC-6, Benjamin Couillard wrote:

> Cordic is useful to compute high-precision atan2. Otherwise for 2 16-bit inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you take advantage of the symmetries).
Sorry; I didn't see that you were talking about arctan. It's not quite as easy as sin/cos, but there is still the question as to whether a Farrow-type architecture using a coarse lookup and Taylor interpolations would be better than a CORDIC, and I am guessing that with the BRAMs and multipliers in present-day FPGAs, the answer would be yes.
Le vendredi 12 octobre 2018 00:42:18 UTC-4, Kevin Neilson a =C3=A9crit=C2=
=A0:
> On Thursday, October 11, 2018 at 7:37:37 AM UTC-6, Benjamin Couillard wro=
te:
>=20 > > Cordic is useful to compute high-precision atan2. Otherwise for 2 16-b=
it inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you= take advantage of the symmetries).
>=20 > Sorry; I didn't see that you were talking about arctan. It's not quite a=
s easy as sin/cos, but there is still the question as to whether a Farrow-t= ype architecture using a coarse lookup and Taylor interpolations would be b= etter than a CORDIC, and I am guessing that with the BRAMs and multipliers = in present-day FPGAs, the answer would be yes. Yes I think you're right it could work for an atan or atan 2. You could im= plemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 = degrees), a BRAM + taylor interpolation.=20 For atan, you'd simply skip the divider and sign look-up part.=20 On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for a= tan/atan2.
> Yes I think you're right it could work for an atan or atan 2. You could implemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 degrees), a BRAM + taylor interpolation. > > For atan, you'd simply skip the divider and sign look-up part. > > On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for atan/atan2.
It's probably better to multiply y/x by x to get a normalized ratio rather than use a divider, which requires a lot of resources. I recalled that I had to implement the atan2 function once for a QAM carrier/symbol recovery circuit. I didn't need great precision, so I split one quadrant into a grid and put the angle of the center of each grid square into a 2-dimensional lookup ROM. Then I could put X,Y into the ROM and get the coarse angle (which was then adjusted for the quadrant) and could use that for carrier recovery.
Le mardi 16 octobre 2018 22:36:56 UTC-4, Kevin Neilson a écrit :
> > Yes I think you're right it could work for an atan or atan 2. You could implemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 degrees), a BRAM + taylor interpolation. > > > > For atan, you'd simply skip the divider and sign look-up part. > > > > On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for atan/atan2. > > It's probably better to multiply y/x by x to get a normalized ratio rather than use a divider, which requires a lot of resources. > > I recalled that I had to implement the atan2 function once for a QAM carrier/symbol recovery circuit. I didn't need great precision, so I split one quadrant into a grid and put the angle of the center of each grid square into a 2-dimensional lookup ROM. Then I could put X,Y into the ROM and get the coarse angle (which was then adjusted for the quadrant) and could use that for carrier recovery.
One case that comes to mind : you have 2 quadrature signals x = A(t) * cos (wt + some phase) + noise_x y = A(t) * sin (wt + some phase) + noise_y Atan2(y, x) = wt + some phase. The variations of A(t) (as long as they are slow-ish) will cancel each other out. You can filter or average wt + some phase to extract the phase. Or derivate "wt + some phase" then filter to get the filtered frequency. So multiplying "y/x by x" would not make much sense in this case.
On Wednesday, October 17, 2018 at 9:26:38 AM UTC-6, Benjamin Couillard wrote:
> Le mardi 16 octobre 2018 22:36:56 UTC-4, Kevin Neilson a écrit : > > > Yes I think you're right it could work for an atan or atan 2. You could implemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 degrees), a BRAM + taylor interpolation. > > > > > > For atan, you'd simply skip the divider and sign look-up part. > > > > > > On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for atan/atan2. > > > > It's probably better to multiply y/x by x to get a normalized ratio rather than use a divider, which requires a lot of resources. > > > > I recalled that I had to implement the atan2 function once for a QAM carrier/symbol recovery circuit. I didn't need great precision, so I split one quadrant into a grid and put the angle of the center of each grid square into a 2-dimensional lookup ROM. Then I could put X,Y into the ROM and get the coarse angle (which was then adjusted for the quadrant) and could use that for carrier recovery. > > One case that comes to mind : you have 2 quadrature signals > > x = A(t) * cos (wt + some phase) + noise_x > y = A(t) * sin (wt + some phase) + noise_y > > Atan2(y, x) = wt + some phase. The variations of A(t) (as long as they are slow-ish) will cancel each other out. > > You can filter or average wt + some phase to extract the phase. Or derivate "wt + some phase" then filter to get the filtered frequency. > > So multiplying "y/x by x" would not make much sense in this case.
No, multiplying by x doesn't make sense. Perhaps using a ROM for 1/x and a multiplier would be better than a full divider.