On Wednesday, 5 September 2018 23:43:28 UTC+12, Gene Filatov wrote:> > I don't know the literature well, but I think it would be cool if you > actually write an article detailing your approach! > > GeneI'll work on writing one up over the next few days, as well as posting a sample implementation. Mike
What to do with an improved algorithm?
Started by ●September 3, 2018
Reply by ●September 10, 20182018-09-10
Reply by ●October 10, 20182018-10-10
> What do you think?That's good stuff. I wonder why you think using a BRAM is bad? It's good to use the hard cores in an FPGA instead of the fabric--it's less power and deterministic routing times. Is the CORDIC still advantageous in a modern FPGA? The last time I needed to find sine, I used a coarse BRAM lookup that output sine on one port and cos on another. Those were used as derivatives for a 2nd-order Taylor. Two multipliers (more hard cores) are used (using Horner's Rule) for the 1st and 2nd-order interpolations. I don't remember how many digits of accuracy this yields, but the latency is low.
Reply by ●October 11, 20182018-10-11
Le mercredi 10 octobre 2018 21:52:06 UTC-4, Kevin Neilson a écrit :> > What do you think? > > That's good stuff. I wonder why you think using a BRAM is bad? It's good to use the hard cores in an FPGA instead of the fabric--it's less power and deterministic routing times. > > Is the CORDIC still advantageous in a modern FPGA? The last time I needed to find sine, I used a coarse BRAM lookup that output sine on one port and cos on another. Those were used as derivatives for a 2nd-order Taylor. Two multipliers (more hard cores) are used (using Horner's Rule) for the 1st and 2nd-order interpolations. I don't remember how many digits of accuracy this yields, but the latency is low.Cordic is useful to compute high-precision atan2. Otherwise for 2 16-bit inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you take advantage of the symmetries).
Reply by ●October 11, 20182018-10-11
> Cordic is useful to compute high-precision atan2. Otherwise for 2 16-bit inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you take advantage of the symmetries).I think you missed the part about the Taylor interpolation.
Reply by ●October 12, 20182018-10-12
On Thursday, October 11, 2018 at 7:37:37 AM UTC-6, Benjamin Couillard wrote:> Cordic is useful to compute high-precision atan2. Otherwise for 2 16-bit inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you take advantage of the symmetries).Sorry; I didn't see that you were talking about arctan. It's not quite as easy as sin/cos, but there is still the question as to whether a Farrow-type architecture using a coarse lookup and Taylor interpolations would be better than a CORDIC, and I am guessing that with the BRAMs and multipliers in present-day FPGAs, the answer would be yes.
Reply by ●October 12, 20182018-10-12
Le vendredi 12 octobre 2018 00:42:18 UTC-4, Kevin Neilson a =C3=A9crit=C2= =A0:> On Thursday, October 11, 2018 at 7:37:37 AM UTC-6, Benjamin Couillard wro=te:>=20 > > Cordic is useful to compute high-precision atan2. Otherwise for 2 16-b=it inputs, you'd need a ram with 2^32 addresses (maybe 16 times less if you= take advantage of the symmetries).>=20 > Sorry; I didn't see that you were talking about arctan. It's not quite a=s easy as sin/cos, but there is still the question as to whether a Farrow-t= ype architecture using a coarse lookup and Taylor interpolations would be b= etter than a CORDIC, and I am guessing that with the BRAMs and multipliers = in present-day FPGAs, the answer would be yes. Yes I think you're right it could work for an atan or atan 2. You could im= plemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 = degrees), a BRAM + taylor interpolation.=20 For atan, you'd simply skip the divider and sign look-up part.=20 On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for a= tan/atan2.
Reply by ●October 16, 20182018-10-16
> Yes I think you're right it could work for an atan or atan 2. You could implemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 degrees), a BRAM + taylor interpolation. > > For atan, you'd simply skip the divider and sign look-up part. > > On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for atan/atan2.It's probably better to multiply y/x by x to get a normalized ratio rather than use a divider, which requires a lot of resources. I recalled that I had to implement the atan2 function once for a QAM carrier/symbol recovery circuit. I didn't need great precision, so I split one quadrant into a grid and put the angle of the center of each grid square into a 2-dimensional lookup ROM. Then I could put X,Y into the ROM and get the coarse angle (which was then adjusted for the quadrant) and could use that for carrier recovery.
Reply by ●October 17, 20182018-10-17
Le mardi 16 octobre 2018 22:36:56 UTC-4, Kevin Neilson a écrit :> > Yes I think you're right it could work for an atan or atan 2. You could implemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 degrees), a BRAM + taylor interpolation. > > > > For atan, you'd simply skip the divider and sign look-up part. > > > > On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for atan/atan2. > > It's probably better to multiply y/x by x to get a normalized ratio rather than use a divider, which requires a lot of resources. > > I recalled that I had to implement the atan2 function once for a QAM carrier/symbol recovery circuit. I didn't need great precision, so I split one quadrant into a grid and put the angle of the center of each grid square into a 2-dimensional lookup ROM. Then I could put X,Y into the ROM and get the coarse angle (which was then adjusted for the quadrant) and could use that for carrier recovery.One case that comes to mind : you have 2 quadrature signals x = A(t) * cos (wt + some phase) + noise_x y = A(t) * sin (wt + some phase) + noise_y Atan2(y, x) = wt + some phase. The variations of A(t) (as long as they are slow-ish) will cancel each other out. You can filter or average wt + some phase to extract the phase. Or derivate "wt + some phase" then filter to get the filtered frequency. So multiplying "y/x by x" would not make much sense in this case.
Reply by ●October 18, 20182018-10-18
On Wednesday, October 17, 2018 at 9:26:38 AM UTC-6, Benjamin Couillard wrote:> Le mardi 16 octobre 2018 22:36:56 UTC-4, Kevin Neilson a écrit : > > > Yes I think you're right it could work for an atan or atan 2. You could implemented a divider for atan 2 (y/x), a sign look-up (to get the whole 360 degrees), a BRAM + taylor interpolation. > > > > > > For atan, you'd simply skip the divider and sign look-up part. > > > > > > On the other hand, Xilinx and Altera offer plug-and-play Cordic cores for atan/atan2. > > > > It's probably better to multiply y/x by x to get a normalized ratio rather than use a divider, which requires a lot of resources. > > > > I recalled that I had to implement the atan2 function once for a QAM carrier/symbol recovery circuit. I didn't need great precision, so I split one quadrant into a grid and put the angle of the center of each grid square into a 2-dimensional lookup ROM. Then I could put X,Y into the ROM and get the coarse angle (which was then adjusted for the quadrant) and could use that for carrier recovery. > > One case that comes to mind : you have 2 quadrature signals > > x = A(t) * cos (wt + some phase) + noise_x > y = A(t) * sin (wt + some phase) + noise_y > > Atan2(y, x) = wt + some phase. The variations of A(t) (as long as they are slow-ish) will cancel each other out. > > You can filter or average wt + some phase to extract the phase. Or derivate "wt + some phase" then filter to get the filtered frequency. > > So multiplying "y/x by x" would not make much sense in this case.No, multiplying by x doesn't make sense. Perhaps using a ROM for 1/x and a multiplier would be better than a full divider.