FPGARelated.com
Forums

Downsizing Verilog synthesization.

Started by eromlignod August 6, 2008
On Aug 6, 12:31=A0pm, eromlignod <eromlig...@aol.com> wrote:
> On Aug 6, 11:50=A0am, John McCaskill <jhmccask...@gmail.com> wrote: > > > If you can map this onto a block ram, you will save quite a bit of > > registers. Whether or not you can do this depends on if you can write > > the vectors in one (or a few) at a time, and process them sequentially > > in the time you have available. =A0How much time do you have to process > > the vectors? Ns, us, ms ? > > Ah, I think I'm following along now. =A0Are you talking about sending > the numbers over a single 8-bit vector wire one-at-a-time? =A0Hmmm. > > The vectors are actually independent from each other and refresh at > various random rates, so a few usec here or there shouldn't make a > difference. =A0I'll give it a try! > > Don
I changed the individual vectors to a single vector and now my slice count is down to 80% with 0% unrelated logic. Thanks! Don
Jim Granville wrote:
(snip)

> Read what I wrote carefully. "Read up on reciprocal Frequecy counters" > A Reciprocal Frequecy counter is not an ordinary frequency counter :) > It counts in two domains: Whole cycles and time.
> Your values above are a dT of 17.05us. > Suppose you measure 3 whole cycles, giving ~9 readings a second > then a (relatively low) 1MHz reciprocal Counter will give a 63 count > difference on your 15.9mHz dF example. - so is about 6 bits more > precise than you need.
I think that is what he is doing, but averaging 32 counts. But averaging 32 counts is the same as counting 32 times as long, which might take less logic. Putting analog PLL frequency multipliers on each string would be interesting, but take a lot of extra circuitry. I don't know DLL enough to say, but it might work. I would read up on DLL (that is, the digital version of the analog PLL). -- glen
Jeff Cunningham wrote:
(snip)

>> The device is a self-tuning piano. You can read/listen about it >> here...
>> New York Times: >> http://query.nytimes.com/gst/fullpage.html?res=9800E1D8133FF931A35752C0A9659C8B63
>> NPR: >> http://www.npr.org/templates/story/story.php?storyId=878091
>> New Scientist Magazine: >> http://www.newscientist.com/article/dn3143-hotwired-piano-tunes-itself.html
(snip)
>> I first convert the wave to a "period" wave that has an "on" time >> equal to one period of the string's vibration. I then use this wave >> to enable counting of the 50-MHz system clock. So I get a count of >> how many clock ticks of the system clock occur for one period of >> string vibration. This takes up to 21 bits for the low strings. I >> average 32 of these numbers and calculate an error based on a stored >> setpoint. Currently I'm using a theoretical setpoint, but eventually >> I will want to add the feature whereby a piano tech can hand-tune the >> piano and then "store" his tuning numbers for subsequent use.
Instead of averaging 32 values, why not count for 32 cycles? Probably more cycles for higher notes, and fewer for lower notes. (snip)
> Are you concerned about the 600W of heat drying out the piano wood? I > know some pianos have humidifiers inside them so it seems like a > potential issue.
It seems that the heat is only when it is in use, and maybe not much different from the sunlight that many pianos will receive. The mechanical tuning has to be close enough that each string can be tuned at below 95F, and equal to or above the ambient temperature. With reduced tension when it is not being used, it might stay "in tune" longer. Then again, I don't know how much string stretching increases with temperature. I would say that magnetic coupling to the strings was pretty obvious, but temperature controlled tuning is not. My thought was stepper motors on each tuning peg. (Small and geared down a lot.) -- glen
glen herrmannsfeldt wrote:
> Jim Granville wrote: > (snip) > >> Read what I wrote carefully. "Read up on reciprocal Frequecy counters" >> A Reciprocal Frequecy counter is not an ordinary frequency counter :) >> It counts in two domains: Whole cycles and time. > > >> Your values above are a dT of 17.05us. >> Suppose you measure 3 whole cycles, giving ~9 readings a second >> then a (relatively low) 1MHz reciprocal Counter will give a 63 count >> difference on your 15.9mHz dF example. - so is about 6 bits more >> precise than you need. > > > I think that is what he is doing, but averaging 32 counts. > But averaging 32 counts is the same as counting 32 times as long, > which might take less logic. > > Putting analog PLL frequency multipliers on each string would be > interesting, but take a lot of extra circuitry. I don't know > DLL enough to say, but it might work. I would read up on > DLL (that is, the digital version of the analog PLL).
Why add more jitter/noise ? In my example, a low 1Mhz timebase reciprocal counter is 63 TIMES more accurate than the OP's 15.9mHz spec - so it has 6 extra bits of precision. (500KHz would be my trade-off target) On a more practical note, I can't see a stored-cal approach working very well (too much thermal drifing about..) - this has to be able to work closer to real time, or at least have a mode that allows 'verify during play', so the design target should be reading ALL channels. Block RAM arrays would be one way to pack this into a smaller FPGA, and just clone the Array-Scan as needed. eg if you have the time-budget to scan 50 into one block RAM, then 3 repeats supports 150 channels... 18 bit counters are good, as the reciprocal Cycle Ctr 'scales' these to similar times. (so precison is largely independant of freauency) The Dual-port access to Block Rams, would allow the Cycle Ctr, and TimeCtr to share a block, typical block is 1K x 18, so 50 T_Ctrs, 50 captures, and 50 C_Ctrs + Preloads, is only 200 RAM-lines. Must be something to do with the spare space ? ;) Could do a reading FIFO, but the capture rate is similar on allchannels, with a reciprocal counter, and ~10/sec, even over 120 channels, is only 1200Ch/s reading rate : this could stream the numbers over a SPI or even a RS232 link to a PC. Should fit in the smallest FPGA ? Did the OP mention which FPGA he uses now ? -jg
Jim Granville wrote:

> Block RAM arrays would be one way to pack this into a smaller > FPGA, and just clone the Array-Scan as needed. > eg if you have the time-budget to scan 50 into one block RAM, > then 3 repeats supports 150 channels... 18 bit counters > are good, as the reciprocal Cycle Ctr 'scales' these to > similar times. > (so precison is largely independant of freauency) > > The Dual-port access to Block Rams, would allow > the Cycle Ctr, and TimeCtr to share a block, > typical block is 1K x 18, so 50 T_Ctrs, 50 captures, > and 50 C_Ctrs + Preloads, is only 200 RAM-lines.
I see a good, recent Xilinx WP here "Creative Uses of Block RAM" By: Peter Alfke http://www.xilinx.com/support/documentation/white_papers/wp335.pdf So Peter may be able to comment on how many Counters, you could practically 'pack into' Xilinx BlockRam, on a 1-2us Frame time ? [Tho even the smallest Xilinx device has 4 block rams..] The OP may find pin-count dictates the fpga, if he wants to do a single-package parallel wiring design. Practical wiring decisions might point to a fast serial backplane to collect the signals ? Something like a clone of the 4 bit SPI, done in 4-5 very bottom-end cpld's, would feed 24-26-30 bits into practical 'ribbon-cable' widths and speeds. That would avoid paying for a bigger fpga, just to get the pins.... -jg
On Aug 8, 9:25=A0pm, Jim Granville <no.s...@designtools.maps.co.nz>
wrote:
> Jim Granville wrote: > > Block RAM arrays would be one way to pack this into a smaller > > FPGA, and just clone the =A0Array-Scan as needed. > > eg if you have the time-budget to scan 50 into one block RAM, > > then 3 repeats supports 150 channels... 18 bit counters > > are good, as the reciprocal Cycle Ctr 'scales' these to > > similar times. > > (so precison is largely independant of freauency) > > > The Dual-port access to Block Rams, would allow > > the Cycle Ctr, and TimeCtr to share a block, > > typical block is 1K x 18, so 50 T_Ctrs, 50 captures, > > and 50 C_Ctrs + Preloads, is only 200 RAM-lines. > > =A0 I see a good, recent Xilinx WP here > "Creative Uses of Block RAM" =A0By: Peter Alfke > > http://www.xilinx.com/support/documentation/white_papers/wp335.pdf > > =A0 So Peter may be able to comment on how many Counters, you > could practically 'pack into' Xilinx BlockRam, on a 1-2us Frame time ? > =A0 [Tho even the smallest Xilinx device has 4 block rams..] > > The OP may find pin-count dictates the fpga, if he wants to do a > single-package parallel wiring design. > =A0 Practical wiring decisions might point to a fast serial backplane > to collect the signals ? > =A0 Something like a clone of the 4 bit SPI, done in 4-5 very bottom-end > cpld's, would feed 24-26-30 bits into practical 'ribbon-cable' > widths and speeds. That would avoid paying for a bigger fpga, just > to get the pins.... > > -jg
It's a Spartan-3 and I picked it because of my large number of inputs. Connecting an input from each of the 88 notes lets me avoid an elaborate tree of external multiplexers. I only sustain 44 of the strings at a time (every other note) because otherwise I get interference from adjacent driving magnets. The system does not run all the time. It only actively adjusts pitches for the first couple of minutes after you turn it on. Once the strings stabilize at their in-tune pitch, all of the PWM duty cycles are frozen and maintained from that point on and the magnetic sustaining ceases. When the musician is done playing, he switches it off. The next time it is turned on, you get a new tuning for that day's conditions. This also keeps the system simple, since there is only one button. Don Kansas City
eromlignod wrote:
> It's a Spartan-3 and I picked it because of my large number of > inputs. Connecting an input from each of the 88 notes lets me avoid > an elaborate tree of external multiplexers. I only sustain 44 of the > strings at a time (every other note) because otherwise I get > interference from adjacent driving magnets. > > The system does not run all the time. It only actively adjusts > pitches for the first couple of minutes after you turn it on. Once > the strings stabilize at their in-tune pitch, all of the PWM duty > cycles are frozen and maintained from that point on and the magnetic > sustaining ceases. When the musician is done playing, he switches it > off. The next time it is turned on, you get a new tuning for that > day's conditions. This also keeps the system simple, since there is > only one button. > > Don > Kansas City
So the tuning mode results in a constant square wave for each of the 44 channels? I was wondering how the system would decide when the square wave was "stable enough" to have reliable measurements. The continuous feed simplifies many decisions. The "Spartan-3" mention supplies a family, but the curiosity was more around the size of the part for the number of BlockRAMs and LUT resources (which is also different for S3, S3E, and S3A). If you have a table full of count values, do you then just want to read these with an external processor? Do you want to have processing done on the signals inside the FPGA? Personally, I liked the earlier suggestion of defining the target period count and just measuring the delta value: the period error. The approach I'd enjoy coding up would use the distributed SelectRAM memories as counters and use an "octave counter" configuration (my concept for the unique requirements of your project) which would count the C8 period every 2 clocks, C7 every 4, C6 every 8, all the way down to C1 and C2 each counted only every 128 clocks. All eight octaves count with the same percent resolution relative to the period and use significantly less logic than one counter per key. 6 channels of the octave counter then give you the 44 channels at a time with a 2:1 mux to select odd or even keys. The octave counter only needs to be used for the 3 LSbits and a single BlockRAM used to coordinate the counts and accumulate the 44 channels by adding the LSbits. Two BlockRAMs would make the whole configuration easier to allow a dual-port for the accumulator on one and the full count (plus scaling?) into one port of the 2nd BlockRAM while the processor side has full access to the other port of the 2nd BlockRAM. In the period counter arrangement, the largest error would be on the high frequency side so a 1/100 semitone error on C8 gives a 138ns error for a single cycle. The "every other clock" at 50MHz for this highest octave easily gives a single-cycle measurement that would give the accuracy you need (80ns error from both sides of a single count); at this point your system error is probably much more an issue of the noise (measurement jitter) in the square waves from the sensors. The octave counter approach give you the same percentage error on all octaves for a single key. As pointed out by someone earlier, counting multiple consecutive periods rather than averaging multiple counts gives you a much higher precision, spreading the 80ns error I mentioned above over multiple periods rather than just one for a per-period error of 80ns/N for an N period count. Using BlockRAMs to hold the main part of the counters, you have 36 bits to play with for free. Since the octave counter scales by a factor of 2 for each octave, all the counts are the same relative size. If you wanted to pursue the error count rather than the raw count values for each period, you could even multiply the errors by a defined constant for each key to give you a scaled error that's appropriate to feed your system adjustments and have all your processing easily handled within the single FPGA. This kind of stuff is just fun.
eromlignod wrote:
> It's a Spartan-3 and I picked it because of my large number of > inputs. Connecting an input from each of the 88 notes lets me avoid > an elaborate tree of external multiplexers.
Done right, the external fan-in design can save a lot of bulky wiring, and keeps the looming in simple ribbon cable. We have done Serial/parallel chainable fan-in/fan-out systems in designs, and it is very expandable, and simple to loom-up. Also has natural cross talk advantages. You will already have 'remote PCBs' doing the front-end signal conditioning ? How much you 'production enginerr' this, depends on how many of these you hope to make.
> I only sustain 44 of the > strings at a time (every other note) because otherwise I get > interference from adjacent driving magnets. > > The system does not run all the time. It only actively adjusts > pitches for the first couple of minutes after you turn it on. Once > the strings stabilize at their in-tune pitch, all of the PWM duty > cycles are frozen and maintained from that point on and the magnetic > sustaining ceases. When the musician is done playing, he switches it > off. The next time it is turned on, you get a new tuning for that > day's conditions. This also keeps the system simple, since there is > only one button.
Hmm... has this been proven on a working unit, in the field ? Better would be a system that can _prove_ what you stated, by tracking the frequency even during a performance. It does not have to adjust, but knowing just how many ppm the thing has drifted, would seem a strong selling point -jg
John_H wrote:

> The octave counter only needs to be used for the 3 LSbits and a single > BlockRAM used to coordinate the counts and accumulate the 44 channels by > adding the LSbits. Two BlockRAMs would make the whole configuration > easier to allow a dual-port for the accumulator on one and the full > count (plus scaling?) into one port of the 2nd BlockRAM while the > processor side has full access to the other port of the 2nd BlockRAM.
With counters, normally you have a capture register, so that gives the processor something relatively static to read,
> In the period counter arrangement, the largest error would be on the > high frequency side so a 1/100 semitone error on C8 gives a 138ns error > for a single cycle. The "every other clock" at 50MHz for this highest > octave easily gives a single-cycle measurement that would give the > accuracy you need (80ns error from both sides of a single count);
Correct. What you describe, with variable capture cycles, is a reciprocal frequecny counter. There are three choices for the Cycles-to-time-over count determination in a reciprocal frequecny counter. a) it can be hard coded. (which I think is what you describe) b) it can be preloaded via SW, which allows elasticity in the gate times. (one might try varying times to reduce beat effects) c) It is Auto-Set by a desired gate time, and smarter logic takes the first whole cycle number larger than the gate time. Option c) is the full benchtop reciprocal Counter, and b) is the variant I'd suggest for this project. Lets SW have full control, without recompile of FPGA, but saves some logic & data bandwidth from c)
> As pointed out by someone earlier, counting multiple consecutive periods > rather than averaging multiple counts gives you a much higher precision, > spreading the 80ns error I mentioned above over multiple periods rather > than just one for a per-period error of 80ns/N for an N period count. > Using BlockRAMs to hold the main part of the counters, you have 36 bits > to play with for free. Since the octave counter scales by a factor of 2 > for each octave, all the counts are the same relative size.
Yes, with a little design care, on a reciprocal counter you can extend the precision across multiple readings. The trick to to not lose time-counts, or edge-counts - If your HE does that, then [Cycles/Time] can be added downstream for very high precisions. This would enable research into the milli-kelvin thermal shifts going down, long before they became audible Perhaps a variant of this capture-pathway, could be a 'wide tuner' for general piano use - more sales ? Would slash the skill and time needed to tune manually a piano
On Aug 9, 6:09=A0pm, Jim Granville <no.s...@designtools.maps.co.nz>
wrote:
> eromlignod wrote: > > It's a Spartan-3 and I picked it because of my large number of > > inputs. =A0Connecting an input from each of the 88 notes lets me avoid > > an elaborate tree of external multiplexers. =A0 > > Done right, the external fan-in design can save a lot of bulky wiring, > and keeps the looming in simple ribbon cable. > > We have done Serial/parallel chainable fan-in/fan-out systems > in designs, and it is very expandable, and simple to loom-up. > Also has natural cross talk advantages. > > You will already have 'remote PCBs' doing the front-end signal > conditioning ? > > How much you 'production enginerr' this, depends on how many of these > you hope to make. > > > I only sustain 44 of the > > strings at a time (every other note) because otherwise I get > > interference from adjacent driving magnets. > > > The system does not run all the time. =A0It only actively adjusts > > pitches for the first couple of minutes after you turn it on. =A0Once > > the strings stabilize at their in-tune pitch, all of the PWM duty > > cycles are frozen and maintained from that point on and the magnetic > > sustaining ceases. =A0When the musician is done playing, he switches it > > off. =A0The next time it is turned on, you get a new tuning for that > > day's conditions. =A0This also keeps the system simple, since there is > > only one button. > > Hmm... has this been proven on a working unit, in the field ? > > Better would be a system that can _prove_ what you stated, by > tracking the frequency even during a performance. It does not > have to adjust, but knowing just how many ppm the thing > has drifted, would seem a strong selling point > > -jg
This would be difficult since no song ever plays all 88 notes. In fact the top and bottom few notes may never get played (I am a pianist myself and have never seen music written for A0 or C8). And if a note is played too quickly, like with staccato, there may not be enough clean vibrations to get an average. There is also the fact that a note's pitch changes slightly with volume. A louder note sounds a little sharp since there is more string excursion and thus more tension. My system magnetically sustains each string at a steady, repeatable volume, which is the same volume it is sustained at when the tuning is "recorded". The whole idea of the system is to get a fresh tuning in a few seconds on a daily basis. Every time you press that button, you could have spent $100 and over a hour of inconvenience. Don Kansas City