FPGARelated.com
Forums

My invention: Coding wave-pipelined circuits with buffering function in HDL

Started by Weng Tianxiang January 10, 2018
On Saturday, January 20, 2018 at 4:17:02 PM UTC-8, rickman wrote:
> Jan Coombs wrote on 1/20/2018 2:20 PM: > > On Fri, 19 Jan 2018 17:42:57 -0500 > > rickman <gnuarm.deletethisbit@gmail.com> wrote: > > > > ... > > > >> I think I understand the concept of wave pipelining. It is > >> just eliminating the intermediate registers of a pipeline > >> circuit and designing the combinational logic so that the > >> delays are even enough across the many paths so the output can > >> be clocked at a given time and will receive a stable result > >> from the input N clocks earlier. In other words, the logic is > >> designed so that the changes rippling through the logic never > >> catch up to the changes created by the data entered 1 clock > >> cycle earlier. Nice if you can do it. > > > > Thanks, interesting, but sounds complex to get reliable > > operation. > > > >> I can see where this would be useful in an ASIC. In ASICs FFs > >> and logic compete for space within the chip. In FPGAs the > >> ratio between FFs and logic are fixed and predetermined. So > >> using logic without using the FFs that are already there is > >> not of much value. > > > > Generally true, but > > > > 1) You might be able to combine three stages that require 2/3 of > > a clock cycle for maximum propagation delay, and get the result > > in in the time of two clock cycles. > > If your stages are only using 2/3 of a clock, you can regroup the logic to > make it 1 clock each in two stages. There is supposed to be software to > handle that for you although I've never used it. > > > > 2) If the Microsemi/Actel Igloo/Smartfusion FPGAs are used then > > each tile can be a latch or a LUT, so flops are not wasted. > > There's your first mistake, no one uses Actel/Microsemi FPGAs. They long > for the day they are as big as Lattice, lol! > > > > Either way there must be a great deal of complex floor planning > > and/or timing constraints needed to make this work. Automating > > this would be amazing? > > Isn't that what the OP is claiming? I'm surprised he could make this work > over PVT. The actual stable time has to be on a clock edge, the same clock > edge under all conditions. I wouldn't want to try that manually in a simple > circuit. > > -- > > Rick C > > Viewed the eclipse at Wintercrest Farms, > on the centerline of totality since 1998
Rick&#65292; SMB stands for Series Master component with Buffering function, one of 2 WPC (Wive-Pipelining Component). I don't understand what you are saying: "Isn't that what the OP is claiming? I'm surprised he could make this work over PVT. " What do OP and PVT stand for? My attention on this topic is centered on introduction of my inventions to public and asking for their critical comments, challenge or suspicion from technical point of view, not specially on whether or not they are useful. Personally I never have a chance to write a pipelined circuit, not mention designing for a wave-pipelined circuit. What I did is a result of my observation that such an important problem can be perfectly resolved by my insight as a person outside the wave-pipelined design circle, fully based on only one reference [1] IEEE Transactions on VLSI Systems http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.1783&rep=rep1&type=pdf . Weng
On Sun, 21 Jan 2018 08:22:45 -0800 (PST)
Weng Tianxiang <wtxwtx@gmail.com> wrote:

[much irrelevant stuff snipped - please help with this]

> My attention on this topic is centered on introduction of my > inventions to public and asking for their critical comments, > challenge or suspicion from technical point of view, not > specially on whether or not they are useful.
I was unable to quickly understand the "2 fast reading materials" which you sent me.
> Personally I never have a chance to write a pipelined circuit, > not mention designing for a wave-pipelined circuit.
Why do you have patents. A patent should disclose the method of the novelty, so would need an implementation. Perhaps this is what I am missing?
> What I did is a result of my observation that such an > important problem can be perfectly resolved by my insight as a > person outside the wave-pipelined design circle, fully based > on only one reference [1] IEEE Transactions on VLSI Systems > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.1783&rep=rep1&type=pdf .
Perhaps if you follow wave-pipelined techniques to the limit, you will find yourself looking at asynchronous (or self clocked) logic. There is also much historical work on this, and it may be easier to test on FPGA chips[1]. Jan Coombs -- [1] or at least drum up some business for Microsemi/Actel
On 21/01/2018 00:16, rickman wrote:
> Jan Coombs wrote on 1/20/2018 2:20 PM: >> On Fri, 19 Jan 2018 17:42:57 -0500 >> rickman <gnuarm.deletethisbit@gmail.com> wrote: >> >> &nbsp;&nbsp; ... >> >>> I think I understand the concept of wave pipelining.&nbsp; It is >>> just eliminating the intermediate registers of a pipeline >>> circuit and designing the combinational logic so that the >>> delays are even enough across the many paths so the output can >>> be clocked at a given time and will receive a stable result >>> from the input N clocks earlier.&nbsp; In other words, the logic is >>> designed so that the changes rippling through the logic never >>> catch up to the changes created by the data entered 1 clock >>> cycle earlier.&nbsp; Nice if you can do it. >> >> Thanks, interesting, but sounds complex to get reliable >> operation. >> >>> I can see where this would be useful in an ASIC.&nbsp; In ASICs FFs >>> and logic compete for space within the chip.&nbsp; In FPGAs the >>> ratio between FFs and logic are fixed and predetermined.&nbsp; So >>> using logic without using the FFs that are already there is >>> not of much value. >> >> Generally true, but >> >> 1) You might be able to combine three stages that require 2/3 of >> a clock cycle for maximum propagation delay, and get the result >> in in the time of two clock cycles. > > If your stages are only using 2/3 of a clock, you can regroup the logic > to make it 1 clock each in two stages.&nbsp; There is supposed to be software > to handle that for you although I've never used it. > > >> 2) If the Microsemi/Actel Igloo/Smartfusion FPGAs are used then >> each tile can be a latch or a LUT, so flops are not wasted. > > There's your first mistake, no one uses Actel/Microsemi FPGAs.&nbsp; They > long for the day they are as big as Lattice, lol!
Microsemi has been at the number 3 spot for as long as I use FPGA's (+/- 28 years starting with Actel's A1010). They are twice as large as Lattice. Here is a reference: https://www.eetimes.com/author.asp?doc_id=1331443 Hans www.ht-lab.com
> >> Either way there must be a great deal of complex floor planning >> and/or timing constraints needed to make this work. Automating >> this would be amazing? > > Isn't that what the OP is claiming?&nbsp; I'm surprised he could make this > work over PVT.&nbsp; The actual stable time has to be on a clock edge, the > same clock edge under all conditions.&nbsp; I wouldn't want to try that > manually in a simple circuit. >
On 1/21/18 11:22 AM, Weng Tianxiang wrote:
> What do OP and PVT stand for? >
OP = Original Poster, the person who started the topic PVT = Process / Voltage / Temperature (I presume) The issue being that gate delay isn't a hard fixed value, but changes slightly (or not so slightly) from device to device and under varying operating conditions, which brings in to question the designing of a gate tree that presents results stably and reliably two clock cycles after application, even with the inputs changing after one clock cycles.
On Sunday, January 21, 2018 at 8:44:09 AM UTC-8, Jan Coombs wrote:
> On Sun, 21 Jan 2018 08:22:45 -0800 (PST) > Weng Tianxiang <wtxwtx@gmail.com> wrote: >=20 > [much irrelevant stuff snipped - please help with this] >=20 > > My attention on this topic is centered on introduction of my > > inventions to public and asking for their critical comments, > > challenge or suspicion from technical point of view, not > > specially on whether or not they are useful. >=20 > I was unable to quickly understand the "2 fast reading > materials" which you sent me.=20 >=20 > > Personally I never have a chance to write a pipelined circuit, > > not mention designing for a wave-pipelined circuit. >=20 > Why do you have patents. A patent should disclose the method of > the novelty, so would need an implementation. Perhaps this is > what I am missing? >=20 > > What I did is a result of my observation that such an > > important problem can be perfectly resolved by my insight as a > > person outside the wave-pipelined design circle, fully based > > on only one reference [1] IEEE Transactions on VLSI Systems > > http://citeseerx.ist.psu.edu/viewdoc/download?doi=3D10.1.1.90.1783&rep=
=3Drep1&type=3Dpdf .
>=20 > Perhaps if you follow wave-pipelined techniques to the limit, you > will find yourself looking at asynchronous (or self clocked) > logic. There is also much historical work on this, and it may > be easier to test on FPGA chips[1].=20 >=20 > Jan Coombs > --=20 > [1] or at least drum up some business for Microsemi/Actel
Jam, I don't think you are right: "Perhaps if you follow wave-pipelined techniqu= es to the limit, you will find yourself looking at asynchronous (or self cl= ocked) logic." I had studied the asynchronous circuit, but found that it is a dead road ba= sed on its structural inefficiency and current commercial trend. And coding= or synthesizing a wave-pipelined circuit has nothing to do with their coun= terpart for an asynchronous circuit, and the former is much more complex th= an asynchronous circuit!=20 Synthesizing a wave-pipelined circuit needs much more complex algorithms th= at have been matured since 1969 based on my observation.=20 My design never considers PVT, it belongs to another specialty field and I = have zero knowledge on it. From my point of view building a bridge between a code designer and a synth= esizer is a very important issue to publicize the technology for wave-pipel= ined circuits: in 1980 Intel published and developed 8087 for 32-bit floating multiplier; = 10 and more years later, in 1997 they claimed MMX technology, including a s= econd version of 64-bit floating multiplier. From my point of view the seco= nd version of 64-bit floating multiplier using MMX technology is none but a= technology using wave-pipelined circuit.=20 Regular engineers never have a chance to implement a wave-pipelined circuit= because of the complexity of all related PVT. But according to my scheme, the most complex part of generating a wave-pipe= lined circuit is fully left to synthesizer manufacturers and a code designe= r in HDL only focuses his attention to how to code it with zero knowledge a= bout how a wave-pipelined circuit is synthesized and generated that hopeful= ly leads to a situation that any college student with basic knowledge in HD= L can generate the second version of 64-bit floating multiplier within half= an hour. As far as 2 fast reading materials are concerned, please communicate with m= e through private email and let me know what you want: specification, drawi= ng and source code in VHDL. Sorry, I mistakenly thought you were a lawyer, = not an engineer. Thank you. Weng
Weng Tianxiang wrote on 1/21/2018 11:22 AM:
> On Saturday, January 20, 2018 at 4:17:02 PM UTC-8, rickman wrote: >> Jan Coombs wrote on 1/20/2018 2:20 PM: >>> On Fri, 19 Jan 2018 17:42:57 -0500 >>> rickman <gnuarm.deletethisbit@gmail.com> wrote: >>> >>> ... >>> >>>> I think I understand the concept of wave pipelining. It is >>>> just eliminating the intermediate registers of a pipeline >>>> circuit and designing the combinational logic so that the >>>> delays are even enough across the many paths so the output can >>>> be clocked at a given time and will receive a stable result >>>> from the input N clocks earlier. In other words, the logic is >>>> designed so that the changes rippling through the logic never >>>> catch up to the changes created by the data entered 1 clock >>>> cycle earlier. Nice if you can do it. >>> >>> Thanks, interesting, but sounds complex to get reliable >>> operation. >>> >>>> I can see where this would be useful in an ASIC. In ASICs FFs >>>> and logic compete for space within the chip. In FPGAs the >>>> ratio between FFs and logic are fixed and predetermined. So >>>> using logic without using the FFs that are already there is >>>> not of much value. >>> >>> Generally true, but >>> >>> 1) You might be able to combine three stages that require 2/3 of >>> a clock cycle for maximum propagation delay, and get the result >>> in in the time of two clock cycles. >> >> If your stages are only using 2/3 of a clock, you can regroup the logic to >> make it 1 clock each in two stages. There is supposed to be software to >> handle that for you although I've never used it. >> >> >>> 2) If the Microsemi/Actel Igloo/Smartfusion FPGAs are used then >>> each tile can be a latch or a LUT, so flops are not wasted. >> >> There's your first mistake, no one uses Actel/Microsemi FPGAs. They long >> for the day they are as big as Lattice, lol! >> >> >>> Either way there must be a great deal of complex floor planning >>> and/or timing constraints needed to make this work. Automating >>> this would be amazing? >> >> Isn't that what the OP is claiming? I'm surprised he could make this work >> over PVT. The actual stable time has to be on a clock edge, the same clock >> edge under all conditions. I wouldn't want to try that manually in a simple >> circuit. >> >> -- >> >> Rick C >> >> Viewed the eclipse at Wintercrest Farms, >> on the centerline of totality since 1998 > > Rick&#65292; > > SMB stands for Series Master component with Buffering function, one of 2 WPC (Wive-Pipelining Component). > > I don't understand what you are saying: > "Isn't that what the OP is claiming? I'm surprised he could make this work > over PVT. " > > What do OP and PVT stand for?
OP means "original poster" and is a common abbreviation in newsgroups. PVT means Process, Voltage, Temperature and are the three main factors causing variations in delay times in silicon chip. If you don't account for these effects in your timing calculations you wave pipelining idea won't work. If you aren't aware of this, I suspect you don't really understand how to design FPGA devices. It isn't all text book analysis.
> My attention on this topic is centered on introduction of my inventions to public and asking for their critical comments, challenge or suspicion from technical point of view, not specially on whether or not they are useful. > > Personally I never have a chance to write a pipelined circuit, not mention designing for a wave-pipelined circuit. > > What I did is a result of my observation that such an important problem can be perfectly resolved by my insight as a person outside the wave-pipelined design circle, fully based on only one reference > [1] IEEE Transactions on VLSI Systems > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.1783&rep=rep1&type=pdf .
Then I think you have not solved anything. The problem with wave pipelining is that the timing can vary so much that the output of the combinational circuit won't be stable during the clock edges. If you haven't tested your ideas by designing a circuit and running it on an FPGA, you don't know any of this will work in the real world. -- Rick C Viewed the eclipse at Wintercrest Farms, on the centerline of totality since 1998
Weng Tianxiang wrote on 1/21/2018 2:15 PM:
> On Sunday, January 21, 2018 at 8:44:09 AM UTC-8, Jan Coombs wrote: >> On Sun, 21 Jan 2018 08:22:45 -0800 (PST) >> Weng Tianxiang <wtxwtx@gmail.com> wrote: >> >> [much irrelevant stuff snipped - please help with this] >> >>> My attention on this topic is centered on introduction of my >>> inventions to public and asking for their critical comments, >>> challenge or suspicion from technical point of view, not >>> specially on whether or not they are useful. >> >> I was unable to quickly understand the "2 fast reading >> materials" which you sent me. >> >>> Personally I never have a chance to write a pipelined circuit, >>> not mention designing for a wave-pipelined circuit. >> >> Why do you have patents. A patent should disclose the method of >> the novelty, so would need an implementation. Perhaps this is >> what I am missing? >> >>> What I did is a result of my observation that such an >>> important problem can be perfectly resolved by my insight as a >>> person outside the wave-pipelined design circle, fully based >>> on only one reference [1] IEEE Transactions on VLSI Systems >>> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.1783&rep=rep1&type=pdf . >> >> Perhaps if you follow wave-pipelined techniques to the limit, you >> will find yourself looking at asynchronous (or self clocked) >> logic. There is also much historical work on this, and it may >> be easier to test on FPGA chips[1]. >> >> Jan Coombs >> -- >> [1] or at least drum up some business for Microsemi/Actel > > Jam, > > I don't think you are right: "Perhaps if you follow wave-pipelined techniques to the limit, you will find yourself looking at asynchronous (or self clocked) logic." > > I had studied the asynchronous circuit, but found that it is a dead road based on its structural inefficiency and current commercial trend. And coding or synthesizing a wave-pipelined circuit has nothing to do with their counterpart for an asynchronous circuit, and the former is much more complex than asynchronous circuit! > > Synthesizing a wave-pipelined circuit needs much more complex algorithms that have been matured since 1969 based on my observation. > > My design never considers PVT, it belongs to another specialty field and I have zero knowledge on it. > > From my point of view building a bridge between a code designer and a synthesizer is a very important issue to publicize the technology for wave-pipelined circuits: > > in 1980 Intel published and developed 8087 for 32-bit floating multiplier; 10 and more years later, in 1997 they claimed MMX technology, including a second version of 64-bit floating multiplier. From my point of view the second version of 64-bit floating multiplier using MMX technology is none but a technology using wave-pipelined circuit. > > Regular engineers never have a chance to implement a wave-pipelined circuit because of the complexity of all related PVT. > > But according to my scheme, the most complex part of generating a wave-pipelined circuit is fully left to synthesizer manufacturers and a code designer in HDL only focuses his attention to how to code it with zero knowledge about how a wave-pipelined circuit is synthesized and generated that hopefully leads to a situation that any college student with basic knowledge in HDL can generate the second version of 64-bit floating multiplier within half an hour.
The multiplier is not a good example to use as many FPGAs contain multiplier blocks. But then they are pipelined and so won't work in a non-pipelined solution, so maybe you can show your technique even if it has little practical value in this case. The problem is "the most complex part of generating a wave-pipelined circuit is fully left to synthesizer manufacturers". Your method leaves me wondering what your software is doing??? Asking the synthesizer companies to solve your problems of making it work is a bit of a stretch. What makes you think they will even take on your idea rather than provide their own solution. If your patent only covers the idea of writing simple HDL to describe the circuit desired and leaving the implementation details to the synthesis companies, I don't think you have actually patented anything. This part if very obvious. The *real* work is in synthesizing a circuit that will work in the FPGA. -- Rick C Viewed the eclipse at Wintercrest Farms, on the centerline of totality since 1998
Richard Damon wrote on 1/21/2018 1:24 PM:
> On 1/21/18 11:22 AM, Weng Tianxiang wrote: >> What do OP and PVT stand for? >> > > OP = Original Poster, the person who started the topic > > PVT = Process / Voltage / Temperature (I presume) > > The issue being that gate delay isn't a hard fixed value, but changes > slightly (or not so slightly) from device to device and under varying > operating conditions, which brings in to question the designing of a gate > tree that presents results stably and reliably two clock cycles after > application, even with the inputs changing after one clock cycles.
MUCH more than slightly. The numbers I have been told is 2:1 is not uncommon. That's why overclockers can get CPU chips to run *much* faster than they are rated. They provide very excellent cooling, tweak the PSU voltage and select their special chips. This is also why we use synchronous logic with registers for pipelines. -- Rick C Viewed the eclipse at Wintercrest Farms, on the centerline of totality since 1998
HT-Lab wrote on 1/21/2018 1:19 PM:
> On 21/01/2018 00:16, rickman wrote: >> Jan Coombs wrote on 1/20/2018 2:20 PM: >>> On Fri, 19 Jan 2018 17:42:57 -0500 >>> rickman <gnuarm.deletethisbit@gmail.com> wrote: >>> >>> ... >>> >>>> I think I understand the concept of wave pipelining. It is >>>> just eliminating the intermediate registers of a pipeline >>>> circuit and designing the combinational logic so that the >>>> delays are even enough across the many paths so the output can >>>> be clocked at a given time and will receive a stable result >>>> from the input N clocks earlier. In other words, the logic is >>>> designed so that the changes rippling through the logic never >>>> catch up to the changes created by the data entered 1 clock >>>> cycle earlier. Nice if you can do it. >>> >>> Thanks, interesting, but sounds complex to get reliable >>> operation. >>> >>>> I can see where this would be useful in an ASIC. In ASICs FFs >>>> and logic compete for space within the chip. In FPGAs the >>>> ratio between FFs and logic are fixed and predetermined. So >>>> using logic without using the FFs that are already there is >>>> not of much value. >>> >>> Generally true, but >>> >>> 1) You might be able to combine three stages that require 2/3 of >>> a clock cycle for maximum propagation delay, and get the result >>> in in the time of two clock cycles. >> >> If your stages are only using 2/3 of a clock, you can regroup the logic to >> make it 1 clock each in two stages. There is supposed to be software to >> handle that for you although I've never used it. >> >> >>> 2) If the Microsemi/Actel Igloo/Smartfusion FPGAs are used then >>> each tile can be a latch or a LUT, so flops are not wasted. >> >> There's your first mistake, no one uses Actel/Microsemi FPGAs. They long >> for the day they are as big as Lattice, lol! > > Microsemi has been at the number 3 spot for as long as I use FPGA's (+/- 28 > years starting with Actel's A1010). They are twice as large as Lattice. > > Here is a reference: > > https://www.eetimes.com/author.asp?doc_id=1331443
There's some BS somewhere... http://www.fpgadeveloper.com/2011/07/list-and-comparison-of-fpga-companies.html More importantly, look at the numbers in your link. The Actell/Microsemi numbers are going in the wrong direction! X, A and L are headed upward year-to-year and Actel is headed down! While looking this up I found a link indicating the JTAG interface of the ProASIC3 devices has a back door which would allow their security to be bypassed. Security was their claim to fame and this could be a major blow to the company. -- Rick C Viewed the eclipse at Wintercrest Farms, on the centerline of totality since 1998
> The multiplier is not a good example to use as many FPGAs contain multiplier > blocks. But then they are pipelined and so won't work in a non-pipelined > solution, so maybe you can show your technique even if it has little > practical value in this case. > > Rick C >
What I patented in my patents is a method on how to code a wave-pipelined circuit in HDL (not only in VHDL, but all HDLs) by a circuit designer, nothing else. If you slightly change the code, a 64x64 bits floating multiplier can be generated!!! If anybody uses HDL to code, he has nothing to do with PVT, never put PVT into consideration, not me, not you, nobody does it!!! That is other ones' business. Based on my method what you need to do is that you just describe the logic for the critical path, and call a library to finish your job, nothing else, all others are left to Xilinx or Altera to do! If you are really interested in a real good FPGA example, I recommend you reading following one paper on website: Wave-pipelined intra-chip signaling for on-FPGA communications http://www.doc.ic.ac.uk/~wl/papers/10/integration10tm.pdf There are numerous circuits in FPGA that are worth being the wave-pipelined circuits. Weng