My invention: Coding wave-pipelined circuits with buffering function in HDL

Started by Weng Tianxiang January 11, 2018
Hi,

A wive-pipelined circuit has the same logic as its pipeline counterpart exc=
ept that the wive-pipelined circuit has only one stage, a critical path fro=
m the input register passing through a piece of computational logic to the =
output register, and no intermediate registers.

My invention kernel idea is: A designer provides the least information and =
logic code about the critical path, and leave all complex logic designs to =
a synthesizer and a system library that is what an HDL should do.

All coding has 3 steps:
1. Write a Critical Path Component (CPC) with defined interface;

2. Call a Wave-Pipelining Component (WPC) provided by a system library;

3. Call one of 3 link statement to link a CPC instantiation with a paired W=
PC instantiation to specify what your target is.

Here is the all code on a 64*64 bits signed integer multiplier C <=3D A*B.

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.wave_pipeline_package.all;

-- CPC code for wave-pipelined 64-bit signed integer multiplier C <=3D A*B
-- CPC_1_2 is linked with SMB by link1() / link2() if "wave" is accepted in=
 VHDL
-- link1(): generation would fail if the circuit cannot accept 1 data per c=
ycle
-- link2(): generation never fails and the circuit is capable of accepting =
1 data per=20
-- INPUT_CLOCK_NUMBER cycles

entity CPC_1_2 is=20
   generic (  =20
      input_data_width  : positive  :=3D 64;                  -- optional
      output_data_width : positive  :=3D 128                  -- optional
   );
   port (
      CLK   :  in std_logic;
      WE_i  :  in std_logic;     -- '1': write enable to input registers A =
& B=20
      Da_i  :  in signed(input_data_width-1 downto 0);      -- input data A
      Db_i  :  in signed(input_data_width-1 downto 0);      -- input data B
      WE_o_i:  in std_logic;  -- '1': write enable to output register C
      Dc_o  :  out unsigned(output_data_width -1 downto 0)  -- output data =
C
   );
end CPC_1_2;

architecture A_CPC_1_2 of CPC_1_2 is
   signal   Ra :  signed(input_data_width-1 downto 0);  -- input register A
   signal   Rb :  signed(input_data_width-1 downto 0);  -- input register B
   signal   Rc :  signed(output_data_width-1 downto 0); -- output register =
C
   signal   Cl :  signed(output_data_width-1 downto 0); -- combinational lo=
gic
  =20
begin
   Cl    <=3D Ra * Rb;             -- combinational logic output, key part =
of CPC
   Dc_o  <=3D unsigned(Rc);        -- output through output register

   p_1 : process(CLK)
   begin
      if Rising_edge(CLK) then
         if WE_i =3D '1' then      -- WE_i =3D '1' : latch input data
            Ra <=3D Da_i;
            Rb <=3D Db_i;
         end if;
        =20
         if WE_O_I =3D '1' then    -- WE_O_I =3D '1': latch output data
            Rc <=3D Cl;
         end if;
      end if;
   end process;

---------------------------------------------------------------------------=
-----

end A_CPC_1_2;

In summary, after HDL adopting my system, writing a wave-pipelined circuit =
is simple as writing a one-cycle logic circuit.

Thank you.

Weng

On Wednesday, January 10, 2018 at 5:56:45 PM UTC-8, Weng Tianxiang wrote:
> Hi, > > A wive-pipelined circuit has the same logic as its pipeline counterpart except
that the wive-pipelined circuit has only one stage, a critical path from the input register passing through a piece of computational logic to the output register, and no intermediate registers.
> > My invention kernel idea is: A designer provides the least information and logic
code about the critical path, and leave all complex logic designs to a synthesizer and a system library that is what an HDL should do.
> > All coding has 3 steps: > 1. Write a Critical Path Component (CPC) with defined interface; > > 2. Call a Wave-Pipelining Component (WPC) provided by a system library; > > 3. Call one of 3 link statement to link a CPC instantiation with a paired WPC
instantiation to specify what your target is.
> > Here is the all code on a 64*64 bits signed integer multiplier C <= A*B. > > library ieee; > use ieee.std_logic_1164.all; > use ieee.numeric_std.all; > use work.wave_pipeline_package.all; > > -- CPC code for wave-pipelined 64-bit signed integer multiplier C <= A*B > -- CPC_1_2 is linked with SMB by link1() / link2() if "wave" is accepted in VHDL > -- link1(): generation would fail if the circuit cannot accept 1 data per cycle > -- link2(): generation never fails and the circuit is capable of accepting 1 data
per
> -- INPUT_CLOCK_NUMBER cycles > > entity CPC_1_2 is > generic ( > input_data_width : positive := 64; -- optional > output_data_width : positive := 128 -- optional > ); > port ( > CLK : in std_logic; > WE_i : in std_logic; -- '1': write enable to input registers A & B > Da_i : in signed(input_data_width-1 downto 0); -- input data A > Db_i : in signed(input_data_width-1 downto 0); -- input data B > WE_o_i: in std_logic; -- '1': write enable to output register C > Dc_o : out unsigned(output_data_width -1 downto 0) -- output data C > ); > end CPC_1_2; > > architecture A_CPC_1_2 of CPC_1_2 is > signal Ra : signed(input_data_width-1 downto 0); -- input register A > signal Rb : signed(input_data_width-1 downto 0); -- input register B > signal Rc : signed(output_data_width-1 downto 0); -- output register C > signal Cl : signed(output_data_width-1 downto 0); -- combinational logic > > begin > Cl <= Ra * Rb; -- combinational logic output, key part of CPC > Dc_o <= unsigned(Rc); -- output through output register > > p_1 : process(CLK) > begin > if Rising_edge(CLK) then > if WE_i = '1' then -- WE_i = '1' : latch input data > Ra <= Da_i; > Rb <= Db_i; > end if; > > if WE_O_I = '1' then -- WE_O_I = '1': latch output data > Rc <= Cl; > end if; > end if; > end process; > > -------------------------------------------------------------------------------- > > end A_CPC_1_2; > > In summary, after HDL adopting my system, writing a wave-pipelined circuit is
simple as writing a one-cycle logic circuit.
> > Thank you. > > Weng
Hi, The following information is from Wikipedia: 1. The Intel 8087, announced in 1980, was the first x87 floating-point coprocessor for the 8086 line of microprocessors. 2. MMX is a single instruction, multiple data (SIMD) instruction set designed by Intel, introduced in 1997 with its P5-based Pentium line of microprocessors, designated as "Pentium with MMX Technology".[1] It developed out of a similar unit introduced on the Intel i860,[2] and earlier the Intel i750 video pixel processor. MMX is a processor supplementary capability that is supported on recent IA-32 processors by Intel and other vendors. MMX has subsequently been extended by several programs by Intel and others: 3DNow!, Streaming SIMD Extensions (SSE), and ongoing revisions of Advanced Vector Extensions (AVX). 8087's floating 64-bit multiplier needs 5 cycles to finish a data processing with one input data per cycle. MMX floating 64-bit floating multiplier needs 4 cycles to finish a data processing with one set of input data per 2 cycles. Because each multiplier needs one multiplicand A and one multiplier B to get the result C, so naturally many testing benches claim MMX 64-bit floating multiplier is 20% faster than 8087 (4 cycles vs 5 cycles). With my invention, any college students with knowledge of HDL can write a MMX wave-pipelined 64-bit floating multiplier within half an hour under following conditions: 1. My invented system is fully accepted to HDL; 2. Synthesizer manufacturers have updated their products to handle the generation of related wave-pipelined circuits. All related technology and algorithms are available off selves. 3. It needs time. One of wonderful wave-pipelined circuits I think may be 16 channels FFT processor with wave-pipelined technology: the benefits are faster running frequency and a lot of saving in respect of logic area and power consumption. Thank you. Weng
On Saturday, January 13, 2018 at 1:31:17 PM UTC-8, Rick C. Hodgin wrote:
> Do you have a YouTube example? And an example that wil > synthesize in Icarus? So we can see your method compares to a > standard example. > > -- > Rick C. Hodgin
Hi Rick, Actually I have got 3 patents issued for the subject: 1. 9,747,252: Systematic method of coding wave-pipelined circuits in HDL. 2. 9,734,127: Systematic method of synthesizing wave-pipelined circuits in HDL. 3. 9,575,929: Apparatus of wave-pipelined circuits. All 3 patents have the same specification, drawings, abstract with different claims Here is my new non-provisional patent application 15,861,093 (application, hereafter), "Coding wave-pipelined circuits with buffering function in HDL", filed to USPTO on 2018/01/03. The non-provisional patent application 15,861,093 has a *txt (*.vhd) file attached so that they are not secrets and any persons who are interested in the subject can email me to get what he wants, I would email the file set to him, even full application set will be published 18 months later. The following is part of my sell-promotional file to some big companies: "The new application can be viewed in some extents as the continuation of the 3 patents logically, but legally it is a brand new invention devoting the main attention to coding buffering function for wave-pipelined circuits in HDL, a topic never mentioned in the 3 patents, while it is still paying great attention to improve the 3 patents to make them more robust, friendlier and more complete in point of view from coding designers." In the 3 previous patents a first version of source code was attached, the new application provides the second version. With the 2nd version of VHDL source code available you can use a VHDL-2002 or above simulator to simulate all workings and generate waves. The source file is also well noted with inserted debugging function code. Please email me what you want me to send: for 3 patents: 1.1 Specification 1.2. 3 sets of claims. 1.3. Drawings. 1.4. Source code. 1.5. ZIP file of all above. For new application: 2.1 Specification. 2.2. claims. 2.3. Drawings. 2.4. Abstract. 2.5. Source code. 2.6. ZIP file of all above. For the new application, specification has 81 pages, 48 claims have 15 pages and drawings have 24 pages. If you lack time, the best way to learn all working structures needs only 2.1 Specification; 2.3. Drawings; and 2.4. Abstract. Because the target of my patents and new application is a) to make my invented system as part of HDL (not only VHDL, but all languages in HDL), and b) to make the source code as part of system library in HDL, I am willing to distribute my code and all related files to any persons who are really interested in how I did it. Through CPC_1_2 you may know that my scheme needs the least logic information and coding from a designer to resolve a very difficult problem, an almost 50-years open problem. My Email address is wtx wtx @ gmail . com (please remove spaces between characters) Thank you. Weng