Computing Fixed-Point Square Roots and Their Reciprocals Using Goldschmidt Algorithm
Michael Morris presents a practical, FPGA-friendly fixed-point implementation of the Goldschmidt algorithm to compute sqrt and 1/sqrt. The post shows how an msb-indexed Y_est table and an N_adj scaling factor produce a reliable initial inverse-square-root estimate for an FP32B16 format, enabling five-iteration convergence. It also covers fixed-point normalization, multiplier/shift tradeoffs, and why this fits a real-time motion-controller use case.
Dealing With Fixed Point Fractions
Fixed-point fractional math is easy to botch, and this post lays out pragmatic ways to avoid those mistakes. It clarifies the difference between integer and fractional overflow, shows how Q notation helps track binary-point scaling, and explains why multiplies add sign bits that may require shifting. Read for concrete FPGA strategies: keeping bit growth, selective shifts, or aggressive normalization, plus testing tips.
Polynomial Inverse
One of the important steps of computing point addition over elliptic curves is a division of two polynomials.
One Clock Cycle Polynomial Math
Error correction codes and cryptographic computations are most easily performed working with GF(2^n)
Tools of the Trade: reading PDFs (and keeping bookmarks)
Victor Yurkovsky recommends MuPDF as a fast, no-frills PDF viewer and shares a tiny C tweak that makes bookmarks persistent and portable. The mod writes a 40-byte file next to each PDF containing ten 4-byte page numbers, so bookmarks follow your files across machines. If you juggle dozens of datasheets and manuals, this keyboard-driven workflow makes reading and navigation dramatically less painful.
Inside the Spartan-6: Using LUTs to optimize circuits
Victor Yurkovsky hit poor synthesis packing while building a J1 CPU on Spartan-6 and traced the problem to an 18-bit logic ALU that mapped to many slices. He demonstrates a practical fix: instantiate LUT6 primitives with carefully chosen INIT values, then use RLOC placement to stack the per-bit LUTs and collapse the design down to five slices. This is a hands-on guide to Xilinx-specific optimization when synthesis falls short.
Makefiles for Xilinx Tools
Building a bitstream from HDL is messy, and Victor Yurkovsky lays out a minimal, practical makefile workflow for Xilinx ISE and XST. He shows a simple project layout, techniques to tame ISE's generated logs and temps, and a ready-to-clone repo; an LED blinker example builds to bitstream in under 20 seconds on his machine. Use it as a pragmatic starting point for command-line FPGA builds.
Use Microprogramming to Save Resources and Increase Functionality
Microprogramming can rescue an overfull FPGA, Michael Morris shows, by compressing control logic and time-multiplexing FIFO storage. He replaces an ABEL state machine with a small microprogram ROM that uses block RAM for deep Rx/Tx FIFOs and LUT RAM for pointers and counters, freeing about 25 percent of the device. The article includes Verilog comparisons, resource tables, and a microassembler link to reproduce the approach.
I don’t often convert VHDL to Verilog but when I do ...
Converting VHDL to Verilog is tedious, and Christopher Felton lays out a pragmatic, repeatable workflow using vhd2vl to do most of the heavy lifting. He walks through the iterate-run-comment-fix cycle, highlights frequent failure points like arrays, records and packages, and explains why many open-source projects favor Verilog for better FOSS simulator support.
MyHDL Interface Example
Christopher Felton shows how MyHDL 0.9 interfaces bundle Signals into a single bus object to cut connector clutter and simplify module connections. The post walks through a pedagogical example where button presses drive a memory-mapped BareBoneBus read-modify-write that inverts LEDs, with a TDD-style testbench and notes on converting to Verilog/VHDL and loading the example on supported boards.
Computing Fixed-Point Square Roots and Their Reciprocals Using Goldschmidt Algorithm
Michael Morris presents a practical, FPGA-friendly fixed-point implementation of the Goldschmidt algorithm to compute sqrt and 1/sqrt. The post shows how an msb-indexed Y_est table and an N_adj scaling factor produce a reliable initial inverse-square-root estimate for an FP32B16 format, enabling five-iteration convergence. It also covers fixed-point normalization, multiplier/shift tradeoffs, and why this fits a real-time motion-controller use case.
BGA and QFP at Home 1 - A Practical Guide.
It's a myth that BGAs and fine-pitch QFPs can't be soldered at home. Victor Yurkovsky lays out a practical, no-frills approach for hobbyists to design and assemble FPGA boards using 2-layer PCBs, breakout modules, and low-cost reflow methods like toaster ovens or hotplates. The article focuses on manufacturable PCB choices, netlist-driven workflows, and power/decoupling tactics that make high-density parts approachable for amateurs.
Dealing With Fixed Point Fractions
Fixed-point fractional math is easy to botch, and this post lays out pragmatic ways to avoid those mistakes. It clarifies the difference between integer and fractional overflow, shows how Q notation helps track binary-point scaling, and explains why multiplies add sign bits that may require shifting. Read for concrete FPGA strategies: keeping bit growth, selective shifts, or aggressive normalization, plus testing tips.
Inside the Spartan-6: Using LUTs to optimize circuits
Victor Yurkovsky hit poor synthesis packing while building a J1 CPU on Spartan-6 and traced the problem to an 18-bit logic ALU that mapped to many slices. He demonstrates a practical fix: instantiate LUT6 primitives with carefully chosen INIT values, then use RLOC placement to stack the per-bit LUTs and collapse the design down to five slices. This is a hands-on guide to Xilinx-specific optimization when synthesis falls short.
I don’t often convert VHDL to Verilog but when I do ...
Converting VHDL to Verilog is tedious, and Christopher Felton lays out a pragmatic, repeatable workflow using vhd2vl to do most of the heavy lifting. He walks through the iterate-run-comment-fix cycle, highlights frequent failure points like arrays, records and packages, and explains why many open-source projects favor Verilog for better FOSS simulator support.
Makefiles for Xilinx Tools
Building a bitstream from HDL is messy, and Victor Yurkovsky lays out a minimal, practical makefile workflow for Xilinx ISE and XST. He shows a simple project layout, techniques to tame ISE's generated logs and temps, and a ready-to-clone repo; an LED blinker example builds to bitstream in under 20 seconds on his machine. Use it as a pragmatic starting point for command-line FPGA builds.
Use Microprogramming to Save Resources and Increase Functionality
Microprogramming can rescue an overfull FPGA, Michael Morris shows, by compressing control logic and time-multiplexing FIFO storage. He replaces an ABEL state machine with a small microprogram ROM that uses block RAM for deep Rx/Tx FIFOs and LUT RAM for pointers and counters, freeing about 25 percent of the device. The article includes Verilog comparisons, resource tables, and a microassembler link to reproduce the approach.
An Editor for HDLs
If you prefer Notepad++ over Emacs, Dave Vandenbout shows how to turn it into a capable HDL editor using templates, a Perl package generator, and Emacs run in batch mode for beautification. He covers FingerText snippets for VHDL skeletons, binding a Perl script to auto-create/update package component declarations, and invoking Emacs from a hotkey to format files with one keystroke.
VGA Output in 7 Slices. Really.
Victor Yurkovsky shows how to generate VGA timing on a Xilinx Spartan3 using clever SRL16 tricks to squeeze the generator into just a few slices. By using 32-bit SRLs for line pulses, two mutually prime SRL lengths as a divide-by-99 timebase, and tapped SRLs to combine HSYNC and HBLANK, the approach achieves accurate-enough horizontal and vertical timing with minimal LUT usage.
MyHDL FPGA Tutorial II (Audio Echo)
Christopher Felton demonstrates how to build an FPGA audio echo using MyHDL by storing delayed samples in BRAM and mixing them back with incoming audio. The project shows parameterizable sample rate, sample width, buffer depth, and conversion from MyHDL to Verilog, with a strong emphasis on test-driven verification and simulation-based resource reports. Read on to see how delay, scaling, and BRAM usage affect real-time audio.
BGA and QFP at Home 1 - A Practical Guide.
It's a myth that BGAs and fine-pitch QFPs can't be soldered at home. Victor Yurkovsky lays out a practical, no-frills approach for hobbyists to design and assemble FPGA boards using 2-layer PCBs, breakout modules, and low-cost reflow methods like toaster ovens or hotplates. The article focuses on manufacturable PCB choices, netlist-driven workflows, and power/decoupling tactics that make high-density parts approachable for amateurs.
I don’t often convert VHDL to Verilog but when I do ...
Converting VHDL to Verilog is tedious, and Christopher Felton lays out a pragmatic, repeatable workflow using vhd2vl to do most of the heavy lifting. He walks through the iterate-run-comment-fix cycle, highlights frequent failure points like arrays, records and packages, and explains why many open-source projects favor Verilog for better FOSS simulator support.
An Editor for HDLs
If you prefer Notepad++ over Emacs, Dave Vandenbout shows how to turn it into a capable HDL editor using templates, a Perl package generator, and Emacs run in batch mode for beautification. He covers FingerText snippets for VHDL skeletons, binding a Perl script to auto-create/update package component declarations, and invoking Emacs from a hotkey to format files with one keystroke.
Computing Fixed-Point Square Roots and Their Reciprocals Using Goldschmidt Algorithm
Michael Morris presents a practical, FPGA-friendly fixed-point implementation of the Goldschmidt algorithm to compute sqrt and 1/sqrt. The post shows how an msb-indexed Y_est table and an N_adj scaling factor produce a reliable initial inverse-square-root estimate for an FP32B16 format, enabling five-iteration convergence. It also covers fixed-point normalization, multiplier/shift tradeoffs, and why this fits a real-time motion-controller use case.
MyHDL FPGA Tutorial II (Audio Echo)
Christopher Felton demonstrates how to build an FPGA audio echo using MyHDL by storing delayed samples in BRAM and mixing them back with incoming audio. The project shows parameterizable sample rate, sample width, buffer depth, and conversion from MyHDL to Verilog, with a strong emphasis on test-driven verification and simulation-based resource reports. Read on to see how delay, scaling, and BRAM usage affect real-time audio.
Dealing With Fixed Point Fractions
Fixed-point fractional math is easy to botch, and this post lays out pragmatic ways to avoid those mistakes. It clarifies the difference between integer and fractional overflow, shows how Q notation helps track binary-point scaling, and explains why multiplies add sign bits that may require shifting. Read for concrete FPGA strategies: keeping bit growth, selective shifts, or aggressive normalization, plus testing tips.
VGA Output in 7 Slices. Really.
Victor Yurkovsky shows how to generate VGA timing on a Xilinx Spartan3 using clever SRL16 tricks to squeeze the generator into just a few slices. By using 32-bit SRLs for line pulses, two mutually prime SRL lengths as a divide-by-99 timebase, and tapped SRLs to combine HSYNC and HBLANK, the approach achieves accurate-enough horizontal and vertical timing with minimal LUT usage.
Inside the Spartan-6: Using LUTs to optimize circuits
Victor Yurkovsky hit poor synthesis packing while building a J1 CPU on Spartan-6 and traced the problem to an 18-bit logic ALU that mapped to many slices. He demonstrates a practical fix: instantiate LUT6 primitives with carefully chosen INIT values, then use RLOC placement to stack the per-bit LUTs and collapse the design down to five slices. This is a hands-on guide to Xilinx-specific optimization when synthesis falls short.
Makefiles for Xilinx Tools
Building a bitstream from HDL is messy, and Victor Yurkovsky lays out a minimal, practical makefile workflow for Xilinx ISE and XST. He shows a simple project layout, techniques to tame ISE's generated logs and temps, and a ready-to-clone repo; an LED blinker example builds to bitstream in under 20 seconds on his machine. Use it as a pragmatic starting point for command-line FPGA builds.
Use Microprogramming to Save Resources and Increase Functionality
Microprogramming can rescue an overfull FPGA, Michael Morris shows, by compressing control logic and time-multiplexing FIFO storage. He replaces an ABEL state machine with a small microprogram ROM that uses block RAM for deep Rx/Tx FIFOs and LUT RAM for pointers and counters, freeing about 25 percent of the device. The article includes Verilog comparisons, resource tables, and a microassembler link to reproduce the approach.










