> Even with your example of a matrix multiplication, there is still a lot to
> figure out. For one thing, you usually have to fold the multiplication
> because you don't have the resources to do the whole thing in parallel.
> The Matlab->gates tool that I used once is long gone.
>
> I just spent several days implementing a design that I modeled in two
> lines of Matlab code. For a tool to have converted those two lines into a
> good hardware implementation would've been difficult.
I agree, you can't expect the tool to just do it - there are many things
that need to be tweaked. But those are architectural design choices, which
are different from writing Verilog or VHDL. By all means expose the
tradeoffs to the designer in a way they understand, just don't make them
write HDL. Software folks are familiar with the idea of different data
structures having difference performance qualities. But by and large a
serial 'program' isn't a helpful way to expose tradeoffs to either the
software or hardware engineer.
As I said, this ignores the elephant in the room that is communication.
Your C code has this illusion that it lives in a flat uniform memory space
because it's behind a cache, an MMU, a prefetcher and a standard library,
that do a lot of work (in terms of time, area and power) to make it look
that way. To get good performance, you need both a way to write the compute
and a way to move the data around to the right place at the right time.
'HLS' tools are usually poor at handling that: I don't know a HLS 'C to
gates' tool that it would make sense to write a cache in, for example.
Theo
Reply by Kevin Neilson●July 27, 20162016-07-27
The high-level design tools I've used weren't very abstracted. To make something work well, I had to keep moving to lower levels of abstraction. In tool that was supposed to be high-level I found myself instantiating DSP48s. Not very abstract.
Reply by Kevin Neilson●July 27, 20162016-07-27
Even with your example of a matrix multiplication, there is still a lot to figure out. For one thing, you usually have to fold the multiplication because you don't have the resources to do the whole thing in parallel. The Matlab->gates tool that I used once is long gone.
I just spent several days implementing a design that I modeled in two lines of Matlab code. For a tool to have converted those two lines into a good hardware implementation would've been difficult.
Reply by rickman●July 27, 20162016-07-27
On 7/27/2016 1:33 PM, Jecel wrote:
> On Wednesday, July 27, 2016 at 11:41:11 AM UTC-3, rickman wrote:
>> On 7/26/2016 8:11 PM, Kevin Neilson wrote:
>>> I think Celoxica is defunct.
>>
>> So there it is!
>
> Celoxica was a very successful company that was bought out by
> a much larger competitor and immediately shut down. This left
> many customers who loved the product without any good options.
>
> And this is why I refuse to use non open source tools no matter
> what advantages they claim to have. You can't know if they will
> still be available tomorrow. The exception (for now) is the
> vendor tools for generating the FPGA bitfiles.
At least with open source tools you can't be disappointed by a total
lack of support. My installation of Lattice Diamond errors out when I
perform synthesis and I can't get any help from the vendor support... at
all!
--
Rick C
Reply by Jecel●July 27, 20162016-07-27
On Wednesday, July 27, 2016 at 11:41:11 AM UTC-3, rickman wrote:
> On 7/26/2016 8:11 PM, Kevin Neilson wrote:
> > I think Celoxica is defunct.
>
> So there it is!
Celoxica was a very successful company that was bought out by
a much larger competitor and immediately shut down. This left
many customers who loved the product without any good options.
And this is why I refuse to use non open source tools no matter
what advantages they claim to have. You can't know if they will
still be available tomorrow. The exception (for now) is the
vendor tools for generating the FPGA bitfiles.
-- Jecel
Reply by rickman●July 27, 20162016-07-27
On 7/26/2016 8:11 PM, Kevin Neilson wrote:
> I think Celoxica is defunct.
So there it is!
--
Rick C
Reply by Theo Markettos●July 27, 20162016-07-27
Mark Curry <gtwrek@sonic.net> wrote:
> They're quite capable of this. Problem is they DONT WANT to. They'd prefer
> to be moving their software coding to a higher level of abstraction (through
> advances is SW languages and techniques). Then leave all these
> "fiddly hardware details" to the hardware designers.
Indeed. It puzzles me why hardware designers would think that a pile of
nested for loops consist of a high level abstraction.
If we're concentrating on compute to the exclusion of all else (as HLS seems
to), the algorithm might be defined in terms of matrix operations, so surely
it's that which should be the input to the HLS toolchain? In a matrix
multiply, say, the parallelism is inherent and it is friendly to the
programmer: they want to compute matA*matB and all the other details can be
left to the tool to figure out. At the very least it leaves plenty more
scope for the tool to improve, rather than trying to unpick C loops that
represent matA*matB.
Matlab isn't a great language for many reasons, but it does make it possible
to write code with implicit parallelism pretty easily, without even thinking
about it. That would be heading towards my definition of 'high level'.
(I'm not familiar with Simulink-to-gates flows, because I'm not a great fan
of schematics. Perhaps the tools are better in this space)
Theo
Reply by Kevin Neilson●July 26, 20162016-07-26
I think Celoxica is defunct.
Reply by Tom Gardner●July 26, 20162016-07-26
On 26/07/16 18:43, Kevin Neilson wrote:
> That's what they like to say. It sounds nice. But software engineers still can't program parallel processor arrays well, let alone FPGAs.
>
> These tools can all make a functional FPGA, but if it uses too many gates and has too many levels of logic, you're better off using software.
Be aware that the "high frequency trading" mob put
trading algorithms (i.e if X then buy shares Y)
into FPGAs to shave off the odd millisecond latency.
Obviously development turnaround time for the
algorithms is very important.
That's a good use-case for software->gates.
They've also laid their own $600m trans-Atlantic
fibre optic cable to avoid contention and latency,
and have bought up all the microwave transmission
towers between Chicago and New York because the
speed of light in fibres is noticeably slower than
that in air.
Reply by rickman●July 26, 20162016-07-26
On 7/26/2016 1:30 PM, Kevin Neilson wrote:
> They did claim that software engineers with no hardware experience could be designing FPGAs after a very short training period. Which might have been true. But there's a difference between doing FPGAs and doing them well.
I suppose the proof of the pudding is in the eating. Who is using this
tool for production in this way?
--
Rick C