FPGARelated.com
Forums

fastest FPGA

Started by hypermodest August 23, 2006
Summary

A user seeks advice on selecting the fastest FPGA to implement a brute-force cryptographic hash cracker on a $1,000–$2,000 budget. The discussion explores the trade-offs between high-end single chips like the Virtex-5 and massively parallel arrays of lower-cost devices like the Spartan-3.

Experts recommend focusing on parallel hardware logic rather than soft processors and highlight that power consumption and thermal management are often more significant bottlenecks than raw clock frequency when saturating a device with active logic.

  • High-performance brute-forcing is better achieved through massive parallelism in logic rather than using soft processors like Nios II or MicroBlaze.
  • The Xilinx Virtex-5 is suggested as a top-tier performance choice, while Spartan-3 arrays offer a more cost-effective solution for distributed cracking.
  • FPGA 'overclocking' is debated, with manufacturers emphasizing that designs must meet static timing analysis to ensure reliability across temperature and voltage.
  • Dense, high-speed designs may require aggressive cooling and power derating because standard thermal specs assume lower logic toggle rates.
  • A search space of 2^48 can be processed relatively quickly on a single FPGA if the hash function is implemented efficiently in parallel blocks.
CryptographyThermal ManagementHardware AccelerationXilinx Virtex
Hello.
I've a task to make attempt to crack some cryptographical hash-function
by using brute-force attack. So I wish to implement it in FPGA.
How can I get fastest FPGA at the modern market?
Altera Nios II dev kit stratix 2 edt (EP2S60) is the right choice?
By the way, are these devices (EP2S60) can be overclocked? If yes, how?

"hypermodest" <hypermodest@gmail.com> writes:
> How can I get fastest FPGA at the modern market?
To a first approximation, by spending the most money.
> Altera Nios II dev kit stratix 2 edt (EP2S60) is the right choice?
No.
> By the way, are these devices (EP2S60) can be overclocked?
Yes.
> If yes, how?
By increasing the clock frequency beyond the Fmax reported by the development tools.
Eric Smith wrote:
> "hypermodest" <hypermodest@gmail.com> writes: > > How can I get fastest FPGA at the modern market? > > To a first approximation, by spending the most money.
OK, but if to use single-chip solution?
"hypermodest" <hypermodest@gmail.com> wrote in message 
news:1156353676.802741.145830@75g2000cwc.googlegroups.com...
> Hello. > I've a task to make attempt to crack some cryptographical hash-function > by using brute-force attack. So I wish to implement it in FPGA. > How can I get fastest FPGA at the modern market? > Altera Nios II dev kit stratix 2 edt (EP2S60) is the right choice? > By the way, are these devices (EP2S60) can be overclocked? If yes, how?
Do you intend the brute force method to use many, many parallel units? How much do you want to spend? You may run into power issues before you can consider overclocking. You could design so much high speed logic in a huge part that you can only run x% of the part at Fmax or all of the part at x% of Fmax.
First, you have to decide how much logic you need, i.e. how much money
you want to spend.
Then you have to look at the two leading manufacturers, which are -in
order of size and speed- Xilinx and Altera
>From Xilinx, I would recommend the appropriate size Virtex-4 LX part,
or -if you need lots of multipliers and/or accumulators- the appropriate size Virtex-4 SX part. If you are after max speed, you hardly need a microprocessor, but both companies offer a soft microprocessor, it's called MicroBlaze in Xilinx. Good luck, sounds like a fun project. Peter Alfke, Xilinx
H.Modest,

First, any processor, soft or hard is far too slow to be of any use (be
it NIOSII or 405PPC).  The design will have to be done massively
parallel, all in logic.

Virtex 5 is sampling now (65nm), and represents the latest that
technology can offer.

There are the 5VLX30, 50, 85 and 110 sampling...

With the 5VLX110 you have:

17280 slices (Virtex-5 slices are organized differently from previous
generations. Each Virtex-5 slice contains four 6 input LUTs and four
flip-flops

64 DSP48E (Each DSP48E slice contains a 25 x 18 multiplier, an adder,
and an accumulator.)

128 36b BRAM blocks (Virtex-5 block RAMs are fundamentally 36 Kbits in
size. Each block can also be used as two independent 18-Kbit blocks)

etc.

With the clock tree supporting up to 550 MHz speeds (of course the
design has to meet timing, and so on to find the speed it can really
operate at).

http://direct.xilinx.com/bvdocs/publications/ds100.pdf

With the 6 LUT able to be a SRL32 (32 bit shift register), there are all
sorts of tricks one can use to speed up cyrpto algorithms.  Combined
with the DSP48E, I suspect that Virtex 5 will see some of the speediest
encryption and decryption cores in the future.

Farms of FPGAs to do brute force decryption have been proposed, and some
have actually been built (or so the conference papers claim).  The use
of FPGA farms for decryption is no longer "new" and may be the reason
why triple DES is no longer recommended for new equipment (cracking
2E112 brute force is now considered "too easy"?).  Even AES128 is being
skipped for new designs by those who feel that the difference between
2E112, and 2E128 is just not enough...(for example, we use AES256 for
our bitstream decyptor).

There is even a cracking farm that proposes the use of low cost Spartan
3 FPGAs (cracking on a budget?) at modest speeds.  The Spartan 3 is
probably three times slower (at best), but since a cracking farm
requires very little communications, being massively parallel means just
having many devices.  Why not pick the least expensive device, and just
use a ton of them?

If you desire the fastest possible logic, with the lowest power, right
now the Virtex 5 65nm FPGA can not be beat (as there is no other 65nm
offering at this time by anyone, anywhere).

Until there is something to compare it with, you really have only one
choice.

There is no such thing as "over-clocking" a FPGA:  either it meets
timing and works, or it doesn't.  You may have to have very exotic
cooling in order not to melt down the device, at speeds like 550 MHz
with all of the logic toggling.  The Industrial temp spec is the
junction must be kept below 100C.  Commercial grade must be kept below 85C.

You could increase the clock rate till the device fails to operate
correctly, or can not be cooled, but in this application it would be
very difficult to know if it wasn't operating correctly!  Best to design
it to work where it is supposed to work.

Austin




hypermodest wrote:
> Hello. > I've a task to make attempt to crack some cryptographical hash-function > by using brute-force attack. So I wish to implement it in FPGA. > How can I get fastest FPGA at the modern market? > Altera Nios II dev kit stratix 2 edt (EP2S60) is the right choice? > By the way, are these devices (EP2S60) can be overclocked? If yes, how? >
>(cracking >2E112 brute force is now considered "too easy"?)
Woah. Is that really the case? These "farms" must be enormous-- if you have 2 million instances of your cracking units running at, say, 500 MHz (both of which seem sortof optimistic to me), you're still "only" doing a quadrillion attempts per second. That's still many, many orders of magnitude off the search space in any reasonable amount of time. Obviously, I'm not a crypto guy, but are these proposal's really for "brute force" farms? --Josh PS- too lazy to g "Austin Lesea" <austin@xilinx.com> wrote in message news:44EC9C56.1000606@xilinx.com...
> H.Modest, > > First, any processor, soft or hard is far too slow to be of any use (be > it NIOSII or 405PPC). The design will have to be done massively > parallel, all in logic. > > Virtex 5 is sampling now (65nm), and represents the latest that > technology can offer. > > There are the 5VLX30, 50, 85 and 110 sampling... > > With the 5VLX110 you have: > > 17280 slices (Virtex-5 slices are organized differently from previous > generations. Each Virtex-5 slice contains four 6 input LUTs and four > flip-flops > > 64 DSP48E (Each DSP48E slice contains a 25 x 18 multiplier, an adder, > and an accumulator.) > > 128 36b BRAM blocks (Virtex-5 block RAMs are fundamentally 36 Kbits in > size. Each block can also be used as two independent 18-Kbit blocks) > > etc. > > With the clock tree supporting up to 550 MHz speeds (of course the > design has to meet timing, and so on to find the speed it can really > operate at). > > http://direct.xilinx.com/bvdocs/publications/ds100.pdf > > With the 6 LUT able to be a SRL32 (32 bit shift register), there are all > sorts of tricks one can use to speed up cyrpto algorithms. Combined > with the DSP48E, I suspect that Virtex 5 will see some of the speediest > encryption and decryption cores in the future. > > Farms of FPGAs to do brute force decryption have been proposed, and some > have actually been built (or so the conference papers claim). The use > of FPGA farms for decryption is no longer "new" and may be the reason > why triple DES is no longer recommended for new equipment (cracking > 2E112 brute force is now considered "too easy"?). Even AES128 is being > skipped for new designs by those who feel that the difference between > 2E112, and 2E128 is just not enough...(for example, we use AES256 for > our bitstream decyptor). > > There is even a cracking farm that proposes the use of low cost Spartan > 3 FPGAs (cracking on a budget?) at modest speeds. The Spartan 3 is > probably three times slower (at best), but since a cracking farm > requires very little communications, being massively parallel means just > having many devices. Why not pick the least expensive device, and just > use a ton of them? > > If you desire the fastest possible logic, with the lowest power, right > now the Virtex 5 65nm FPGA can not be beat (as there is no other 65nm > offering at this time by anyone, anywhere). > > Until there is something to compare it with, you really have only one > choice. > > There is no such thing as "over-clocking" a FPGA: either it meets > timing and works, or it doesn't. You may have to have very exotic > cooling in order not to melt down the device, at speeds like 550 MHz > with all of the logic toggling. The Industrial temp spec is the > junction must be kept below 100C. Commercial grade must be kept below
85C.
> > You could increase the clock rate till the device fails to operate > correctly, or can not be cooled, but in this application it would be > very difficult to know if it wasn't operating correctly! Best to design > it to work where it is supposed to work. > > Austin > > > > > hypermodest wrote: > > Hello. > > I've a task to make attempt to crack some cryptographical hash-function > > by using brute-force attack. So I wish to implement it in FPGA. > > How can I get fastest FPGA at the modern market? > > Altera Nios II dev kit stratix 2 edt (EP2S60) is the right choice? > > By the way, are these devices (EP2S60) can be overclocked? If yes, how? > >
hypermodest wrote:
> Hello. > I've a task to make attempt to crack some cryptographical hash-function > by using brute-force attack. So I wish to implement it in FPGA. > How can I get fastest FPGA at the modern market? > Altera Nios II dev kit stratix 2 edt (EP2S60) is the right choice? > By the way, are these devices (EP2S60) can be overclocked? If yes, how?
I got a few questions for you: 1st: do you really need to brute force it? 2nd: how much time do you got? 3rd: budget? I assume you are not working for NSA or similar YAT (Yet Another TLA) 4th: do you really have the ambition to learn FPGA development for a simple homework? Given the simplicity of the algorithm and given that your search space contsains "only" 2**64 keys, yes it can be cracked... but I am pretty sure that your simple hash function can be cracked by other means than brute force. Having said that, there are a lot of people on the Internet (some even on this newsgroup) doing this kind of thing with very cheap FPGAs. I am sure the Xilinx/Altera/Lattice/Actel/Quicklogic/Ateml guys are more than happy to point out that their latest FPGA is the best one on the market. But the question is, even if you could afford the 10,000 USD it would cost, would you really need it? could you really handle that beast? regards, -Burns
John_H wrote:
> You may run into power issues before you can consider overclocking. You > could design so much high speed logic in a huge part that you can only run > x% of the part at Fmax or all of the part at x% of Fmax.
Right on John ... :) As I've noted before, you seriously need to "derate" large Xiinx FPGAs for designs that have very high percentage of active logic. Since they assume around 15-20% of the design will be active, it's very easy to be unable to get power into the devices within the spec'ed voltage margins, or to keep it cool if you do. Doing single point thermal monitoring on the die, may not be enough to readily identify that other portions of the die are well above that limit temp once you are agressively cooling the part. Packing the device from edge to edge with active logic will cause problems. At both high speeds and high density, many of the larger parts are simply not usable.
Hej,

> There is even a cracking farm that proposes the use of low cost Spartan > 3 FPGAs (cracking on a budget?) at modest speeds.
That farm Austin is talking about sounds like our project ;-) Have a look at it at http://www.copacobana.org. There you can find some conference papers as well.
> The Spartan 3 is > probably three times slower (at best),
Right. But it's cheap.
> but since a cracking farm > requires very little communications, being massively parallel means just > having many devices. Why not pick the least expensive device, and just > use a ton of them?
Exactly. We have a system build of 120 Spartan3-1000 FPGAs doing all communication in full parallel on a 64-bit backplane. [promotion] Currently we are working on a application development framework for both, the FPGAs and the host computer. If you are interested in the platform, feel free to drop me an email or contact Jan Pelzl instead (pelzl @ crypto.rub.de without the blanks, of course), he is responsible for the project. [/promotion] Cheers /Chris -- Christian Schleiffer Communication Security (COSY) Dept. of Electr. Eng. & Information Science Ruhr-University Bochum, Germany http://www.crypto.rub.de cschleiffer@crypto.rub.de