Kudos to NVidia for CUDA; It’s fast — really fast.
CUDA allows you to harness the parrellel power of a CUDA capable GPU for (semi-) general computation. The result? Hundreds of cores running your binomial option pricing models, Monte-Carlo simulations, or Black-Scholes computations.
I started learning CUDA yesterday; I wrote my first simple CUDA program today. The library does have a non-negligible learning curve, but it is not steep. It largely is a matter of learning the most efficient ways to work with CUDA (e.g. shared, local, or constant memory). Happily, this is an incremental process; You can learn to write bad yet working CUDA applications while slowly learn to write them better; And, as a bonus, even your bad code is likely to run laps around your CPU (for finance apps anyway).
I am currently writing a Poker hand simulator (to compare to the OpenHoldem’s speed) but, once I am comfortable with the library, I will be porting my c++ option pricing algorithm. With CUDA, my algorithm will now be (closer to) real-time!
P.S. I bought the GForce 9600GSO which was only $99 at Best Buy; This low-end NVidia card achieved the following results for the binomialOptions.exe sample included in the SDK:
Using single precision... Using device 0: GeForce 9600 GSO Generating input data... Running GPU binomial tree... Options count : 512 Time steps : 2048 binomialOptionsGPU() time: 92.526009 msec Options per second : 5533.579236 Running CPU binomial tree... Comparing the results... GPU binomial vs. Black-Scholes L1 norm: 1.484960E-004 CPU binomial vs. Black-Scholes L1 norm: 1.045247E-004 CPU binomial vs. GPU binomial L1 norm: 4.464579E-005 TEST PASSED Shutting down...
Not bad, Eh?