<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Path Dependent &#187; CUDA</title>
	<atom:link href="http://pathdependent.com/tag/cuda/feed/" rel="self" type="application/rss+xml" />
	<link>http://pathdependent.com</link>
	<description>Programming, Complex Systems, Trading, and Introspection</description>
	<lastBuildDate>Sun, 22 Aug 2010 19:08:23 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Warning: NVIDIA CUDA memset bug</title>
		<link>http://pathdependent.com/2009/05/09/warning-nvidia-cuda-memset-bug/</link>
		<comments>http://pathdependent.com/2009/05/09/warning-nvidia-cuda-memset-bug/#comments</comments>
		<pubDate>Sat, 09 May 2009 10:42:44 +0000</pubDate>
		<dc:creator>John Nelson</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[GPU]]></category>

		<guid isPermaLink="false">http://pathdependent.com/?p=58</guid>
		<description><![CDATA[Calling cudaMemset() on some platforms does nothing. It took me two hours to figure out why the result of my computation was so bizarre. My experience was similar to many people who posted on this informative thread. In emulator mode, the function performs as expected; When running on the device, the memory is not set.
]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;"><a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fpathdependent.com%2F2009%2F05%2F09%2Fwarning-nvidia-cuda-memset-bug%2F"><img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fpathdependent.com%2F2009%2F05%2F09%2Fwarning-nvidia-cuda-memset-bug%2F" height="61" width="51" /></a></div><p>Calling cudaMemset() on some platforms does nothing. It took me two hours to figure out why the result of my computation was so bizarre. My experience was similar to many people who posted on <a title="cudaMemset thread" href="http://forums.nvidia.com/lofiversion/index.php?t29225.html">this informative thread</a>. In emulator mode, the function performs as expected; When running on the device, the memory is not set.</p>
]]></content:encoded>
			<wfw:commentRss>http://pathdependent.com/2009/05/09/warning-nvidia-cuda-memset-bug/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CUDA for Financial Modeling</title>
		<link>http://pathdependent.com/2008/12/30/cuda-for-financial-modeling/</link>
		<comments>http://pathdependent.com/2008/12/30/cuda-for-financial-modeling/#comments</comments>
		<pubDate>Tue, 30 Dec 2008 15:42:22 +0000</pubDate>
		<dc:creator>John Nelson</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[CUDA]]></category>

		<guid isPermaLink="false">http://pathdependent.com/?p=66</guid>
		<description><![CDATA[
Kudos to NVidia for CUDA; It’s fast — really fast.
CUDA allows you to harness the parrellel power of a CUDA capable GPU for (semi-) general computation. The result? Hundreds of cores running your binomial option pricing models, Monte-Carlo simulations, or Black-Scholes computations.
I started learning CUDA yesterday; I wrote my first simple CUDA program today. The [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;"><a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fpathdependent.com%2F2008%2F12%2F30%2Fcuda-for-financial-modeling%2F"><img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fpathdependent.com%2F2008%2F12%2F30%2Fcuda-for-financial-modeling%2F" height="61" width="51" /></a></div><div>
<p>Kudos to <a title="NVidia" href="http://www.nvidia.com/">NVidia</a> for <a title="Learn CUDA" href="http://www.nvidia.com/object/cuda_learn.html">CUDA</a>; It’s fast — really fast.</p>
<p>CUDA allows you to harness the parrellel power of a <a href="http://www.nvidia.com/object/cuda_learn_products.html">CUDA capable GPU</a> for (semi-) general computation. The result? Hundreds of cores running your <a title="Binomial Options" href="http://en.wikipedia.org/wiki/Binomial_options_pricing_model">binomial option pricing models</a>, <a title="Monte Carlo Methods in Finance" href="http://en.wikipedia.org/wiki/Monte_Carlo_methods_in_finance">Monte-Carlo simulations</a>, or <a title="Black-Scholes" href="http://en.wikipedia.org/wiki/Black-Scholes">Black-Scholes</a> computations.</p>
<p>I started learning CUDA yesterday; I wrote my first simple CUDA program today. The library does have a non-negligible learning curve, but it is not steep. It largely is a matter of learning the most efficient ways to work with CUDA (e.g. shared, local, or constant memory). Happily, this is an incremental process; You can learn to write bad yet working CUDA applications while slowly learn to write them better; And, as a bonus, even your bad code is likely to run laps around your CPU (for finance apps anyway).</p>
<p>I am currently writing a Poker hand simulator (to compare to the <a title="Open Holdem" href="http://code.google.com/p/openholdembot/">OpenHoldem</a>’s speed) but, once I am comfortable with the library, I will be porting my c++ option pricing algorithm. With CUDA, my algorithm will now be (closer to) real-time!</p>
<p>P.S. I bought the GForce 9600GSO which was only $99 at Best Buy; This low-end NVidia card achieved the following results for the binomialOptions.exe sample included in the SDK:</p>
<pre>Using single precision...
Using device 0: GeForce 9600 GSO
Generating input data...
Running GPU binomial tree...
Options count            : 512
Time steps               : 2048
binomialOptionsGPU() time: 92.526009 msec
Options per second       : 5533.579236
Running CPU binomial tree...
Comparing the results...
GPU binomial vs. Black-Scholes
L1 norm: 1.484960E-004
CPU binomial vs. Black-Scholes
L1 norm: 1.045247E-004
CPU binomial vs. GPU binomial
L1 norm: 4.464579E-005
TEST PASSED
Shutting down...</pre>
<p>Not bad, Eh?</p></div>
]]></content:encoded>
			<wfw:commentRss>http://pathdependent.com/2008/12/30/cuda-for-financial-modeling/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
