Tuesday, December 15, 2009

cpu post processing

Out of curiosity, I tried to replicate the post processing effect using the CPU instead of using shaders on the GPU. This is complicated because we're not supposed to access the graphics data directly, but uh-hmm -- whatever

To get the rendered graphics data, I use GetRenderTextureData which uses the system memory pool. But MSDN says that : "system memory is not typically accessible by the Direct3D device". So we must wait on a lock to copy from the system memory to default memory. This is a really costly operation and basically is the failure of my experiment. Running in exclusive (fullscreen) mode, takes about ~50ms to run, and the framerate is killed to <20fps

// Texture surfaces with given CreateTexture parameters
// renderTexture is D3DUSAGE_RENDERTARGET, D3DPOOL_DEFAULT
// systemMemTexture is D3DUSAGE_DYNAMIC, D3DPOOL_SYSTEMMEM
// defaultMemTexture is D3DUSAGE_DYNAMIC, D3DPOOL_DEFAULT

pd3dDevice->GetRenderTargetData(renderTextureSurface, systemMemSurface);
systemMemSurface->LockRect(&source, NULL, 0);
defaultMemSurface->LockRect(&dest, NULL, 0);
copy(source.pBits, dest.pBits);
systemMemSurface->UnlockRect();
defaultMemSurface->UnlockRect();



Apparently, GPGPU programming is enabled by hardware and the software. For example in nVidia CUDA it let's programmers control shared memory for hardware the graphics that enables programmer control (see Toms's Hardware: CUDA from the Hardware Point of View )



From the GPGPU FAQ it says about the shared memory

"Texture read operations are routed through a cache structure to allow efficient exploitation of reference locality. Read operations also factor into shader thread scheduling as read operations that miss the cache will block the thread and cause other ready threads to run while waiting for the read to memory to complete. Writes are routed through different logic that does some amount of buffering to maximize the efficieny of writes."

No comments:

Post a Comment