Fast demosaicing software on NVIDIA GPU
Image processing for RAW data from color cameras takes into account the way how to convert raw from image sensor to 24/48-bit color image. This is usually done by some kind of interpolation and that process is called demosaicing. There are several different patterns of color filters, but the most frequently used is Bayer CFA (color filter array) which has the following patterns: RGGB, BGGR, GBRG, GRBG. As soon as each pixel can register only one color component, we have just one third of necessary data and that's why we need to interpolate raw to get color image. Such interpolation could strongly affect image quality. Demosaicing algorithm is probably the most important part of raw image processing pipeline to ensure high quality results.
High quality interpolation according to demosaicing algorithm is usually time-consuming and we need to do a lot computations to get the result. This is the problem to get low latency response and to offer realtime processing which is a must for video and digital cinema applications. Fast demosaicing algorithms for CPU offer low quality due to the presence of various artifacts (noise, blur, false colors, zipping, etc.). There are more sofisticated and high quality demosaicing algorithms, but they could hardly offer realtime performance on good CPU even for 2K resolution. Nevertheless, demosaicing could be implemented with parallel computations and here comes the idea to utilize GPU. For parallel algorithm this is right approach and one can get very high performance for demosaicing on GPU, much faster than realtime, even for 4K and more.
Features of demosaicing on CUDA
- Support of all mosaic patterns: BGGR, RGGB, GRBG, GBRG
- Read 10/12/14/16-bit RAW image data from SSD or RAM in DNG, CinemaDNG, CinemaDNG RAW and MLV formats
- Optional support of 12-bit RAW SDI from BMD Micro Studio 4k camera
- HQLI demosaic quality (High Quality Linear Interpolation algorithm, 5×5) – avr. PSNR ~ 36.5 dB (SSIM ~ 0.965) for Kodak data set
- DFPD demosaic quality (Directional Filtering and a Posteriori Decision algorithm, 11×11) – avr. PSNR ~ 39 dB (SSIM ~ 0.978) for Kodak data set
- MG demosaic quality (Multiple Gradients algorithm, 23×23) – avr. PSNR ~ 40.5 dB for Kodak data set
Maximum performance for CUDA demosaicing software on NVIDIA Quadro P6000
- 16-bit frame with 4K resolution, without batch, no streaming, GPU computations only
- HQLI demosaicing performance ~ 50 GPix/s
- DFPD demosaicing performance ~ 18 GPix/s
- MG demosaicing performance ~ 4 GPix/s
Fastvideo GPU demosacing software is intended to get high quality results and to visualize the power of parallel computing on GPU with CUDA technology. We have also implemented GPU-based SDK on GPU for various camera applications. It inclused raw export, subtraction of dark frame image, FFC, WB, demosaic, denoise, color correction, 1D and 3D LUT, tone mapping, various image filters, rotation, crop, resize, remap, sharp, OpenGL and FFmpeg interoperability, jpeg and jpeg2000 codecs, etc.