High performance demosaicing software on NVIDIA GPU
Image processing for RAW footages from color cameras takes into account the way how to convert raw data from image sensor to color image. This is usually done by interpolation and that process is called demosaicing. There are a few different patterns of color filters, but the most frequently used is Bayer CFA (color filter array) which has the following patterns: RGGB, BGGR, GBRG, GRBG. As soon as each pixel can register just one color component, we have just one third of necessary data and that's why we need to interpolate raw to get color image. Such interpolation strongly affects image quality. Demosaicing algorithm is probably the most important part of raw image processing pipeline.
Interpolations according to demosaicing algorithms are usually time-consuming and we need to do a lot computations to get the result. This is the problem to get low latency response and to offer realtime processing which is a must for video and digital cinema applications. Fast demosaicing algorithms for CPU have low quality due to the presence of various artifacts (noise, blur, false colors, zipping, etc.). There are more sofisticated and high quality demosaicing algorithms, but they could hardly offer realtime performance on good CPU even for 2K resolution. Nevertheless, task of demosaicing could be solved with parallel computations and here comes the idea to utilize GPU. For parallel algorithm this is right approach and one can get really high performance for demosaicing on GPU, much faster than realtime, even for 4K and more.
Features of demosaicing on CUDA
- All Bayer mosaic patterns for input data supported (RGGB, BGGR, GBRG, GRBG)
- Read 16-bit input CFA image data from HDD/SSD/RAID in DNG, CinemaDNG, CinemaDNG RAW, MLV and PGM formats
- Minimum image resolution 128 × 128 pixels
- Maximum image resolution up to 16,000 × 16,000 pixels and more
- HQLI demosaic (High Quality Linear Interpolation, window 5×5) – avr. PSNR ~ 36.5 dB (SSIM ~ 0.965) for Kodak data set
- DFPD demosaic (Directional Filtering and a Posteriori Decision, window 11×11) – avr. PSNR ~ 39 dB (SSIM ~ 0.978) for Kodak data set
- MG demosaic (Multiple Gradients, window 23×23) – avr. PSNR ~ 40.7 dB for Kodak data set
Peak performance for demosaicing software on NVIDIA Quadro P6000
- 16-bit frame with 4K resolution, no batch, no streaming, GPU computations only
- HQLI demosaic ~ 50 GPix/s
- DFPD demosaic ~ 18 GPix/s
- MG demosaic ~ 4 GPix/s
Fastvideo GPU demosacing software is a tool to demonstrate the power of high performance parallel computations on GPU with NVIDIA CUDA technology. It's difficult to imagine how fast could be parallel implementation of GPU demosaic for high quality image processing. We have also designed fast image processing SDK on GPU for high speed and high resolution cameras: dark frame subtraction, flat-field correction, white balance, demosaicing, denoising, color correction, 1D and 3D LUT, tone mapping, image filtering, rotating, cropping, resizing, remapping, sharpening, OpenGL rendering and output, jpeg/jpeg2000/bayer/h264/h265 encoding and decoding, FFmpeg integration, etc.