GPU Denoiser for Non-Local Means (NLM) Algorithm

fastvideo
Image: Fastcompression

In many camera applications a denoiser is needed and currently there are quite a lot of AI-based solutions for denoising which are offering pretty high image quality after denoising at the market. These AI solutions have in general one problem: they require very powerful GPU and at the same time they are very slow. Fastvido implemented a GPU-based NLM denoiser. This is a standard algorithm which is found to be practical in many camera applications, especially in low-light cases. They’ve done a low level acceleration for NLM algorithm and now can denoise a 12MP image (color, 16-bit per channel) at less than 0.4ms on the Nvidia GeForce RTX 4090. This is more than 30 GPix/s performance and it makes NLM useful in many realtime camera applications.