Celebrating SRC-B's Multiple Championships at the CVPR 2024 Competitions

The New Trends in Image Restoration and Enhancement (NTIRE 2024) workshop and associated challenges and Mobile Intelligent Photography & Imaging (MIPI 2024) are international competitions held in conjunction with the Computer Vision and Pattern Recognition Conference (CVPR 2024).

NTIRE 2024 aims to foster computer vision and image processing advancements, focusing on image denoising, super-resolution, deblurring, inpainting, and colorization. It also showcases innovative techniques designed to enhance image quality and enhance details, texture, and visual appearance.

MIPI 2024 emphasizes the integration of novel image sensors and imaging algorithms, addressing the increasing demand for computational photography and imaging on mobile platforms. Despite challenges such as the lack of high-quality research data and the limited opportunities for in-depth exchanges of views from industry and academia, this competition serves as a platform to promote related technological advancements in mobile intelligent photography and imaging.

We are pleased to announce that the Samsung R&D Institute China - Beijing (SRC-B) has emerged victorious in several challenges at these events. Details are as follows:

Few-shot RAW Image Denoising Challenge (MIPI @ CVPR2024) – Runner-up
Bracketing Image Restoration and Enhancement Challenge - Track 1 - BracketIRE (NTIRE @ CVPR2024) – Champion
Bracketing Image Restoration and Enhancement Challenge - Track 2 - BracketIRE+ (NTIRE @ CVPR2024) – Champion
RAW Image Super-Resolution (NTIRE @ CVPR2024) – Champion
HR Depth from Images of Specular and Transparent Surfaces - Track 1 Stereo (NTIRE @ CVPR2024) – Top 2
Night Photography Rendering Challenge (NTIRE @ CVPR2024) – Top 2

For the first four challenges, SRC-B cooperated with the Mobile eXperience (MX) Business. With MX’s rich experience in commercialization of Camera Image Signal Processor (ISP)-related tasks, they worked together as one team to analyze issues, process data, design models, and conduct experiments, leading to good results.

The Bracketing Image Restoration and Enhancement (BracketIRE and BracketIRE+) challenges utilize bracketing photography to capture high-quality photos with clear content in low-light environments. SRC-B and MX’s innovative Refine with Two-Stage Image Restoration and Enhancement method, Refine with Two-Stage Image Restoration and Enhancement Network (RT-IRENet), includes a first module based on Temporally Modulated Recurrent Network (TMRNet) [1] that fuses raw images into a coarsely restored result. The second module, based on Nonlinear Activation Free Network (NAFNet) [2], refines the first module’s output into the detailed final image. RT-IRENet significantly outperformed its rivals on both tracks.

For the RAW Image Super Resolution Challenge, SRC-B and MX introduced a two-stage network employing a divide-and-conquer strategy. The first stage focuses on recovering the image structure from the low-resolution degraded RAW image, while the second stage enhances detail retrieval for a refined reconstruction. They extended existing methods for synthetic data generation, studying hardware-specific RAW image degradations, proposing new definitions for the relevant device-specific noise profiles, and new blur kernels aligning with typical real-world scenarios. They designed a randomized degradation model, simulating different interactions between the observed simulated defects. Finally, SRC-B proposed a novel Focal Pixel Loss, proven through performance improvements during the model fine-tuning stage. Their solutions achieved state-of-the-art performance in this challenging task.

The Few-Shot Raw Image Denoising involves training neural networks to denoise raw images with limited paired data. SRC-B and MX collected additional data from DSLR cameras as a pretraining dataset and designed a specialized color loss to train the deep neural network, producing impressive results with superior perceptual quality.

In the track of HR Depth from images of specular and transparent surfaces, monocular relative depth from the pretrained large MDE models is used as guidance. The relative depth is initially aligned with the metric disparity of the stereo network and subsequently input into a multiscale GRU network along with cost volume for iterative refinement. Integrating prior information from the MDE model significantly enhances the performance of the stereo-matching network in transparent, specular, and even textureless areas. The results could benefit future robotic or smartphone scene understanding applications.

The Night Photography Rendering Challenge focuses on processing methods for night photography by solving the complexities of rendering images taken at night (especially images captured this year using primitive mobile phones) and evaluating the results based on perceived quality and computational efficiency. For the complex light source environment in the night environment, SRC-B uses a multilight source white balance algorithm to generate high-quality images and use them for ISP training. It can be applied to Mobile Camera ISPs in the future.

Throughout the competition, SRC-B faced fierce competition from global contenders. Their months of hard work and dedication to perfecting their innovative techniques allowed them to successfully overcome various challenges in the competition through innovative methods and techniques, ultimately winning the top spots.

SRC-B’s team members

[1] Zhang Z, Zhang S, Wu R, et al. Bracketing is All You Need: Unifying Image Restoration and Enhancement Tasks with Multi-Exposure Images[J]. arXiv preprint arXiv:2401.00766, 2024.

[2] Chen L, Chu X, Zhang X, et al. Simple baselines for image restoration[C]//European conference on computer vision. Cham: Springer Nature Switzerland, 2022: 17-33.