Track 3: Atmospheric Turbulence Mitigation
Register for this track
Imaging in environments which degrade the quality of an image sets a hard constraint on where our computer vision algorithms are effectively applied. Often, an image domain model for these situations is utilized with great success; points of interest are used to estimate the global motion blur which can be fed into a deblurring framework, or a patch matching scheme can be used to effectively denoise the image. More recently, large amounts of data, typically in the form of input-output pairs, can be used to train a complex network to remove effects which are difficult to model in the image domain. In spite of these successes, there remain real world scenarios in which neither of the previous methodologies are feasible. Imaging over long distances through the atmosphere presents such a problem; there is no image domain model, and there is no way to acquire realistic input-output pairs without use of wave-propagation simulation which is not scalable for the purposes of generating a large scale dataset for training. This leaves us with the problem of incorporating the image formation problem in novel ways within our pipeline, which has yet to be developed.
The theories of turbulence and propagation of light through random media have been studied for the better part of a century. Yet progress for associated image reconstruction algorithms has been slow, as the turbulence mitigation problem has not thoroughly been given the modern treatments of advanced image processing approaches (e.g., deep learning methods) that have positively impacted a wide variety of other imaging domains (e.g., classification). This is due, in part, to the lack of standardization in testing procedures and limited simulation tools. Many simulation approaches are either accurate and relatively slow and often not open source, or rely on too simple of a model that will not generalize well to real-world sequences. To add to the difficulty of data generation, obtaining real-world sequences true to reality is challenging and the difficulty of obtaining the ground truth for image reconstruction precludes the use of objective quality metrics. Some approaches bypass this difficulty using the so-called "hot-air" sequences, though the general validity of such sequences is subject to debate. Furthermore, when testing on data obtained in a real-world setting, the lack of a large set of standardized testing sequences and quality evaluation metrics leaves the analysis insufficient and circumstantial.
UG2+ Challenge 3 aims to promote the development of new image reconstruction algorithms for imaging through atmospheric turbulence. The participants will be asked to develop image reconstruction algorithms that targets at arbitary generic images. A state-of-the-art atmospheric turbulence simulator will be provided for the participants to generate training data. Data from other sources are also allowed to be used, but must be clearly stated in the final submissions. The submitted algorithms will be evaluated with two different dataset:
- The turbulence text dataset: This dataset contains text patterns collected during hot weather from 300 meters away. The final ranking of the challenge will be based on the average accuracy of three existing scene text recognition algorithms on the reconstruction result of this dataset.
- The hot-air dataset: This dataset contains generic turbulence/clean image pairs collected through heated air but over a much shorter distance. The purpose of this dataset is to encourage the participants to develop algorithms for generic scenes. In order to be considered in the final ranking, the same model used for processing dataset 1 must has performance (PSNR) better than a threshold (to be set) on dataset 2.
If you have any questions about this challenge track please feel free to email cvpr2022.ug2challenge@gmail.com
References:
[1] Mao, Z., Chimitt, N., Chan, S. H., Oct, 2021. Accelerating Atmospheric Turbulence Simulation via Learned Phase-to-Space Transform. In Proc. of IEEE/CVF International Conference on Computer Vision. pp 14759-14768
Footer