Underexposed Image Correction via Hybrid Priors Navigated Deep Propagation
JJOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1
Underexposed Image Correction via Hybrid PriorsNavigated Deep Propagation
Risheng Liu,
Member, IEEE,
Long Ma, Yuxi Zhang, Xin Fan,
Member, IEEE, and Zhongxuan Luo
Abstract —Enhancing visual qualities for underexposed imagesis an extensively concerned task that plays important roles invarious areas of multimedia and computer vision. Most existingmethods often fail to generate high-quality results with appro-priate luminance and abundant details. To address these issues,we in this work develop a novel framework, integrating bothknowledge from physical principles and implicit distributionsfrom data to solve the underexposed image correction task.More concretely, we propose a new perspective to formulatethis task as an energy-inspired model with advanced hybridpriors. A propagation procedure navigated by the hybrid priorsis well designed for simultaneously propagating the reflectanceand illumination toward desired results. We conduct extensiveexperiments to verify the necessity of integrating both underlyingprinciples (i.e., with knowledge) and distributions (i.e., from data)as navigated deep propagation. Plenty of experimental results ofunderexposed image correction demonstrate that our proposedmethod performs favorably against the state-of-the-art methodson both subjective and objective assessments. Additionally, weexecute the task of face detection to further verify the naturalnessand practical value of underexposed image correction. What’smore, we employ our method to single image haze removal whoseexperimental results further demonstrate its superiorities.
Index Terms —Deep learning; Hybrid priors; Underexposedimage correction; Face detection.
I. I
NTRODUCTION
High-visibility images with sufficient details of target scenesare quite essential for many multimedia and computer visionapplications. However, the captured images often suffer fromlow-visibility due to nighttime, backlighting or some light lim-ited scenes in most real-world scenarios. Thus, underexposedimage correction is generally demanded in many practicalfields. During the past few years, various underexposed imagecorrection techniques have been proposed.In the early stage, researchers tend to design the methodsbased on histogram modification ([1], [2]) for underexposedimage correction. This type of technique indeed improves theluminance, but it cannot work well for non-uniform illumi-nation. Retinex-based image decomposition approaches ([3],[4], [5]) are widely used for this task at the present stage,which follow the physical law so that they achieve excellentperformance. However, since they always need to design thecomplex prior regularization to narrow down the solution,not only increasing the time-consuming, but also resulting
R. Liu, L. Ma, Y. Zhang, X. Fan, and Z. Luo are with the DUT-RUInternational School of Information Science & Engineering, Dalian Universityof Technology, and also with the Key Laboratory for Ubiquitous Networkand Service Software of Liaoning Province, Dalian 116024, China. E-mail:[email protected], [email protected], [email protected],[email protected], [email protected] received April 19, 2005; revised August 26, 2015. in the insufficient improvement of luminance and depict ofdetails. Undoubtedly, network-based techniques ([6], [7]) canbe directly adopted to settle this task. Unfortunately, thedifficulty of obtaining training pairs limits the developmentof deep network in this task. Actually, most of existing re-lated network-based works tend to generate the unnaturalnessenhanced results in real scenarios.
A. Our Contributions
As discussed above, to compensate for the ill-posedness ofthe image decomposition, strong priors for both the reflectanceand illumination are required to regularize the solution space.However, designing such exact priors in hand-crafted manneris challenging and needs extremely high mathematical skills.More importantly, the purely designed priors may only suitablefor the data with given distributions, thus limit their applica-tions in more complex real-world scenarios (see Fig. 1 (b)).Additionally, due to the lack of exact references for training,the networks learned by the end-to-end manners may hard toenhance all these details in the dark regions (see Fig. 1 (c))or generate the unnatural result (see Fig. 1 (d)).In this work, we propose a novel underexposed imagecorrection framework, in which the domain knowledge andtraining data are integrated to generate the hybrid priors forRetinex decomposition. Specifically, we establish a genericenergy-inspired deep propagation framework, based on theimage decomposition model in Eq. (1). By introducing aschematic alternating half-quadratic splitting scheme, the fun-damental propagations of the reflectance and illumination areestablished. To navigate the coupled iterations towards thedesired solutions, we develop the hybrid priors, which consistof both explicitly designed distribution constraints (i.e., knowl-edge) and implicitly trained deep architectures (i.e., data). Theadvantage of our proposed methodology is initially verifiedby comparing it with one representative decomposition-basedmethod (i.e., LIME [3]) and two end-to-end discriminativelearning approaches (i.e., HDRNet [6], RetinexNet [7]) on anexample image in Fig. 1.The main contributions of our proposed method can besummarized in the following four aspects: • We provide a generic energy-inspired deep propagationperspective to formulate the image decomposition prob-lem. It will be demonstrated that both underexposedimage correction and other related vision tasks (e.g.,dehazing) can be addressed within this framework. • With the flexibility of our proposed framework, we cansuccessfully combine domain knowledge (e.g., Retinex a r X i v : . [ c s . C V ] J u l OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 2 (a) Input (b) LIME (c) HDRNet (d) RetinexNet (e) Ours
Fig. 1. Underexposed image correction results comparison on an example image. It can be seen that there exists severe overexposure and some detailscannot be recovered in the result of LIME [3] (Retinex decomposition based method). Most of details in the dark regions cannot be properly recovered byHDRNet [6] (end-to-end network). The worst visual expression of RetinexNet [7] (Retinex-based end-to-end network). By comparison, we obtain the bestvisual performance with clear details and appropriate contrast among these compared methods. principle, structure priors) and data-dependent architec-tures (e.g., learnable descent directions) to navigate thepropagations of the coupled image component. • To fully indicate the naturalness and practical value ofour proposed method, we not only conduct extensiveexperiments on challenging underexposed images, butalso execute the face detection to further manifest thenaturalness and practical values of our method. • To evaluate the scalability of our built framework, the taskof single image haze removal is considered. We presentvisual comparison on some challenging hazy images inreal-world scenarios, which indicates our superiority.II. R
ELATED W ORKS
In this section, we state a brief review of the related works.Generally, existing underexposed image correction techniquescan be divided into three categories: the methods based onhistogram modification, image decomposition and discriminatelearning.
Histogram Modification:
The histogram-based methodsmake efforts to modify the histogram distributions to recall vis-ibilities of dark regions. Histogram equalization [8] is one ofthe most commonly used histogram modification techniques.However, it tends to result in over-enhancement. Differentconstraints have also been designed for brightness preserva-tion [1], [2], [9] and weight adjustment [10]. However, thesemethods cannot work well for non-uniform illuminations.
Image Decomposition:
Retinex theory assumes that thescene in human’s eyes is the product of reflectance and illumi-nation layers [11], in which illumination represents the lightintensity and reflectance denotes the physical characteristic ofobjects [12]. Given the observation O ∈ R N , this model canbe formulated as O = R (cid:12) I , (1)where “ (cid:12) ” denotes pixel-wise multiplication, and R , I ∈ R N are the reflectance and illumination parts, respectively. Withbasic physical principles, it is also necessary to assume thatthe pixel values of R and I are in definite ranges, i.e., R ∈ Ω R := { R | ≤ R i ≤ , i = 1 , ..., N } and I ∈ Ω I := { I | ≤ I i ≤ O i , i = 1 , ..., N } .Although with the additional value constraints, the decom-position above is highly ill-posed [13]; thus strong priors arerequired for both the reflectance and illumination to regularizethe solution space. The work in [14] adopted (cid:96) regularizerto estimate the illumination in the logarithmic domain. [15] adopted Total Variational (TV) based model in the logarithmicdomain for intrinsic image decomposition. Fu et al. directlydesigned probabilistic formulations to simultaneously estimatereflectance and illumination in the image domain [16]. Further-more, they considered a weighted variational decompositionformulation in the logarithmic domain [17]. In [3], the illumi-nation is refined by only preserving the main contour basedon an initial illumination. Similarly, the paper [4] proposed aperceptually bidirectional similarity to produce natural-lookingresults based on the illumination optimization. The work in[12] combined different priors to build a complex energymodel. It can be seen that these methods just adopt contrivedpriors to constrain their decompositions. However, it is hard toutilize human-designed priors to investigate the intrinsic struc-tures of the underlying illumination and reflectance, especiallyin the image domain. This is because we are actually still notsure about the exact distributions of these latent components. Discriminative Learning:
Very recently, discriminativelearning based methods are proposed. Especially like Convolu-tional Neural Networks (CNNs), which has been demonstratedthat CNNs can learn realistic natural image distributions froma number of images [18]. Thus, several approaches havebeen proposed to apply the implicit CNN priors for low-levelvision tasks, such as super-resolution ([19], [20]), deconvo-lution ([21], [22]), dehazing ([23]), and others ([24], [25],[26]). However, since there exist highly coupled variables andcomplex constraints, which cause that both synthetic and real-world datasets are all obtained difficult, thus it is extremelychallenging to apply CNNs to inference the decompositionmodels in Eq. (1). Up to now, there indeed exist some CNN-based approaches for addressing underexposed image correc-tion task ([6], [7], [27], [28], [29]). Actually, training datasetbecomes the key restraints for the practical performance ofthese works. In [6], professional photographers are employedto generate the training pairs. [27] proposes to train thedesigned network using the synthetic dataset generated fromthe operator of Gamma Correction. Considering the multi-exposure images, the work in [28] builds a large scale multi-exposure image dataset, and generates high-quality referenceimages based on 13 MEF and HDR algorithms for networktraining. The turning point arises in the work [7] which builds anew dataset, i.e. LOw-Light dataset (LOL), which is generatedby adjusting the exposure time. This paper also proposes anend-to-end Retinex-based deep network which combines theRetinex theory and learnable architecture. Lately, a practicalnetwork-based algorithm is proposed in [29] based on LOL
OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 3
IlluminationInput Output Reflectance I ▽ Spatial Smoothness " (SS) " Edge Preservation " (EP) " l og(1 [ ] ) ii R ▽ Illumination Adjustment "(IA)"
Conv ReLU BN
Learnable Descent Direction"(LDD)" O R I ‖ ‖ Retinex Principle"(RP)" t I t R t I t I "(RP)" t R "(LDD)" "(SS)""(EP)" t R ...... ...... Hybrid Priors Navigated Deep Propagation " (I A ) " Fig. 2. The illustration of our propagations with hybrid priors navigation. The four dashed rectangles of left column is our core principles of designingour method. The right bottom dashed rectangle represents the iteration block in t -th stage, i.e., the details of our algorithm. The right top row is the visualcomparison of underexposed input, the enhanced output and obtained decomposition components. dataset. Overall, these network-based approaches all tend togenerate the unnaturalness performance with insufficient de-tails and under/over exposure. The reason is that training pairsare inaccurate to depict the real distribution, e.g., LOL justconsiders the exposure time, while there exist many physicalfactors (e.g., illumination condition) in real scenarios.In summary, training pairs are hard to generate usingexisting techniques for underexposure image correction, whichseverely limits the development of deep learning in this area.So it is intuitive to consider the learnable architecture in somedeductive ways, to skip the difficulty of directly generatingtraining pairs for this task.III. T HE P ROPOSED F RAMEWORK
Most existing decomposition-based models [14], [15], [17]are based on the logarithmic transformation. However, theside effect is that these undesired structures are amplifiedin the low magnitude stimuli areas and edges may becomefuzzy. Therefore, in this work we directly formulate ourRetinex decomposition in the image domain as the followingregularized variational energy model: min R ∈ Ω R , I ∈ Ω I f ( I , R ) + Φ( I ) + Ψ( R ) , (2)where f ( I , R ) = (cid:107) R (cid:12) I − O (cid:107) is the fidelity term derivedfrom the model Eq. (1), Φ and Ψ are the prior regularizationterms of illumination and reflectance, respectively.Notice that different from most existing decomposition-based methods, which only design the prior penalties basedon their intuitions, we provide a new way to integrate knowl-edge and data to obtain more efficient hybrid priors for ourdecomposition problem in the next section. A. Hybrid Priors Navigated Deep Propagation
Since the expression of the explicit formulations of Φ and Ψ in Eq. (2) is hard to obtain, it is indeed challenging toadopt standard iteration schemes to optimize this variationalenergy. Next, we will provide a new propagation frameworkto integrate alternating half-quadratic splitting scheme andhybrid priors to respectively obtain our desired illuminationand reflectance.
1) Illumination Propagation:
We first consider the priorterm of illumination as the combination of one principledassumption (i.e., spatial smoothness) and one implicit data-dependent term as follows: Φ( I ) = µ I (cid:107)∇ I (cid:107) + D I ( I ) , (3)where µ I is a trade-off parameter, (cid:107)∇ I (cid:107) enforces the smoothconstraint and D I denotes our implicit prior submodule(learned from data).Then utilizing half-quadratic splitting technique with anauxiliary variable ˜ I (with penalty parameter λ t I ), we have thefollowing subproblem for illumination updating ( I t +1 , ˜ I t +1 ) = arg min I ∈ Ω I , ˜ I f ( I , R t ) + µ I (cid:107)∇ I (cid:107) + D I (˜ I ) + λ t I (cid:107) I − ˜ I (cid:107) . (4)Rather than explicitly formulating and calculating D I , wedirectly update the auxiliary variable ˜ I t +1 via the followinglearnable descent scheme ˜ I t +1 = I t − N ( I t ; Θ) , (5)where N denotes the parameterized descent directions (e.g.,CNN architectures) with parameters Θ . We will discuss thedetails of these learnable architectures in the next part. Itshould be emphasized that we actually provide a way to learnthe guidance from training data to navigate our illuminationpropagation. Then we are ready to update I for the t + 1 -thstage. By further reformulating the fidelity (cid:107) R (cid:12) I − O (cid:107) as (cid:107) I − OR (cid:107) , we can obtain the updating scheme of I as I t +1 = P Ω I (cid:32) OR t + λ t I ˜ I t +1 λ t I + µ I ∇ (cid:62) ∇ + 1 (cid:33) , (6)where P Ω I denotes the projection on Ω I .
2) Reflectance Propagation:
As for the reflectance R ,we would like to preserve the sharp edge structure of thereflectance during the enhancement process. Thus we considerthe following hybrid regularization term Ψ as Ψ( R ) = µ R (cid:88) i log(1 + θ [ ∇ R ] i ) + D R ( R ) , (7) OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 4
Algorithm 1
Underexposed Image Correction via HybridPriors Navigated Deep Propagation Input: O , and some necessary parameters. Initialization: I = O , R = . for t = 0 : t max − do Update ˜ I t +1 using Eq. (5). Update I t +1 using Eq. (6). Update ˜ R t +1 using Eq. (9). Update R t +1 using Eq. (10). end for Obtain final enhanced result O enhanced (i.e., Eq. (11)). Output: O enhanced .where the first term is a widely used non-convex potentialfunction (with a sparsity controlled parameter θ , can be usedto reveal the sharp edge structure) [30], µ R is the trade-off parameter and D R denotes the data-dependent prior forreflectance. Here [ ∇ R ] i denotes the i -th element of ∇ R .Using half-quadratic reformulation technique, we can obtainthe R subproblem as ( R t +1 , ˜ R t +1 ) = arg min R ∈ Ω R , ˜ R f ( I t +1 , R )+ µ R (cid:80) i log(1 + θ [ ∇ R ] i ) + D R ( ˜ R ) + λ t R (cid:107) R − ˜ R (cid:107) , (8)where ˜ R is an auxiliary variable and λ t R is the penaltyparameter.Intuitively, we may follow the idea in ˜ I -subproblem to intro-duce another network to calculate ˜ R . However, by recallingthe physical rule in Eq. (1), we can obtain a much simplerupdating scheme for ˜ R as ˜ R t +1 = η O˜I t +1 + R t η + 1 , (9)where η denotes the weight coefficient.However, due to the non-convex potential function, wecannot obtain closed-form solution of the R -subproblem inEq. (8). Thus, we adopt a projected gradient type rule to update R as following: R t +1 = P Ω R (cid:16) R t − ∇ R g ( I t +1 , R t , ˜ R t +1 , λ t R ) (cid:17) , (10)where g ( I , R , ˜ R , λ R ) = f ( R , I ) + λ t R (cid:107) ˜ R − R (cid:107) + µ R (cid:80) i log(1 + θ [ ∇ R ] i ) and P Ω R is the projection on Ω R .
3) Illumination Adjustment:
It is known that the illumina-tion contains the lightness information. So the underexposedimage correction task now reduces to the problem of adjust-ing the illumination to generate high-visually reconstructions.Gamma correction is a common measure to encode and decodethe luminance by taking advantage of the non-linear manner inwhich humans perceive light and color [31]. So we adopt thefollowing Gamma correction operation to adjust our obtainedillumination: O e = R (cid:12) I γ (11)where O e denotes the final enhanced result. γ > is a tunningparameter (empirically designed as 2.2). Now we are ready tosummarize our algorithm in Alg. 1 and Fig. 2. Illumination Reflectance Output Zoomed-in(a) “(RP)”+“(EP)”+“(SS)”(b) “(RP)”+“(EP)”+“(SS)”+“(IA)”(c) “(RP)”+“(LDD)”+“(IA)”(d) Ours ( “(RP)”+“(EP)”+“(SS)”+“(LDD)”+“(IA)”) Fig. 3. Visual comparisons of our method with different prior strategies.Notice that all the abbreviations of this figure come from the Fig. 2.
To demonstrate the necessity of our propagation with hy-brid priors navigation, we provide an illustrative compari-son of different updating strategies for the decompositionand the corresponding enhancement performance in Fig. 3.Concretely, we consider three different variations of priors,i.e. only data-dependent (i.e., “(LDD)”), only knowledge-based (i.e., “(EP)”+“(SS)”) and our proposed hybrid prior(i.e., “(LDD)”+“(EP)”+“(SS)”). Moreover, we also consider acondition of only knowledge-based prior without illuminationadjustment (i.e., “(IA)”). It evidents that the illuminationadjustment is essential to adjust the luminance of illuminationas the subfigures (a) and (b) of Fig. 3 show. Obviously,some details are missing in the reflectance generated by theknowledge-based prior, thus, there exist some details cannotbe recovered in the enhanced results. We observe that thereflectance estimated by data-dependent prior is over-smooth,which leads to the absence of some structural informationin the enhanced result. In contrast, the reflectance obtainedby the proposed hybrid prior is felicitous and the enhancedresult has the distinguished enhancement effects (see zoomed-in regions).
B. Learnable Architecture
As for the learnable architecture, we would like to pointout that we actually produce a learnable descent directionderived from the data distributions (see ”(LDD)” in Fig. 2),to assist searching our desired solution. Additionally, sincethe knowledge-based submodule in our hybrid priors has theability to roughly estimate the latent image structures, the mainleft task to the data-dependent submodule should be refiningthe rich details and removing small corrections. Therefore, itis essential to adopt a denoising-type strategy for our learnablearchitecture. That is, we generate the training image pairsby adding different levels of Gaussian noises to simulate the
OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 5
SRIE WVM JIEP Ours
Fig. 4. Comparing the image components for Retinex decomposition basedmethods (i.e., SRIE [16], WVM [17], JIEP [12] and Ours). The illumination,reflectance and enhanced components are plotted on the top, middle, andbottom row, respectively. corruptions and consider the clear images as the outputs of thearchitecture.Specifically, a simple CNN architecture is adopted as ourlearnable architecture, which consists of 7 dilated convolutionlayers with 64 kernels, acting on a kernel of size 3. Weset a ReLU as nonlinear activation function in between twoconvolution layers, batch normalizations are also introducedfor convolution operations from 2nd to 6th linear layers. Weadopt mean square error as our training loss. As for thetraining data, we randomly select 800 images from ImageNetdatabase [32]. We crop them into small patches of size 35 × XPERIMENTAL R ESULTS
In this section, we conduct a series of experiments to evalu-ate our algorithm. Since the reference images are unavailable,it is hard to assess the quantitative performance using standardmetrics (e.g., PSNR). Thus we follow most exiting works toadopt Natural Image Quality Evaluator (NIQE) [33] as ourquantitative metric in all experiments. Please notice that thelower value of NIQE indicates a higher image quality. Thedecomposition is applied for the V-channel in the HSV (Hue,Saturation and Value) space, and then transform it back to theRGB domain. All these experiments are conducted on a PCwith Intel Core i7-8700 CPU at 3.70GHz, 32 GB RAM andan NVIDIA GeForce GTX 1080 Ti 11GB GPU.
A. Methodology Comparisons
We provide a range of experiments to compare the per-formance of different methodologies on underexposed imagecorrection. Then we illustrate the roles of data-driven andknowledge-based submodules in our deep model. In Fig. 4,we compare our proposed method with the state-of-the-artdecomposition based enhancement approaches [16], [17], [12]by illustrating the decomposed components and final enhanced Input HDRNetRetinexNet Ours
Fig. 5. Comparing the performance of two kinds of end-to-end learning-based method (i.e., HDRNet [6], RetinexNet [7]) and our proposed hybridprior method. results. Obviously, the illumination of our method is smootherthan other state-of-the-art approaches, so that the reflectanceand enhanced result preserves most details and thus are clearerthan the results of compared methods. All these results verifythat our method can obtain more realistic constraints for theRetinex type intrinsic image decomposition and therefore ismore suitable for underexposed image correction.Furthermore, we conduct some experiments to comparewith the recent discriminative deep learning approach (e.g.,HDRNet [6], RetinexNet [7]). Notice that since no physicalknowledges is considered in HDRNet, this method can onlyobtain the enhanced results by learning their network model(designed in heuristic manner) from synthesized training data.RetinexNet considers the Retinex decomposition, but due tothe naive generation fashion of training data, i.e., changingexposure time, the predicted enhanced results usually containtoo many details and lack the naturalness in real world scenar-ios. As shown in Fig. 5, HDRNet fails to recover more detailsin the enhanced results, the result of RetinexNet generatesmore details, but it is extremely unnatural. Our method cansuccessfully recover most of the details in the dark regionand keep the naturalness to be most extent. Moreover, ourresult has a distinguishing promotion in terms of brightness,presenting much higher visibility.In the third experiment, we explore the performance ofour hybrid prior navigated deep propagation (based on theensemble of two different methodologies). That is, in Fig. 6,we plot visual performances and quantitative results of ourmethod with varied algorithmic parameters µ I and λ I . As for µ I , it is used to balance the principally designed and data-driven priors. We turn this parameter in the range [1 , and plot the visual performance and quantitative results insubfigure (a). It can be seen that the performance of ourmethod is stable and the NIQE scores only slightly changedin a small interval (about − ). While the parameter λ I isto penalize the auxiliary variable (calculated based on thenetwork propagation). We observe in subfigure (b) that theperformances with µ I ∈ [1 , are also stable. We argue OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 6 N I Q E N I Q E (a) NIQE vs. µ I (b) NIQE vs. λ I Fig. 6. The visual performances and quantitative results of our method w.r.t the settings of algorithmic parameters. The NIQE (lower is better) curves of µ I and λ I are illustrated in subfigures (a) and (b), respectively. With each parameter setting, we plot illumination (top), reflectance (middle) and the finalenhanced result (bottom) in the dashed rectangles. H E M S R C R
G O L W N P E A S R I E W V M L I M E J I E P H D R N e t R R M R e ti n e x N e t O u r s H E M S R C R
G O L W N P E A S R I E W V M L I M E J I E P H D R N e t R R M R e ti n e x N e t O u r s H E M S R C R
G O L W N P E A S R I E W V M L I M E J I E P H D R N e t R R M R e ti n e x N e t O u r s (a) NASA (b) NPE (c) LIME Fig. 7. Quantitative performance (i.e., NIQE, lower is better) on three different benchmark databases. that the stability of our hybrid prior is mainly because thatthe knowledge-based submodule actually provides a baselineperformance guarantee and the data-driven submodule cansuccessfully enrich more details to further improve the per-formance.
B. Underexposed Image Correction
In this part, we first make a series of quantitative and qual-itative comparisons with a lot of state-of-the-art approachesfor settling the underexposed image correction. Then theexperiments in real scenarios are conducted to test the visualperformance. Finally face detection based YOLOv3 [34] isexecuted to further verify our naturalness.
Challenging Benchmarks
In this section, we evaluate theperformance of our method against state-of-the-art methods,including HE [8], MSRCR [35], GOLW [36], NPEA [37],LIME [3], SRIE [16], WVM [17], JIEP [12], HDRNet [6],RRM [5], RetinexNet [7] on different benchmarks. Such as NASA (23 images in the indoor and outdoor scenes),NPE [37] (130 images in different natural scenes ), LIME [3](10 images in different challenging scenes).Fig. 7 reports the averaged NIQE values on differentdatasets. It is obvious that our method obtains better quan-titative performance than other state-of-the-art methods. Wealso plot visual comparisons on example images in Figs. 8-9.It can be seen that most methods can partially improve thevisual quality of the given observations. However, the resultsof SRIE, WVM and JIEP still express low-visibility, especiallyon the most challenging example in Fig. 9. Although withimproved contrast, LIME tends to obtain images with severeover-exposure. Additionally, all these compared methods allfail to recover the detailed information in the dark, such asthe rose flower in the second zoomed-in region of Fig. 9.Learning-based approaches (i.e., HDRNet, RetinexNet) gen-erate unrealistic results with color distortion, especially theresult of RetinexNet. In contrast, our proposed method not https://dragon.larc.nasa.gov/retinex/pao/news/ OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 7
Input HE (3.9318) SRIE (4.0469) WVM (3.9181)JIEP (4.2309) LIME (4.3105) HDRNet (4.1184) Ours ( ) Fig. 8. Comparisons on an example in Non-uniform dataset. The NIQE scores are reported below each image.TABLE IA
VERAGE RUNNING TIME ON DIFFERENT BENCHMARKS . T
HE BEST ANDSECOND ARE HIGHLIGHTED IN RED AND BLUE COLOR , RESPECTIVELY .Dataset LIME JIEP HDRNet RetinexNet OursNASA 0.0336 0.9858 4.8329 0.1618 0.0477NPE 0.1902 5.0584 13.6522 0.2948 0.1112LIME 0.2149 2.5536 14.9201 0.3356 0.1758 only enhances the visibility, but also preserves most of thedetails, providing much better enhancement performance.We also compare our method with four recently proposedmethods to evaluate the computational cost. Table. I showsthe average running time in seconds among on differentbenchmarks. It can be seen that our method is the fastestamong these compared methods except the NASA dataset.This indicates our method has the significant advantage interms of time cost.
Real-world Scenarios
We also evaluate our methodon real-world underexposed scenarios. We select an exampleimage from HDR+ Burst Photography Dataset [38], whichis captured by the Android mobile cameras using the pub-lic Android Camera2 API. As Fig. 10 shows, it can beseen that our method, LIME, and RetinexNet all have betterperformance than other state-of-the-art methods in the darkregions. However, the zoomed-in regions of LIME are over-exposed and contain color distortion, RetinexNet generates theunnatural enhanced result which looks like style migration.
TABLE IIQ
UANTITATIVE COMPARISON OF FACE DETECTION .Metric Input LIME HDRNet RetinexNet OursmAP (%) 16.77 53.49 34.25 37.10
Avg. Recall (%) 12.67 74.17 35.67 42.99
In contrast, our proposed method obtains more natural visualquality with clear details on the test image.
Face Detection Based on YOLOv3
We know thatthe naturalness of enhanced results is not precise enough toillustrate by NIQE value derived from the statistic regular.Indeed, Visual expression of enhanced results further supportsthe naturalness, but it is over-subjective because of personalpreference. To address these problems, we consider evaluatingthe naturalness property from the perspective of the perfor-mance of the face detection task.To be specific, we adopt a well-known object detectionframework, i.e., YOLOv3 [34] to present the task of facedetection. Following most existing face detection works, weuse WIDER Face dataset [39] as our training data. It needsto be noticed that the illumination is also considered in thisdataset, but those images are easy to recognize the objects byour eyes. To fully verify the capability of underexposed imagecorrection algorithms, we select 100 challenging images fromDARK FACE dataset which comes from the sub-challenge of https://flyywh.github.io/CVPRW2019LowLight/ OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 8
Input HE (3.0200) SRIE (3.0925)WVM (2.8731) LIME (2.9322) JIEP (2.8731)HDRNet (3.0753) RetinexNet (2.8091) Ours ( ) Fig. 9. Comparisons on an example in LIME dataset. The NIQE scores are reported in the brackets.
Input LIME JIEPHDRNet RetinexNet Ours
Fig. 10. Visual comparisons of underexposed image correction on real-world scenario.
OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 9
Fig. 11. Some sample images selected from DARK FACE dataset.
Underexposed image with label Underexposed input (0 / 0) LIME [3] (60.00 / 42.86)HDRNet [6] (1.00 / 42.86) RetinexNet [7] (66.67 / 57.14) Ours (83.33 / ) Fig. 12. Comparison of face detection based on YOLOv3 [34] among LIME [3] (representative Retinex-based method), HDRNet [6] (end-to-end network),RetinexNet [7] (Retinex-based end-to-end network) and our method. The Precision / Recall scores are reported below each image. Clearly, our method candetect almost all objects with the better quantitative performance. This experiment not only manifest the naturalness of our result, but also reflect the prospectsof our proposed method in the real vision application.
UG2 + PRIZE CHALLENGE held at CVPR 2019. We selectsome images from our built dataset to present the difficulty ofrecognition and detection as Fig. 11 shows.As Table. II shows, our method achieves the best quanti-tative performance in all metrics (i.e., mAP, Average Recall)against other state-of-the-art methods. Specifically, end-to-endnetwork based methods (i.e., HDRNet, RetinexNet) harvestthe worst quantitative performance. It is worth noting thatRetinexNet is superior to HDRNet, the reason may be thatRetinexNet adopts the Retinex decomposition to achieve amore favorable performance for detection. The representativeRetinex-based method (i.e., LIME) indeed presents the finenumerical results, which only consider the designed priordriven by knowledges. In contrast, the average recall of ourmethod is higher than the LIME about four percentage points,which reflects our method can detect more objects. Actually,the detection network trained by lots of natural images needsthe input which satisfies the distribution of natural images, toachieve more excellent performance. In this view, our methodindeed performs more effective naturalness. Actually, our nu-merical results are not objectively prominent for face detection task, whose cause may be that many noises and artifacts areproduced in the enhanced procedure to influence the detection(as Fig. 12 shows). We will consider the procedure of noisesremoval to further improve the enhanced performance in ourfuture work.
C. Single Image Haze Removal
Finally, we further conduct an experiment of single imagehaze removal to verify the effectiveness of our proposedframework. We follow the duality of Retinex and imagedehazing [40], to execute this task. The visual comparisonswith three representative methods (i.e., the classical dehazingmethod, DCP [41], the traditional optimization based method,NLD [42] and the end-to-end network, AODNet [23]) arepresented in Fig. 13. Obviously, traditional methods (i.e., DCP,NLD) generate the dehazing results with many artifacts andcolor distortion, which indicates that only depending on priorregularization is extremely hard to achieve the desired resultsin real complex scenarios. While AODNet (network-basedmethod) presents unclear details depict and underexposureresults. By comparison, it can be easily seen that our results
OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 10
Input DCP NLD AODNet Ours
Fig. 13. Visual comparisons of single image haze removal on real-world scenario. have the prominent visual expression and more natural thanother state-of-the-art approaches.V. C
ONCLUSIONS
In this paper, we provided a new perspective to formu-late the problem of underexposed image correction withinhybrid priors (i.e., knowledge and learnable architectures)energy-inspired model. We designed a propagation procedurenavigated by the hybrid priors to simultaneously propagatethe reflectance and illumination. Extensive experiments ofunderexposed image correction validated that the effectivenessand superiority of our method against other state-of-the-artapproaches both in the enhanced effects and running time.Subsequently, face detection task is executed to further verifythe naturalness and practical values of our proposed hybridpriors navigated deep propagation. Finally, the single imagehaze removal task is also performed to illustrate our excellenceagain other state-of-the-art approaches.A
CKNOWLEDGMENT
This work is partially supported by the National NaturalScience Foundation of China (Nos. 61672125, 61300086, and61632019), the Fundamental Research Funds for the CentralUniversities . R
EFERENCES[1] C. Wang and Z. Ye, “Brightness preserving histogram equalization withmaximum entropy: a variational perspective,”
IEEE Transactions onConsumer Electronics , vol. 51, no. 4, pp. 1326–1334, 2005.[2] D. Sheet, H. Garud, A. Suveer, M. Mahadevappa, and J. Chatterjee,“Brightness preserving dynamic fuzzy histogram equalization,”
IEEETransactions on Consumer Electronics , vol. 56, no. 4, 2010.[3] X. Guo, Y. Li, and H. Ling, “Lime: Low-light image enhancement viaillumination map estimation,”
IEEE Transactions on Image Processing ,vol. 26, no. 2, pp. 982–993, 2017.[4] Q. Zhang, W.-S. Zheng, G. Yuan, C. Xiao, and L. Zhu, “High-qualityexposure correction of underexposed photos,” in
ACM Multimedia , 2018.[5] M. Li, J. Liu, W. Yang, X. Sun, and Z. Guo, “Structure-revealing low-light image enhancement via robust retinex model,”
IEEE Transactionson Image Processing , vol. 27, no. 6, pp. 2828–2841, 2018.[6] M. Gharbi, J. Chen, J. T. Barron, S. W. Hasinoff, and F. Durand, “Deepbilateral learning for real-time image enhancement,”
ACM Transactionson Graphics , vol. 36, no. 4, p. 118, 2017.[7] C. Wei, W. Wang, W. Yang, and J. Liu, “Deep retinex decomposition forlow-light enhancement,” in
British Machine Vision Conference , 2018.[8] H. Cheng and X. Shi, “A simple and effective histogram equalizationapproach to image enhancement,”
Digital Signal Processing , vol. 14,no. 2, pp. 158–170, 2004.[9] M. J. Power, C. Whitlock, P. Bartlein, and L. R. Stevens, “Multi-histransactions on graphicsram equalization methods for contrast en-hancement and brightness preserving,”
IEEE Transactions on ConsumerElectronics , vol. 53, no. 3, pp. 1186–1194, 2011.[10] S. H. Yun, H. K. Jin, and S. Kim, “Contrast enhancement using aweighted histransactions on graphicsram equalization,” in
IEEE Inter-national Conference on Consumer Electronics , vol. 10, no. 11, 2011,pp. 203–204.[11] J. McCann,
Retinex Theory . Springer New York, 2016.
OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 11 [12] B. Cai, X. Xu, K. Guo, K. Jia, B. Hu, and D. Tao, “A joint intrinsic-extrinsic prior model for retinex,” in
Proceeding of the IEEE Conferenceon Computer Vision and Pattern Recognition , 2017.[13] E. H. Land and J. J. McCann, “Lightness and retinex theory,”
Journalof the Optical Society of America , vol. 61, no. 1, pp. 1–11, 1971.[14] R. Kimmel, M. Elad, D. Shaked, R. Keshet, and I. Sobel, “A variationalframework for retinex,”
International Journal of Computer Vision ,vol. 52, no. 1, pp. 7–23, 2003.[15] M. K. Ng and W. Wang, “A total variation model for retinex,”
SIAMJournal on Imaging Sciences , vol. 4, no. 1, pp. 345–365, 2011.[16] X. Fu, Y. Liao, D. Zeng, Y. Huang, X.-P. Zhang, and X. Ding, “A prob-abilistic method for image enhancement with simultaneous illuminationand reflectance estimation,”
IEEE Transactions on Image Processing ,vol. 24, no. 12, pp. 4965–4977, 2015.[17] X. Fu, D. Zeng, Y. Huang, X.-P. Zhang, and X. Ding, “A weighted vari-ational model for simultaneous reflectance and illumination estimation,”in
Proceeding of the IEEE Conference on Computer Vision and PatternRecognition , 2016.[18] D. Ulyanov, A. Vedaldi, and V. S. Lempitsky, “Deep image prior,” in
Proceeding of the IEEE Conference on Computer Vision and PatternRecognition , 2018.[19] J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolutionusing very deep convolutional networks,” in
Proceeding of the IEEEConference on Computer Vision and Pattern Recognition , 2016, pp.1646–1654.[20] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual densenetwork for image super-resolution,” in
Proceedings of the IEEE Con-ference on Computer Vision and Pattern Recognition , 2018, pp. 2472–2481.[21] R. Liu, X. Fan, S. Cheng, X. Wang, and Z. Luo, “Proximal alternatingdirection network: A globally converged deep unrolling framework,” in
Association for the Advancement of Artificial Intelligence , 2018.[22] R. Liu, Y. He, S. Cheng, X. Fan, and Z. Luo, “Learning collaborativegeneration correction modules for blind image deblurring and beyond,”in
ACM Multimedia , 2018.[23] B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng, “Aod-net: All-in-onedehazing network,” in
Proceedings of the IEEE International Conferenceon Computer Vision , 2017.[24] R. Liu, L. Ma, Y. Wang, and L. Zhang, “Learning converged prop-agations with deep prior ensemble for image enhancement,”
IEEETransactions on Image Processing , vol. 28, no. 3, pp. 1528–1543, 2018.[25] R. Liu, S. Cheng, L. Ma, X. Fan, and Z. Luo, “Deep proximal unrolling:Algorithmic framework, convergence analysis and applications,”
IEEETransactions on Image Processing , 2019.[26] R. Liu, S. Cheng, Y. He, X. Fan, Z. Lin, and Z. Luo, “On the convergenceof learning-based iterative methods for nonconvex inverse problems,”
IEEE transactions on pattern analysis and machine intelligence , 2019.[27] K. G. Lore, A. Akintayo, and S. Sarkar, “Llnet: A deep autoencoderapproach to natural low-light image enhancement,”
Pattern Recognition ,vol. 61, pp. 650–662, 2017.[28] J. Cai, S. Gu, and L. Zhang, “Learning a deep single image contrastenhancer from multi-exposure images,”
IEEE Transactions on ImageProcessing , vol. 27, no. 4, pp. 2049–2062, 2018.[29] Y. Zhang, J. Zhang, and X. Guo, “Kindling the darkness: A practicallow-light image enhancer,” in
ACM Multimedia , 2019.[30] S. Roth and M. J. Black, “Fields of experts,”
International Journal ofComputer Vision , vol. 82, no. 2, pp. 205–229, 2009.[31] C. Poynton,
Digital video and HD: Algorithms and Interfaces , 2012.[32] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classificationwith deep convolutional neural networks,” in
Neural Information Pro-cessing Systems , 2012.[33] A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a completelyblind image quality analyzer,”
IEEE Signal Processing Letters , vol. 20,no. 3, pp. 209–212, 2013.[34] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767 , 2018.[35] Z.-u. Rahman, D. J. Jobson, and G. A. Woodell, “Retinex processing forautomatic image enhancement,”
Journal of Electronic Imaging , vol. 13,no. 1, pp. 100–111, 2004.[36] Q. Shan, J. Jia, and M. S. Brown, “Globally optimized linear windowedtone mapping,”
IEEE Transactions on Visualization and ComputerGraphics , vol. 16, no. 4, pp. 663–675, 2010.[37] S. Wang, J. Zheng, H.-M. Hu, and B. Li, “Naturalness preservedenhancement algorithm for non-uniform illumination images,”
IEEETransactions on Image Processing , vol. 22, no. 9, pp. 3538–3548, 2013. [38] S. W. Hasinoff, D. Sharlet, R. Geiss, A. Adams, J. T. Barron, F. Kainz,J. Chen, and M. Levoy, “Burst photography for high dynamic range andlow-light imaging on mobile cameras,”
ACM Transactions on Graphics ,vol. 35, no. 6, 2016.[39] S. Yang, P. Luo, C. C. Loy, and X. Tang, “Wider face: A face detectionbenchmark,” in
Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , 2016.[40] A. Galdran, A. Alvarez-Gila, A. Bria, J. Vazquez-Corral, andM. Bertalmıo, “On the duality between retinex and image dehazing,”in
Proceeding of the IEEE Conference on Computer Vision and PatternRecognition , 2018.[41] K. He, J. Sun, and X. Tang, “Single image haze removal using darkchannel prior,”
IEEE Transactions on Pattern Analysis and MachineIntelligence , vol. 33, no. 12, pp. 2341–2353, 2011.[42] D. Berman, S. Avidan et al. , “Non-local image dehazing,” in
Proceedingof the IEEE Conference on Computer Vision and Pattern Recognition ,2016, pp. 1674–1682.
Risheng Liu (M’12-) received the BSc and PhDdegrees both in mathematics from the Dalian Univer-sity of Technology in 2007 and 2012, respectively.He was a visiting scholar in the Robotic Institute ofCarnegie Mellon University from 2010 to 2012. Heserved as Hong Kong Scholar Research Fellow atthe Hong Kong Polytechnic University from 2016to 2017. He is currently an associate professorwith the Key Laboratory for Ubiquitous Networkand Service Software of Liaoning Province, InternalSchool of Information and Software Technology,Dalian University of Technology. His research interests include machinelearning, optimization, computer vision and multimedia. He was a co-recipientof the IEEE ICME Best Student Paper Award in both 2014 and 2015. Twopapers were also selected as Finalist of the Best Paper Award in ICME 2017.He is a member of the IEEE and ACM.
Long Ma received the B.E. degree in Informationand Computing Science from Northeast AgriculturalUniversity, Harbin, China, in 2016. He receivedthe M.S. degree in software engineering at DalianUniversity of Technology, Dalian, China, in 2019.He is currently pursuing the PhD degree in soft-ware engineering at Dalian University of Technol-ogy, Dalian, China. His research interests includecomputer vision, image enhancement and machinelearning.
Yuxi Zhang received the B.E. degree in softwareengineering from Dalian University of Technology,Dalian, China, in 2017. She is currently pursuing theM.S. degree in software engineering at Dalian Uni-versity of Technology, Dalian, China. Her researchinterests include computer vision, image enhance-ment and machine learning.
OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 12
Xin Fan was born in 1977. He received the B.E. andPh.D. degrees in information and communicationengineering from Xian Jiaotong University, Xian,China, in 1998 and 2004, respectively. He was withOklahoma State University, Stillwater, from 2006to 2007, as a post-doctoral research Fellow. Hejoined the School of Software, Dalian University ofTechnology, Dalian, China, in 2009. His current re-search interests include computational geometry andmachine learning, and their applications to lowlevelimage processing and DTI-MR image analysis.