版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、<p><b> 外文資料</b></p><p> Detecting ground shadows in outdoor consumer photographs</p><p> Jean-Francois Lalonde, Alexei A. Efros, and Srinivasa G. Narasimhanc</p><
2、p> School of Computer Science, Carnegie Mellon University</p><p> Project webpage: http://graphics.cs.cmu.edu/projects/shadows</p><p> Abstract. Detecting shadows from images can significa
3、ntly improve the performance of several vision tasks such as object detection and tracking. Recent approaches have mainly used illumination invariants which can fail severely when the qualities of the images are not very
4、 good, as is the case for most consumer-grade photographs, like those on Google or Flickr. We present a practical algorithm to automatically detect shadows cast by objects onto the ground, from a single consumer photogra
5、ph. Our</p><p> key hypothesis is that the types of materials constituting the ground in outdoor scenes is relatively limited, most commonly including asphalt, brick, stone, mud, grass, concrete, etc. As a
6、result, the appearances of shadows on the ground are not as widely varying as general shadows and thus, can be learned from a labelled set of images. Our detector consists of a three-tier process including (a) training a
7、 decision tree classifier on a set of shadow sensitive features computed around each image</p><p> Introduction</p><p> Shadows are everywhere! Yet, the human visual system is so adept at filt
8、ering them out, that we never give shadows a second thought; that is until we need to deal with them in our algorithms. Since the very beginning of computer vision, the presence of shadows has been responsible for wreaki
9、ng havoc on a wide variety of applications, including segmentation, object detection, scene analysis, stereo, tracking, etc. On the other hand, shadows play a crucial role in determining the type of illuminat</p>
10、<p> is because the appearances and shapes of shadows outdoors depend on several hidden factors such as the color, direction and size of the illuminants (sun, sky, clouds), the geometry of the objects that are cast
11、ing the shadows and the shape and material properties of objects onto which the shadows are cast.</p><p> Most works for detecting shadows from a single image are based on computing illumination invariants
12、that are physically-based and are functions of individual pixel values [10–14] or the values in a local image neighborhood [15].Unfortunately, reliable computations of these invariants require high quality images with w
13、ide dynamic range, high intensity resolution and where the camera radiometry and color transformations are accurately measured and compensated for. Even slight perturbations (imper</p><p> Our goal is to bu
14、ild a reliable shadow detector for consumer photographs of outdoor scenes. While detecting all shadows is expected to remain hard, we explicitly focus on the shadows cast by objects onto the ground plane. Fortunately, th
15、e types of materials constituting the ground in typical outdoor scenes are (relatively) limited, most commonly including concrete, asphalt, grass, mud, stone, brick, etc. Given this observation, our key hypothesis is tha
16、t the appearances of shadows on the ground </p><p><b> Overview</b></p><p> Our approach consists of three stages depending on the information in the image</p><p> us
17、ed. In the first stage, we will exploit local information around edges in the image. For this, we compute a set of shadow sensitive features that include the ratios of brightness and color filter responses at different s
18、cales and orientations on both sides of the edge. These features are then used with a trained decision tree classifier to detect whether an edge is a shadow or not. The idea is that while any single feature may not be us
19、eful for detecting all ground shadows, the classifier is p</p><p> In the second stage, we enforce a grouping of the shadow edges using a Conditional Random Field (CRF) to create longer contours. This is si
20、milar in spirit to the classical constrained label propagation used in mid-level vision tasks [17]. This procedure connects likely shadow edges, discourages T-junctions which are highly unlikely on shadow boundaries, and
21、 removes isolated weak edges. But how do we detect the ground in an image? For this, in the third stage, we incorporate a global scene layout</p><p> We demonstrate successful shadow detection on several im
22、ages of natural scenes that include beaches, meadows and forest trails, as well as urban scenes that include numerous pedestrians, vehicles, trees, roads and buildings, captured under a variety of illumination conditions
23、 (sunny, partly cloudy, overcast). Similarly to the approach of Zhu et al. [19], our method relies on learning the appearance of shadows based on image features, but does so by using full color information. We found that
24、 usi</p><p> 2 Learning local cues for shadow detection</p><p> Our approach relies on a classifier which is trained to recognize ground shadow edges by using features computed over a local ne
25、ighborhood around the edge. We show that it is indeed possible to obtain good classification accuracy by relying on local cues, and that it can be used as a building block for subsequent steps. In this section, we descri
26、be how to build, train, and evaluate such a classifier.</p><p> 2.1From pixels to boundaries</p><p> We first describe the underlying representation on which we compute features. Since workin
27、g with individual pixels is prone to noise and computationally expensive, we propose to instead reason about boundaries, or groups of pixels along an edge in the image. To obtain these boundaries, we first smooth the ima
28、ge with a bilateral filter [20], compute gradient magnitudes on the filtered image, and then apply the watershed segmentation algorithm on the gradient map. Fig. 1(b) shows a close-up exampl</p><p> (a) Inp
29、ut image (b) Boundaries (c) Strong boundaries (d) Output</p><p> Fig. 1. Processing stages for the local classifier. The input image (a) is over-segmented into thousands of regions to obtain boundar
30、ies (b). Weak boundaries are filtered out by a Canny edge detector (c), and the classifier is applied on the remainder. (d) shows the boundaries i for which P (yi = 1|x) > 0.5. Note the correct classification of occlu
31、sion contours around the person’s legs and the reflectance edges in the white square between the person’s feet.</p><p> An undesirable consequence of the watershed segmentation is that it generates boundari
32、es in smooth regions of the image (Fig. 1(b)). To compensate for this, we retain only those boundaries which align with the strong edges in the image. For this, we use the canny edge detector at 4 scales to account for b
33、lurry shadow edges (σ 2 = {1, 2, 4, 8}), with a high threshold empirically set to t = 0.3. Under these conditions, we verified that the initial set of boundaries contain more than 97% of the tru</p><p> 2.2
34、Local shadow features</p><p> We now describe the features computed over each boundary in the image. A useful feature to describe a shadow edge is the ratio of color intensities on both sides of the edge (
35、e.g. min divided by max) [21]. The intuition is that shadows should have a specific ratio that is more or less the same across an image, since it is primarily due to the differences in natural lighting inside and outside
36、 theshadow. Since it is hard to manually determine the best color space [22] or best scale to compute fea</p><p> For a pixel along a boundary, we compute the intensity on one side of the edge (say, the lef
37、t) by evaluating a weighted sum of pixels on the left of the edge. But which pixels to choose? We could use the watershed segments, but they do not typically extend very far. Instead, we use an oriented gaussian derivati
38、ve filter of variance σ 2 , but keep only its values which are greater than zero. We align the filter with the boundary orientation such that its positive weights lie on the left of the bo</p><p> We also e
39、mploy two features suggested in [19] which capture the texture and intensity distribution differences on both sides of a boundary. The first feature computes a histogram of textons at 4 different scales, and compares the
40、m using the χ2 -distance. The texton dictionary was computed on a non-overlapping set of images. The second feature computes the difference in skewness of pixel intensities, again at the same 4 scales. </p><p&
41、gt; Finally, we concatenate the absolute value of the minimum filter response computed over the intensity channel to obtain the final, overcomplete, 48-dimensional feature vector at every pixel. Boundary feature vector
42、s are obtained by averaging the features of all pixels that belong to it.</p><p> 2.3 Classifier</p><p> Having computed the feature vector xi at each strong boundary in the image, we can now
43、use them to train a classifier to learn the probability P (yi |xi ) that boundary i is due to a shadow (which we denote with label yi ). We estimate that distribution using a logistic regression version of Adaboost [24],
44、 with twenty 16-node decision trees as weak learners. This classification method provides good feature selection and outputs probabilities, and has been successfully used in a variety of other </p><p> To t
45、rain the classifier, we selected 170 images from LabelMe [16], Flickr, and the dataset introduced in [19], with the only conditions being that the ground must be visible, and there must be shadows. The positive training
46、set contains manually labelled shadow boundaries, while the negative training set is populated with an equal amount of strong non-shadow boundaries on the ground (e.g. street markings) and occlusion boundaries.</p>
47、<p> We obtain a per-boundary classification accuracy of 79.7% (chance is 50%, see Fig. 5 for a breakdown per class). See Fig. 1(d) for an example. This result support out hypothesis: while the appearance of shad
48、ows on any type of material in any condition might be impossible to learn, the space of shadow appearances on the ground in outdoor scenes may not be that large after all!</p><p> 3 Creating shadow contours
49、</p><p> Despite encouraging results, our classifier is limited by its locality since it treats each boundary independently of the next. However, the color ratios of a shadow boundary should be consistent w
50、ith those of its neighbors, since the sources illuminating nearby scene points should also be similar. Thus, we can exploit higher order dependencies across local boundaries to create longer shadow contours as well as re
51、move isolated/spurious ones.</p><p><b> (b)</b></p><p> Fig. 2. Creating shadow contours by enforcing local consistency. Our CRF formulationmay help to (a) bridge the gap across X-
52、junctions where the local shadow classifier might be uncertain, and (b) remove spurious T-junctions which should not be caused by shadows.</p><p> To model these dependencies, we construct a graph with indi
53、vidual boundaries as nodes (such as those in Fig. 1(b)) and drawing an edge across boundaries which meet at a junction point. We then define a CRF on that graph, which expresses the log-likelihood of a particular labelin
54、g y (i.e. assignment of shadow/non-shadow to each boundary) given observed data x as a sum of unary φi (yi ) and pairwise potentials ψi,j (yi , yj ): </p><p><b> (1)</b></p><p> wh
55、ere B is the set of boundaries, E the set of edges between them, and λ and β are model parameters. In particular, λ is a weight controlling the relative importance of the two terms. Zλ,β is the partition function that de
56、pends on the parameters λ and β, but not on the labeling y itself. Intuitively, we would like the unary potentials to penalize the assignment of the “shadow” label to boundaries which are not likely to be shadows accordi
57、ng to our local classifier. This can be modeled using </p><p> φi (yi ) = ? log P (yi |xi ) (2)</p><p> We would also like the pairwise potentials to penalize the assignm
58、ent of different labels to neighboring boundaries that have similar features, which can be written as </p><p> , (3)</p><p> where 1(·) is the indicator function, and β is a contra
59、st-normalization constant as suggested in [26]. In other words, we encourage neighboring shadows which have similar features and strong local probabilities to be labelled as shadows. </p><p> The negative l
60、ikelihood in (1) can be efficiently minimized using graph cuts [27–29]. The free parameters were assigned the values of λ = 0.5 and β = 16 obtained by 2-fold cross-validation on a non-overlapping set of images.</p>
61、<p> Applying the CRF on our test images results in an improvement of roughly 1% in total classification accuracy, for a combined score of 80.5% (see Fig. 5-(b)). But more importantly, in practice, the way the CR
62、F is setup encourages</p><p> (a) Input (b) Local classifier (c) Shadow contours </p><p> (d) Ground likelihood [18] (e) Combining (c) and (d)</p><p> Fig
63、. 3. Incorporating scene layout for detecting cast shadows on the ground. Applying our shadow detector on a complex input image (a) yields false detections in the vertical structures because of complex effects like occlu
64、sion boundaries, self-shadowing, etc. (b)& (c). Recent work in scene layout extraction from single images [18] can be used to</p><p> estimate the location of the ground pixels (d). We show how we can c
65、ombine scene layout information with our shadow contour classifier to automatically detect cast shadows on the ground (e). continuity, crossing through X-junctions, and discourages T-junctions as shown in Fig. 2. Since s
66、hadows are usually signaled by the presence of X-junctions and the absence of T-junctions [30], this reduces the number of false positives.</p><p> 4 Incorporating scene layout</p><p> Until n
67、ow, we have been considering the problem of detecting cast shadow boundaries on the ground with a classifier trained on local features and a CRF formulation which defines pairwise constraints across neighboring boundarie
68、s. While both approaches provide good classification accuracy, we show in Fig. 3 that applying them on the entire image generates false positives in the vertical structures of the scene. Reflections, transparency, occlus
69、ion boundaries, selfshadowing, and complex geometry [</p><p> The advent of recent approaches which estimate a qualitative layout of the scene from a single image (e.g. splitting an image into three main ge
70、ometric classes: the sky, vertical surfaces, and ground [18]) may provide explicit knowledge of where the ground is. Since such a scene layout estimator is specifically trained on general features of the scene and not th
71、e shadows, combining its out-put with our shadow detector should reduce the number of false positive (non-shadow) detections outside the</p><p> 4.1 Combining scene layout with local shadow cues</p>
72、<p> To combine the scene layout probabilities with our local shadow classifier, we</p><p> can marginalize the probability of shadows over the three geometric classes sky S, ground G, and vertical su
73、rfaces V:</p><p> = (4)</p><p> where ci is the geometric class label of boundary i, P (yi |ci , xi ) is given by our local shadow classifier, and P (ci |xi ) by the scene
74、 layout classifier (we use the geometric context algorithm [18]). Unfortunately, this approach does not actually improve classification results because while it gets rid of false positives in the vertical structures, it
75、also loses true positives on the ground along the way. This is due to the fact that shadow likelihoods get down-weighted by low-confidence gr</p><p> 4.2 Combining scene layout with shadow contours</p>
76、;<p> Intuitively, we would like to penalize an assignment to the shadow class when the probability of being on the ground is low. When it is high, however, we should let the shadow classifier decide. We can enco
77、de this behavior simply by modifying the unary potentials φi (yi ) from (2) in our CRF formulation:</p><p><b> (5)</b></p><p> Here, λ = 0.5 and β = 16 was found by cross-validatio
78、n. They yield a good compromise between local evidence and smoothness constraints.This approach effectively combines local and mid-level shadow cues with high-level scene interpretation results, and yields an overall cla
79、ssification accu-racy of 84.8% on our test set (see Fig. 5) without adding to the complexity of training our model. Observe how the results are significantly improved in Fig. 3(e) as compared to the other scenarios in Fi
80、g. 3(b)</p><p> References</p><p> Lalonde, J.F., Efros, A.A., Narasimhan, S.G.: Illumination estimation from a single </p><p> outdoor image. In: IEEE International Conference o
81、n Computer Vision. (2009)</p><p> Sato, I., Sato, Y., Ikeuchi, K.: Illumination from shadows. IEEE Transactions on </p><p> Pattern Analysis and Machine Intelligence 25 (2003)</p><p
82、> 3. Matsushita, Y., Nishino, K., Ikeuchi, K., Sakauchi, M.: Illumination normalization</p><p> with time-dependent intrinsic images for video surveillance. IEEE Transactions on Pattern Analysis and Mac
83、hine Intelligence 26 (2004)</p><p> 4. Finlayson, G.D., Fredembach, C., Drew, M.S.: Detecting illumination in images. In: IEEE International Conference on Computer Vision. (2007)</p><p> 5. We
84、iss, Y.: Deriving intrinsic images from image sequences. In: IEEE International</p><p> Conference on Computer Vision. (2001)</p><p> 6. Huerta, I., Holte, M., Moeslund, T., Gonz`lez, J.: Dete
85、ction and removal of chroamatic moving shadows in surveillance scenarios. In: IEEE International Conference on Computer Vision. (2009)</p><p> 7. Wu, T.P., Tang, C.K.: A bayesian approach for shadow extract
86、ion from a single image. In: IEEE International Conference on Computer Vision. (2005)</p><p> 8. Bousseau, A., Paris, S., Durand, F.: User-assisted intrinsic images. ACM Trans-actions on Graphics (SIGGRAPH
87、Asia 2009) 28 (2009)</p><p> 9. Shor, Y., Lischinski, D.: The shadow meets the mask: pyramid-based shadow removal. Computer Graphics Forum Journal (Eurographics 2008) 27 (2008)</p><p> 10. Fin
88、layson, G.D., Hordley, S.D., Drew, M.S.: Removing shadows from images. In: European Conference on Computer Vision. (2002)</p><p> 11. Finlayson, G.D., Drew, M.S., Lu, C.: Intrinsic images by entropy minimiz
89、ation.</p><p> In: European Conference on Computer Vision. (2004)</p><p> 12. Finlayson, G.D., Drew, M.S., Lu, C.: Entropy minimization for shadow removal. International Journal of Computer Vi
90、sion 85 (2009)</p><p> 13. Maxwell, B.A., Friedhoff, R.M., Smith, C.A.: A bi-illuminant dichromatic reflection model for understanding images. In: IEEE Conference on Computer Vision and Pattern Recognition.
91、 (2008)</p><p> 14. Tian, J., Sun, J., Tang, Y.: Tricolor attenuation model for shadow detection. IEEE</p><p> Transactions on Image Processing 18 (2009)</p><p> 15. Narasimhan,
92、S.G., Ramesh, V., Nayar, S.K.: A class of photometric invariants:</p><p> Separating material from shape and illumination. In: IEEE International Conference on Computer Vision. (2005)</p><p>
93、16. Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database</p><p> and web-based tool for image annotation. International Journal of Computer Vision 77 (2008)</p><p> 17
94、. Freeman, W.T., Pasztor, E.C., Carmichael, O.T.: Learning low-level vision. International Journal of Computer Vision 40 (2000)</p><p> 18. Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from
95、 an image. International Journal of Computer Vision 75 (2007)</p><p> 19. Zhu, J., Samuel, K.G.G., Masood, S.Z., Tappen, M.F.: Learning to recognize shadows in monochromatic natural images. In: IEEE Confere
96、nce on Computer</p><p> Vision and Pattern Recognition. (2010)</p><p> 20. Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Proceedings of the 6th International Con
97、ference on Computer Vision. (1998)</p><p> 21. Barnard, K., Finlayson, G.D.: Shadow identification using colour ratios. In: Proc.</p><p> IS&T/SID 8th Color Imaging Conf. Color Science, Sy
98、stems and Applications.(2000)</p><p> 22. Khan, E.A., Reinhard, E.: Evaluation of color spaces for edge classification in</p><p> outdoor scenes. In: IEEE International Conference on Image Pro
99、cessing. (2005)</p><p> 23. Chong, H.Y., Gortler, S.J., Zickler, T.: A perception-based color space for illumination-invariant image processing. ACM Transactions on Graphics (SIG-GRAPH 2008) (2008)</p>
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 眾賞文庫(kù)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 地面的日常清潔
- 基于Hadoop的圖片地理定位研究.pdf
- 圖片壓縮技術(shù)【外文翻譯】
- 車輛工程外文翻譯--日常汽車保養(yǎng)
- 土建施工外文翻譯(英文為圖片)
- 外文翻譯---水位檢測(cè)設(shè)計(jì)
- 外文翻譯---液位檢測(cè)
- 外文翻譯--無(wú)損檢測(cè).doc
- 外文翻譯--由隧道排水引起的地面反應(yīng)曲線
- 外文翻譯--無(wú)損檢測(cè).doc
- 陰影特征的魯棒性評(píng)價(jià)和陰影檢測(cè)算法.pdf
- 電能質(zhì)量檢測(cè)【外文翻譯】
- 法學(xué)強(qiáng)奸罪外文翻譯(英文為圖片)
- 外文翻譯--由隧道排水引起的地面反應(yīng)曲線
- 車輛陰影檢測(cè)算法研究.pdf
- 刀具成本檢測(cè)外文翻譯
- 外文翻譯--由隧道排水引起的地面反應(yīng)曲線
- 復(fù)雜場(chǎng)景下的陰影檢測(cè).pdf
- 外文翻譯--桁架核心夾芯板的性能(英文為圖片)
- 工業(yè)設(shè)計(jì)外文翻譯--設(shè)計(jì)精神(英文為圖片)
評(píng)論
0/150
提交評(píng)論