Object Class Detection and Classification using Multi Scale Gradient and Corner Point based Shape Descriptors
OObject Class Detection and Classification using Multi Scale Gradient and Corner Point based Shape Descriptors
Fernando Basura, Karaoglu Sezer*, Saha Sajib Kumar † Faculty of Science, University Jean Monnet, France [email protected], [email protected]*, [email protected] † Abstract — This paper presents a novel multi scale gradient and a corner point based shape descriptors. The novel multi scale gradient based shape descriptor is combined with generic Fourier descriptors to extract contour and region based shape information. Shape information based object class detection and classification technique with a random forest classifier has been optimized. Proposed integrated descriptor in this paper is robust to rotation, scale, translation, affine deformations, noisy contours and noisy shapes. The new corner point based interpolated shape descriptor has been exploited for fast object detection and classification with higher accuracy.
Key words-gradient based descriptor; corner based descriptor; steerable filter; multi-scale edge response I. INTRODUCTION According to the review done by Zhang [9], shape descriptors are classified into two classes; contour based and region based. Curvature scale space methods (CSS) [14], [17], [18], global shape descriptors, geometric invariants and spectral descriptors are based on contour information of the shape while Geometric moments, Legendre moments, Zernike moments and pseudo Zernike moments are some of the commonly used region based shape descriptors. According to the surveys [8], [3] and review [9]; the global approach for shape information extraction outperforms the structural approach. Fourier Descriptors outperform many contour based shape descriptors and Generic Fourier descriptors outperform all region based shape descriptors [9]. Centroid distance based Fourier descriptors perform better than other contour based signatures. A detailed study of several Fourier descriptors and signatures can be found in [15]. In [11] Çapar, Kurt, and Gökmen proposed a gradient based shape descriptor. Bandera in [15] proposed an adaptive approach for affine-invariant 2D shape description. As pointed out by [2] both regions based approach and contour based approach has its own pros and cons. Use of both region and contour information is the better approach [2]. Our interest is to develop an innovative descriptor as well as an algorithm that can use both regions and contour properties of shapes for object class recognition. In addition the proposed method has to be invariant to rotation, scale, orientation, affine deformations and noisy contours. Suggested integrated shape descriptor is a combination of Generic Fourier Descriptor (GFD) [1] and novel Multi Scale Gradient Based Descriptor (MSGBD). For rapid classification of object shapes, an inventive Corner Based Interpolated Descriptor (CBID) has been proposed. CBID is based on the concept proposed in [8]. In [19] a novel machine learning algorithm based on edit distance and tree matching was proposed. Similarly in [20] an approach for learning class-specific explicit shape models has been suggested. We propose to use Random Forest (RF) classifier [22] for object recognition. Similar concept in shape analysis and classification has been recommended in [23], [24], and [25]. The rest of the paper is organized as follows. The proposed method is described in Section II. Experimental Results are presented in Section III. Finally, Conclusion is presented in Section IV. II.
METHODOLOGY
A. Generic Fourier Descriptor
GFD is a region based shape descriptor which outperforms MPEG-7 proposed Zernike moments. As pointed out by [9], GFD has many advantages. It is easy to implement and robust. It is less sensitive to deformations and noise. GFD is invariant to rotation translation and scale. Detailed analysis of GFD is present in [9]. The authors in [1] suggest to use 9 regular and 4 angular frequencies as optimum choise for classifation. In our proposed approach N number of radial lower frequency componets and L number of lower angular freqeuncy components are ordered in the following format to obtain a vector. )0,()...0,1().,0()...2,0(),1,0( LGFDGFDNGFDGFDGFD
B. Multi Scale Gradient Based Descriptor (MSGBD)
Gradient Based Shape Descriptor proposed by Capar and Kurt in [11] uses directional steerable filters [6]. Authors in [11] used gradient information near the contours instead of location of contour. They used G-Steerable filters to obtain the gradient information at many directions. Our proposed methodology is based on the concept of steerable filter responses at the boundary of the shape. We use canny edge detector to detect edges. Instead for G filters, Gaussian filters are used to obtain directional erivatives. The concept of gradient based shape descriptor [11] has been enhanced to have better description of the object. The directional filter response is computed based on the concept of steerable filters. As pointed out by Mikolajczyk and Schmid in [12] the edge response significantly varies with the scale. In order to overcome this issue we propose to use steerable filter responses computed at different scales and orientations. At a given boundary point the maximum edge response and the corresponding direction of the maximum response is taken at several scales. This information is used to create the new shape signature. Maximum edge response computed by using multiple orientations and scales is robust to affine transformations, viewpoint and noisy boundaries. The Gaussian function and its first and second major directional partial derivatives are given by ( ) eGeGe yxyyxxyxyxG ⎟⎟⎟⎠⎞⎜⎜⎜⎝⎛ +−⎟⎟⎟⎠⎞⎜⎜⎜⎝⎛ +−⎟⎟⎟⎠⎞⎜⎜⎜⎝⎛ +− ∂∂=∂∂== σσσσσ , Here the partial derivative along the x dimension is said to have directional derivative of zero degree and along y dimension its 90 degree.
IandI
GRGRLet ** σσσσ == Directional filter response (cid:1844) (cid:3097)(cid:3087) and filter (cid:1833) (cid:3097)(cid:3087) at scale σ and direction θ can be given by ( ) ( ) ( ) ( ) GGGRRR and sincossincos σσθσσσθσ θθθθ ×+×=×+×=
Let image be a function f(x, y); directional derivative at θ direction is defined as follows ( ) ( ) ⎟⎟⎠⎞⎜⎜⎝⎛ ⎟⎟⎠⎞⎜⎜⎝⎛ ×+×∗= ),(sin),(cos),( ),( yxGdydyxGdxdyxf yxg θθθ We can write the directional filter response using the steerable filter with a given scale σ and direction θ as follows (cid:1859) (cid:3097)(cid:3087) (cid:4666)(cid:1876), (cid:1877)(cid:4667) ( ) ),(, yxyxf G θσ ∗= The gradient of the image I in multiple directions for a given scale σ is calculated as follows ( ) ( ) yxGGBD kk Imk ,, θσσ ∗= Filter response magnitude )( kf and directional filter response f d ( k ) is defined as ( ) ( ) ( ) ( ) ⎟⎟⎠⎞⎜⎜⎝⎛=⎟⎠⎞⎜⎝⎛= ==== mkkandmkkf GBDArgMaxfGBDMAX
MmmdMmm ,)(,)( σσ After that for each boundary point ( x , y ) the polar coordinates is computed. For each boundary point ( x , y ) r and θ is computed. After that the proposed shape signature based on polar coordinates is expressed in the form of ( ) ( ) Boundaryyxirf ff d ∈∀×+= ,, θ Fourier Transform is applied on this signature followed by FD normalization and takes K × L = X low frequencies as the response ( ) ( ) e TRrjr rff ⎟⎠⎞⎜⎝⎛ +− ×= ∑∑ µθλπθ θµλ ,,ˆ The output of the MSGBD is a vector of size X by performing 2-D to 1-D transformation. MSGBD can encapsulate shape information in compact and robust way. MSGBD uses centroid based location information, gradient magnitude in multiple directions and scales. It’s invariant to rotation, scale and affine deformations due to the fact that edge responses are quite stable over multiple scales. C. Integrated shape descriptor
The proposed integrated shape descriptor combines GFD and MSGBD based on region and contour properties of shape. In order to obtain better results and to control the significance of contribution of the overall descriptor; two weighting parameters α , β are used. The combined descriptor is ( N × L ) + X in length one dimensional vector. So the integrated shaped descriptor ISD is written in the form GBSDGFDISD ×+×= βα
The distance between two shape objects is calculated based on city block distance. However for the classification task we use a Random Forest classifier.
D. Corner based interpolated descriptor (CBID)
Corner based interpolated descriptor is designed to find probable classes of shapes in multiclass object scenes. Being a compact descriptor, it has the capability to capture significant shape information and to rapidly classify objects into classes. According to the proposed approach, firstly the image is binarized by OTSU’s method; then the Canny edge detector is applied on binarized image to obtain robust contour information. After that Harris Stephen corner detector is applied on the edge detected image to find corner points in the object boundary. Once corner points are localized; they are transformed to polar representation. Centroid of the shape object is computed using detected corner points. Suppose the set of corner points are given by ),()...,(),,(),,( yxyxyxyxC cccc n = . Then the radial distances and the angles of each corner point is calculated as follows: )tan( )()( xx yyxxyyR cn cnnn acncn −−=+= −− θ Where ( x c , y c ) is the coordinates of the centroid. ormalization for the radial distance is done by: )( RMax RR nn = and the signature is given by ( ) Rf nn = θ When we need to compare two shapes for similarity, we just need to interpolate the shape signature for every θ from 0 to 360 degrees using nearest neighbor interpolation. Then Fourier descriptor is applied on the interpolated shape signature. It is also possible to use other interpolation technique. For each shape class we extract the radial signatures based on the corner points. Since very few corner points can capture shape signature; this descriptor becomes very compact and it can be used with any machine learning technique. We have used Random Forest classifier but KNN or any other machine learning approach can also be used. E. Object Recognition and Classification
First the corner based interpolated descriptor is applied to recognize the probable classes. It’s the job of the Classifier designed for CBID to give the probability for being in each class. We select N number of classes with highest probability. Once probable shape classes are recognized; the actual classification is done based on the Integrated Shape Descriptor (ISD) proposed in section II ( C ). In this way the search space for ISD has been reduced. Finally in conjunction with RF classifier we have noise free classification. III. EXPERIMENTAL RESULTS GFD [1] of size 9 × 4 is used for the integrated descriptor. MSGBD starts with σ = 0.1, an increment factor of 1.4 (up to 5 scale levels) and 10 directional derivatives. According our observation and analysis 36 low frequency components are sufficient to represent shape information. For experimental analysis MPEG7 CE Shape-1 Part B shape database with 70 image shape classes and 1400 images are used. Random Forest classifier used in this paper uses 10 trees; each constructed while considering 6 random features. Corner point based interpolated descriptor uses maximum of 40 corner points with nearest neighbor interpolation. To represent the shape information in CBID 10 Fourier descriptors were used. For the comparison with other algorithms we have used Centroid Based Fourier Descriptor CBFD with 36 FD coefficients. Also Elliptic Fourier Descriptor [13] EFD, GFD, CBFD, CBID and MSGBD were selected to do the comparisons. All descriptors selected used 36 coefficients except CBID (The classifier used in each shape descriptor is RF classifier). The summary of the experimental results on several shape descriptors on MPEG-7 shape dataset is as follows. TABLE 1: EXPERIMENTAL RESULTS
IV.
CONCLUSION Nonetheless our proposed corner based interpolated descriptor is fast and compact; it has a better accuracy than elliptic Fourier descriptors and centroid based Fourier descriptors. Corner point based interpolated descriptor performs well on largely deformed shapes and when noise is present in the contour of the shape. In this approach we used corner points however in future experiments we want to explore on visual saliency based descriptors. Gradient information near object boundaries is very useful to extract shape information as well as region based shape properties. Since contour information varies with the scale it’s very important to extract gradient information at different scales and orientations. Proposed Multi Scale Gradient Based Descriptor extracts shape information in multiple orientations and scales; it encapsulates centroid based radial information. As a result MSGBD shows very high accuracy in comparison to other shape descriptors such as GFD, CBFD. Again it is robust to noisy contours, invariant to rotation, scale and affine deformations. Finally the integrated shape descriptor is a perfect shape descriptor which captures both regions based and contour based shape properties and gives better classification with satisfactory retrieval time. It suffers when the shape object is very small, which is the only noticeable drawback of this approach. In future work we want to incorporate visual saliency information to challenge that issue. REFERENCES [1] D. Zhang and G. Lu, “Generic Fourier Descriptors for Shape-based Image Retrieval”, IEEE International Conference on Multimedia and Expo, Lausanne, Switzerland, August 26-29, 2002. [2] A. Sajjanhar, G. Lu, D. Zhang, W. Zhou, “A Composite Descriptor for Shape Retrieval”, icis, pp.795-800, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), 2007. [3] D. Zhang, G. Lu. “A Comparative Study of Three Region Shape Descriptors”, In Proc. of the Sixth Digital Image Computing — Techniques and Applications (DICTA02), pp.86-91, Melbourne, Australia, January 21-22, 2002. [4] D. Zhang and M. Lim, “An Efficient and Robust
Method GFD EFD CBISD MSGBSD CBFD
Accuracy 80% 65% 78.33% 85.50% 63.33% Average Precision 0.80 0.64 0.80 0.86 0.67 Average Recall 0.80 0.65 0.78 0.85 0.63 echnique for Region Based Shape Representation and Retrieval”, In Proc. of 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS2007), ISBN: 0-7695-2841-4, pp.801-806, Melbourne, 11-13 July, 2007. [5] A. Çapar, B. Kurt, M. Gökmen, “Affine Invariant Gradient Based Shape Descriptor”, MRCS 2006: 514-521. [6] W. T. Freeman , E. H. Adelson , “The design and use of steerable filters”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891 – 906, September 1991. [7] D. Zhang, G. Lu. “An Integrated Approach to Shape Based Image Retrieval”, In Proc. of the Fifth Asian Conference on Computer Vision (ACCV02), pp.652-657. [8] A. Sajjanhar, G. Lu, D. Zhang , W. Zhou, “Corners-Based Composite Descriptor for Shapes”, In Proc. of the 2008 Congress on Image and Signal Processing (CISP08), Vol.2, pp.714-718, ISBN:978-0-7695-3119-9, 27-30 May, Hainan, China, 2008. [9] D. Zhang and G. Lu, “Review of Shape Representation and Description Techniques”, Pattern Recognition, 37(1):1-19, 2004. [10] S.Loncaric, “A Survey of Shape Analysis Techniques”, Pattern Recognition, vol 31, no 8, pp. 983-1001(19), August 1998. [11] A. Çapar, B. Kurt, M. Gökmen, “Gradient-based shape descriptors”, Mach. Vis. Appl. 20(6): 365-378 (2009). [12] K.Mikolajczyk, A.Zisserman C. Schmid, “Shape recognition with edge-based features”, Proceedings of the British Machine Vision Conference (2003). [13] F.P.Kuhl, C.R.Giardina , “Elliptic Fourier features of a closed contour” ,Computer Graphics and Image Processing” Volume 18, Issue 3, March 1982, Pages 236-258. [14] S. Abbasi, F. Mokhtarian, J. Kittler, “Curvature scale space image in shape similarity retrieval”, Multimedia Systems 7 (1999) 467–476. [15] D. Zhang, G. Lu, “Study and evaluation of different Fourier methods for image retrieval”, Image and Vision Computing 23 (2005) 33–49. [16] A. Bandera1, E. Ant´unez2, R. Marfil1, “An Adaptive Approach for Affine-Invariant 2D Shape Description”, Pattern Recognition and Image Analysis (2009) pp. 417-424. [17] F. Mokhtarian, S. Abbasi, J. Kittler, “Robust and efficient shape indexing through curvature scale space”, Proceedings of British Machine Vision Conference, Edinburgh, UK, 1996 pp. 53–62. [18] S. Abbassi, F. Mokhtarian, J. Kittler, “Enhancing CSS-based shape retrieval for objects with shallow concavities”, Image and Vision Computing 18 (2000) 199–211. [19] A. Torsello , A. Robles-Kelly, E.R.Hancook , “Discovering Shape Classes using Tree Edit-Distance and Pair wise Clustering”, International Journal of Computer Vision 72(3), 259–285, 2007. [20] V. Ferrari, F. Jurie, C. Schmid, “From Images to Shape Models for Object Detection”, International Journal of Computer Vision (2010) 87: 284–303. [21] Y.Li, M.J.Kyan, L.Guan, “Improving Shape-Based CBIR for Natural Image Content Using a Modified GFD”, Image Analysis and Recognition (2005) pp.593-600. [22] Breiman, Leo, “Random Forests”, Machine Learning (2001) Machine Learning, 45, 5–32, 2001. [23] A.Bosch, A.Zisserman, X. Mu˜noz, “Image Classification using Random Forests and Ferns”, ICCV 2007. [24] Ho, Tin, “Random Decision Forest”, 3rd Int'l Conf. on Document Analysis and Recognition (1995) pp. 278–282. [25] Y. Amit, D. Geman, “Shape quantization and recognition with randomized trees”, Neural Computation 9 (7): 1545–1588.echnique for Region Based Shape Representation and Retrieval”, In Proc. of 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS2007), ISBN: 0-7695-2841-4, pp.801-806, Melbourne, 11-13 July, 2007. [5] A. Çapar, B. Kurt, M. Gökmen, “Affine Invariant Gradient Based Shape Descriptor”, MRCS 2006: 514-521. [6] W. T. Freeman , E. H. Adelson , “The design and use of steerable filters”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891 – 906, September 1991. [7] D. Zhang, G. Lu. “An Integrated Approach to Shape Based Image Retrieval”, In Proc. of the Fifth Asian Conference on Computer Vision (ACCV02), pp.652-657. [8] A. Sajjanhar, G. Lu, D. Zhang , W. Zhou, “Corners-Based Composite Descriptor for Shapes”, In Proc. of the 2008 Congress on Image and Signal Processing (CISP08), Vol.2, pp.714-718, ISBN:978-0-7695-3119-9, 27-30 May, Hainan, China, 2008. [9] D. Zhang and G. Lu, “Review of Shape Representation and Description Techniques”, Pattern Recognition, 37(1):1-19, 2004. [10] S.Loncaric, “A Survey of Shape Analysis Techniques”, Pattern Recognition, vol 31, no 8, pp. 983-1001(19), August 1998. [11] A. Çapar, B. Kurt, M. Gökmen, “Gradient-based shape descriptors”, Mach. Vis. Appl. 20(6): 365-378 (2009). [12] K.Mikolajczyk, A.Zisserman C. Schmid, “Shape recognition with edge-based features”, Proceedings of the British Machine Vision Conference (2003). [13] F.P.Kuhl, C.R.Giardina , “Elliptic Fourier features of a closed contour” ,Computer Graphics and Image Processing” Volume 18, Issue 3, March 1982, Pages 236-258. [14] S. Abbasi, F. Mokhtarian, J. Kittler, “Curvature scale space image in shape similarity retrieval”, Multimedia Systems 7 (1999) 467–476. [15] D. Zhang, G. Lu, “Study and evaluation of different Fourier methods for image retrieval”, Image and Vision Computing 23 (2005) 33–49. [16] A. Bandera1, E. Ant´unez2, R. Marfil1, “An Adaptive Approach for Affine-Invariant 2D Shape Description”, Pattern Recognition and Image Analysis (2009) pp. 417-424. [17] F. Mokhtarian, S. Abbasi, J. Kittler, “Robust and efficient shape indexing through curvature scale space”, Proceedings of British Machine Vision Conference, Edinburgh, UK, 1996 pp. 53–62. [18] S. Abbassi, F. Mokhtarian, J. Kittler, “Enhancing CSS-based shape retrieval for objects with shallow concavities”, Image and Vision Computing 18 (2000) 199–211. [19] A. Torsello , A. Robles-Kelly, E.R.Hancook , “Discovering Shape Classes using Tree Edit-Distance and Pair wise Clustering”, International Journal of Computer Vision 72(3), 259–285, 2007. [20] V. Ferrari, F. Jurie, C. Schmid, “From Images to Shape Models for Object Detection”, International Journal of Computer Vision (2010) 87: 284–303. [21] Y.Li, M.J.Kyan, L.Guan, “Improving Shape-Based CBIR for Natural Image Content Using a Modified GFD”, Image Analysis and Recognition (2005) pp.593-600. [22] Breiman, Leo, “Random Forests”, Machine Learning (2001) Machine Learning, 45, 5–32, 2001. [23] A.Bosch, A.Zisserman, X. Mu˜noz, “Image Classification using Random Forests and Ferns”, ICCV 2007. [24] Ho, Tin, “Random Decision Forest”, 3rd Int'l Conf. on Document Analysis and Recognition (1995) pp. 278–282. [25] Y. Amit, D. Geman, “Shape quantization and recognition with randomized trees”, Neural Computation 9 (7): 1545–1588.