AUTOMATION & CONTROL - Theory and Practice Part 15.pdf (cơ khí chế tạo máy)

Image Retrieval System in Heterogeneous Database 341 linear separation in this space, where this time it should be more adapted. The kernel functions can achieve this projection, and must check a number of properties to ensure the effectiveness of this technique, so you do not have to make calculations in very large dimensions. With the kernel functions, we can work in very large dimensions. However, a linear separation, and a linear regression is facilitated by the projection of data in a space of high dimension. Projecting in the space of descriptors and using an algorithm to maximize the margin, SVM managed to get a severability retaining good generalization capacity, is the central idea of SVM. For more details on SVMs, we refer interested readers to (Cristianini & Taylor, 2000). A comparison between SVM-multiclass, as supervised classification and Euclidian distance based k-means, as unsupervised classification, is presented in (Kachouri et al., 2008b). The obtained results prove that SVM classifier outperforms the use of similarity measures, chiefly to classify heterogeneous image database. Therefore, we integrate SVM classifier in our proposed image retrieval systems in this chapter. 5. Image recognition and retrieval results through relevant features selection To ensure a good feature selection during image retrieval, we present and discuss the effectiveness of the different feature kind and aggregation. Since heterogeneous image database contains various images, presenting big content difference. The idea to introduce a system optimization tool was essential when one realized during the carried out tests that the use of all extracted features could be heavy to manage. Indeed, more features vectors dimensions are significant more the classifier has difficulties for their classification. The traditional way that one followed in (Djouak et al., 2005a) and that one finds in many CBIR systems is a diagram which consists of the use of all extracted features in the classification step. Unfortunately, this method presents a great disadvantage, by using all features the classifier manages a great dimensions number. That involves a consequent computing time what creates a real handicap for great images databases. In fact, this problem which is the direct result of the high dimensionality problem was the subject of several works which led to cure it. Feature (content) extraction is the basis of CBIR. Recent CBIR systems retrieve images based on visual properties. As we use an heterogeneous image database, images are various categories, and we can find a big difference between their visual properties. So a unique feature or a unique feature kind, cannot be relevant to describe the whole image database. Moreover, while SVM is a powerful classifier, in case of heterogeneous images, given the complexity of their content, some limitations arise, it is that many features may be redundant or irrelevant because some of them might not be responsible for the observed image classification or might be similar to each other. In addition when there are too many irrelevant features in the index dataset, the generalization performance tends to suffer. Consequently, it becomes essential and indispensable to select a feature subset that is most relevant to the interest classification problem. Hence the birth of a new issue, other than image description, it is relevant feature selection. Subsequently, to guarantee a best classification performance, good content image recognition system must be, mainly, able to determine the most relevant feature set, then to well discretize correspond spaces. Feature selection for classification purposes is a well-studied topic (Blum & Langley 1997), with some recent work related specifically to feature selection for SVMs. Proposed algorithms in 342 AUTOMATION & CONTROL - Theory and Practice this regard, shall be ample literature for several years (Guyon & Elisseeff 2003). Although proposed selection methods, are quite varied, two main branches are distinguished, wrappers and filters (John et al. 1994), (Yu & Liu 2004). Filtres are very fast, they rely on theoretical considerations, which allow, generally, to better understanding variable relationships. Linear filters, as PCA (Principal Component Analysis), or FLD (Fisher’s Linear Discriminant) (Meng et al. 2002) are very used, but these methods are satisfactory, only if there is a starting data redundancy. (Daphne & Mehran 1996) propose markov blanket algorithms, which allow to found for a given variable xi, a set of variables not including xi that render xi un-necessary. Once a Markov blanket is found, xi can safely be eliminated. But this is only a summary approximation, because this idea is not implementable in practice. However, as it does not take into account the used classifier in generalization stage, all filters kind selection methods still, generally, unable to guarantee high recognition rate. Although conceptually simpler than filters, wrappers are recently introduced by (John et al. 1994). This selection kind uses the classifier as an integral part of the selection process. Indeed, the principle of a feature subset selection is based on its success to classify test images. Therefore, the selected feature subset is well suited to the classification algorithm, in other words, high recognition rates are obtained because selection takes into account the intrinsic bias of classification algorithm. Some specifically related works on feature selection using SVM classifier are recorded in literature (Guyon et al. 2002), (Zhu & Hastie 2003), (Bi et al. 2003), (Chen et al. 2006). The major inconvenient of this selection technique is the need for expensive computation, especially when the variable number grows. More details, are accommodated in (Guyon & Elisseeff 2003) and references therein. To take advantage, of both of these selection kinds, filters speed and selected feature subset adaptability with the used classifier in wrappers, new selection methods ensuring that compromise is always looking. Recently, (Bi et al. 2003) have proposed the use of 1-norm SVM, as a linear classifier for feature selection, so computational cost will not be an issue, then non linear SVM is used for generalization. Other methods combining filters and wrappers are presented in (Guyon & Elisseeff 2003). It is within this framework that we propose in this section, the modular statistical optimization (section 5.1) and the best features type selection (section 5.2) methods. 5.1 Modular statistical optimization The proposed modular statistical architecture in figure 9 is based on a feedback loop procedure. The principal idea (Djouak et al., 2006) of this architecture is that instead of using all features in the classification step, one categorizes them on several blocks or modules and after one tries to obtain the optimal precision with the minimum of blocks. The introduced modular features database includes all presented features in section 3. Using all these features one formed four features modules which one can describe as follows: The first module (b1) gathers the all shape features, the second module (b2) gathers the color features, the third module (b3) the texture features and finally the fourth module (b4) the Daubeshies features. Features Blocs B1 Concerned Modules b1 Table. 1. Used features blocks table B2 B3 B4 B5 b1 b2 b3 b1 b2 b3 b4 B6 b1 b2 b3 b4 Image Retrieval System in Heterogeneous Database 343 The table (table.1) summarizes the obtained features blocks (B1 to B6) by combining the exposed features modules (b1 to b4). Fig. 9. Modular Statistical optimization architecture. We can remark in figure 10, for the request (query) image number 4 that the classification rate error is very important for bloc B1. However, the rate error decrease progressively when the others features bloc are used. The presented modular architecture presents some disadvantages, the query images must be included in the database, the experimental rate error is used as prior information. To solve this problem, we propose in the next section the classification procedure based on hierarchical method using the best feature type selection. Fig. 10. Average of the classification rate error obtained for different feature blocs. 344 AUTOMATION & CONTROL - Theory and Practice 5.2 Best feature type selection method The hierarchical feature model is proposed to replace the classical employment of aggregated features (Djouak et al., 2005a), (Djouak et al., 2005b). This method is able to select features and organize them automatically through their kinds and the image database contents. In the off-line stage, due to feature extraction step, we obtain from an image database correspond feature dataset. Then, we start, first of all by the training step, using, at every turn, one group of same feature kind separately, and based on the training rate criterion computed through used classifier, we select hierarchically the best same feature kind. In the on-line stage, we classify each image from the test image database, using separately the different same feature kinds. So, for each image, we will have different clusters as a retrieval result. Then To decide between these various outputs, we treat each two same feature kind outputs together, according to the hierarchical feature selection model, as described in figure 11. We start process within the two latest same feature kind outputs, until reaching the best one. Each time, according to the examined two group of same feature kind outputs, a comparison block, will decide the use of Nearest Cluster Center (NCC) block or not. The NCC block ensure the computation of Euclidian distance between the candidate image and the two cluster centers (clusters used are the two group of same feature kind outputs). Fig. 11. Hierarchical best features type selection and organization architecture using different SVM models. Image Retrieval System in Heterogeneous Database 345 A comparison between classical mixed features and the proposed hierarchical feature model is applied. Hierarchical feature model (figure 11) outperforms the use of aggregated features (several feature kind combination) simply by mixing them all together (color + texture + shape). We present, in Figure 12 and Figure 13, the first 15 images retrieved for a query image, using respectively aggregated features and hierarchical features. In these two figures, the first image is the request image. We observe, obviously, that the retrieval accuracy of hierarchical feature model is more efficient than that of aggregated feature use. However, we demonstrate in this section that the feature aggregation is not enough efficient, if we just mix various feature kind. Indeed, each descriptor kind range is different than those of the other descriptor kinds. Fig. 12. Retrieval examples, using classical aggregated features. So, each feature vector extracts signature which is not uniform with the other feature signature extracted from images. . Fig. 13. Retrieval examples, using hierarchical feature model. 346 AUTOMATION & CONTROL - Theory and Practice Consequently, we prove that using proposed hierarchical feature model is more efficient than using aggregated features in an heterogeneous image retrieval system. Figure 14 proves that using the hierarchical feature model is more efficient than using aggregated features in an image retrieval system. Indeed, we obtain with hierarchical features model 0,815 % representing the good classification results and 0,68 % with aggregated features method. Fig. 14. Precision-recall graph comparing hierarchical features and Aggregated Features. 6. Conclusion In this chapter, we have presented the different stages of image recognition and retrieval system dedicated to different applications based computer vision domain. The image description and classification constitute the two important steps of an image recognition system in large heterogeneous databases. We have detailed the principles of the features extraction, image description contained in large database and the importance of robustness. After presenting the features extraction and some improvement we have detailed the importance of the classification task and presented the supervised SVM classifier. To ensure a good feature selection during image retrieval, we have presented and discussed the effectiveness of the different feature kind and aggregation. We have detailed the need of the optimization methods in CBIR systems and we have proposed two architectures, the modular statistical optimization and the hierarchical features model. The satisfactory obtained results show the importance of optimization and the features selection in this domain. Searching CBIR systems remain a challenges problem. Indeed, the different domains has been unable to take advantage of image retrieval and recognition methods and systems in spite of their acknowledged importance in the face of growing use of image databases in mobile robotics, research, and education. The challenging type of images to be treated and the lacking of suitable systems have hindered their acceptance. While it is difficult to develop a single comprehensive system, it may be possible to take advantage of the growing Image Retrieval System in Heterogeneous Database 347 research interest and several successful systems with developed techniques for image recognition in large databases. 7. References Antania, S., Kasturi, R. & Jain, R. (2002). A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video, Pattern recognition, 35(4), pages: 945-965. Bi J., Bennett K., Embrechts M., Breneman C., & Song M., (2003), Dimensionality reduction via sparse support vector machines, J. Machine Learning Research (JMLR), 3, 1229– 1243. Bimbo A. D., Visual Information Retrieval, (2001), Morgan Kaufmann Publishers, San Francisco, USA. Blum A.L. & Langley P., (1997), Selection of Relevant Features and Examples in Machine Learning, Artificial Intelligence, 97(1- 2), 245–271. Carson, C., Belongie, Se., Greenpan, H. & Jitendra, M. (2002). Blobworld: Image segmentation using Expectation-Maximization and its Application to Image Querying, IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 24, NO. 8. Chen Y. & Wang J.Z., (2004), Image Categorization by Learning and Reasoning with Regions, J. Machine Learning Research, vol. 5, pp. 913–939. Chen Y., Bi J. & Wang J.Z., (2006), MILES: Multiple-Instance Learning via Embedded Instance Selection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 12, pp. 1931–1947. Cristianini N. & Taylor J. S., (2000), An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press. Daphne K. and Mehran S., (1996), Toward optimal feature selection, In International Conference on Machine Learning, 284–292. Delingette H. & Montagnat J., (2001), Shape and topology constraints on parametric active contours, Computer Vision and Image Understanding, vol. 83, no. 2, 140–171. Djouak, A., Djemal, K. & Maaref, H. (2007). Image Recognition based on features extraction and RBF classifier, Journal Transactions on Signals, Systems and Devices, Issues on Comminucation and Signal Processing, Shaker Verlag, Vol. 2, N°. 3, pp: 235-253. Djouak, A., Djemal, K. & Maaref, H., (2005a), Image retrieval based on features extraction and RBF classifiers, IEEE International Conference on Signals Systems and Devices, SSD 05, Sousse. Tunisia. Djouak, A., Djemal, K. & Maaref, H.k, (2006). Modular statistical optimization and VQ method for images recognition, International Conference on Artificial Neural Networks and Intelligent Information Processing, pp: 13-24, ISBN: 978-972-8865-689, Setúbal, Portugal, August. Djouak, A., Djemal K. & Maaref, H., (2005b), Features extraction and supervised classification intended to image retrieval. In IMACS World Congress: Scientific Computation, Applied Mathematics and Simulation, Paris, France. Egmont-Petersen, M., de Ridder & Handels, D. H. (2002). Image processing with neural networks-a review. Pattern Recognition, 35(10):2279–2301. 348 AUTOMATION & CONTROL - Theory and Practice Faloutsos C., Equitz W., Flickner M., Niblack W., Petkovic D. & Barber R., (1994), Efficient and Effective Querying by Image Content, Journal of Intelligent Information Systems, vol. 3, No 3/4, 231– 262. Guyon I.,Weston J., Barnhill S., & Vapnik V., (2002), Gene selection for cancer classifcation using support vector machines, Machine Learning 46, 389–422. Guyon I., & Elisseeff A., (2003), An introduction to feature and variable selection, Journal of Machine Learning Research, vol. 3, 1157–1182. Hafner J., Sawhney H.S., Equitz W., Flickner M., & Niblack W., (1995), Efficient color histogram indexing for quadratic form distance function, IEEE Trans. Pattern Anal. Mach. Intell., 17, 7, 729–736. Haralick R.M.S.K., & Dinstein I., (1973), Textural Features for Image Classification, IEEE Transactions on Systems, Man and Cybernetics, 3(6), 610-621. Hu M.K., (1962), Visual pattern recognition by moment invariants, IEEE Transactions information Theory, 8, 179–187. Huang J., Kumar S.R., Mitra M., & Zhu W.J., Spatial color indexing and applications, (1999), Intl. Conf. on Computer Vision, 35, 3, 245–268. Huang, J., Kumar, S. R., Mitra, M., Zhu, W.-J. & Zabih, R., (1997). Image indexing using color correlograms. In Proc. IEEE Comp. Soc. Conf. Comp. Vis. and Patt. Rec., pages 762–768. Jacobs, C., Finkelstein, A. & Salesin, D. (1995). Fast multiresolution image querying. In Proc. SIGGRAPH. Julezs. B., (1975), Experiments in the visual perception of texture. Scientific American, 232(4):2–11. John G.H., Kohavi R., & Pfleger K., Irrelevant features and the subset selection problem, 1994, In International Conference on Machine Learning, pages 121–129. Journal version in AIJ, available at http ://citeseer. nj.nec.com/13663.html. Kachouri R., Djemal K., Maaref H., Sellami Masmoudi D., & Derbel N., (2008b), Heterogeneous Image Retrieval System Based On Features Extraction and SVM Classifier, International Conference on Informatics in Control, Automation and Robotics, ICINCO 08, Funchal, Madeira, Portugal. Kachouri R., Djemal K., Sellami Masmoudi D., & Derbel N., (2008c), Content Based Image Recognition based on QUIP-tree Model, IEEE International Conference on Signals Systems and Devices, SSD 08, Amman, Jordan. Kachouri R., Djemal K., Maaref H., Masmoudi, D.S., Derbel N., (2008), Content description and classification for Image recognition system, 3rd IEEE International Conference on Information and Communication Technologies: From Theory to Applications, ICTTA08 1– 4. Kadyrov, A. & Petrou, M. (2001). The Trace transform and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI, Vol:811–828. Lin, Chun-Yi., Yin, Jun-Xun., Gao, X., Chen, Jian-Y. & Qin, P. (2006). A Semantic Modeling Approach for Medical Image Semantic Retrieval Using Hybrid Bayesian Networks, Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06), vol. 2, pp.482-487. Lipson, P., Grimson, E. & Sinha, P. (1997). Configuration based scene classification and image indexing. In Proc. IEEE Comp. Soc. Conf. Comp. Vis. and Patt. Rec., pages 1007– 1013. Image Retrieval System in Heterogeneous Database 349 Manjunath B.S., Ohm J.-R., Vasudevan V.V., & Yamada A., (2001), Color and texture descriptors, IEEE transaction on circuits and systems for video technology, 11, 703–715. Meng J.E., Shiqian W., Juwei L., & Hock L.T., (2002), Face Recognition With Radial Basis Function (RBF) Neural Network, IEEE Transaction on Neural Networks, 13, 3. Press W.H., Flannery B.P., Teukolsky S.A., & Vetterling W.T., (1987), Numerical Recipes, The Art of Scientific Computing, Cambrigde, U.K.: Cambridge Univ. Rezai-Rad G., & Aghababaie M., (2006), Comparison of SUSAN and Sobel Edge Detection in MRI Images for Feature Extraction, Information and Communication Technologies, ICTTA 06. 1, 1103–1107. Ramesh J. R., Murthy S.N.J., Chen P.L.-J., & Chatterjee S., (1995), Similarity Measures for Image Databases, Proceedings of IEEE International Joint Conference of the Fourth IEEE International Conference on Fuzzy Systems and The Second International Fuzzy Engineering Symposium, 3, 1247-1254. Sastry, Ch. S., Pujari, Arun K., Deekshatulu, B. L. & Bhagvati, C. (2004). A wavelet based multiresolution algorithm for rotation invariant feature extraction, Pattern Recognition Letters, 25:1845–1855. Sclaroff, S. & Pentland, A. (1995). Modal Matching for Correspondence and Recognition, IEEE Trans. Pattern Analysis and Machine Intelligence, 17(6):545–561. Serrano N., Savakisb A.E., & Luoc J., (2004), Improved scene classification using efficient low-level features and semantic cues, Pattern Recognition, 37 , 1773–1784. Smith, J. R. & Chang, S.-F. (1996). Tools and techniques for color image retrieval. In SPIE Proc. Storage and Retrieval for Image and Video Databases, volume 2670, pages 426–437. Stricker, M. & Dimai, A. (1997). Spectral covariance and fuzzy regions for image indexing. Machine Vision and Applications, 10(2):66–73. Stricker, M., & Swain, M. (1994). Capacity and the sensitivity of color histogram indexing, Technical Report, 94-05, University of Chicago. Swain, M. & Ballard, D. (1991). Color indexing, International Journal of Computer Vision, 7(1), pages:11–32. Takashi, I. & Masafumi, H. (2000). Content-based image retrieval system using neural networks. International Journal of Neural Systems, 10(5):417–424. Vapnik, V., (1998). Statistical learning theory, Wiley-Interscience. Wu, J., (2003), Rotation Invariant Classification of 3D Surface Texture Using Photometric Stereo. PhD Thesis, Heriot-Watt University. Yu L. & Liu H., (2004), Efficient Feature Selection via Analysis of Relevance and Redundancy, J. Machine Learning Research, vol. 5, pp. 1205–1224. Zhu J., & Hastie T., (2003), Classication of gene microarrays by penalized logistic regression, Biostatistics. 350 AUTOMATION & CONTROL - Theory and Practice

AUTOMATION & CONTROL - Theory and Practice Part 15

Nội dung