Scientific and Technical Journal


ISSN Print 2221-3937
ISSN Online 2221-3805
Due to the rapid growth of the amounts multimedia information, the issue of its cataloging, analyzing, as well as search and retrieval, has become a challenging task. To solve this problem the process of video annotation is used.
This article provides a brief analysis of existing systems and the decision is made about the viability of using the ontologies in the form of hierarchically structured vocabulary of semantic video concepts and their basic properties as a tool to describe the contents of a video.
We propose a structure of the automatic video annotation system and then go into detail about each of its subsystems. An improved background subtraction method is used for moving objects segmentation, with the subsequent extraction of the objects’ features.
An ontology for the domain of human behavior analysis is developed. The ontology defines the basic set of events and scenarios, which is then used for the purpose of annotating the video sequences to enable semantic search and retrieval in video content. A Bayesian network to classify the events and scenarios in the videos is constructed using the domain ontology.
Finally, preliminary testing of the proposed system is carried out using the CAVIAR test video dataset and the results show great promise.
1. Sjekavica T., Obradović I. and Gledec G., (2013), Ontologies for Multimedia Annotation: An overview, Proceedings of the 4th European Conference of Computer Science, pp. 123 – 129.
2. Kokaram A., Rea N., Dahyot R., Tekalp A.M., Bouthemy P., Gros P. and Sezan I., (2006), Browsing Sports Video: Trends in Sports-Related Indexing and Retrieval Work, IEEE, Vol. 23, pp. 47 – 58.
3. Calic J., Campbell N., Dasiopoulou S. and Kompatsiaris Y.A., (2005), Survey on Multimodal Video Representation for Semantic Retrieval, IEEE, Vol. 1, pp. 135 – 138.
4. Bertini M., Bimbo A. Del and Serra G., (2008), Learning Ontology Rules for Semantic Video Annotation, Proceedings of the 2nd ACM Workshop on Multimedia Semantics, pp. 1 – 8.
5. Martínez J.M., (2003), MPEG-7 Overview: Version 9, International Organization for Standardization, url: /mpeg/standards/mpeg-7/mpeg-7.htm.
6. Lu J., Tian Y., Li Y., Zhang Y. and
Lu Z., (2009), A Framework for Video Event Detection Using Weighted SVM Classifiers, Artificial Intelligence and Computational Intelligence, No. 4, pp. 255 – 259.
7. Jeong J., Hong H. and Lee D., (2011), Ontology-Based Automatic Video Annotation Technique In Smart TV Environment, IEEE Transaction on Consumer Electronics, Vol. 57, No. 4, pp. 1830 – 1836.
8. Vrusias B., Makris D. and Renno J.A., (2007), Framework for Ontology Enriched Semantic Annotation of CCTV Video, Eight International Workshop on Image Analysis for Multimedia Iinteractive Services, IEEE, 5 p.
9. Sun S., and Wang Y.F., (2011), Automatic Annotation of web Videos, Multimedia and Expo (ICME), 2011 IEEE International Conference, pp. 1 – 6.
10. Liya T., and Syama R., (2014), Ontology Based Video Annotation and Retrieval System, International Journal of Emerging Technology and Advanced Engineering, Vol. 4, Iss. 7, pp. 617 – 621.
11. Kovalenko N.V., Antoshchuk S.G. and Godovichenko N.A. Otslezhivanie Ob’ektov Interesa pri Postroenii Avtomatizirovannyh Sys-tem Videonabljudeniya za Ljud’mi, [Using Object Of Interests Tracking In The Development Of Human Video Surveillance Systems], (2012), Journal of Electrotechnic and Computer Systems, Vol. 8, Iss. 84, pp. 151 – 156. (In Russian).
12. Stauffer C., and Grimson W.E.L., (1999), Adaptive Background mixture Models for Real-time Tracking, IEEE Computer vision and Pattern Recognition, Vol. 2, pp. 246 – 252.
13. Kovalenko N.V. and Antoshchuk S.G. Povyshenie Effektivnosti Metoda Vychitanija Fona pri Detektirovanii Dvizhenija v Dinamicheskih Scenah [Increasing the Effectiveness of Background Subtraction Method for Movement Segmentation in Dynamic Scenes], (2013), Proceedings of 3rd International Student and Voung Researchers Conference MIT’13, Odessa, Ukraine (In Russian).
14. Kozakaya T., Ito S., Kubota S. and Yamaguchi O., (2009), Cat Face Detection with two Heterogeneous Features, 16th IEEE International Conference on Image Processing, pp. 1213 – 1216.
15. Kovalenko N. V., Antoshchuk S. G. and Brovkov V. G. Raspredelennaja Mul'tikamernaja Sistema Bezopasnosti bez Narushenija Prav Lichnosti na Konfidencial'nost' [Distributed Multicam Security System not Infringing the Private Life Rights], (2013), Journal Of Electrotechnic And Computer Systems, Kiev, Ukraine, Vol. 12, Iss. 88, pp. 132 – 137 (In Russian).
16. Fenz S., Min T. A., and Hudec M., (2009), Ontology-Based Generation of Bayesian Networks, International Conference on Complex, Intelligent and Software Intensive Systems, pp. 712 – 717.
17. Murphy K., (2002), Dynamic Bayesian Networks: Representation, Inference and Learning, PhD thesis, University of California at Berkley.
18. CAVIAR. Context aware Vision Using Image-Based Active Recognition, (2014), url: http://homepages.
Last download:
2017-11-16 10:40:39

[ © KarelWintersky ] [ All articles ] [ All authors ]
[ © Odessa National Polytechnic University, 2014. Any use of information from the site is possible only under the condition that the source link! ]