روش عمومی قابل‌اعتماد موقیعت‌یابی متن در تصاویر طبیعی

نویسندگان

دانشکده فنی و مهندسی ، دانشگاه ﺷﺎﻫﺪ، ﺗﻬﺮان، اﻳﺮان

چکیده

این مقاله یک روش عمومی دقیق و سریع خودکار تشخیص متون چندزبانی از تصاویر رنگی دوربین و ویدیو با زمینه پیچیده براساس مؤلفة‌ همبند مبتنی بر لبه را در سطح بلاک، کلمه و خط متن ارایه می‌دهد. این روش نسبت به رنگ، ابعاد متن، زاویة دوربین، انحنای سطح تصویر و نورپردازی ناهموار مقاوم است و در تصاویر دارای پیچیدگی زمینه و زبان‌های مختلف به‌ خوبی عمل می‌کند. نتایج دقیق آزمایشات تجربی در تصاویر متنوع طبیعی با داده‌های استاندارد، دارای نرخ فراخوانی 0/90، دقت 0/85، شاخص 0/87F  و MDR برابر با 13% در سطح خط متن و نرخ فراخوانی 70%، دقت 74% و شاخص F برابر با 71% در سطح کلمه است. این روش نسبت به آخرین روش‌ها علمی دارای بهبود نرخ فراخوانی، دقت، شاخص F و MDR در سطوح خط متن و کلمه همچنین برخوردار از قدرت تشخیص داده‌های متنوع جهت پشتیبانی از داده‌های عظیم تصویری است.

کلیدواژه‌ها

  • [1] Z. Yingying, C. Yao, and X. Bai, "Scene text detectionand recognition: Recent advances and future trends,"Frontiers of Computer Science, 2015.
  • [2] L. Rainer, and A. Wernicke, "Localizing and segmentingtext in images and videos," Circuits and Systems for VideoTechnology, IEEE Transactions, vol. 12, no. 4, pp. 256-268,2002.
  • [3] W. Edward, and M. Chen, "A new robust algorithm forvideo text extraction," Pattern Recognition, vol. 36, no. 6, pp.1397-1406, 2003.
  • [4] C. Min, J. Song, and M. R. Lyu, "A new approach forvideo text detection," IEEE Image Processing, InternationalConference,vol. 1, pp. I-117, 2002.
  • [5] J. Akhtar, I. Siddiqi, F. Arif, and A. Raza, "Edge-basedfeatures for localization of artificial Urdu text in videoimages," Document Analysis and Recognition (ICDAR),IEEE International Conference, pp.1120-1124, 2011.
  • [6] A. Marios, B. Gatos, andI. Pratikakis, "A two-stagescheme for text detection in video images," Image andVision Computing, vol. 28, no. 9, pp. 1413-1426, 2010.
  • [7] P. Xujun, H. Cao, R. Prasad, and P. Natarajan, "Textextraction from video using conditional random fields," InDocument Analysis and Recognition (ICDAR), IEEEInternational Conference, pp. 1029-1033, 2011.
  • [8] S. Palaiahnakote, T. Phan, S. Lu, and C. Lim Tan, "Gradient vector flow and grouping-based method forarbitrarily oriented scene text detection in video images,"Circuits and Systems for Video Technology, IEEETransactions, vol. 23, no. 10, pp. 1729-1739, 2013.
  • [9] P. Yi-Feng, X. Hou, and C. Liu, "A hybrid approach todetect and localize texts in natural scene images," ImageProcessing, IEEE Transactions,vol. 20, no. 3, pp. 800-813,2011.
  • [10] J. Anil, and B. Yu. "Automatic text location in imagesand video frames." Pattern recognition, vo. 31, no. 12, pp.2055-2076, 1998.
  • [11] E. Boris, E. Ofek, and Y. Wexler, "Detecting text innatural scenes with stroke width transform," ComputerVision and Pattern Recognition (CVPR), IEEE Conference,pp. 2963-2970, 2010.
  • [12] N. Lukas, and J. Matas, "A method for text localizationand recognition in real-world images," Computer Vision–ACCV, pp. 770-783, 2011.
  • [13] Y. Cong, X. Bai, W. Liu, Y. Ma, and Z. Tu, "Detectingtexts of arbitrary orientations in natural images," ComputerVision and Pattern Recognition (CVPR), IEEE Conference,pp. 1083-1090, 2012.
  • [14] H. Weilin, Z. Lin, J. Yang, J. Wang, "Text localizationin natural images using stroke feature transform and textcovariance descriptors," Computer Vision (ICCV), 2013.
  • [15] N. Tatiana, O. Barinova, P. Kohli, and V. Lempitsky, "Large-lexicon attribute-consistent text recognition in naturalimages," InComputer Vision–ECCV, pp. 752-765, 2012.
  • [16] Y. Cong, X. Bai, and W. Liu, "A unified framework formultioriented text detection and recognition," ImageProcessing, IEEE Transactions, vol. 23, no. 11, pp.4737-4749, 2014.
  • [17] Y. Xu-Cheng, X. Yin, K. Huang, and H. Hao, "Robusttext detection in natural scene images," Pattern Analysis andMachine Intelligence, IEEE Transactions, vol. 36, no. 5, pp.970-983, 2014.
  • [18] W. John, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y.Ma, "Robust face recognition via sparse representation,"Pattern Analysis and Machine Intelligence, IEEETransactions on, vol. 31, no. 2, pp. 210-227, 2009.
  • [19] E. Michael, and M. Aharon, "Image denoising via sparseand redundant representations over learned dictionaries,"Image Processing, IEEE Transactions, vol. 15, no. 12, pp.3736-3745, 2006.
  • [20] Z. Ming, S. Li, and J. Kwok, "Text detection in imagesusing sparse representation with discriminative dictionaries,"Image and Vision Computing, vol. 28, no. 12, pp.1590-1599, 2010.
  • [21] S. Palaiahnakote, T. QuyPhan, and C. Lim Tan, "Alaplacian approach to multi-oriented text detection in video,"Pattern Analysis and Machine Intelligence, IEEETransactions, vol. 33, no. 2, pp. 412-419, 2011.
  • [22] L. Simon, A. Panaretos, L. Sosa, A. Tang, S. Wong, andR. Young, "ICDAR 2003 robust reading competitions."ICDAR, 2003.
  • [23] L. Simon, "ICDAR 2005 text locating competitionresults," Document Analysis and Recognition, Proceedings,IEEE Eighth International Conference, pp.80-84, 2005.
  • [24] C. Huizhong, S. S. Tsai, G. Schroth, D. M. Chen, R.Grzeszczuk, and B. Girod, "Robust text detection in naturalimages with edge-enhanced maximally stable extremalregions," In Image Processing (ICIP), 2011 18th IEEEInternational Conference on, pp. 2609-2612, 2011.
  • [25] L. Su, and K. E. Barner, "Weighted DCT coefficientbased text detection," In Acoustics, Speech and SignalProcessingICASSP 2008, IEEE International Conference,pp. 1341-1344, 2008.
  • [26] W. Edward, and M. Chen, "A new robust algorithm forvideo text extraction, " Pattern Recognition, vol. 36, no. 6,pp. 1397-1406, 2003.
  • [27] C. Min, J. Song, and M. R. Lyu, "A new approach forvideo text detection," In Image Processing Proceedings,International Conference, pp. 110-117, 2002.
  • [28] Y. Qixiang, Q. Huang, W.Gao, and D. Zhao, "Fast androbust text detection in images and video frames," Image andVision Computing, vol. 23, no. 6, pp. 565-576, 2005.
  • [29] L. C. Woo, K. Jung, and H. J. Kim, "Automatic textdetection and removal in video sequences," PatternRecognition Letters, vol. 24, no. 15, pp. 2607-2623, 2003.
  • [30] C. Datong, J. Odobez, and J. Thiran, "Alocalization/verification scheme for finding text in imagesand video frames based on contrast independent features andmachine learning methods," Signal Processing: ImageCommunication, vol. 19, no. 3, pp. 205-217, 2004.
  • [31] W. Tao, D. J. Wu, A. Coates, and A. Y. Ng, "End-to-end text recognition with convolutional neural networks," InPattern Recognition (ICPR), 21st International Conference,no. 012, pp. 3304-3308.
  • [32] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Synthetic data and artificial neural networks for naturalscene text recognition," arXiv preprint arXiv: 1406.2227,2014.
  • [33] S. Bolan, and S. Lu, "Accurate scene text recognitionbased on recurrent neural network," In Computer Vision-ACCV 2014, Springer International Publishing, pp. 35-48,2015.
  • [34] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Reading text in the wild with convolutional neuralnetworks," International Journal of Computer Vision, pp.1-20, 2014.
  • [35] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Deep structured output learning for unconstrained textrecognition," arXiv preprint arXiv: 1412.5903, 2014.
  • [36] J. Munho, and K. Jo,"Multi language text detectionusing fast stroke width transform," Frontiers of ComputerVision (FCV), 21st Korea-Japan Joint Workshop IEEE,2015.
  • [37] T. Kobchaisawat, and H. C. Thanarat, "A method formulti-oriented Thai text localization in natural scene imagesusing Convolutional Neural Network," Signal and ImageProcessing Applications (ICSIPA), EEE InternationalConference, pp. 220-225, 2015.
  • [38] D. Karatzas, and et. al., "Icdar 2015 competition onrobust reading," 13th International Conference on DocumentAnalysis and Recognition (ICDAR) IEEE, Tunis, Tunisia,pp. 1156–1160, 2015.
  • [39] Z. Yang, et al.,"A cascaded method for text detectionin natural scene images, " Neurocomputing, vol. 238, pp.307-315, 2017.
  • [40] N. Vasilopoulos, and E. Kavallieratou, "Unified layoutanalysis and text localization framework," Journal ofElectronic Imaging, vol. 26, no. 1, 2017.
  • [41] A. A. Ben, and et. al., "MapReduce Based TextDetection in Big Data Natural Scene Videos," ProcediaComputer Science,vol. 53, pp. 216-223, 2015.
  • [42] A. Sana, and et. al., "A Review on Text DetectionTechniques," VFAST Transactionson Software Engineering,vol. 8, no. 2, 2015.
  • [43] S. Palaiahnakote, R. P. Sreedhar, T. Q. Phan, S. Lu, andC. L. Tan, "Multioriented video scene text detection throughbayesian classification and boundary growing," Circuits andSystems for Video Technology, IEEE Transactions, vol. 22,no. 8, pp. 1227-1235, 2012.
  • [44] L. Neumann, and J. Matas, "Real-time lexicon-freescene text localization and recognition," IEEETransactions onPattern Analysis and Machine Intelligence, vol. 38, no. 9,pp.1872-85, 2016.
  • [45]Z. Jingchao,L. Xu, B. Xiao, R. Dai, andS. Si. "A robustsystem for textextraction invideo," InMachine Vision,2007. ICMV 2007. IEEE International Conference on,pp.119-124, 2007.
  • [46]S. Palaiahnakote, T. Q.Phan, andC. L. Tan, "NewFourier-statisticalfeatures inRGB spacefor videotextdetection," Circuits and Systems for Video Technology,IEEETransactions,vol. 20, no.11, pp. 1520-1532, 2010.
  • [47]L. Chunmei, C. Wang,and R. Dai, "Text detectioninimages based onunsupervisedclassification of edge-basedfeatures," InDocumentAnalysis and Recognition,Proceedings. Eighth International Conference, pp. 610-614, 2005.
  • [48]W. Huang,Z. Lin, J.Yang, andJ. Wang, "Textlocalization in natural imagesusing strokefeature transformandtext covariance descriptors," InComputer Vision (ICCV), IEEEInternationalConference,pp. 1241-1248, 2013.
  • [49]W. Huang,Q. Yu, andX. Tang, "Robust scenetextdetection withconvolutionneural network inducedmsertrees," In Computer Vision–ECCV, pp. 497-511, 2014.
  • [50]Z. Yu, K.Karu, andA. K. Jain, "Locating textincomplex colorimages," InDocumentAnalysisandRecognition, Proceedingsof the Third InternationalConference, vol.1, pp.146-149, 1995.
  • [51]L. Huiping,D. Doermann, and O. Kia,"Automatictextdetection and tracking in digital video," Image Processing,IEEETransactions, vol. 9, no.1, pp. 147-156, 2000.
  • [52]Y. Chucai,and Y. Tian, "Text stringdetectionfromnatural scenes bystructure-based partitionand grouping,"Image Processing, IEEE Transactions, vol. 20, no. 9,pp.2594-2605, 2011.
  • [53]K. Kwang,K. Jung, and J. H. Kim, "Texture-basedapproach for textdetection inimages using support vectormachines and continuously adaptive meanshift algorithm,"Pattern Analysis and Machine Intelligence, IEEETransactions,vol.25, no. 12,pp. 1631-1639,2003.
  • [54]L. Michael,J. Song, andM. Cai, "Acomprehensivemethod for multilingual videotext detection, localization,andextraction,"Circuits and Systems for Video Technology,IEEETransactions,vol. 15, no.2, pp. 243-255, 2005.
  • [55]Y. Liu, and T. Ikenaga, "A contour-based robustalgorithm for text detectionin colorimages," IEICEtransactions on information and systems, vol. 89, no. 3,pp.1221-1230, 2006.
  • [56]W. Kai, andS. Belongie,"Word spotting in the wild,"Springer BerlinHeidelberg, 2010.
  • [57]C. Xiangrong, and A. L.Yuille, "Detecting and readingtextin naturalscenes," InComputer Vision and Pattern Recognition, CVPR 2004,Proceedingsof the 2004IEEEComputer Society Conference, vol. 2, pp. II-366, 2004.
  • [58] W. Christian, and J.M. Jolion,"Extractionandrecognition ofartificial text in multimedia documents,"Formal PatternAnalysis &Applications,vol. 6, no.4, pp.309-326, 2004.

 

دوره 15، شماره 1
بهار و تابستان
اردیبهشت 1396