‫ روش عمومی قابل‌اعتماد موقیعت‌یابی متن در تصاویر طبیعی

روش عمومی قابل‌اعتماد موقیعت‌یابی متن در تصاویر طبیعی

امین اله مه آبادی, علیرضا زارعی

چکیده

این مقاله یک روش عمومی دقیق و سریع خودکار تشخیص متون چندزبانی از تصاویر رنگی دوربین و ویدیو با زمینه پیچیده براساس مؤلفة‌ همبند مبتنی بر لبه را در سطح بلاک، کلمه و خط متن ارایه می‌دهد. این روش نسبت به رنگ، ابعاد متن، زاویة دوربین، انحنای سطح تصویر و نورپردازی ناهموار مقاوم است و در تصاویر دارای پیچیدگی زمینه و زبان‌های مختلف به‌ خوبی عمل می‌کند. نتایج دقیق آزمایشات تجربی در تصاویر متنوع طبیعی با داده‌های استاندارد، دارای نرخ فراخوانی 0/90، دقت 0/85، شاخص 0/87F  و MDR برابر با 13% در سطح خط متن و نرخ فراخوانی 70%، دقت 74% و شاخص F برابر با 71% در سطح کلمه است. این روش نسبت به آخرین روش‌ها علمی دارای بهبود نرخ فراخوانی، دقت، شاخص F و MDR در سطوح خط متن و کلمه همچنین برخوردار از قدرت تشخیص داده‌های متنوع جهت پشتیبانی از داده‌های عظیم تصویری است.

کلمات کلیدی

پردازش تصویر, تصاویر رنگی, تصاویر ویدیویی, موقعیت یابی متن, روش مبتنی بر مولفه های متصل, روش قابل اعتماد

مراجع

  • [1] Z. Yingying, C. Yao, and X. Bai, "Scene text detectionand recognition: Recent advances and future trends,"Frontiers of Computer Science, 2015.
  • [2] L. Rainer, and A. Wernicke, "Localizing and segmentingtext in images and videos," Circuits and Systems for VideoTechnology, IEEE Transactions, vol. 12, no. 4, pp. 256-268,2002.
  • [3] W. Edward, and M. Chen, "A new robust algorithm forvideo text extraction," Pattern Recognition, vol. 36, no. 6, pp.1397-1406, 2003.
  • [4] C. Min, J. Song, and M. R. Lyu, "A new approach forvideo text detection," IEEE Image Processing, InternationalConference,vol. 1, pp. I-117, 2002.
  • [5] J. Akhtar, I. Siddiqi, F. Arif, and A. Raza, "Edge-basedfeatures for localization of artificial Urdu text in videoimages," Document Analysis and Recognition (ICDAR),IEEE International Conference, pp.1120-1124, 2011.
  • [6] A. Marios, B. Gatos, andI. Pratikakis, "A two-stagescheme for text detection in video images," Image andVision Computing, vol. 28, no. 9, pp. 1413-1426, 2010.
  • [7] P. Xujun, H. Cao, R. Prasad, and P. Natarajan, "Textextraction from video using conditional random fields," InDocument Analysis and Recognition (ICDAR), IEEEInternational Conference, pp. 1029-1033, 2011.
  • [8] S. Palaiahnakote, T. Phan, S. Lu, and C. Lim Tan, "Gradient vector flow and grouping-based method forarbitrarily oriented scene text detection in video images,"Circuits and Systems for Video Technology, IEEETransactions, vol. 23, no. 10, pp. 1729-1739, 2013.
  • [9] P. Yi-Feng, X. Hou, and C. Liu, "A hybrid approach todetect and localize texts in natural scene images," ImageProcessing, IEEE Transactions,vol. 20, no. 3, pp. 800-813,2011.
  • [10] J. Anil, and B. Yu. "Automatic text location in imagesand video frames." Pattern recognition, vo. 31, no. 12, pp.2055-2076, 1998.
  • [11] E. Boris, E. Ofek, and Y. Wexler, "Detecting text innatural scenes with stroke width transform," ComputerVision and Pattern Recognition (CVPR), IEEE Conference,pp. 2963-2970, 2010.
  • [12] N. Lukas, and J. Matas, "A method for text localizationand recognition in real-world images," Computer Vision–ACCV, pp. 770-783, 2011.
  • [13] Y. Cong, X. Bai, W. Liu, Y. Ma, and Z. Tu, "Detectingtexts of arbitrary orientations in natural images," ComputerVision and Pattern Recognition (CVPR), IEEE Conference,pp. 1083-1090, 2012.
  • [14] H. Weilin, Z. Lin, J. Yang, J. Wang, "Text localizationin natural images using stroke feature transform and textcovariance descriptors," Computer Vision (ICCV), 2013.
  • [15] N. Tatiana, O. Barinova, P. Kohli, and V. Lempitsky, "Large-lexicon attribute-consistent text recognition in naturalimages," InComputer Vision–ECCV, pp. 752-765, 2012.
  • [16] Y. Cong, X. Bai, and W. Liu, "A unified framework formultioriented text detection and recognition," ImageProcessing, IEEE Transactions, vol. 23, no. 11, pp.4737-4749, 2014.
  • [17] Y. Xu-Cheng, X. Yin, K. Huang, and H. Hao, "Robusttext detection in natural scene images," Pattern Analysis andMachine Intelligence, IEEE Transactions, vol. 36, no. 5, pp.970-983, 2014.
  • [18] W. John, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y.Ma, "Robust face recognition via sparse representation,"Pattern Analysis and Machine Intelligence, IEEETransactions on, vol. 31, no. 2, pp. 210-227, 2009.
  • [19] E. Michael, and M. Aharon, "Image denoising via sparseand redundant representations over learned dictionaries,"Image Processing, IEEE Transactions, vol. 15, no. 12, pp.3736-3745, 2006.
  • [20] Z. Ming, S. Li, and J. Kwok, "Text detection in imagesusing sparse representation with discriminative dictionaries,"Image and Vision Computing, vol. 28, no. 12, pp.1590-1599, 2010.
  • [21] S. Palaiahnakote, T. QuyPhan, and C. Lim Tan, "Alaplacian approach to multi-oriented text detection in video,"Pattern Analysis and Machine Intelligence, IEEETransactions, vol. 33, no. 2, pp. 412-419, 2011.
  • [22] L. Simon, A. Panaretos, L. Sosa, A. Tang, S. Wong, andR. Young, "ICDAR 2003 robust reading competitions."ICDAR, 2003.
  • [23] L. Simon, "ICDAR 2005 text locating competitionresults," Document Analysis and Recognition, Proceedings,IEEE Eighth International Conference, pp.80-84, 2005.
  • [24] C. Huizhong, S. S. Tsai, G. Schroth, D. M. Chen, R.Grzeszczuk, and B. Girod, "Robust text detection in naturalimages with edge-enhanced maximally stable extremalregions," In Image Processing (ICIP), 2011 18th IEEEInternational Conference on, pp. 2609-2612, 2011.
  • [25] L. Su, and K. E. Barner, "Weighted DCT coefficientbased text detection," In Acoustics, Speech and SignalProcessingICASSP 2008, IEEE International Conference,pp. 1341-1344, 2008.
  • [26] W. Edward, and M. Chen, "A new robust algorithm forvideo text extraction, " Pattern Recognition, vol. 36, no. 6,pp. 1397-1406, 2003.
  • [27] C. Min, J. Song, and M. R. Lyu, "A new approach forvideo text detection," In Image Processing Proceedings,International Conference, pp. 110-117, 2002.
  • [28] Y. Qixiang, Q. Huang, W.Gao, and D. Zhao, "Fast androbust text detection in images and video frames," Image andVision Computing, vol. 23, no. 6, pp. 565-576, 2005.
  • [29] L. C. Woo, K. Jung, and H. J. Kim, "Automatic textdetection and removal in video sequences," PatternRecognition Letters, vol. 24, no. 15, pp. 2607-2623, 2003.
  • [30] C. Datong, J. Odobez, and J. Thiran, "Alocalization/verification scheme for finding text in imagesand video frames based on contrast independent features andmachine learning methods," Signal Processing: ImageCommunication, vol. 19, no. 3, pp. 205-217, 2004.
  • [31] W. Tao, D. J. Wu, A. Coates, and A. Y. Ng, "End-to-end text recognition with convolutional neural networks," InPattern Recognition (ICPR), 21st International Conference,no. 012, pp. 3304-3308.
  • [32] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Synthetic data and artificial neural networks for naturalscene text recognition," arXiv preprint arXiv: 1406.2227,2014.
  • [33] S. Bolan, and S. Lu, "Accurate scene text recognitionbased on recurrent neural network," In Computer Vision-ACCV 2014, Springer International Publishing, pp. 35-48,2015.
  • [34] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Reading text in the wild with convolutional neuralnetworks," International Journal of Computer Vision, pp.1-20, 2014.
  • [35] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Deep structured output learning for unconstrained textrecognition," arXiv preprint arXiv: 1412.5903, 2014.
  • [36] J. Munho, and K. Jo,"Multi language text detectionusing fast stroke width transform," Frontiers of ComputerVision (FCV), 21st Korea-Japan Joint Workshop IEEE,2015.
  • [37] T. Kobchaisawat, and H. C. Thanarat, "A method formulti-oriented Thai text localization in natural scene imagesusing Convolutional Neural Network," Signal and ImageProcessing Applications (ICSIPA), EEE InternationalConference, pp. 220-225, 2015.
  • [38] D. Karatzas, and et. al., "Icdar 2015 competition onrobust reading," 13th International Conference on DocumentAnalysis and Recognition (ICDAR) IEEE, Tunis, Tunisia,pp. 1156–1160, 2015.
  • [39] Z. Yang, et al.,"A cascaded method for text detectionin natural scene images, " Neurocomputing, vol. 238, pp.307-315, 2017.
  • [40] N. Vasilopoulos, and E. Kavallieratou, "Unified layoutanalysis and text localization framework," Journal ofElectronic Imaging, vol. 26, no. 1, 2017.
  • [41] A. A. Ben, and et. al., "MapReduce Based TextDetection in Big Data Natural Scene Videos," ProcediaComputer Science,vol. 53, pp. 216-223, 2015.
  • [42] A. Sana, and et. al., "A Review on Text DetectionTechniques," VFAST Transactionson Software Engineering,vol. 8, no. 2, 2015.
  • [43] S. Palaiahnakote, R. P. Sreedhar, T. Q. Phan, S. Lu, andC. L. Tan, "Multioriented video scene text detection throughbayesian classification and boundary growing," Circuits andSystems for Video Technology, IEEE Transactions, vol. 22,no. 8, pp. 1227-1235, 2012.
  • [44] L. Neumann, and J. Matas, "Real-time lexicon-freescene text localization and recognition," IEEETransactions onPattern Analysis and Machine Intelligence, vol. 38, no. 9,pp.1872-85, 2016.
  • [45]Z. Jingchao,L. Xu, B. Xiao, R. Dai, andS. Si. "A robustsystem for textextraction invideo," InMachine Vision,2007. ICMV 2007. IEEE International Conference on,pp.119-124, 2007.
  • [46]S. Palaiahnakote, T. Q.Phan, andC. L. Tan, "NewFourier-statisticalfeatures inRGB spacefor videotextdetection," Circuits and Systems for Video Technology,IEEETransactions,vol. 20, no.11, pp. 1520-1532, 2010.
  • [47]L. Chunmei, C. Wang,and R. Dai, "Text detectioninimages based onunsupervisedclassification of edge-basedfeatures," InDocumentAnalysis and Recognition,Proceedings. Eighth International Conference, pp. 610-614, 2005.
  • [48]W. Huang,Z. Lin, J.Yang, andJ. Wang, "Textlocalization in natural imagesusing strokefeature transformandtext covariance descriptors," InComputer Vision (ICCV), IEEEInternationalConference,pp. 1241-1248, 2013.
  • [49]W. Huang,Q. Yu, andX. Tang, "Robust scenetextdetection withconvolutionneural network inducedmsertrees," In Computer Vision–ECCV, pp. 497-511, 2014.
  • [50]Z. Yu, K.Karu, andA. K. Jain, "Locating textincomplex colorimages," InDocumentAnalysisandRecognition, Proceedingsof the Third InternationalConference, vol.1, pp.146-149, 1995.
  • [51]L. Huiping,D. Doermann, and O. Kia,"Automatictextdetection and tracking in digital video," Image Processing,IEEETransactions, vol. 9, no.1, pp. 147-156, 2000.
  • [52]Y. Chucai,and Y. Tian, "Text stringdetectionfromnatural scenes bystructure-based partitionand grouping,"Image Processing, IEEE Transactions, vol. 20, no. 9,pp.2594-2605, 2011.
  • [53]K. Kwang,K. Jung, and J. H. Kim, "Texture-basedapproach for textdetection inimages using support vectormachines and continuously adaptive meanshift algorithm,"Pattern Analysis and Machine Intelligence, IEEETransactions,vol.25, no. 12,pp. 1631-1639,2003.
  • [54]L. Michael,J. Song, andM. Cai, "Acomprehensivemethod for multilingual videotext detection, localization,andextraction,"Circuits and Systems for Video Technology,IEEETransactions,vol. 15, no.2, pp. 243-255, 2005.
  • [55]Y. Liu, and T. Ikenaga, "A contour-based robustalgorithm for text detectionin colorimages," IEICEtransactions on information and systems, vol. 89, no. 3,pp.1221-1230, 2006.
  • [56]W. Kai, andS. Belongie,"Word spotting in the wild,"Springer BerlinHeidelberg, 2010.
  • [57]C. Xiangrong, and A. L.Yuille, "Detecting and readingtextin naturalscenes," InComputer Vision and Pattern Recognition, CVPR 2004,Proceedingsof the 2004IEEEComputer Society Conference, vol. 2, pp. II-366, 2004.
  • [58] W. Christian, and J.M. Jolion,"Extractionandrecognition ofartificial text in multimedia documents,"Formal PatternAnalysis &Applications,vol. 6, no.4, pp.309-326, 2004.