مدل موقعیت‌یابی متون در داده‌های عظیم تصویری

نویسنده

دانشکده فنی مهندسی، دانشگاه شاهد، تهران، ایران

چکیده

روش‌های متداول موقعیت‌یابی متن در تصاویر طبیعی دارای چالش‌ فضای زیاد جستجو جهت تشخیص دقیق، صحیح، سریع و کارا است و برای پشتیبانی از داده‌های عظیم تصویری به یک مسئله سخت لاینحل تبدیل می‌شود. می‌توان کاهش محاسبات پیچیده را با هدف کنترل فضای جستجو، منابع مصرفی و هزینه‌ها، با برخورداری از تکنیک‌های نرم‌افزاری در فضای سخت‌افزاری چندهسته‌ای، ساختارهای گرید و رایانش ابری شکل داد. موقعیت‌یابی و خواندن نوری متون در سطوح مختلف پاراگراف، خط متن، کلمه و کاراکتر در داده‌های عظیم به‌طور وحشتناکی به پیچیدگی آن می‌افزاید. در این مقاله روش‌های عمومی غیریادگیر موقعیت‌یابی متون در تصاویر کم‌حجم و روش‌های متداول داده‌های حجیم طبیعی بررسی شده و مدلی مناسب دارای تحلیلگر استراتژیک تصویر با استفاده از عامل‌های هوشمند و بات‌های زیرک برای پردازش داده‌های عظیم تصویری جهت موقعیت‌یابی متون توسط روبات‌ها ارائه و در کنار مجموعه داده‌های متفاوت و معیارهای ارزیابی مختلف تشریح شده است. 

کلیدواژه‌ها

 • [1] Z. Yingying, C. Yao, and X. Bai, "Scene text detection and recognition: Recent advances and future trends," Frontiers of Computer Science, vol.10, no.1, pp. 19-36, 2016.
 • [2] L. Rainer, and A. Wernicke, "Localizing and segmenting text in images and videos," Circuits and Systems for Video Technology, vol.12, no.4, pp.256-268, 2002.
 • [3] W. Edward, and M. Chen, "A new robust algorithm for video text extraction," Pattern Recognition, vol.36, no.6, pp.1397-1406, 2003.
 • [4] C. Min, J. Song, and M. R. Lyu, "A new approach for video text detection," IEEE International Conference on Image Processing, vol.1, pp.1-17, 2002.
 • [5] J A. Jamil, I. Siddiqi, F. Arif, and A. Raza, "Edge-based features for localization of artificial Urdu text in video images," IEEE International Conference on Document Analysis and Recognition (ICDAR), pp.1120-1124, 2011.
 • [6] A. Marios, B. Gatos, I. Pratikakis, "A two-stage scheme for text detection in video images," Image and Vision Computing, vol.28, no.9, pp.1413-1426, 2010.
 • [7] P. Xujun, H. Cao, R. Prasad, and P. Natarajan, "Text extraction from video using conditional random fields," IEEE International Conference on Document Analysis and Recognition (ICDAR), pp.1029-1033, 2011.
 • [8] S. Palaiahnakote, T. Phan, S. Lu, and C. Lim Tan, "Gradient vector flow and grouping-based method for arbitrarily oriented scene text detection in video images,"Circuits and Systems for Video Technology, vol.23, no.10, pp.1729-1739, 2013.
 • [9] P. Yi-Feng, X. Hou, and C. Liu, "A hybrid approach to detect and localize texts in natural scene images," Image Processing, vol.20, no.3, pp.800-813, 2011.
 • [10] A. K. Jain, and B. Yu. "Automatic text location in images and video frames," Pattern recognition, vol.31, no.12, pp.2055-2076, 1998.
 • [11] E. Boris, E. Ofek, Y. Wexler, "Detecting text in natural scenes with stroke width transform," IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2963-2970, 2010.
 • [12] N. Lukas, and J. Matas, "A method for text localization and recognition in real-world images," Asian Conference on Computer Vision, pp.770-783, 2011.
 • [13] Y. Cong, X. Bai, W. Liu, Y. Ma, and Z. Tu, "Detecting texts of arbitrary orientations in natural images," IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1083-1090, 2012.
 • [14] H. Weilin, Z. Lin, J. Yang, and J. Wang, "Text localization in natural images using stroke feature transform and text covariance descriptors," IEEE international conference on Computer Vision (ICCV), pp. 1241-1248 ,2013.
 • [15] T. Novikova, O. Barinova, P. Kohli, and V. Lempitsky, "Large-lexicon attribute-consistent text recognition in natural images," European conference on Computer Vision–ECCV, pp.752-765, 2012.
 • [16] Y. Cong, X. Bai, and W. Liu, "A unified framework for multioriented text detection and recognition," Image Processing, vol.23, no.11, pp.4737-4749, 2014.
 • [17] Y. Xu-Cheng, X. Yin, K. Huang, and H. Hao, "Robust text detection in natural scene images," Pattern Analysis and Machine Intelligence, vol.36, no.5, pp.970-983, 2014.
 • [18] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. W. Ma,” Robust face recognition via sparse representation,” Pattern Analysis and Machine Intelligence, vol.31, no. 2, pp.210-227, 2008.
 • [19] E. Michael, and M. Aharon, "Image denoising via sparse and redundant representations over learned dictionaries," Image Processing, vol.15, no.12, pp.3736-3745, 2006.
 • [20] Z. Ming, S. Li, and J. Kwok, "Text detection in images using sparse representation with discriminative dictionaries," Image and Vision Computing, vol.28, no.12, pp.1590-1599, 2010.
 • [21] S. Palaiahnakote, T. QuyPhan, and C. L. Tan, "A laplacian approach to multi-oriented text detection in video," Pattern Analysis and Machine Intelligence, vol. 33, no.2, pp.412-419, 2011.
 • [22] L. Simon, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young, "ICDAR 2003 robust reading competitions," International Conference on Document Analysis and Recognition (ICDAR), pp.682-678, 2003.
 • [23] L. Simon, "ICDAR 2005 text locating competition results, " International Conference on Document Analysis and Recognition (ICDAR), pp.80-84, 2005.
 • [24] H. Chen, S. S. Tsai, G. Schroth, D. M. Chen, R. Grzeszczuk, and B. Girod, "Robust text detection in natural images with edge-enhanced maximally stable extremal regions," IEEE International Conference on Image Processing (ICIP), pp.2609-2612, 2011.
 • [25] L. Su, and K. E. Barner, "Weighted DCT coefficient based text detection," International Conference on Acoustics, Speech and Signal Processing ICASSP, pp.1341-1344, 2008.
 • [26] W. Edward, and M. Chen, "A new robust algorithm for video text extraction," Pattern Recognition, vol.36, no.6, pp. 1397-1406, 2003.
 • [27] C. Min, J. Song, and M. R. Lyu, "A new approach for video text detection," International Conference on Image Processing Proceedings, vol.1, pp.110-117, 2002.
 • [28] Y. Qixiang, Q. Huang, W. Gao, and D. Zhao, "Fast and robust text detection in images and video frames," Image and Vision Computing, vol.23, no.6, pp.565-576, 2005.
 • [29] L. C. Woo, K. Jung, and H. J. Kim, "Automatic text detection and removal in video sequences," Pattern Recognition Letters, vol.24, no.15, pp.2607-2623, 2003.
 • [30] C. Datong, J. Odobez, and J. Thiran, "A localization/verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods," Signal Processing: Image Communication, vol.19, no.3, pp.205-217, 2004.
 • [31] W. Tao, D. J. Wu, A. Coates, and A. Y. Ng, "End-to-end text recognition with convolutional neural networks," International Conference on Pattern Recognition (ICPR), pp.3304-3308, 2012.
 • [32] J. Max, K. Simonyan, A. Vedaldi, A. Zisserman, "Synthetic data and artificial neural networks for natural scene text recognition," arXiv preprint arXiv:1406.2227 (2014).
 • [33] S. Bolan, and S. Lu, "Accurate scene text recognition based on recurrent neural network," Computer Vision-ACCV 2014, Springer International Publishing, pp.35-48, 2015.
 • [34] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Reading text in the wild with convolutional neural networks," International Journal of Computer Vision, vol.116, no.1, pp.1-20, 2014.
 • [35] J. Max, K. Simonyan, A. Vedaldi, and A. Zisserman, "Deep structured output learning for unconstrained text recognition," arXiv preprint arXiv:1412.5903, 2014.
 • [36] J. Munho, and K. Jo, "Multi language text detection using fast stroke width transform," Korea-Japan Joint Workshop onFrontiers of Computer Vision (FCV), pp.1-4, 2015.
 • [37] T. Kobchaisawat, and H. C. Thanarat, "A method for multi-oriented Thai text localization in natural scene images using Convolutional Neural Network," International Conference on Signal and Image Processing Applications (ICSIPA), pp.220-225, 2015.
 • [38] D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V. R. Chandrasekhar, S. Lu, and F. Shafait, "Icdar 2015 competition on robust reading," International Conference on Document Analysis and Recognition (ICDAR), pp.1156–1160, 2015.
 • [39] Y. Zheng, Q. Li, J. Liu, H. Liu, G. Li, and S. Zhang, "A cascaded method for text detection in natural scene images," Neurocomputing, vol.238, pp.307-315, 2017.
 • [40] N. Vasilopoulos, and E. Kavallieratou, "Unified layout analysis and text localization framework, " Electronic Imaging, vol. 26, no.1, 2017.
 • [41] A. B. Ayed, M. B. Halima, and A. M. Alimi, "MapReduce Based Text Detection in Big Data Natural Scene Videos," INNS Conference on Big Data, vol.53, pp.216-223, 2015.
 • [42] S. Ali S, K. Iqbal, S. Khan, Q. Z. Aqil, and R. Tariq, "A Review on Text Detection Techniques," VFAST Transactions on Software Engineering, vo.78, pp.4-3, 2016.
 • [43] P. Shivakumara, R. P. Sreedhar, T. Q. Phan, S. Lu, and C. L. Tan, "Multioriented video scene text detection through bayesian classification and boundary growing," Circuits and Systems for Video Technology, vol.22, no.8, pp.1227-1235, 2012.
 • [44] X. Wang, Y. Jiang, Z. Luo, C. L. Liu, H. Choi, and S. Kim, "Arbitrary shape scene text detection with adaptive text region representation," IEEE Conference on Computer Vision and Pattern Recognition, pp.6449-6458, 2019.
 • [45] J. Zhou, L. Xu, B. Xiao, and R. Dai, "A robust system for text extraction in video," IEEE International Conference on Machine Vision, pp.119-124, 2007.
 • [46] S. Palaiahnakote, T. Q. Phan, and C. L. Tan, "New Fourier-statistical features in RGB space for video text detection," Circuits and Systems for Video Technology, vol. 20, no.11, pp.1520-1532, 2010.
 • [47] L. Chunmei, C. Wang, and R. Dai, "Text detection in images based on unsupervised classification of edge-based features," International Conference on Document Analysis and Recognition, pp.610-614, 2005.
 • [48] W. Huang, Z. Lin, J. Yang, and J. Wang, "Text localization in natural images using stroke feature transform and text covariance descriptors," International Conference on Computer Vision (ICCV), pp.1241-1248, 2013.
 • [49] W. Huang, Q. Yu, and X. Tang, "Robust scene text detection with convolution neural network induced mser trees," Computer Vision–ECCV, pp.497-511, 2014.
 • [50] Z. Yu, K. Karu, and A. K. Jain, "Locating text in complex color images," International Conference on Document Analysis and Recognition, vol.1, pp.146-149, 1995.
 • [51] L. Huiping, D. Doermann, and O. Kia, "Automatic text detection and tracking in digital video," Image Processing, vol.9, no.1, pp,147-156, 2000.
 • [52] Y. Chucai, and Y. Tian, "Text string detection from natural scenes by structure-based partition and grouping." Image Processing, vol.20, no.9, pp.2594-2605, 2011.
 • [53] K. Kwang, K. Jung, and J. H. Kim, "Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm," Pattern Analysis and Machine Intelligence, vol.25, no.12, pp.1631-1639, 2003.
 • [54] L. Michael, J. Song, and M. Cai, "A comprehensive method for multilingual video text detection, localization, and extraction," Circuits and Systems for Video Technology, vol.15, no.2, pp.243-255, 2005.
 • [55] Y. Liu, and T. Ikenaga, "A contour-based robust algorithm for text detection in color images," IEICE transactions on information and systems, vol.89, no.3, pp.1221-1230, 2006.
 • [56] W. Kai, and S. Belongie, “Word spotting in the wild,” European Conference on Computer Vision, Springer Berlin Heidelberg, 2010.
 • [57] C. Xiangrong, and A. L. Yuille, "Detecting and reading text in natural scenes," IEEE Conference on Computer Vision and Pattern Recognition, vol.2, pp.II-II, 2004.
 • [58] W. Christian, and J. M. Jolion, "Extraction and recognition of artificial text in multimedia documents," Formal Pattern Analysis & Applications, vol.6, no.4, pp.309-326, 2004.
 • [59] Y. Bae, and J. Park, "Architecture for fast object detection supporting CPU-GPU hybrid and distributed computing, " IEEE International Conference Consumer Electronics (ICCE), pp.158-159, 2017.
 • [60] J. Pont-Tuset, P. Arbelaez, J. T. Barron, F. Marques, and J. Malik, "Multiscale combinatorial grouping for image segmentation and object proposal generation, " Pattern Analysis and Machine Intelligence, vo.39, no.1, pp.128-140, 2017.
 • [61] T. Kurc, X. Qi, D. Wang, F. Wang, G. Teodoro, L. Cooper, M. Nalisnik, L. Yang, J. Saltz, and D. J. Foran, "Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies, " BMC bioinformatics, vol.16. no.1, pp.1-21, 2015.
 • [62] Z. Chen, W. Zhang, B. Hu, X. Cao, S. Liu, and D. Meng, "Retrieving Objects by Partitioning," IEEE Transactions on Big Data, vol.3, no.1, pp.44-54, 2017.
 • [63] R. Kune, P. K. Konugurthi, A. Agarwal, R. R. Chillarige, and R. Buyya, "XHAMI–extended HDFS and MapReduce interface for Big Data image processing applications in cloud computing environments, " Software: Practice and Experience, vol.47, no.3, pp.455-472, 2017.
 • [64] K. Gauen, R. Rangan, A. Mohan, Y. H. Lu, W. Liu, and A. C. Berg, "Low-power image recognition challenge, " IEEE Conference on Design Automation (ASP-DAC), pp.99-104, 2017.
 • [65] I. Lee, "Big data: Dimensions, evolution, impacts, and challenges, " Business Horizons, vol. 60, no.3, pp.293-303, 2017.
 • [66] J. Liu, Y. Huang, J. Peng, J. Yao, and L. Wang, "Fast Object Detection at Constrained Energy, " IEEE Transactions on Emerging Topics in Computing, vol.6, no.3, pp.409-4016, 2016.
 • [67] R. Zhang, X. Liu, J. Hu, K. Chang, and K. Liu, "A fast method for moving object detection in video surveillance image, " Signal, Image and Video Processing, vol.11, no.5, pp.841-848, 2017.
 • [68] L. Dong, Z. Lin, Y. Liang, L. He, N. Zhang, Q. Chen, X. Cao, and E. Izquierdo, "A Hierarchical Distributed Processing Framework for Big Image Data," IEEE Transactions on Big Data, vol.2, no.4, pp.297-309, 2016.
 • [69] F. Ronald B., S. Gardner, and P. Palangpour, "Energy-efficient secure vision processing applying object detection algorithms," U.S. Patent Application, No. 15/227,949, 2017.
 • [70] G. Xiang, H. Yeh, and P. Marayong, "A high-speed color-based object detection algorithm for quayside crane operator assistance system," Annual IEEE International Systems Conference (SysCon), pp.1-6, 2017.
 • [71] D. Nguyen, L. Shijian, N. Ouarti, and M. Mokhtari, "Text-Edge-Box: An Object Proposal Approach for Scene Texts Localization," IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1296-1305, 2017.
 • [72] U. B. Karanje, and R. Dagade, "Survey on Text Detection, Segmentation and Recognition from a Natural Scene Images," International Journal of Computer Applications, vol.108, no.13, 2014.
 • [73] X. Shen, W. Liu, I. Tsang, F. Shen, and Q. S. Sun, “Compressed K-Means for Large-Scale Clustering, ” Thirty-first aaai Conference on Artificial Intelligence, 2017.
 • [74] H. Kevin, and M. Golparvar-Fard, "Potential of big visual data and building information modeling for construction performance analytics: An exploratory study," Automation in Construction, vol. 73, pp.184-198, 2017.
 • [75] K. Amandeep, R. Dhir, and G. S. Lehal, "A survey on camera-captured scene text detection and extraction: towards Gurmukhi script," International Journal of Multimedia Information Retrieval, vol.6, no.2, pp.115-142, 2017.
 • [76] T. Mukesh, and R. Singhai, "A Review of Detection and Tracking of Object from Image and Video Sequences," International Journal of Computational Intelligence Research, vol.13, no.5, pp.745-765, 2017.
 • [77] L. Yang, S. Cheng, P. K. Verma, and S. Wang, "Text Search: Towards Fast Text Localization in Scene Images," IEEE International Symposium on Multimedia (ISM), pp. 83-86, 2016.
 • [78] M. Liao, B. Shi, X. Bai, X. Wang, and W. Liu, "TextBoxes: A Fast Text Detector with a Single Deep Neural Network," arXiv preprint arXiv:1611.06779, 2017.
 • [79] G. Luís, and D. Karatzas. "Text proposals: a text-specific selective search algorithm for word spotting in the wild," Pattern Recognition, vol.70, pp.60-74, 2017.
 • [80] S. Qin, and M. Manduchi. "Cascaded Segmentation-Detection Networks for Word-Level Text Spotting," arXiv preprint arXiv:1704.00834, 2017.
 • [81] J. Zhang, G. Wu, X. Hu, and X. Wu, “A distributed cache for hadoop distributed file system in real-time cloud services,” International Conference on Grid Computing (GRID), pp.12-21, 2012.
 • [82] B. Kulis, and K. Grauman, “Kernelized locality-sensitive hashing for scalable image search,” international conference on computer vision (ICCV), pp.2130-2137, 2017.
 • [83] Z. Zhang, D. S. Katz, J. M. Wozniak, A. Espinosa, and I. Foster, “Design and analysis of data management in scalable parallel scripting,” International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1-12, 2012.
 • [84] M. Almeer, “Cloud Hadoop MapReduce for remote sensing image analysis,” Emerging Trends in Computing and Information Sciences, vol.3, no.4, pp.637-644, 2012.
 • [85] L. Neumann, and J. Matas, "Real-time lexicon-free scene text localization and recognition," IEEE transactions on pattern analysis and machine intelligence, vol.38, no.9, pp.1872-85, 2016.
 • [86] A. Veit, T. Matera, L. Neumann, J. Matas, and S. Belongie, "Coco-text: Dataset and benchmark for text detection and recognition in natural images," arXiv preprint arXiv:1601.07140, 2016.
 • [87] H. Turki, M. B. Halima, and A. M. Alimi, "Scene text detection images with pyramid image and MSER enhanced," International Conference on Intelligent Systems Design and Applications (ISDA), pp.301-306, 2015.
 • [88] S. Matko, "Scene Text Segmentation using Low Variation Extremal Regions and Sorting Based Character Grouping," Neurocomputing, vol.266, pp.56-65, 2017.
 • [89] C. Hojin, M. Sung, and B. Jun. "Canny text detector: Fast and robust scene text localization algorithm," IEEE Conference on Computer Vision and Pattern Recognition, pp. 3566-3573, 2016.
 • [90] Y. Song, J. Chen, H. Xie, Z. Chen, X. Gao, and X. Chen, "Robust and parallel Uyghur text localization in complex background images," Machine Vision and Applications, vol.28, no.7, pp.755-69, 2017.
 • [91] Y. Chong, Y. Song, and Y. Zhan, "Scene text localization using edge analysis and feature pool," Neurocomputing, vol.175, pp. 652-661, 2016.
 • [92] C. Kai, F. Yin, and C. L. Liu. "Effective Candidate Component Extraction for Text Localization in Born-Digital Images by Combining Text Contours and Stroke Interior Regions," IAPR Workshop on Document Analysis Systems (DAS), pp. 352-357, 2016.
 • [93] Vidhya, K.A. and Geetha, T.V., "Rough set theory for document clustering: A review," Journal of Intelligent & Fuzzy Systems, vol.32, no.3, pp.2165-2185, 2017.
 • [94] Z. Pawlak, J. Grzymala-Busse, R. Slowinski, and W. Ziarko,"Rough Set," Communication of the ACM, vol.38, no.11, pp.88–95, 1995.
 • [95] H. Cho, and M.K. An, "Co-clustering-based clustering and segmentation for pattern discovery from time course data," International Journal of Information and Electronics Engineering, vol.4, no.5, pp.358, 2014.
 • [96] E. Elhamifar, and R. Vidal, "Sparse subspace clustering: Algorithm, theory, and applications," IEEE transactions on pattern analysis and machine intelligence, vol.35, no.11, pp.2765-2781, 2014.
 • [97] Z. Li, L. F. Cheong, S. Yang, and K. C. Toh, "Simultaneous Clustering and Model Selection: Algorithm, Theory and Applications," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.40, no.8, pp.1964-78, 2017.
 • [98] D. Bazazian, R. G´omez, A. Nicolaou, L. Gomez, D. Karatzas, and A. D. Bagdanov, "Improving Text Proposals for Scene Images with Fully Convolutional Networks," arXiv preprint arXiv:1702.05089, 2017.
 • [99] T. He, W. Huang, Y. Qiao, and J. Yao, " Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network," arXiv preprint arXiv:1603.09423, 2016.
 • [100] Wei Y, Zhang Z, Shen W, Zeng D, Fang M, and Zhou S. "Text detection in scene images based on exhaustive segmentation," Signal Process: Image Communication, vol.50, pp.1–8, 2017.
 • [101] M. Jiang, J. Cheng, M. Chen, and X. Ku, " An Improved Text Localization Method for Natural Scene Images," Journal of physics: conference series, vol.960, no.1, p.012027, 2018, doi:10.1088/1742-6596/960/1/012027.
 • [102] N. Robert, A. Dicker, and K. Meyer-Wegener, "NEOCR: A configurable dataset for natural image text recognition," Camera-Based Document Analysis and Recognition, pp.150-163, 2011, Springer Berlin Heidelberg.
 • [103] L. SeongHun, M. Su Cho, K. Jung, and J. Kim, "Scene text extraction with edge constraint and text collinearity," international conference on pattern recognition, pp. 3983-3986, 2010.
 • [104] D. Campos, T. Emídio, B. R. Babu, and M. Varma, "Character Recognition in Natural Images," VISAPP, vol.2, pp.273-280, 2009.
 • [105] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. "Reading digits in natural images with unsupervised feature learning," NIPS workshop on deep learning and unsupervised feature learning, Granada, Spain, 2011.
 • [106] A. Mishra, A. Karteek, and C. V. Jawahar, "Scene text recognition using higher order language priors," BMVC 23rd British Machine Vision Conference BMVA, 2012.
 • [107] http://www.cvc.uab.es.
 • [108] http://www.iapr-tc11.org.
 • [109] http://wwwee.surrey.ac.uk.
 • [110] http://benchmarks.ai
دوره 18، شماره 1
بهار و تابستان
اردیبهشت 1399