License Plate Localization for Low Computation Resources Systems Using Raw Image Input and Artificial Neural Network

Tjong Wan Sen; Sinung Suakanto; Amril Mutoi Siregar

doi:10.61769/telematika.v15i1.349

Authors

Tjong Wan Sen Faculty of Computing, President Universit
Sinung Suakanto Program Studi Teknik Elektro, Institut Teknologi Harapan Bangsa
Amril Mutoi Siregar Faculty of Engineering and Computer Science, Buana Perjuangan University

DOI:

https://doi.org/10.61769/telematika.v15i1.349

Keywords:

computer and information processing, image analysis, image processing, object detection, license plate localization

Abstract

License Plate localization using Computer Vision needs a lot of computation resources. Thus, it is hard to deploy it on small systems. This paper presents an efficient license plate localization method using raw image input and artificial neural network. This is achieved by eliminating feature extraction stage and try to use as minimum as possible neural network architecture. Raw image input in dataset is cropped and labelled manually from random car images and video frames. The minimum architecture of the model has only three layers and 32,770 neurons. This is feasible to be deployed in today most single chip systems. The results, from various experiments, yield more than 90% of localization accuracy.

Nomor plat kendaraan bermotor yang diperoleh dengan menggunakan Computer Vision membutuhkan banyak daya komputasi. Hal ini menyebabkan implementasinya ke dalam sistem minimum yang sederhana menjadi tidak mudah. Dalam penelitian ini, dikembangkan sebuah metoda untuk mendapatkan plat nomor kendaraan bermotor yang effisien menggunakan masukan langsung tanpa ektraksi ciri dan jaringan saraf tiruan. Penghematan daya komputasi dicapai dengan cara menghilangkan tahap ekstraksi ciri dan penggunaan arsitektur jaringan saraf tiruan yang seminimum mungkin. Citra masukan diperoleh dengan cara memotong dan memberi label gambar mobil dan frame video yang diperoleh secara acak. Arsitektur minimum yang dihasilkan berupa model yang hanya terdiri dari tiga lapisan dan 32,770 neuron. Model ini cukup fisibel untuk diterapkan pada kebanyakan system on a chip yang ada pada saat ini. Tingkat akurasi model dalam menemukan lokasi nomor kendaraan dari berbagai eksperimen berhasil mencapai lebih dari 90%.

Author Biographies

Tjong Wan Sen, Faculty of Computing, President Universit

He received Ph.D. degree from Institut Teknologi Bandung in 2009. Since 2010, he has been with the Faculty of Computing, President University, Jababeka, Indonesia, where he is currently a lecturer and researcher. His research interests include automatic speech/speaker recognition, computer vision, artificial intelligence, and embedded systems.

Sinung Suakanto, Program Studi Teknik Elektro, Institut Teknologi Harapan Bangsa

He received Ph.D. degree from Institut Teknologi Bandung. Since 2006, he has been with Institut Teknologi Harapan Bangsa, Bandung, Indonesia, as a lecturer and researcher. His research interests include application development, computer and telecommunication network, artificial intelligence, IoT, and big data.

Amril Mutoi Siregar, Faculty of Engineering and Computer Science, Buana Perjuangan University

He is currently a Ph.D. candidate from Institut Pertanian Bogor University. Since 2018, he has been with the Faculty of Engineering and Computer Science, Buana Perjuangan University, Karawang, Indonesia, where he is currently a lecturer and researcher. His research interests include computational intelligence and optimization, computer vision, data mining, text mining, machine learning, and deep learning.

References

M. Rezaei and M. Isehaghi, “An efficient method for license plate localization using multiple statistical features in a multilayer perceptron neural network,” in 9th Conference on Artificial Intelligence and Robotics and 2nd Asia-Pacific International Symposium, 2018.

Y. Jamtsho, P. Riyamongkol, R. Waranusast, “Real-time Bhutanese license plate localization using YOLO,” ICT Express 6, 2020 pp. 121-124.

J. Redmon, A. Farhadi, “YOLO9000: better, faster, stronger,” the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 7263-7271

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, vol. 0, pp. 580-587.

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, 2015, vol. 521, pp. 436-444.

J. Liu, C.-H. Wu, Y. Wang, Q. Xu, Y. Zhou, H. Huang,C. Wang, S. Cai, Y. Ding, H. Fan, and J. Wang, “Learning raw image denoising with bayer pattern normalization and bayer preserving augmentation,” CVPR, 2019.

X. Xu, Y. Ma, and W. Sun, “Towards real scene super-resolution with raw images,” CVPR, 2019.

A. Schwartzman, M. Kagan, L. Mackey, B. Nachman, L. De Oliveira, “Image processing, computer vision, and deep learning: new approaches to the analysis and physics interpretation of LHC events,” IOP Publishing Journal of Physics: Conference Series, 762, 2016.

W. S. Tjong, “Voice activity detector for device with small processor and memory,” International Conference on Sustainable Engineering and Creative Computing (ICSECC), 2019, pp. 212-217.

P. Ghahremani, V. Manohar, D. Povey, and S. Khudanpur, "Acoustic modelling from the signal domain using CNNs," Interspeech, 2016.

Z. Tuske, P. Golik, R. Schluter, and H. Ney, "Acoustic modeling with deep neural networks using raw time signal for LVCSR," Interspeech, pp. 890-894, 2014.

T. N. Sainath, R. J. Weiss, A. Senior, K. W. Wilson, and O. Vinyals, "Learning the speech front-end with raw waveform CLDNNs," in Interspeech, pp. 1-5, 2015.

R. Z. Candil, T. N. Sainath, G. Simko, and C. Parada, "Feature learning with raw-waveform CLDNNs for voice activity detection," in Interspeech 2016, pp. 3668-3672, San Fransisco, USA, 2016.

S. Kohn, E. Racah, C. Tull, D. Dwyer, Prabhat, and W. Bhimji, “Deep learning with raw data from Daya Bay,” IOP Conf. Series: Journal of Physics, 898, 2017.

W. S. Tjong, B. R. Trilaksono, A. A. Arman, R. Mandala, “Robust automatic speech recognition features using complex wavelet packet transform coefficients,” Journal of ICT Research and Applications, 2009, vol. 3, no. 2, pp. 123-134.

W. S. Tjong, B. R. Trilaksono, A. A. Arman, “Evaluation of wavelet transform coefficients for robust speech recognition feature vectors,” Proceedings International Conference on Electrical Engineering and Informatics, 2007.

Y. LeCun, P. Haffner, L. Bottou, and Y. Bengio, "Object recognition with gradient-based learning,” in Shape, Contour and Grouping in Computer Vision, 1999, pp. 319–344.

A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in NIPS, 2012.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-based convolutional networks for accurate object detection and segmentation,” TPAMI, 2015.

R. Girshick, “Fast R-CNN,” in ICCV, 2015, pp. 1440-1448.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in ICLR, 2015.

F. Seide, H. Fu, J. Droppo, G. Li and D. Yu, "1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs," in Proc. of Interspeech, Singapore, 2014.