ROBUST AUTOMATIC PHONEME RECOGNITION FEATURES USING COMPLEX WAVELET PACKET TRANSFORM COEFFICIENTS
DOI:
https://doi.org/10.61769/telematika.v6i1.41Keywords:
pengenalan fonem otomatis, kokoh, derau, transformasi paket wavelet kompleks.Abstract
Untuk meningkatkan kinerja sistem pengenalan fonem otomatis pada saat dioperasikan pada lingkungan berderau, kami mengembangkan teknik baru yang dapat melakukan estimasi terhadap suatu fitur fonem bersih dari bentuk berderaunya. Fitur-fitur kokoh tersebut diperoleh dari koefisien transformasi paket wavelet kompleks (Complex
Wavelet Packet Transform/CWPT). Karena koefisien CWPT merepresentasikan semua pita frekuensi yang berbeda dari suatu sinyal masukan, mendekomposisi sinyal masukan tersebut ke dalam pohon CWPT yang lengkap akan mencakup semua frekuensi yang terlibat dalam proses pengenalan. Setiap komponen frekuensi dalam sinyal masukan akan ditempatkan pada tepat satu pita frekuensi yang spesifik. Untuk suatu campuran sinyal domain waktu dengan frekuensi yang berbedabeda, misalnya sinyal fonem dengan derau, semua koefisien fonem dalam pita frekuensi yang sama, yaitu semua koefisien
yang melewati jalur filter bank wavelet yang sama, akan berubah sesuai dengan magnituda komponen frekuensi derau. Oleh karena itu, jika ada sebuah pita frekuensi yang tidak mengandung derau sama sekali, seluruh koefisien fonem pada pita frekuensi tersebut tidak akan mengalami perubahan. Informasi dari semua koefisien yang dikandung oleh pita frekuensi tersebut kemudian dapat dimanfaatkan untuk melakukan estimasi terhadap kemungkinan fonem bersihnya. Karena jumlah fonem dalam suatu bahasa adalah terbatas dan relatif kecil dan sudah diketahui dengan baik sebelumnya, teknik yang dikembangkan ini fisibel secara komputasi. Hasil-hasil simulasi menunjukkan bahwa teknik baru yang dikembangkan ini merupakan pengekstrak fitur yang efisien dan tidak hanya dapat meningkatkan kekokohan sistem pengenal fonem otomatis jika dioperasikan pada berbagai macam lingkungan yang berderau tetapi juga tetap memelihara kinerja baiknya pada lingkungan yang bersih.
To improve the performance of Automatic Phoneme Recognition in noisy environment, we developed a new technique that could estimate clean phoneme feature from its noisy one. These robust features are obtained from Complex Wavelet Packet Transform (CWPT) coefficients. Since the CWPT coefficients represent all different frequency bands of the input signal, decomposing the input signal into complete CWPT tree would covered all frequencies that involved in recognition
process. Each frequency would be placed into exactly one of its frequency bands. For time overlapping signals with different frequency contents, e. g. phoneme with noises, all coefficients
belongs to the same frequency band, which is coming through the same wavelet filter banks path, would be changed according to noise frequencies magnitude. Thus, if there is one frequency band which contain no noises at all, all coefficients belongs to that frequency band would not change. Information from all coefficients belongs to that frequency band could be used then to estimate the clean phonemes. Since the numbers of phonemes are limited and already well known, this technique is computationally feasible. Simulation results showed that this new technique is an efficient features extractor that improves the robustness of the systems in various adverse noisy conditions but still reserve the good performance in clean environments.
References
S. M. Metev and V. P. Veiko, Laser Assisted Microtechnology, 2nd ed.,
R. M. Osgood, Jr., Ed. Berlin, Germany: Springer-Verlag, 1998.
J. Breckling, Ed., The Analysis of Directional Time Series:
Applications to Wind Speed and Direction, ser. Lecture Notes in
Statistics. Berlin, Germany: Springer, 1989, vol. 61.
S. Zhang, C. Zhu, J. K. O. Sin, and P. K. T. Mok, “A novel ultrathin
elevated channel low-temperature poly-Si TFT,” IEEE Electron Device
Lett., vol. 20, pp. 569–571, Nov. 1999.
M. Wegmuller, J. P. von der Weid, P. Oberson, and N. Gisin, “High
resolution fiber distributed measurements with coherent OFDR,” in
Proc. ECOC’00, 2000, paper 11.3.4, p. 109.
R. E. Sorace, V. S. Reinhardt, and S. A. Vaughn, “High-speed digitalto-
RF converter,” U.S. Patent 5 668 842, Sept. 16, 1997.
(2002) The IEEE website. [Online]. Available: http://www.ieee.org/
M. Shell. (2002) IEEEtran homepage on CTAN. [Online]. Available:
http://www.ctan.org/texarchive/
macros/latex/contrib/supported/IEEEtran/
FLEXChip Signal Processor (MC68175/D), Motorola, 1996.
“PDCA12-70 data sheet,” Opto Speed SA, Mezzovico, Switzerland.
A. Karnik, “Performance of TCP congestion control with rate feedback:
TCP/ABR and rate adaptive TCP/IP,” M. Eng. thesis, Indian Institute
of Science, Bangalore, India, Jan. 1999.
J. Padhye, V. Firoiu, and D. Towsley, “A stochastic model of TCP
Reno congestion avoidance and control,” Univ. of Massachusetts,
Amherst, MA, CMPSCI Tech. Rep. 99-02, 1999.
Wireless LAN Medium Access Control (MAC) and Physical Layer
(PHY) Specification, IEEE Std. 802.11, 1997.
Downloads
Published
Issue
Section
License
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.