Analisis Big Data Berbasis Stream Processing Menggunakan Apache Spark
DOI:
https://doi.org/10.61769/telematika.v11i1.145Keywords:
Big Data, Real-Time, Stream Processing, Apache Spark, Open-source softwareAbstract
Teknologi big data memiliki 3 ciri utama yaitu volume, kecepatan tinggi dan kompleks. Pengolahan big data bukanlah hal yang mudah karena harus diolah secara real-time. Banyak perusahaan mengalami kendala dalam mengolah big data. Kendala tersebut berupa penyimpanan data yang tidak terstruktur, tidak lengkap, dan sulit diakses. Oleh karena itu, ada beberapa metode untuk mengolah big data, yaitu Tupel, Micro Batching, dan Widowed Real-Time Stream Processing. Metode yang digunakan dalam penelitian ini adalah windowed real-time stream processing. Penerapan stream processing membutuhkan perangkat lunak tertentu, yaitu Apache Spark. Apache Spark merupakan salah satu open-source software yang digunakan untuk menganalisis dan mengolah streaming data. Penerapan Apache Spark untuk big data dimulai dengan tahap pengintegrasian yang bertujuan agar Apache Spark dapat memperoleh data-data yang akan dianalisis. Hasil akhir dari penerapan metode ini berupa suatu sistem yang dapat membantu perusahaan untuk mengolah big data.References
H. John. ”Definition Real-Time.” Internet: http://whatis.techtarget.com/definition/real-time, Apr. 2006 [Oct. 20, 2015].
K. Wahner. ”Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse.” Internet: http://www.infoq.com/articles/stream-processing-hadoop, Sept. 10, 2014 [Oct. 20, 2015].
M. Barlow. (2013, February 25). Real-Time Big Data Analytics. (1st edition). [Online]. [Oct 20, 2015].
N. Idoudi, N. Louati, C. Duvallet, at all. (2009, January). "A Framework to Model Real-Time Databases." International Journal of Computing and Information Sciences. [Online]. 7(1), pp. 1-8. Available: http://www.ijcis.info/Vol7N1/Vol7P1N1-PP-1-11.pdf [Oct. 20, 2015].
N. Chetan. “Real-Time Event Stream Processing.” Internet:
https://www.datatorrent.com/real-time-event-stream-processing-what-are-your-choices/, March. 9, 2015 [Oct. 20, 2015].
P. Srini. ”Big Data Processing with Apache Spark.” Internet:
http://www.infoq.com/articles/apache-spark-introduction, Jan. 30, 2015 [Oct. 20, 2015].
T. Das. “Faster Stateful Stream Processing in Apache Spark Streaming.” Internet: https://databricks.com/blog/2016/02/01/faster-stateful-stream-processing-in-apache-spark-streaming.html/, Feb.1, 2016 [Jun. 20, 2016].
W. Kai. “Real-Time Stream Processing as Game Charger in a Big Data World with Hadoop and Data Warehouse.” Internet: http://www.infoq.com/articles/stream-processing-hadoop, Sept. 10, 2014 [Oct. 20, 2015].
Downloads
Published
Issue
Section
License
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.