Hadoop Ecosystem Enhances Data Analytics for Music Streaming: A Case Study of User Behavior in the Last FM Dataset
DOI:
https://doi.org/10.58776/ijitcsa.v2i3.166Keywords:
Hadoop ecosystem, HDFS, Apache Pig, Big data, User behavior analysisAbstract
This paper proposed a big data pipeline to analyze user behavior on Last.fm which aims to make data-driven recommendations for improving user engagement and attracting new users. The comprehensive analysis of user behavior in the music streaming industry using the Hadoop ecosystem and data analytics techniques. Specifically, the study focuses on Last.fm, a popular music streaming platform that collects large amounts of user activity data. The article proposes a new data pipeline utilizing Hadoop Distributed File System (HDFS) for data storage and Apache Pig for data transformation, leading to improved data preprocessing and analysis. Various analyses are conducted, including identifying the most listened to artists, top users based on song consumption and social connections, artist popularity by tags, and the most recently tagged artists. The findings provide valuable insights into user preferences, current trends, and opportunities for enhancing the recommendation algorithm and user engagement. The article concludes by offering recommendations for personalized marketing strategies and curated playlists to increase user satisfaction and revenue.
References
. “Mining user generated data for music information retrieval,” Mining User Generated Content, pp. 107–136, Jan. 2014. doi:10.1201/b16413-14.
. S. Sela, Improvised music follows human language quantitative properties to optimize music processing, Dec. 2021. doi:10.31234/osf.io/fh4qu.
. C. S. R. Prabhu, A. S. Chivukula, A. Mogadala, R. Ghosh, and L. M. J. Livingston, “Big Data Tools—hadoop ecosystem, Spark and NoSQL databases,” Big Data Analytics: Systems, Algorithms, Applications, pp. 83–165, 2019. doi:10.1007/978-981-15-0094-7_4.
. N. Gerhart and M. Koohikamali, “Social Network Migration and anonymity expectations: What anonymous social network apps offer,” Computers in Human Behavior, vol. 95, pp. 101–113, Jun. 2019. doi:10.1016/j.chb.2019.01.030.
. Y. M. Kassa, R. Cuevas, and A. Cuevas, “A large-scale analysis of Facebook’s user-base and user engagement growth,” IEEE Access, vol. 6, pp. 78881–78891, 2018. doi:10.1109/access.2018.2885458.
. H. Qin et al., “Building Electricity Consumption Analysis: Data-driven approach with preprocessing, visualization, and cluster analysis,” 2023 International Conference on Electronics and Devices, Computational Science (ICEDCS), pp. 48–53, Sep. 2023. doi:10.1109/icedcs60513.2023.00015.
. E. Nazari, M. H. Shahriari, and H. Tabesh, “Bigdata analysis in Healthcare: Apache Hadoop , Apache Spark and Apache Flink,” Frontiers in Health Informatics, vol. 8, no. 1, p. 14, Jul. 2019. doi:10.30699/fhi.v8i1.180.
. Jala Aghazada, “Arrangement and modulation of ETL process in the storage,” Science Review, no. 1(28), pp. 3–8, Jan. 2020. doi:10.31435/rsglobal_sr/31012020/6866.
. A. D. Jadhav and V. Pellakuri, “Accuracy based fault tolerant two phase - intrusion detection system (TP-IDS) using machine learning and HDFS,” Revue d’Intelligence Artificielle, vol. 35, no. 5, pp. 359–366, Oct. 2021. doi:10.18280/ria.350501.
. B. Vaddeman, “Pig Latin in Hue,” Beginning Apache Pig, pp. 115–122, 2016. doi:10.1007/978-1-4842-2337-6_8.
. Z. Liu and F. Ren, “Algorithm improvement of movie recommendation system based on hybrid recommendation algorithm,” Frontiers in Computing and Intelligent Systems, vol. 3, no. 3, pp. 113–117, May 2023. doi:10.54097/fcis.v3i3.8581.
. J. I. Criado and J. Villodre, “Public employees in social media
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Akkord Elizade
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.