International Journal of Information Technology and Computer Science Applications

Application of K-Means Clustering in Grouping Customer Preferences for K-Pop Albums And Merchandise

2025-07-14T14:18:51+07:00

The increasing popularity of K-Pop in Indonesia is particularly in the purchase of physical products. THJMINE Store faces challenges in inventory management and promotional strategies due to the lack of product grouping for albums and merchandise. This study applies the K-Means Clustering algorithm to 110 sales transaction data from July 2022 to January 2025. The method used in this study is the CRISP-DM approach, which consists of the following stages: business understanding, data understanding, data preparation, modeling, and evaluation discussion. The result of the study shows that the K-Means algorithm successfully formed three clusters with customer classification: loyal customers (cluster 0), general customers (cluster 1), and premium or collector customers (cluster 2). The model evaluation results in a DBI score of 0.6342, indicating good cluster quality. These clustering results can help THJMINE Store understand customer segmentation, develop more targeted marketing strategies, and improve inventory management efficiency.

Comparing Holt-Winters Variants Accuracy in Forecasting Indonesia LQ45 Stock Prices

2026-04-27T12:21:57+07:00

This study applies the Holt–Winters method, an exponential smoothing approach incorporating level, trend, and seasonal components, to compare the predictive accuracy of four variants (multiplicative, additive, OR, and average) of Holt-Winter Method in forecasting stock prices of companies listed in the LQ45 index. The dataset consists of stock prices from 2016–2021 for training and January–February 2022 for testing, with forecasting accuracy evaluated using Mean Absolute Percentage Error (MAPE), visualized through boxplots, and assessed using the nonparametric Kruskal–Wallis test. The Holt–Winters computations were performed using Microsoft Excel, while boxplot visualization and the Kruskal–Wallis test were conducted using the R programming language. The results indicate significant differences in predictive performance among the four methods with p-value = 0.04059 in Kruskal-Wallis test. The Additive Holt–Winters method achieves the best performance with the lowest MAPE, while the multiplicative method performs the worst. Among LQ45 stocks, INDF records the lowest forecasting error (1.6799%), whereas TPIA exhibits the highest (83.0783%). These results suggesting that the additive Holt–Winters method is more suitable for forecasting LQ45 stock prices under the observed conditions

Comparison of Naïve Bayes and K-Nearest Neighbor for Iphone 16 Youtube Sentiment

2025-07-16T11:49:01+07:00

Sentiment analysis plays an important role in understanding public opinion toward technological products, particularly in the context of social media such as YouTube. This study aims to analyze the sentiment of user comments on an iPhone 16 review video published by the GadgetIn YouTube channel, as well as to compare the performance of the Naïve Bayes and K-Nearest Neighbor classification algorithms. The data were collected through a crawling process, resulting in 2,499 comments, which were then split into training data 80% and testing data 20%. The methodology includes text cleaning, tokenization, normalization, and term weighting using the TF-IDF method. The experimental results show that the Naïve Bayes algorithm achieved an accuracy of 73%, with precision, recall, and F1-score each reaching 72%, outperforming KNN, which only achieved 65% accuracy. Most comments were neutral; positive comments generally focused on design and performance, while negative comments mainly highlighted price and comparisons with other products. These findings indicate that the Naïve Bayes algorithm is more suitable for sentiment analysis of unstructured YouTube comment data.

Clustering and Sales Prediction Using K-Means and Simple Linear Regression

2025-07-16T11:49:54+07:00

CV. Cipta Usaha Selaras faces challenges in identifying customer purchasing patterns and accurately projecting sales values. The importance of this research lies in the company’s need for data-driven marketing strategies and efficient operational planning. This study employs the K-Means algorithm to cluster customers based on purchase frequency and total transaction value, as well as Simple Linear Regression to predict total purchases based on transaction frequency. The data analyzed consists of 358 sales transaction entries from the year 2024. The clustering results reveal three customer segments with distinct characteristics, with a Silhouette Score of 0.7913, indicating good segmentation quality. The regression model produced an equation with a coefficient of determination (R²) of 0.6910, a MAE of IDR 213 million, and a MSE of IDR 206 trillion. These results indicate that the applied approach provides a reasonably representative overview of customer purchasing behavior. This research offers a significant contribution to data-driven decision-making within the company, particularly in the development of marketing strategies and estimation of potential revenue.

Toward Rigorous Zero-Shot and Few-Shot Benchmarking of Time-Series Foundation Models Under Domain Shift: A Leakage-Aware Benchmark Specification, Governance Framework, and Executable Pilot Instantiation

2026-04-27T12:24:57+07:00

Time-series foundation models (TSFMs) are increasingly promoted as reusable forecasting systems that can generalize across domains with zero-shot or few-shot adaptation. That claim is scientifically consequential, but current evaluation practice remains under-specified where it matters most target-domain separation, contamination control, adaptation budget definition, shift severity characterization, and aggregation across heterogeneous deployment conditions. This paper reconstructs TSFM benchmarking as a methodological problem rather than a leaderboard problem. We formalize zero-shot and few-shot forecasting under domain shift as conditional risk estimation over governed target distributions; develop a forecasting-specific taxonomy of shift covering temporal regime, entity, resolution, schema, horizon, observation-quality, intervention, and label-formation change; and propose a six-layer benchmark architecture spanning model governance, dataset governance, deterministic shift generation, evaluation tracks, metric tensors, and reporting bundles. The contribution is primarily conceptual, but to avoid a purely rhetorical framework, we also provide an executable pilot instantiation on a public electricity-transformer forecasting setting. Because large-scale TSFM execution was not conducted in this package, the pilot uses lightweight surrogate forecasters to validate the benchmark machinery itself rather than to claim new TSFM state of the art. Even this limited pilot shows that in-domain and cross-domain rankings can diverge sharply, that adaptation gains must be interpreted jointly with cost, and that robustness to observation degradation and calibration cannot be inferred from average point error alone. The paper therefore advances a benchmark doctrine: credible TSFM claims require leakage-aware governance, severity-conditioned analysis, explicit adaptation accounting, and multi-objective reporting that aligns evidence with generalization claims.

Association Pattern Analysis of Production Results Using the Apriori Algorithm

2025-07-16T11:50:55+07:00

This study aims to analyze association patterns in production data at CV. Sinar Agung Teknik using the Apriori algorithm. The company faces challenges in identifying co-produced product relationships, which complicates production pattern recognition. The research adopts the Knowledge Discovery in Databases (KDD) approach, comprising data selection from three months of daily production, data cleaning, transformation into transactional format, application of the Apriori algorithm, and result visualization. Key parameters applied in the mining process include support, confidence, and lift. The analysis was conducted from 1-itemset to 5-itemset combinations to determine product co-occurrence frequencies. The results revealed several significant association rules. One notable rule shows that the production of Karet Membran TT, Panel Pressure Destec, and Plat C Starcam is followed by Join Tuas Starcam and Karet Membran COM, with a confidence of 90% and a lift value of 2.25. A lift greater than 1 indicates a strong correlation among the products. These findings are expected to provide data-driven insights that can support decision-making in warehouse management, inventory control, and the strategic arrangement and retrieval of products