Application of data warehouse and OLAP processes for retail analytics
DOI:
https://doi.org/10.58776/ijitcsa.v4i1.244Keywords:
Data Warehouse, Sales Data, Retail Analytics, Independent Data Mart, OLAP, ETLAbstract
Retail organizations increasingly rely on heterogeneous operational platforms, including point-of-sale systems, customer relationship management applications, cloud data stores, and locally administered databases. Although these platforms are valuable for transaction processing, they often generate fragmented, duplicated, and semantically inconsistent data that constrain enterprise reporting, forecasting, and customer intelligence. This paper substantially extends a conceptual SwiftMart case into a full design-and-evaluation study of a retail data warehouse and Online Analytical Processing (OLAP) framework. The proposed artifact combines a Kimball-style dimensional architecture, a governed extract-transform-load (ETL) pipeline, conformed dimensions, and materialized OLAP summaries for managerial analytics. To ground the case empirically, the framework is evaluated using the open-access UCI Online Retail dataset, which contains 541,909 transaction records from a UK-based online retailer covering 1 December 2010 to 9 December 2011. The experiment transforms raw transactions into a star schema with 524,878 curated fact rows, 19,960 orders, 4,355 customer members, 4,158 product members, and 38 countries. Four representative analytical workloads are benchmarked across three storage designs: a normalized operational data store, a dimensional warehouse, and materialized aggregate tables. The dimensional warehouse reduces mean latency by 42.3% relative to baseline joins, while materialized aggregates reduce latency by approximately 99.9%. A forecasting demonstration on warehouse-generated daily revenue aggregates further shows that a random forest model outperforms a naive benchmark, achieving an RMSE of 23,715.84 versus 34,055.29. The paper contributes an end-to-end reference architecture for retail analytics, together with dimensional design rationale, mathematical formulations, algorithms, empirical results, and implementation guidance relevant to both academic researchers and practitioners.
References
. E. Aktas and Y. Meng, “An exploration of big data practices in retail sector,” Logistics, vol. 1, no. 2, pp. 12, 2017. doi: https://doi.org/10.3390/logistics1020012.
. J. Aversa, T. Hernandez, and S. Doherty, “Incorporating big data within retail organizations: A case study approach,” Journal of Retailing and Consumer Services, vol. 60, pp. 102447, 2021. doi: https://doi.org/10.1016/j.jretconser.2021.102447.
. S. Chaudhuri and U. Dayal, “An overview of data warehousing and OLAP technology,” SIGMOD Record, vol. 26, no. 1, pp. 65–74, 1997. doi: https://doi.org/10.1145/248603.248616.
. R. Kimball and M. Ross, “The data warehouse toolkit: The definitive guide to dimensional modeling,” 3rd ed., Wiley, 2013.
. W. H. Inmon, “Building the data warehouse,” 4th ed., Wiley, 2005.
. R. Sharda, D. Delen, and E. Turban, “Business intelligence, analytics, and data science: A managerial perspective,” 4th ed., Pearson, 2018.
. V. Harinarayan, A. Rajaraman, and J. D. Ullman, “Implementing data cubes efficiently,” SIGMOD Record, vol. 25, no. 2, pp. 205–216, 1996. doi: https://doi.org/10.1145/235968.233333.
. M. Golfarelli and S. Rizzi, “Data warehouse design: Modern principles and methodologies,” McGraw-Hill, 2009.
. R. Fildes, S. Ma, and S. Kolassa, “Retail forecasting: Research and practice,” International Journal of Forecasting, vol. 38, no. 4, pp. 1283–1318, 2022. doi: https://doi.org/10.1016/j.ijforecast.2019.06.004.
. D. Chen, S. L. Sain, and K. Guo, “Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining,” Journal of Database Marketing & Customer Strategy Management, vol. 19, no. 3, pp. 197–208, 2012. doi: https://doi.org/10.1057/dbm.2012.17.
. C. I. Papanagnou and O. Matthews-Amune, “Coping with demand volatility in retail pharmacies with the aid of big data exploration,” Computers & Operations Research, vol. 98, pp. 343–354, 2018. doi: https://doi.org/10.1016/j.cor.2017.08.009.
. J. A. Aloysius, H. Höhle, S. Goodarzi, and V. Venkatesh, “Big data initiatives in retail environments: Linking service process perceptions to shopping outcomes,” Annals of Operations Research, vol. 270, no. 1-2, pp. 25–51, 2018. doi: https://doi.org/10.1007/s10479-016-2276-3.
. D. Chen, “Online retail [dataset],” UCI Machine Learning Repository, 2015. doi: https://doi.org/10.24432/C5BW33.
. S. Akter and S. F. Wamba, “Big data analytics in E-commerce: a systematic review and agenda for future research,” Electronic Markets, vol. 26, no. 2, pp. 173–194, 2016. doi: https://doi.org/10.1007/s12525-016-0219-0.
. M. Wibowo, S. Sulaiman, and S. M. Shamsuddin, “Machine learning in data lake for combining data silos,” in International conference on data mining and big data, pp. 294–306, 2017. doi: https://doi.org/10.1007/978-3-319-61845-6_30.
. A. Gandomi and M. Haider, “Beyond the hype: Big data concepts, methods, and analytics,” International Journal of Information Management, vol. 35, no. 2, pp. 137–144, 2015. doi: https://doi.org/10.1016/j.ijinfomgt.2014.10.007.
. R. Y. Wang and D. M. Strong, “Beyond accuracy: What data quality means to data consumers,” Journal of Management Information Systems, vol. 12, no. 4, pp. 5–33, 1996. doi: https://doi.org/10.1080/07421222.1996.11518099.
. L. Ehrlinger and W. Wöß, “A survey of data quality measurement and monitoring tools,” Frontiers in Big Data, vol. 5, pp. 850611, 2022. doi: https://doi.org/10.3389/fdata.2022.850611.
. P. Voigt and A. Bussche, “The EU general data protection regulation (GDPR): A practical guide,” Springer, 2017. doi: https://doi.org/10.1007/978-3-319-57959-7.
. S. Beheshti-Kashi, H. R. Karimi, and K. D. Thoben, “A survey on retail sales forecasting and prediction in fashion markets,” Systems Science & Control Engineering, vol. 3, no. 1, pp. 154–161, 2015. doi: https://doi.org/10.1080/21642583.2014.999389.
. A. Cuzzocrea, L. Bellatreche, and I. Y. Song, “Data warehousing and OLAP over big data: Current challenges and future research directions,” in Proceedings of the 16th international workshop on data warehousing and OLAP, pp. 67–70, 2013. doi: https://doi.org/10.1145/2513190.2517828.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Victorio Palben Medel

This work is licensed under a Creative Commons Attribution 4.0 International License.
Attribution 4.0 International
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.


