Healthcare Data Integration Through Enterprise Data Warehousing: Architecture, Conformance Pipeline, and Experimental Validation for Readmission Analytics
DOI:
https://doi.org/10.58776/ijitcsa.v4i1.246Keywords:
Data integration, Healthcare data warehouse, HIS, EHR, Interoperability, Dimensional modeling, ETL, Readmission analyticsAbstract
Healthcare organizations operate a fragmented digital landscape in which hospital information systems (HIS), electronic health records (EHR), laboratory systems, billing platforms, and departmental applications are optimized for transaction processing but not for integrated analysis. The resulting interoperability gaps, semantic inconsistency, duplicated records, and uneven data quality constrain enterprise reporting and limit higher-value analytics. This paper substantially proposes implementable enterprise data warehouse architecture, formalizing its data-quality and conformance mechanisms, and validating the design through experimental analytics use case. The proposed framework combines an integration layer for ETL/ELT, conformed dimensions, departmental marts, governance controls, and an analytics layer for OLAP and machine learning. To demonstrate practical value, the paper evaluates the framework on a de-identified inpatient diabetes dataset comprising 101,766 encounters and 50 raw attributes. The experimental pipeline performs profiling, conformance mapping, diagnosis grouping, missing-value treatment, and dimensional modeling before training benchmark readmission models. The best ranking performance is obtained by XGBoost with an AUROC of 0.688 and an AUPRC of 0.235, while threshold tuning improves recall-oriented operational utility. The results show that healthcare warehousing should not be framed merely as centralized storage; rather, it is an architectural mechanism for interoperability, data quality control, reproducible analytics, and decision support. The manuscript concludes with implementation guidance and limitations relevant to hospitals seeking a scalable, governance-aware warehousing program.
References
. Stylianou, A., & Talias, M. A. Big data in healthcare: a discussion on the big challenges. Health and Technology, 7(1), 97–107, 2017. DOI: 10.1007/s12553-016-0152-4. URL: https://doi.org/10.1007/s12553-016-0152-4
. Shen, Y., Yu, J., Zhou, J., & Hu, G. Twenty-Five Years of Evolution and Hurdles in Electronic Health Records and Interoperability in Medical Research: Comprehensive Review. Journal of Medical Internet Research, 27, e59024, 2025. DOI: 10.2196/59024. URL: https://www.jmir.org/2025/1/e59024/
. Campion, T. R., Jr., Craven, C. K., Dorr, D. A., Bernstam, E. V., & Knosp, B. M. Understanding enterprise data warehouses to support clinical and translational research: impact, sustainability, demand management, and accessibility. Journal of the American Medical Informatics Association, 31(7), 1522–1528, 2024. DOI: 10.1093/jamia/ocae111. URL: https://doi.org/10.1093/jamia/ocae111
. Knosp, B. M., Craven, C. K., Dorr, D. A., Bernstam, E. V., & Campion, T. R., Jr. Understanding enterprise data warehouses to support clinical and translational research: enterprise information technology relationships, data governance, workforce, and cloud computing. Journal of the American Medical Informatics Association, 29(4), 671–676, 2022. DOI: 10.1093/jamia/ocab256. URL: https://doi.org/10.1093/jamia/ocab256
. Wang, Z., Craven, C., Syed, M., Greer, M., Seker, E., Syed, S., & Zozus, M. N. Clinical Data Warehousing: A Scoping Review. Journal of the Society for Clinical Data Management, 4(1), Article 8, 1–19, 2024. DOI: 10.47912/jscdm.320. URL: https://doi.org/10.47912/jscdm.320
. Lyu, S., Craig, S., O’Reilly, G., Taniar, D., et al. The development and use of data warehousing in clinical settings: a scoping review. Frontiers in Digital Health, 7, 1599514, 2025. DOI: 10.3389/fdgth.2025.1599514. URL: https://doi.org/10.3389/fdgth.2025.1599514
. Knezevic Ivanovski, T., Honap, S., Matic, R., Markovic, S., & Peyrin-Biroulet, L. Building a healthcare data warehouse: considerations, opportunities, and challenges. Frontiers in Digital Health, 7, 1691142, 2025. DOI: 10.3389/fdgth.2025.1691142. URL: https://doi.org/10.3389/fdgth.2025.1691142
. Sabooniha, N., Toohey, D. P., & Lee, K. An evaluation of hospital information systems integration approaches. In Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI 2012), pp. 498–504, 2012. DOI: 10.1145/2345396.2345479. URL: https://doi.org/10.1145/2345396.2345479
. Tabari, P., Costagliola, G., De Rosa, M., & Boeker, M. State-of-the-Art Fast Healthcare Interoperability Resources (FHIR)–Based Data Model and Structure Implementations: Systematic Scoping Review. JMIR Medical Informatics, 12, e58445, 2024. DOI: 10.2196/58445. URL: https://medinform.jmir.org/2024/1/e58445
. El-Yafouri, R., & Klieb, L. A scoping review of electronic health records interoperability levels, expectations, approaches, and problems. Health Informatics Journal, 31(4), 2025. DOI: 10.1177/14604582251385986. URL: https://doi.org/10.1177/14604582251385986
. Adegoke, K., Adegoke, A., Dawodu, D., Adekoya, A., Bayowa, A., Kayode, T., & Singh, M. Interoperability as a Catalyst for Digital Health and Therapeutics: A Scoping Review of Emerging Technologies and Standards (2015–2025). International Journal of Environmental Research and Public Health, 22(10), 1535, 2025. DOI: 10.3390/ijerph22101535. URL: https://doi.org/10.3390/ijerph22101535
. Zhang, H., Lyu, T., Yin, P., Bost, S., He, X., Guo, Y., Prosperi, M., Hogan, W. R., & Bian, J. A scoping review of semantic integration of health data and information. International Journal of Medical Informatics, 165, 104834, 2022. DOI: 10.1016/j.ijmedinf.2022.104834. URL: https://doi.org/10.1016/j.ijmedinf.2022.104834
. Marfoglia, A., Nardini, F., Arcobelli, V. A., Moscato, S., Mellone, S., & Carbonaro, A. Towards real-world clinical data standardization: A modular FHIR-driven transformation pipeline to enhance semantic interoperability in healthcare. Computers in Biology and Medicine, 187, 109745, 2025. DOI: 10.1016/j.compbiomed.2025.109745. URL: https://doi.org/10.1016/j.compbiomed.2025.109745
. Sahama, T. R., & Croll, P. R. A data warehouse architecture for clinical data warehousing. In J. F. Roddick & J. R. Warren (Eds.), Proceedings of the First Australasian Workshop on Health Knowledge Management and Discovery (HKMD 2007), CRPIT, Vol. 68, pp. 227–232, 2007. Persistent URL: https://dl.acm.org/doi/10.5555/1274531.1274560
. Berndt, D. J., Fisher, J. W., Hevner, A. R., & Studnicki, J. Healthcare data warehousing and quality assurance. Computer, 34(12), 56–65, 2001. DOI: 10.1109/2.970578. URL: https://doi.org/10.1109/2.970578
. Lighterness, A., Adcock, M., Scanlon, L. A., & Price, G. Data Quality–Driven Improvement in Health Care: Systematic Literature Review. Journal of Medical Internet Research, 26, e57615, 2024. DOI: 10.2196/57615. URL: https://www.jmir.org/2024/1/e57615/
. Penev, Y. P., Buchanan, T. R., Ruppert, M. M., Liu, M., Shekouhi, R., Guan, Z., Balch, J., Ozrazgat-Baslanti, T., Shickel, B., Loftus, T. J., & Bihorac, A. Electronic Health Record Data Quality and Performance Assessments: Scoping Review. JMIR Medical Informatics, 12, e58130, 2024. DOI: 10.2196/58130. URL: https://medinform.jmir.org/2024/1/e58130/
. Hosseinzadeh, E., Afkanpour, M., Momeni, M., et al. Data quality assessment in healthcare, dimensions, methods and tools: a systematic review. BMC Medical Informatics and Decision Making, 25, 296, 2025. DOI: 10.1186/s12911-025-03136-y. URL: https://doi.org/10.1186/s12911-025-03136-y
. An, D., Lim, M., Lee, S., et al. Challenges for Data Quality in the Clinical Data Life Cycle: Systematic Review. Journal of Medical Internet Research, 27, e60709, 2025. DOI: 10.2196/60709. URL: https://www.jmir.org/2025/1/e60709/
. Declerck, J., Kiliç, Ö. D., Erol, E. E., et al. Assessing Data Quality in Heterogeneous Health Care Integration: Simulation Study of the AIDAVA Framework. JMIR Medical Informatics, 13, e75275, 2025. DOI: 10.2196/75275. URL: https://medinform.jmir.org/2025/1/e75275/
. Faridoon, A., & Kechadi, M. T. Healthcare Data Governance, Privacy, and Security. In Body Area Networks: Smart IoT and Big Data for Intelligent Health Management, pp. 261–271. Springer, Cham, 2024. DOI: 10.1007/978-3-031-72524-1_19. URL: https://doi.org/10.1007/978-3-031-72524-1_19
. Ahmed, A., Shahzad, A., Naseem, A., Ali, S., & Ahmad, I. Evaluating the effectiveness of data governance frameworks in ensuring security and privacy of healthcare data: A quantitative analysis of ISO standards, GDPR, and HIPAA in blockchain technology. PLOS ONE, 20(5), e0324285, 2025. DOI: 10.1371/journal.pone.0324285. URL: https://doi.org/10.1371/journal.pone.0324285
. Emi-Johnson, O. G., & Nkrumah, K. J. Predicting 30-Day Hospital Readmission in Patients With Diabetes Using Machine Learning on Electronic Health Record Data. Cureus, 17(4), e82437, 2025. DOI: 10.7759/cureus.82437. URL: https://doi.org/10.7759/cureus.82437
. Mishra, V., Tanniru, M. R., & Sreedharan, J. Prediction of 30-day readmission in diabetes management using machine learning. Computers in Biology and Medicine, 195, 110616, 2025. DOI: 10.1016/j.compbiomed.2025.110616. URL: https://doi.org/10.1016/j.compbiomed.2025.110616
. Chen, E. T. Implementation Issues of Enterprise Data Warehousing and Business Intelligence in the Healthcare Industry. Communications of the IIMA, 12(2), Article 3, 2012. DOI: 10.58729/1941-6687.1186. URL: https://scholarworks.lib.csusb.edu/ciima/vol12/iss2/3/
. Strack, B., DeShazo, J. P., Gennings, C., Olmo, J. L., Ventura, S., Cios, K. J., & Clore, J. N. Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records. BioMed Research International, 2014, 781670, 2014. DOI: 10.1155/2014/781670. URL: https://doi.org/10.1155/2014/781670
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 La Duy Ngôn

This work is licensed under a Creative Commons Attribution 4.0 International License.
Attribution 4.0 International
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.


