Optimizing Artificial Intelligence Performance through Data Engineering Architectures and Machine Learning–Driven Analytics
Keywords:
Artificial Intelligence, Data Engineering, Machine Learning Analytics, Data Pipelines, Big Data Architecture, Model OptimizationAbstract
Artificial Intelligence (AI) systems increasingly rely on scalable data infrastructures and advanced analytics to achieve high performance and reliability. The efficiency of AI models is significantly influenced by the design of data engineering architectures and the integration of machine learning–driven analytics pipelines. This research explores how modern data engineering frameworks—such as distributed data processing, scalable storage systems, and automated pipelines—can enhance the accuracy, scalability, and efficiency of AI models. The study also examines how machine learning–driven analytics improves decision-making by optimizing data preparation, feature engineering, and model deployment processes.
The paper synthesizes findings from existing literature and proposes an integrated framework that combines robust data engineering architectures with machine learning analytics for performance optimization. The proposed framework highlights how efficient data ingestion, transformation, and management directly impact AI model training and inference performance. Additionally, the study presents diagrams illustrating architectural workflows and comparative tables demonstrating performance improvements across different data engineering approaches
References
[1] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794.
[2] Gentyala, R. (2021). The Silent Interruption: Assessing the Impact of an AI Driven Sepsis Alert on Emergency Clinician Cognitive Load and Point-of-Care Efficiency. IACSE - International Journal of Computer Technology (IACSE-IJAIA), 2(1), 7–79.
[3]
[4] Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113.
[5] Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260.
[6] Gentyala, R. (2021). Bridging the Semantic Gap: A Lightweight Ontological Framework for Real-Time Harmonization of Consumer Wearable Data with FHIR-Based EHR Systems. IACSE - International Journal of Computer Technology (IACSE-IJCT), 2(1), 24–77.
[7] Kelleher, J. D., & Tierney, B. (2018). Data science. MIT Press.
[8] Kraska, T., Talwalkar, A., Duchi, J., Griffith, R., Franklin, M., & Jordan, M. (2013). MLbase: A distributed machine learning system. CIDR Conference.
[9] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.
[10] Gentyala, R. (2022). Beyond the Algorithm: A Longitudinal Analysis of Data Heterogeneity and Clinician Trust as Determinants of Predictive Tool Adoption and Patient Outcomes in Personalized Medicine. International Journal of AI, BigData, Computational and Management Studies, 3(2), 137-168. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V3I2P114
[11] Stonebraker, M., et al. (2018). Data management challenges in machine learning systems. Proceedings of the VLDB Endowment, 12(12), 1984–1986.
[12] Zaharia, M., et al. (2016). Apache Spark: A unified engine for big data processing. Communications of the ACM, 59(11), 56–65.
[13] Baylor, D., et al. (2017). TFX: A TensorFlow-based production-scale machine learning platform. KDD Conference.
[14] Gentyala, R. (2023). Anticipating Clinical Decay: A Meta-Learning Framework for Proactive Drift Detection and Feature Attribution in Deployed Healthcare AI . International Journal of Emerging Trends in Computer Science and Information Technology, 4(3), 198-216. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I3P121
[15] Abadi, M., et al. (2016). TensorFlow: A system for large-scale machine learning. OSDI Conference Proceedings.
[16] Gentyala, R. (2024). The Trust Threshold: How Public Perception of AI Harm Moderates the Impact of FinTech Innovation on Systemic Banking Stability . International Journal of Artificial Intelligence, Data Science, and Machine Learning, 5(3), 169-190. https://doi.org/10.63282/3050-9262.IJAIDSML-V5I3P118
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Daniel John McCarthy, (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


