Optimizing Large Language Model Deployment with Scalable Inference and Ensemble Techniques (Gurupriya Adurthy , Trans.). (2025). International Journal of Engineering and Advanced Technology (IJEAT), 15(2), 9-14. https://doi.org/10.35940/ijeat.A4692.15021225