Retrieval-Augmented Generation (RAG) for Large Language Models: A Comprehensive Survey

Tanay Chowdhury

doi:10.63503/j.ijaimd.2026.233

Authors

Tanay Chowdhury Data Science Lead, Gen AI Center of Innovation, Amazon Web Services, Seattle

DOI:

https://doi.org/10.63503/j.ijaimd.2026.233

Keywords:

Retrieval-Augmented Generation (RAG), Large Language Models (LLMs), Knowledge-Augmented NLP, Hallucination Reduction, Adaptive Retrieval.

Abstract

The Retrieval-Augmented Generation (RAG) paradigm has been proposed as a potent concept that would increase the factual accuracy, reliability, and adaptability of Large Language Models (LLMs) by incorporating external information retrieval and text generation. In comparison to an independent LLM, which leverages solely parametric knowledge, RAG dynamically accesses non-parametric knowledge sources during inference that are up-to-date and are founded on the evidence that was accessed to generate the response. This paper summarizes the foundations of RAG, its architecture, major components (retriever and generator) and an indexing-retrieval-generation methodology. It critically examines retrieval strategies in which sparse, dense and hybrid retrieval are examined with their efficiency trade-offs, semantic insight, interpretability and application to domain-specific tasks including healthcare. The paper further compares RAG with fine-tuning and brings out the differences in updating knowledge, customization and hallucination reduction. Finally, the augmentation strategies, the higher-level techniques iterative, recursive and adaptive retrieval are discussed to solve the complex, multi-step reasoning tasks. In summary, the paper reveals that RAG is a scalable and powerful AI-based solution that is knowledge-intensive.

References

[1] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large language models are zero-shot reasoners,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 35, 2022.

[2] M. Chen et al., “Evaluating large language models trained on code,” arXiv preprint arXiv:2107.03374, 2021.

[3] F. J. Teixeira, A. A. N. Castro, and A. C. Sant’Ana, “Investigating the origin of the raw material of rag paper by Raman spectroscopy,” Vib. Spectrosc., vol. 98, pp. 119–122, Sep. 2018, doi: 10.1016/j.vibspec.2018.08.003.

[4] E. Kristiani, V. K. Verma, and C.-T. Yang, “Deploying LLM transformer on edge computing devices: A survey of strategies, challenges, and future directions,” AI, vol. 7, no. 1, p. 15, Jan. 2022, doi: 10.3390/ai7010015.

[5] D. Li et al., “Large language models with controllable working memory,” in Findings of the Association for Computational Linguistics (ACL), pp. 1774–1793, 2023, doi: 10.18653/v1/2023.findings-acl.112.

[6] P.-M. Wong and C.-K. Chui, “Cognitive engine for augmented human decision-making in manufacturing process control,” J. Manuf. Syst., vol. 65, pp. 115–129, Oct. 2022, doi: 10.1016/j.jmsy.2022.09.007.

[7] S. Siriwardhana, R. Weerasekera, E. Wen, and S. Nanayakkara, “Fine-tune the entire RAG architecture (including DPR retriever) for question-answering,” arXiv preprint, Jun. 2021.

[8] P. Pathak, A. Shrivastava, and S. Gupta, “A survey on various security issues in delay tolerant networks,” J. Adv. Shell Program., vol. 2, no. 2, pp. 12–18, 2015.

[9] R. Saxena, S. A. Pushkala, and R. Carvalho, “Systems and methods for rapid processing of file data,” U.S. Patent 9,594,817, Mar. 2017.

[10] J. Chen, R. Zhang, J. Guo, Y. Fan, and X. Cheng, “GERE: Generative evidence retrieval for fact verification,” in Proc. 45th Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval (SIGIR), 2022, pp. 2184–2189, doi: 10.1145/3477495.3531827.

[11] K. Murugandi and R. Seetharaman, “A study of supplier relationship management in global procurement: Balancing cost efficiency and ethical sourcing practices,” Int. J. Adv. Res. Sci. Commun. Technol., vol. 2, no. 1, pp. 724–733, 2022, doi: 10.48175/IJARSCT-7744B.

[12] W. Lin and B. Byrne, “Retrieval augmented visual question answering with outside knowledge,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), 2022, pp. 11238–11254, doi: 10.18653/v1/2022.emnlp-main.772.

[13] C. Zhang, Y. Lai, Y. Feng, and D. Zhao, “A review of deep learning in question answering over knowledge bases,” AI Open, vol. 2, pp. 205–215, 2021, doi: 10.1016/j.aiopen.2021.12.001.

[14] S. Gupta, N. Agrawal, and S. Gupta, “A review on search engine optimization: Basics,” Int. J. Hybrid Inf. Technol., vol. 9, no. 5, pp. 381–390, May 2016, doi: 10.14257/ijhit.2016.9.5.32.

[15] P. S. H. Lewis, “Improving neural question answering with retrieval and generation,” Ph.D. dissertation, Univ. College London (UCL), 2022.

[16] V. Thakaran, “A comparative study of piping stress analysis methods with different tools, techniques, and best practices,” Int. J. Adv. Res. Sci. Commun. Technol., vol. 2, no. 1, pp. 675–684, Oct. 2022, doi: 10.48175/IJARSCT-7868D.

[17] P. Loft, Y. He, I. Yevseyeva, and I. Wagner, “CAESAR8: An agile enterprise architecture approach to managing information security risks,” Comput. Secur., vol. 122, p. 102877, Nov. 2022, doi: 10.1016/j.cose.2022.102877.

[18] J. Kachhia, A. Patel, A. Vala, R. Patel, and K. Mahant, “Logarithmic slots antennas using substrate integrated waveguide,” Int. J. Antennas Propag., vol. 2015, 2015, doi: 10.1155/2015/629797.

[19] W. Yu, “Retrieval-augmented generation across heterogeneous knowledge,” in Proc. NAACL-HLT Student Res. Workshop, 2022, doi: 10.18653/v1/2022.naacl-srw.7.

[20] Y. Ahn, S.-G. Lee, J. Shim, and J. Park, “Retrieval-augmented response generation for knowledge-grounded conversation in the wild,” IEEE Access, vol. 10, pp. 131374–131385, 2022, doi: 10.1109/ACCESS.2022.3228964.

[21] S. Siriwardhana, R. Weerasekera, E. Wen, T. Kaluarachchi, R. Rana, and S. Nanayakkara, “Improving the domain adaptation of retrieval-augmented generation models for open-domain question answering,” Trans. Assoc. Comput. Linguist., vol. 11, pp. 1–17, 2023, doi: 10.1162/tacl_a_00530.

[22] S. Singh and A. Mahmood, “The NLP cookbook: Modern recipes for transformer-based deep learning architectures,” IEEE Access, vol. 9, pp. 68675–68702, 2021, doi: 10.1109/ACCESS.2021.3077350.

[23] P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2020.

[24] Ambati and N. Chhadia, “Knowledge-enhanced language models: A comparative study of RAG and embedding methods,” 2019.

Retrieval-Augmented Generation (RAG) for Large Language Models: A Comprehensive Survey

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

issn-online

frequency

Language

format-of-publication

publishing-mode-and-access-fees

Submission Process:

Post Acceptance:

Email ID

Editor-in-Chief

Publisher