Hybrid Sparse-Dense Retrieval-Augmented Generation: An Empirical Analysis with SPLADE and DPR Enhancing Domain-Specific Answer Generation through Complementary Retriever Synergies

Rathinasamy Muthusami (1), Saritha K (2)
(1) a:1:{s:5:"en_US";s:89:"Department of Computer Applications, Dr. Mahalingam College of Engineering and Technology";}
(2) Department of Mathematics, PA College of Engineering and Technology
How to cite (COMIEN) :
Muthusami, R., & Saritha, K. (2026). Hybrid Sparse-Dense Retrieval-Augmented Generation: An Empirical Analysis with SPLADE and DPR: Enhancing Domain-Specific Answer Generation through Complementary Retriever Synergies. International Journal on Computational Engineering, 3(1). Retrieved from https://comien.org/index.php/comien/article/view/56

Retrieval-Augmented Generation (RAG) has become a key technique in open-domain question answering (QA), where a retriever fetches relevant documents that are passed to a language model to generate answers. Traditional retrievers like BM25 rely on exact lexical matches, while dense retrievers such as DPR capture semantic meaning but often struggle with specific terminology and interpretability. To address these limitations, this paper explores the use of SPLADE, a sparse lexical and expansion-based retriever, within a full RAG pipeline. We benchmark SPLADE against BM25 and DPR using datasets from the BEIR benchmark suite, such as FiQA and TREC-COVID. SPLADE consistently outperforms its counterparts in both retrieval and answer generation quality. For example, in FiQA, SPLADE achieves an nDCG@10 of 0.635, compared to 0.591 for BM25 and 0.604 for DPR. In downstream QA evaluation, using GPT-3.5 for generation, SPLADE-based retrieval leads to higher ROUGE-L scores (0.479) than BM25 (0.391) and DPR (0.418), reflecting more accurate and complete answers. To further enhance performance, we implement a hybrid retriever by combining SPLADE and DPR using Reciprocal Rank Fusion (RRF). This hybrid model achieves even stronger retrieval performance (nDCG@10: 0.681, Recall@10: 0.805), demonstrating that sparse and dense signals are complementary. Additionally, SPLADE offers interpretability via token-level expansion visualization, allowing for analysis of which terms influence retrieval. Our findings confirm that SPLADE, especially when combined with dense retrieval, significantly improves the effectiveness and transparency of LLM-based QA systems, paving the way for more accurate and explainable RAG architectures.