Understanding Web Application for Retrieval-Augmented Generation%3A Implementation and Testing

This paper explores the implementation of retrieval-augmented generation (RAG) technology with open-source large language models (LLMs). A web-based application, PaSSER, was developed to integrate RAG with models such as Mistral:7b, Llama2:7b, and Orca2:7b. The application uses various software tools and evaluates LLMs using metrics like METEOR, ROUGE, BLEU, perplexity, cosine similarity, Pearson correlation, and F1 score. Two tests were conducted: one assessing LLM performance across different hardware configurations, and another determining which model provides the most accurate and contextually relevant responses within RAG. The paper discusses the integration of blockchain with LLMs to manage and store assessment results. The results show that GPUs are essential for fast text generation, with Orca2:7b on Mac M1 being the fastest and Mistral:7b performing best on the 446 question-answer dataset. The paper concludes by outlining future developments, including the use of other LLMs, fine-tuning approaches, and further integration with blockchain and IJPS.This paper explores the implementation of retrieval-augmented generation (RAG) technology with open-source large language models (LLMs). A web-based application, PaSSER, was developed to integrate RAG with models such as Mistral:7b, Llama2:7b, and Orca2:7b. The application uses various software tools and evaluates LLMs using metrics like METEOR, ROUGE, BLEU, perplexity, cosine similarity, Pearson correlation, and F1 score. Two tests were conducted: one assessing LLM performance across different hardware configurations, and another determining which model provides the most accurate and contextually relevant responses within RAG. The paper discusses the integration of blockchain with LLMs to manage and store assessment results. The results show that GPUs are essential for fast text generation, with Orca2:7b on Mac M1 being the fastest and Mistral:7b performing best on the 446 question-answer dataset. The paper concludes by outlining future developments, including the use of other LLMs, fine-tuning approaches, and further integration with blockchain and IJPS.

Web Application for Retrieval-Augmented Generation: Implementation and Testing

2024 | Irina Radeva, Ivan Popchev, Lyubka Doukovska, Miroslava Dimitrova