RAGSST

Retrieval Augmented Generation and Semantic-search Tool

A quick start, locally-run tool to test and use as basis for various document-related use cases:

Rag Query: Prompt a LLM that uses relevant context to answer your queries.
Semantic Retrieval: Retrieve relevant passages from documents, showing sources and relevance.
Rag Chat: Interact with a LLM that utilizes document retrieval and chat history.
LLM Chat: Chat and test a local LLM, without document context.

The interface is divided into tabs for users to select and try the feature for the desired use case. The implementation is focused on simplicity, low-level components, and modularity, in order to depict the working principles and core elements, allowing developers and Python enthusiasts to modify and build upon.

Installation

Download or clone the repository.

On bash, you can run the following installation script:

$ bin/install.sh

Alternatively, install it manually:

Create and activate a virtual environment (optional)

$ python3 -m venv .myvenv
$ source .myvenv/bin/activate

Install dependencies

$ pip3 install -r requirements.txt

Ollama

Install it to run large language models locally

$ curl -fsSL https://ollama.ai/install.sh | sh

Or follow the installation instructions for your operating system: Install Ollama

Choose and download a LLM model [*]

For example:

$ ollama pull llama3.1

Usage

Place your documents on the intended data folder (default: data/).
Start the tool [†]

$ python3 app.py

Open the provided URL on your web browser
Enjoy

Key Settings

Retrieval Parameters

Relevance threshold: Sets the minimum similarity threshold for retrieved passages. Lower values result in more selective retrieval.
Top n results: Specifies the maximum number of relevant passages to retrieve.

Additional Input parameters for the LLMs

Top k: Ranks the output tokens in descending order of probability, selects the first k tokens to create a new distribution, and it samples the output from it. Higher values result in more diverse answers, and lower values will produce more conservative answers.
Temp: This affects the “randomness” of the answers by scaling the probability distribution of the output elements. Increasing the temperature will make the model answer more creatively.
Top p: Works together with Top k, but instead of selecting a fixed number of tokens, it selects enough tokens to cover the given cumulative probability. A higher value will produce more varied text, and a lower value will lead to more focused and conservative answers.

[*] Performance consideration: On notebooks/PCs with dedicated GPUs, models such as llama3.1, mistral or gemma2 should be able to run all the models smoothly and rapidly. On a standard notebook, or if you encounter any memory of performance issues, prioritize smaller models such as llama3.2 or qwen2.5:3b.

[†] If you chose the installation with a virtual environment, remember to activate it before starting the application by running $ source .myvenv/bin/activate

Development

Before committing, format the code using Black:

$ black -t py311 -S -l 99 .

Linters:

Pylance
flake8 (args: --max-line-length=100 --extend-ignore=E401,E501,E741)

For more detailed logging, set the LOG_LEVEL environment variable:

$ export LOG_LEVEL='DEBUG'

License

GPLv3

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
bin		bin
data		data
exports		exports
images		images
log		log
ragsst		ragsst
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAGSST

Retrieval Augmented Generation and Semantic-search Tool

Installation

Create and activate a virtual environment (optional)

Install dependencies

Ollama

Usage

Key Settings

Retrieval Parameters

Additional Input parameters for the LLMs

Development

License

About

Releases

Packages

Contributors 3

Languages

License

aihpi/ragsst

Folders and files

Latest commit

History

Repository files navigation

RAGSST

Retrieval Augmented Generation and Semantic-search Tool

Installation

Create and activate a virtual environment (optional)

Install dependencies

Ollama

Usage

Key Settings

Retrieval Parameters

Additional Input parameters for the LLMs

Development

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages