Chat with your current directory's files using a local or API LLM.
dir-assistant
is a CLI python application available through pip
that recursively indexes all text
files in the current working directory so you can chat with them using a local or API LLM. By
"chat with them", it is meant that their contents will automatically be included in the prompts sent
to the LLM, with the most contextually relevant files included first. dir-assistant
is designed
primarily for use as a coding aid and automation tool.
- Includes an interactive chat mode and a single prompt non-interactive mode.
- When enabled, it will automatically make file updates and commit to git.
- Local platform support for CPU (OpenBLAS), Cuda, ROCm, Metal, Vulkan, and SYCL.
- API support for all major LLM APIs. More info in the LiteLLM Docs.
- Uses a unique method for finding the most important files to include when submitting your prompt to an LLM called CGRAG (Contextually Guided Retrieval-Augmented Generation). You can read this blog post for more information about how it works.
- New Features
- Quickstart
- Examples
- General Usage Tips
- Install
- Embedding Model Configuration
- Optional: Select A Hardware Platform
- API Configuration
- Local LLM Model Download
- Running
- Upgrading
- Additional Help
- Contributors
- Acknowledgements
- Limitations
- Todos
- Additional Credits
- Added support for models that include a
<thinking></thinking>
block in their response. - Added an example script for analyzing stock sentiment on reddit.
In this section are recipes to run dir-assistant
in basic capacity to get you started quickly.
To get started using an API model, you can use Google Gemini 2.0 Flash, which is currently free. To begin, you need to sign up for Google AI Studio and create an API key. After you create your API key, enter the following commands:
pip install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant
Note: The Python.org installer is recommended for Windows. The Windows
Store installer does not add dir-assistant to your PATH so you will need to call it
with python -m dir_assistant
if you decide to go that route.
pip install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant
pip3
has been replaced with pipx
starting in Ubuntu 24.04.
pipx install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant
To get started locally, you can download a default llm model. Default configuration with this model requires 3GB of memory on most hardware. You will be able to adjust the configuration to fit higher or lower memory requirements. To run via CPU:
pip install dir-assistant[recommended]
dir-assistant models download-embed
dir-assistant models download-llm
cd directory/to/chat/with
dir-assistant
To run with hardware acceleration, use the platform
subcommand:
...
dir-assistant platform cuda
cd directory/to/chat/with
dir-assistant
See which platforms are supported using -h
:
dir-assistant platform -h
It is not recommended to use dir-assistant
directly with local LLMs on Windows. This is because
llama-cpp-python
requires a C compiler for installation via pip, and setting one up is not
a trivial task on Windows like it is on other platforms. Instead, it is recommended to
use another LLM server such as LMStudio and configure dir-assistant
to use it as
a custom API server. To do this, ensure you are installing dir-assistant
without
the recommended
dependencies:
pip install dir-assistant
Then configure dir-assistant
to connect to your custom LLM API server:
Connecting to a Custom API Server
For instructions on setting up LMStudio to host an API, follow their guide:
https://lmstudio.ai/docs/app/api
pip3
has been replaced with pipx
starting in Ubuntu 24.04.
pipx install dir-assistant[recommended]
...
dir-assistant platform cuda --pipx
The non-interactive mode of dir-assistant
allows you to create scripts which analyze
your files without user interaction.
To get started using an API model, you can use Google Gemini 2.0 Flash, which is currently free. To begin, you need to sign up for Google AI Studio and create an API key. After you create your API key, enter the following commands:
pip install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant -s "Describe the files in this directory"
pip3
has been replaced with pipx
starting in Ubuntu 24.04.
pipx install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant -s "Describe the files in this directory"
See examples.
./reddit-stock-sentiment.sh
Downloading top posts from r/wallstreetbets...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 173k 100 173k 0 0 353k 0 --:--:-- --:--:-- --:--:-- 353k
Analyzing sentiment of stocks mentioned in subreddits...
Sentiment Analysis Results:
TSLA 6.5
SPY 4.2
NVDA 3.1
BTC 2.8
MSTR 2.5
PLTR -4.5
IONQ -5.2
RGTI -6.1
STRK -7.3
TWLO -8.0
Install with pip:
pip install dir-assistant
You can also install llama-cpp-python
as an optional dependency to enable dir-assistant to
directly run local LLMs:
pip install dir-assistant[recommended]
Note: llama-cpp-python
is not updated often so may not run the latest models or have the latest
features of Llama.cpp. You may have better results with a separate local LLM server and
connect it to dir-assistant using the custom API server
feature.
The default configuration for dir-assistant
is API-mode. If you download an LLM model with download-llm
,
local-mode will automatically be set. To change from API-mode to local-mode, set the ACTIVE_MODEL_IS_LOCAL
setting.
pip3
has been replaced with pipx
starting in Ubuntu 24.04.
pipx install dir-assistant
Dir-assistant is a powerful tool with many configuration options. This section provides some
general tips for using dir-assistant
to achieve the best results.
There are quite literally thousands of models that can be used with dir-assistant
. The best results
in terms of quality for complex coding tasks on large codebases as of writing have been achieved
with voyage-code-3
and gemini-2.0-flash-thinking-exp
. To use these models open the config
file with dir-assistant config open
and modify this optimized configuration to suit your needs:
Note: Don't forget to add your own API keys! Get them via Google AI Studio and Voyage AI.
[DIR_ASSISTANT]
SYSTEM_INSTRUCTIONS = "You are a helpful AI assistant tasked with assisting my coding. "
GLOBAL_IGNORES = [ ".gitignore", ".d", ".obj", ".sql", "js/vendors", ".tnn", ".env", "node_modules", ".min.js", ".min.css", "htmlcov", ".coveragerc", ".pytest_cache", ".egg-info", ".git/", ".vscode/", "node_modules/", "build/", ".idea/", "__pycache__", ]
CONTEXT_FILE_RATIO = 0.9
ACTIVE_MODEL_IS_LOCAL = false
ACTIVE_EMBED_IS_LOCAL = false
USE_CGRAG = true
PRINT_CGRAG = false
OUTPUT_ACCEPTANCE_RETRIES = 2
COMMIT_TO_GIT = true
VERBOSE = false
NO_COLOR = false
LITELLM_EMBED_REQUEST_DELAY = 0
LITELLM_MODEL_USES_SYSTEM_MESSAGE = true
LITELLM_PASS_THROUGH_CONTEXT_SIZE = false
LITELLM_CONTEXT_SIZE = 200000
LITELLM_EMBED_CONTEXT_SIZE = 4000
MODELS_PATH = "~/.local/share/dir-assistant/models/"
LLM_MODEL = "agentica-org_DeepScaleR-1.5B-Preview-Q4_K_M.gguf"
EMBED_MODEL = "nomic-embed-text-v1.5.Q4_K_M.gguf"
[DIR_ASSISTANT.LITELLM_API_KEYS]
GEMINI_API_KEY = "yourkeyhere"
VOYAGE_API_KEY = "yourkeyhere"
[DIR_ASSISTANT.LITELLM_COMPLETION_OPTIONS]
model = "gemini/gemini-2.0-flash-thinking-exp"
timeout = 600
[DIR_ASSISTANT.LITELLM_EMBED_COMPLETION_OPTIONS]
model = "voyage/voyage-code-3"
timeout = 600
[DIR_ASSISTANT.LLAMA_CPP_COMPLETION_OPTIONS]
frequency_penalty = 1.1
presence_penalty = 1.0
[DIR_ASSISTANT.LLAMA_CPP_OPTIONS]
n_ctx = 10000
verbose = false
n_gpu_layers = -1
rope_scaling_type = 2
rope_freq_scale = 0.75
[DIR_ASSISTANT.LLAMA_CPP_EMBED_OPTIONS]
n_ctx = 4000
n_batch = 512
verbose = false
rope_scaling_type = 2
rope_freq_scale = 0.75
n_gpu_layers = -1
You must use an embedding model regardless of whether you are running an LLM via local or API mode, but you can also
choose whether the embedding model is local or API using the ACTIVE_EMBED_IS_LOCAL
setting. Generally local embedding
will be faster, but API will be higher quality. If you wish to use local embedding, you can download a
good default embedding model with:
pip install dir-assistant[recommended]
dir-assistant models download-embed
If you would like to use another local embedding model, download a gguf file and place it in the models directory. The models directory can be opened in a file browser using:
dir-assistant models
Note: The embedding model will be hardware accelerated after using the platform
subcommand. To disable
hardware acceleration, change n_gpu_layers = -1
to n_gpu_layers = 0
in the config.
By default dir-assistant
is installed with CPU-only compute support. It will work properly without this step,
but if you would like to hardware accelerate dir-assistant
, use the command below to compile
llama-cpp-python
with your hardware's support.
dir-assistant platform cuda
Available options: cpu
, cuda
, rocm
, metal
, vulkan
, sycl
Note: The embedding model and the local llm model will be run with acceleration after selecting a platform. To disable
hardware acceleration change n_gpu_layers = -1
to n_gpu_layers = 0
in the config.
pip3
has been replaced with pipx
starting in Ubuntu 24.04.
dir-assistant platform cuda --pipx
System dependencies may be required for the platform
command and are outside the scope of these instructions.
If you have any issues building llama-cpp-python
, the project's install instructions may offer more
info: https://github.com/abetlen/llama-cpp-python
If you wish to use an API LLM, you will need to configure it. To configure which LLM API
dir-assistant uses, you must edit LITELLM_MODEL
and the appropriate API key in your configuration. To open
your configuration file, enter:
dir-assistant config open
Once editing the file, change:
[DIR_ASSISTANT]
LITELLM_CONTEXT_SIZE = 200000
[DIR_ASSISTANT.LITELLM_API_KEYS]
GEMINI_API_KEY = "xxxxxxxxxxxxxxxxxxx"
[DIR_ASSISTANT.LITELLM_COMPLETION_OPTIONS]
model = "gemini/gemini-2.0-flash"
LiteLLM supports all major LLM APIs, including APIs hosted locally. View the available options in the LiteLLM providers list.
There is a convenience subcommand for modifying and adding API keys:
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
If you would like to connect to a custom API server, such as your own ollama, llama.cpp, LMStudio,
vLLM, or other OpenAPI-compatible API server, dir-assistant supports this. To configure for this,
open the config with dir-assistant config open
and make following changes (LMStudio's base_url
shown for this example):
[DIR_ASSISTANT]
ACTIVE_MODEL_IS_LOCAL = false
[DIR_ASSISTANT.LITELLM_COMPLETION_OPTIONS]
model = "openai/mistral-small-24b-instruct-2501"
base_url = "http://localhost:1234/v1"
If you want to use a local LLM directly within dir-assistant
using llama-cpp-python
,
you can download a low requirements default model with:
pip install dir-assistant[recommended]
dir-assistant models download-llm
Note: The local LLM model will be hardware accelerated after using the platform
subcommand. To disable hardware
acceleration, change n_gpu_layers = -1
to n_gpu_layers = 0
in the config.
If you would like to use a custom local LLM model, download a GGUF model and place it in your models directory. Huggingface has numerous GGUF models to choose from. The models directory can be opened in a file browser using this command:
dir-assistant models
After putting your gguf in the models directory, you must configure dir-assistant to use it:
dir-assistant config open
Edit the following setting:
[DIR_ASSISTANT]
LLM_MODEL = "Mistral-Nemo-Instruct-2407.Q6_K.gguf"
Llama.cpp provides a large number of options to customize how your local model is run. Most of these options are
exposed via llama-cpp-python
. You can configure them with the [DIR_ASSISTANT.LLAMA_CPP_OPTIONS]
,
[DIR_ASSISTANT.LLAMA_CPP_EMBED_OPTIONS]
, and [DIR_ASSISTANT.LLAMA_CPP_COMPLETION_OPTIONS]
sections in the
config file.
The options available for llama-cpp-python
are documented in the
Llama constructor documentation.
What the options do is also documented in the llama.cpp CLI documentation.
The most important llama-cpp-python
options are related to tuning the LLM to your system's VRAM:
- Setting
n_ctx
lower will reduce the amount of VRAM required to run, but will decrease the amount of file text that can be included when running a prompt. CONTEXT_FILE_RATIO
sets the proportion of prompt history to file text to be included when sent to the LLM. Higher ratios mean more file text and less prompt history. More file text generally improves comprehension.- If your llm
n_ctx
timesCONTEXT_FILE_RATIO
is smaller than your embedn_ctx
, your file text chunks have the potential to be larger than your llm context, and thus will not be included. To ensure all files can be included, make sure your embed context is smaller thann_ctx
timesCONTEXT_FILE_RATIO
. - Larger embed
n_ctx
will chunk your files into larger sizes, which allows LLMs to understand them more easily. n_batch
must be smaller than then_ctx
of a model, but setting it higher will probably improve performance.
For other tips about tuning Llama.cpp, explore their documentation and do some google searches.
dir-assistant
Running dir-assistant
will scan all files recursively in your current directory. The most relevant files will
automatically be sent to the LLM when you enter a prompt.
dir-assistant
is shorthand for dir-assistant start
. All arguments below are applicable for both.
The following arguments are available while running dir-assistant
:
-i --ignore
: A list of space-separated filepaths to ignore-d --dirs
: A list of space-separated directories to work on (your current directory will always be used)-s --single-prompt
: Run a single prompt and output the final answer-v --verbose
: Show debug information during execution
Example usage:
# Run a single prompt and exit
dir-assistant -s "What does this codebase do?"
# Show debug information
dir-assistant -v
# Ignore specific files and add additional directories
dir-assistant -i ".log" ".tmp" -d "../other-project"
The COMMIT_TO_GIT
feature allows dir-assistant
to make changes directly to your files and commit the changes to git
during the chat. By default, this feature is disabled, but after enabling it, the assistant will suggest file changes
and ask whether to apply the changes. If confirmed, it stages the changes and creates a git commit with the prompt
message as the commit message.
To enable the COMMIT_TO_GIT
feature, update the configuration:
dir-assistant config open
Change or add the following setting:
[DIR_ASSISTANT]
...
COMMIT_TO_GIT = true
Once enabled, the assistant will handle the Git commit process as part of its workflow. To undo a commit,
type undo
in the prompt.
You can include files from outside your current directory to include in your dir-assistant
session:
dir-assistant -d /path/to/dir1 ../dir2
You can ignore files when starting up so they will not be included in the assistant's context:
dir-assistant -i file.txt file2.txt
There is also a global ignore list in the config file. To configure it first open the config file:
dir-assistant config open
Then edit the setting:
[DIR_ASSISTANT]
...
GLOBAL_IGNORES = [
...
"file.txt"
]
Any configuration setting can be overridden using environment variables. The environment variable name should match the configuration key name:
# Override the model path
export DIR_ASSISTANT__LLM_MODEL="mistral-7b-instruct.Q4_K_M.gguf"
# Enable git commits
export DIR_ASSISTANT__COMMIT_TO_GIT=true
# Change context ratio
export DIR_ASSISTANT__CONTEXT_FILE_RATIO=0.7
# Change llama.cpp embedding options
export DIR_ASSISTANT__LLAMA_CPP_EMBED_OPTIONS__n_ctx=2048
# Example setting multiple env vars inline with the command
DIR_ASSISTANT__COMMIT_TO_GIT=true DIR_ASSISTANT__CONTEXT_FILE_RATIO=0.7 dir-assistant
This allows multiple config profiles for your custom use cases.
# Run with different models
DIR_ASSISTANT__LLM_MODEL="model1.gguf" dir-assistant -s "What does this codebase do?"
DIR_ASSISTANT__LLM_MODEL="model2.gguf" dir-assistant -s "What does this codebase do?"
# Test with different context ratios
DIR_ASSISTANT__CONTEXT_FILE_RATIO=0.8 dir-assistant
Some version upgrades may have incompatibility issues in the embedding index cache. Use this command to delete the index cache so it may be regenerated:
dir-assistant clear
Use the -h
argument with any command or subcommand to view more information. If your problem is beyond the scope of
the helptext, please report a Github issue.
We appreciate contributions from the community! For a list of contributors and how you can contribute, please see CONTRIBUTORS.md.
- Local LLMs are run via the fantastic llama-cpp-python package
- API LLMS are run using the also fantastic LiteLLM package
- Dir-assistant only detects and reads text files at this time.
API LLMsRAGFile caching (improve startup time)CGRAG (Contextually-Guided Retrieval-Augmented Generation)Multi-line inputFile watching (automatically reindex changed files)Single-step pip installModel downloadCommit to gitAPI Embedding modelsImmediate mode for better compatibility with custom script automationsSupport for custom APIsSupport for thinking models- Web search
- Daemon mode for API-based use
Special thanks to Blazed.deals for sponsoring this project.