L
llm

  • This project provides a Dockerized setup for running Ollama as a local LLM service with GPU support. It includes a pre-configured Docker Compose setup, automated model management, and support for multiple LLM models like llama3.1 (chat) and snowflake-arctic-embed2 (embeddings). The container exposes an API for interacting with models via CLI or HTTP requests. Configuration is handled via .env, allowing easy customization. Designed for quick deployment, this setup serves as a flexible starting point for integrating local AI models.

    Updated
    Updated