Ollama (Download and run large language models locally)

Ollama is an application which lets you run large language models
offline.

A list of models are available on ollama.com/library.

Optional dependencies like CUDA or ROCm will be automatically detected
during compilation of ollama libraries, if present.

CUDA=ON: building with CUDA, default is CUDA=OFF.

ROCM=ON: building with ROCm, default is ROCM=OFF.

Building ollama server and client requires network and
development/google-go-lang