Get an LLM up and running on your local machine in 10 minutes.
Table of Contents
Open Table of Contents
Quickstart
To get started, Download Ollama and run Llama3:
ollama run llama3
This will download the Llama3 model and run a new chat instance. If the model is already downloded, the terminal should immediately load a new chat.
Model Library
Ollama supports a variety of popular LLM models, the full list can be found here. Here is an example of models that can be downloaded:
Model | Parameters | Size | Download |
---|---|---|---|
Llama 3 | 8B | 4.7GB | ollama run llama3 |
Llama 3 | 70B | 40GB | ollama run llama3:70b |
Phi 3 Mini | 3.8B | 2.3GB | ollama run phi3 |
Phi 3 Medium | 14B | 7.9GB | ollama run phi3:medium |
Gemma | 2B | 1.4GB | ollama run gemma:2b |
Gemma | 7B | 4.8GB | ollama run gemma:7b |
Mistral | 7B | 4.1GB | ollama run mistral |
Moondream 2 | 1.4B | 829MB | ollama run moondream |
Neural Chat | 7B | 4.1GB | ollama run neural-chat |
Starling | 7B | 4.1GB | ollama run starling-lm |
Code Llama | 7B | 3.8GB | ollama run codellama |
Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
LLaVA | 7B | 4.5GB | ollama run llava |
Solar | 10.7B | 6.1GB | ollama run solar |
Advanced
Import from GGUF
Ollama supports importing GGUF models in the Modelfile:
- Create a file named
Modelfile
, with aFROM
instruction with the local filepath to the model you want to import.
FROM ./vicuna-33b.Q4_0.gguf
- Create the model in Ollama
ollama create example -f Modelfile
- Run the model
ollama run example
CLI Reference
- Create a model:
ollama create mymodel -f ./Modelfile
- Pull a model:
ollama pull llama3
- Remove a model:
ollama rm llama3
- Copy a model:
ollama cp llama3 my-model
- List models on the computer:
ollama list
Building Ollama
Check out the developer guide
- Start the server:
./ollama serve
- In a separate shell, run a model:
./ollama run llama3
REST API
Ollama has a REST API for running and managing models.
Generate a response
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":"Why is the sky blue?"
}'
Chat with an LLM
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'
Want more endpoints? Check out the API documentation.