Skip to content

Getting Started with Ollama

Published: at 07:00 PM

Get an LLM up and running on your local machine in 10 minutes.

Meta Ollama Llama3

Table of Contents

Open Table of Contents


To get started, Download Ollama and run Llama3:

ollama run llama3

This will download the Llama3 model and run a new chat instance. If the model is already downloded, the terminal should immediately load a new chat.

Model Library

Ollama supports a variety of popular LLM models, the full list can be found here. Here is an example of models that can be downloaded:

Llama 38B4.7GBollama run llama3
Llama 370B40GBollama run llama3:70b
Phi 3 Mini3.8B2.3GBollama run phi3
Phi 3 Medium14B7.9GBollama run phi3:medium
Gemma2B1.4GBollama run gemma:2b
Gemma7B4.8GBollama run gemma:7b
Mistral7B4.1GBollama run mistral
Moondream 21.4B829MBollama run moondream
Neural Chat7B4.1GBollama run neural-chat
Starling7B4.1GBollama run starling-lm
Code Llama7B3.8GBollama run codellama
Llama 2 Uncensored7B3.8GBollama run llama2-uncensored
LLaVA7B4.5GBollama run llava
Solar10.7B6.1GBollama run solar


Import from GGUF

Ollama supports importing GGUF models in the Modelfile:

  1. Create a file named Modelfile, with a FROM instruction with the local filepath to the model you want to import.

FROM ./vicuna-33b.Q4_0.gguf

  1. Create the model in Ollama

ollama create example -f Modelfile

  1. Run the model

ollama run example

CLI Reference

Building Ollama

Check out the developer guide

  1. Start the server: ./ollama serve
  2. In a separate shell, run a model: ./ollama run llama3


Ollama has a REST API for running and managing models.

Generate a response

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt":"Why is the sky blue?"

Chat with an LLM

curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
  "messages": [
    { "role": "user", "content": "why is the sky blue?" }

Want more endpoints? Check out the API documentation.