🦙Quickstart

Ollama ..

Ollama

Ollama simplifies running LLMs locally by handling model downloads, quantization, and execution seamlessly.

Download and install Ollama from the official website.

Models

It might sound obvious, select models that are able to run on your resources. Also bear in mind models are optimized to perform certain tasks - reasoning, math, code, tools, etc ...

Double-click on the executable to install.
Select your model to download. Select Models option from menu.

Test the setup and download our model.

ollama run deepseek-r1:7b

That's it .. you're good to go .. Let's ask the model a question..?

Some Useful Commands

  /set            Set session variables
  /show           Show model information
  /load <model>   Load a session or model
  /save <model>   Save your current session
  /clear          Clear session context
  /bye            Exit
  /?, /help       Help for a command
  /? shortcuts    Help for keyboard shortcuts

Use """ to begin a multi-line message.

To check the Ollama server

ollama serve

Switch to user directory.

cd $HOME

Check Ollama server is up and running.

Get-NetTCPConnection -LocalPort 11434 -ErrorAction SilentlyContinue

You can also send the question as a JSON object.

$jsonContent = '{
  "model": "deepseek-r1",
  "messages": [{ "role": "user", "content": "Solve: 2 + 2" }],
  "stream": false
}'

[System.IO.File]::WriteAllText("$HOME\request.json", $jsonContent)
curl.exe -X POST http://localhost:11434/api/chat -d "@$HOME\request.json" -H "Content-Type: application/json"

Useful, as there's now have an endpoint that can be used for out Chat UI frontend.

Run the following command.

curl -fsSL https://ollama.com/install.sh | sh

In another terminal, verify that Ollama is running.

ollama -v

Adding Ollama as a startup service (recommended)

Create a user and group for Ollama.

sudo useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
sudo usermod -a -G ollama $(whoami)

Create a service file in /etc/systemd/system/ollama.service.

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"

[Install]
WantedBy=default.target

Then start the service.

sudo systemctl daemon-reload
sudo systemctl enable ollama

Start Ollama

Start Ollama and verify it is running:

sudo systemctl start ollama
sudo systemctl status ollama

Commands & Flags

Command

Description

ollama list

Displays all installed models on your system.

ollama pull <model name>

Download a New Model.

ollama run <model name>

Run an LLM for Chat or Inference.

ollama update <model name>

Ensures you have the latest version.

ollama remove <model name>

Deletes a model.

ollama status

Displays CPU and GPU utilization, active models, and memory usage.

Flag

description

--verbose

Displays stats. Net generation time and output time with tokens/seconds

Ollama also has its REST API, you can call it through curl

The following example uses the Microsoft Phi4 model.

curl http://localhost:11434/api/generate -d '{
  "model": "phi4",
  "prompt": "Why is the sky blue?"
}'

The response will be streamed back ..

chatbot@Office:~$ curl http://localhost:11434/api/generate -d '{
  "model": "phi4",
  "prompt": "Why is the sky blue?"
}'
{"model":"phi4","created_at":"2025-02-25T13:01:16.847307208Z","response":"The","done":false}
{"model":"phi4","created_at":"2025-02-25T13:01:16.895178316Z","response":" sky","done":false}
{"model":"phi4","created_at":"2025-02-25T13:01:16.926463821Z","response":" appears","done":false}
{"model":"phi4","created_at":"2025-02-25T13:01:16.957592326Z","response":" blue","done":false}
{"model":"phi4","created_at":"2025-02-25T13:01:16.988690731Z","response":" due","done":false}
...
319,6138,3135,304,1524,810,72916,315,24210,93959,320,12481,323,6307,705,10923,5129,93959,1093,2579,82,323,85138,311,5662,701,6548,382,45600,11,13558,64069,72916,15100,3249,584,1518,264,6437,13180,1234,14595,62182,4787,13],"total_duration":14595395219,"load_duration":3775345351,"prompt_eval_count":16,"prompt_eval_duration":465000000,"eval_count":280,"eval_duration":10352000000}

PreviousWSL & Docker NextKey Concepts

Last updated 4 months ago