ollama 命令详解

显示行号 | 选择喜欢的代码风格

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

ollama 命令可以允许在本地启动并运行 LLM 大型语言模型：如运行最新的谷歌 Gemma、Llama 3、phi、zephyr、阿里 Qwen、Mistral 和其他包括定制创建自己的模型，适用于 macOS、Linux 和 Windows，现已全面支持 DeepSeek-R1。

ollama 命令安装：

-bash/zsh: ollama command not found

# Windows
https://ollama.com/download/OllamaSetup.exe

# MacOS
https://ollama.com/download/Ollama-darwin.zip

# Linux
curl -fsSL https://ollama.com/install.sh | sh

# Docker
docker pull ollama/ollama # https://hub.docker.com/r/ollama/ollama

# Ollama - Docker CPU Only
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

ollama 命令补充说明：

Ollama 是一个本地推理框架客户端，可一键部署如 DeepSeek R1, Llama 3, Mistral, Llava 等大型语言模型。 ollama 启动成功后，会在本地 11434 端口启动一个 API 服务，可通过 http://localhost:11434 访问。

Ollama What's new UPDATE:

The ps command now reports model context size
Fixed issue where files in ollama-darwin.tgz were not notarized
Parallel request processing now defaults to 1. For more detail on the scheduler, see the FAQ
Fixed styling issue in launch screen
tool_name can now be provided in messages with "role": "tool" using the /api/chat endpoint

Ollama 所有的模型入口：Ollama Models。

目前 Ollama 最新版本是：v0.9.7 Pre-Release，已经支持多模型和并发（通过设置 OLLAMA_NUM_PARALLEL 和 OLLAMA_MAX_LOADED_MODELS）：

Gemma 3 现在支持多张图片 / Multiple images are now supported in Gemma 3
修复了运行 Gemma 3 会消耗大量系统内存的问题 / Fixed issue where running Gemma 3 would consume a large amount of system memory
ollama create --quantize 现在可以在从 safetensors 转换 Gemma 3 时使用 / ollama create --quantize now works when converting Gemma 3 from safetensors
修复了运行名称中带有 / 的模型时 /save 不起作用的问题 / Fixed issue where /save would not work if running a model with / in the name
添加对 AMD Strix Halo GPU 的支持 / Add support for AMD Strix Halo GPUs

ollama 命令 Nvidia （N卡）支持检测：

PS：2024 年 3月 14 日，Ollama 也开始支持 A 卡。

# Install CUDA drivers (optional – for Nvidia GPUs)
# 通过运行以下命令验证驱动程序是否已安装，该命令应打印有关 GPU 的详细信息：

nvidia-smi

ollama 命令语法：

ollama [flags]
ollama [command]

ollama 命令选项：

Available Commands:

  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command


Flags:

  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.

ollama 命令参数：

-h, --help         help for show
    --license      Show license of a model
    --modelfile    Show Modelfile of a model
    --parameters   Show parameters of a model
    --system       Show system message of a model
    --template     Show template of a model

ollama 命令实例：

用 ollama 创建模型：

ollama create mymodel -f ./Modelfile

拉取一个 ollama 模型：

# 此命令还可用于更新本地模型。 只有差异才会被 pull。

ollama pull qwen:0.5b

pulling manifest
pulling fad2a06e4cc7... 100% ▕███████████████████████████████▏ 394 MB
pulling 41c2cf8c272f... 100% ▕███████████████████████████████▏ 7.3 KB
pulling 1da0581fd4ce... 100% ▕███████████████████████████████▏ 130 B
pulling f02dd72bb242... 100% ▕███████████████████████████████▏  59 B
pulling ea0a531a015b... 100% ▕███████████████████████████████▏ 485 B
verifying sha256 digest
writing manifest
removing any unused layers
success

删除一个 ollama 模型：

ollama rm llama2

拷贝一个 ollama 模型：

ollama cp llama2 my-llama2

ollama run 之后，进行多行输入对话：

>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.

ollama 多式联运模型：

>>> What's in this image? /Users/jmorgan/Desktop/smile.png
The image features a yellow smiley face, which is likely the central focus of the picture.

ollama 传入提示，作为参数:

$ ollama run llama2 "Summarize this file: $(cat README.md)"

 Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

ollama list 列出计算机上所有模型：

ollama list

NAME            ID              SIZE    MODIFIED
gemma:2b        b50d6c999e59    1.7 GB  6 days ago
qwen:0.5b       b5dc5e784f2a    394 MB  2 minutes ago
qwen:latest     d53d04290064    2.3 GB  7 days ago
...

启动 ollama：

ollama serve

最后，在单独的 shell 中运行模型：

./ollama run llama2

ollama REST API，Ollama 有一个用于运行和管理模型的 REST API：

# 调用 ollama api 生成响应

curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'

ollama 同模型聊天：

curl http://localhost:11434/api/chat -d '{
  "model": "mistral",
  "messages": [
    { "role": "user", "content": "why is the sky blue?" }
  ]
}'

ollama 的坑：

如果用的 Windows 机器，默认是在 C 盘，模型比较大后续比较麻烦，可以设置环境变量，指向其他盘：
环境变量名：ollama_models 设置为 D:\AI\OllamaModels。

ollama 命令扩展阅读：

CommandNotFound ⚡️ 坑否 - 其他频道扩展阅读：

ollama 命令评论

常用命令

共收录到 543 个 Linux 命令