A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
-bash/zsh: ollama command not found # Windows https://ollama.com/download/OllamaSetup.exe # MacOS https://ollama.com/download/Ollama-darwin.zip # Linux curl -fsSL https://ollama.com/install.sh | sh # Docker docker pull ollama/ollama # https://hub.docker.com/r/ollama/ollama # Ollama - Docker CPU Only docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Ollama 是一个本地推理框架客户端,可一键部署如 DeepSeek R1, Llama 3, Mistral, Llava 等大型语言模型。 ollama 启动成功后,会在本地 11434
端口启动一个 API 服务,可通过 http://localhost:11434 访问。
Ollama 所有的模型入口:Ollama Models。
目前 Ollama 最新版本是:v0.6.5
,已经支持多模型和并发(通过设置 OLLAMA_NUM_PARALLEL 和 OLLAMA_MAX_LOADED_MODELS):
PS:2024 年 3月 14 日,Ollama 也开始支持 A 卡。
# Install CUDA drivers (optional – for Nvidia GPUs) # 通过运行以下命令验证驱动程序是否已安装,该命令应打印有关 GPU 的详细信息: nvidia-smi
ollama [flags] ollama [command]
Available Commands: serve Start ollama create Create a model from a Modelfile show Show information for a model run Run a model pull Pull a model from a registry push Push a model to a registry list List models cp Copy a model rm Remove a model help Help about any command Flags: -h, --help help for ollama -v, --version Show version information Use "ollama [command] --help" for more information about a command.
-h, --help help for show --license Show license of a model --modelfile Show Modelfile of a model --parameters Show parameters of a model --system Show system message of a model --template Show template of a model
用 ollama 创建模型:
ollama create mymodel -f ./Modelfile
拉取一个 ollama 模型:
# 此命令还可用于更新本地模型。 只有差异才会被 pull。 ollama pull qwen:0.5b pulling manifest pulling fad2a06e4cc7... 100% ▕███████████████████████████████▏ 394 MB pulling 41c2cf8c272f... 100% ▕███████████████████████████████▏ 7.3 KB pulling 1da0581fd4ce... 100% ▕███████████████████████████████▏ 130 B pulling f02dd72bb242... 100% ▕███████████████████████████████▏ 59 B pulling ea0a531a015b... 100% ▕███████████████████████████████▏ 485 B verifying sha256 digest writing manifest removing any unused layers success
删除一个 ollama 模型:
ollama rm llama2
拷贝一个 ollama 模型:
ollama cp llama2 my-llama2
ollama run 之后,进行多行输入对话:
>>> """Hello, ... world! ... """ I'm a basic program that prints the famous "Hello, world!" message to the console.
ollama 多式联运模型:
>>> What's in this image? /Users/jmorgan/Desktop/smile.png The image features a yellow smiley face, which is likely the central focus of the picture.
ollama 传入提示,作为参数:
$ ollama run llama2 "Summarize this file: $(cat README.md)" Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
ollama list 列出计算机上所有模型:
ollama list NAME ID SIZE MODIFIED gemma:2b b50d6c999e59 1.7 GB 6 days ago qwen:0.5b b5dc5e784f2a 394 MB 2 minutes ago qwen:latest d53d04290064 2.3 GB 7 days ago ...
启动 ollama:
ollama serve
最后,在单独的 shell 中运行模型:
./ollama run llama2
ollama REST API,Ollama 有一个用于运行和管理模型的 REST API:
# 调用 ollama api 生成响应 curl http://localhost:11434/api/generate -d '{ "model": "llama2", "prompt":"Why is the sky blue?" }'
ollama 同模型聊天:
curl http://localhost:11434/api/chat -d '{ "model": "mistral", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }'
如果用的 Windows 机器,默认是在 C 盘,模型比较大后续比较麻烦,可以设置环境变量,指向其他盘:
环境变量名:ollama_models
设置为 D:\AI\OllamaModels
。