moondream 是一个微型视觉语言模型,性能强大,可在任何地方运行,架构基于 phi2 模型 1.4b。
pip install moondream==0.0.5
使用 Moondream 最新版本的推荐方式是通过官方的 Python 客户端库:
import moondream as md
from PIL import Image
# Initialize with local model path. Can also read .mf.gz files, but we recommend decompressing
# up-front to avoid decompression overhead every time the model is initialized.
model = md.vl(model="path/to/moondream-2b-int8.mf")
# Load and process image
image = Image.open("path/to/image.jpg")
encoded_image = model.encode_image(image)
# Generate caption
caption = model.caption(encoded_image)["caption"]
print("Caption:", caption)
# Ask questions
answer = model.query(encoded_image, "What's in this image?")["answer"]
print("Answer:", answer)