moondream 是一个微型视觉语言模型,性能强大,可在任何地方运行,架构基于 phi2 模型 1.4b。
pip install moondream==0.0.5
使用 Moondream 最新版本的推荐方式是通过官方的 Python 客户端库:
import moondream as md from PIL import Image # Initialize with local model path. Can also read .mf.gz files, but we recommend decompressing # up-front to avoid decompression overhead every time the model is initialized. model = md.vl(model="path/to/moondream-2b-int8.mf") # Load and process image image = Image.open("path/to/image.jpg") encoded_image = model.encode_image(image) # Generate caption caption = model.caption(encoded_image)["caption"] print("Caption:", caption) # Ask questions answer = model.query(encoded_image, "What's in this image?")["answer"] print("Answer:", answer)