Image Annotation API Reference¶
API documentation for the Image Annotation service.
Overview¶
The Image Annotation API provides endpoints for annotating images using Vision-Language Models (VLMs).
Work in Progress
This API is under active development. See the GitHub repository for the latest updates.
REST API Endpoints¶
POST /api/annotate¶
Annotate an image using a VLM.
Request:
{
"image_path": "path/to/image.png",
"model": "gpt-4-vision",
"prompt": "Describe this image for HED annotation"
}
Response:
{
"description": "A person riding a bicycle...",
"objects": ["person", "bicycle", "street"],
"hed_annotation": "Sensory-event, Visual-presentation, ..."
}
GET /api/annotations/{image_id}¶
Retrieve stored annotations for an image.
GET /api/models¶
List available VLM models.
Services¶
VLM Service¶
The VLM service handles communication with vision-language models:
- Ollama: Local models via Ollama
- OpenAI: GPT-4 Vision
- Anthropic: Claude Vision
Annotation Storage¶
Annotations are stored in JSON format in the annotations/ directory.
Configuration¶
Environment variables:
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY | OpenAI API key | - |
OLLAMA_BASE_URL | Ollama server URL | http://localhost:11434 |
ANNOTATION_DIR | Output directory | ./annotations |
Python API¶
from image_annotation.services import VLMService
# Initialize service
service = VLMService(model="gpt-4-vision")
# Annotate image
result = await service.annotate("path/to/image.png")
print(result.description)
print(result.hed_annotation)
Full API Reference
Detailed Python API documentation will be auto-generated once the package structure is finalized. See the source code for current implementation.