novita/llama-32-11b-vision-instruct

public

Published on 3/5/2025

llama-3.2-11b-vision-instruct

Llama-3.2-11B-Vision-Instruct combines visual and language processing at 11B parameters. This multimodal model excels in image captioning, visual QA, and complex image analysis through integrated visual-linguistic understanding.

Models

llama-3.2-11b-vision-instruct

anthropic

chat

edit

apply