@lemonade
Refreshingly fast LLMs on GPUs and NPUs. Install, run LLMs locally, and integrate with apps in minutes! https://lemonade-server.ai/
Llama 4 Scout is a 17B parameter multimodal AI model with 16 experts, offering industry-leading text and image understanding. https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct
State-of-the-art code LLM with 32B parameters, matching GPT-4o coding abilities. Enhanced code generation, reasoning and fixing capabilities. https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct-GGUF
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support.
Devstral is an agentic LLM for software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. https://huggingface.co/mistralai/Devstral-Small-2507_gguf