llama.cpp is an experimental backend in Lemonade Server that enables running GGUF models using llama.cpp's Vulkan-powered server for both CPU and GPU, alongside the default OGA backend. It provides support for chat, embeddings, and reranking endpoints within Lemonade's unified API.
No Rules configured
No Docs configured
No Prompts configured
No Data configured
No MCP Servers configured