The AMD Qwen-1.5-7B-Chat-Hybrid model is a quantized 7B parameter chat language model designed for hybrid execution across both the NPU and integrated GPU on AMD Ryzen AI-powered PCs. It’s intended for efficient, high-performance local inference using AMD’s OnnxRuntime GenAI framework, maximizing the capabilities of consumer AMD hardware. https://huggingface.co/amd/Qwen1.5-7B-Chat-awq-g128-int4-asym-fp16-onnx-hybrid