It took a while (year or so), but now you can use your AMD AI laptop for local AI on Linux.
FastFlowLM together with AMD brought Linux support for the XDNA2 NPU (50 Tops) on AMD AI 300 series (Krackan Point).
https://fastflowlm.com/docs/install_lin/
Tried it out and yes, and the initial experience is excellent. It’s working better on Linux than Windows now for two reasons:
- 10% faster
- There is no memory limit, all of the RAM is accessible by the NPU. So running gpt-oss:20b on a 32GB machine is no problem. It is on Windows. (https://github.com/FastFlowLM/FastFlowLM/issues/242)
Translating with TranslateGemma‑4B runs at roughly ~20T/s, while LLama.cpp (Vulkan) achieve about 10T/s. And prompt/processing/prefill is faster, at least Time to First Token (TTFT).
gpt-oss:20b delivers around 19.5 T/s. (“decoding_speed_tps”:19.444631769512764}
AMD NPU LLM benchmarks
For official benchmarks see:
https://fastflowlm.com/benchmarks/
Conclusion: AMD NPU on Linux is great for laptops
Yes, the AMD NPU is great on Linux.
Biggest win, it won’s slowdown you’re laptop CPU or GPU performance.
This together with much better power-efficiency, so it won’t deplete your battery that fast. it is really usable.
The NPU can also be used for Text to speech (Whisper). Image-creation is in the works, together with support for newer models like Qwen3.5.