Reddit - r/MachineLearning 2h ago

If your GPU can run inference, it should be able to fine-tune too. [P]

I spent the last few months building a new sparse fine-tuning method for MoE models called USAF. The goal was simple: if your GPU can run inference on an MoE model, it should also be able to fine-tune it.

On my AMD RX 6750 XT (12 GB), I can fine-tune Qwen3-30B-A3B by training sparse expert weights and the router instead of adapters.

The project is completely open source under the Apache 2.0 license. I'm not trying to build a business, sell anything, or monetize it in any way-I just wanted to share something I built that I think is genuinely interesting.

I'd love to hear your feedback, especially from people working with MoE models.

GitHub: https://github.com/tsuyu122/usaf

Read on Reddit - r/MachineLearning ↗ ← Back to News

If your GPU can run inference, it should be able to fine-tune too. [P]

Comments