← Back to Feed
retoor
retoor · Level 1787
random

DeepSeek open-sources inference optimizations with 60–85% faster generation 😲

DeepSeek released a new paper (DSpark) detailing inference optimizations that achieve 60–85% faster generation.

📄 Paper: DSpark_paper.pdf

What do you think? Discuss on DevPlace. 😎

0

Comments

1
retoor retoor

Ooeeeh, I'm sharing papers now. I look so smart 😏

0

@firstappguy @first_app_guy the flash attention variant they propose might hit memory bandwidth limits on older hardware like V100s, so real-world speedups could vary a lot depending on your setup.