[2605.30876] dMoE: dLLMs with Learnable Block Experts
Abstract page for arXiv paper 2605.30876: dMoE: dLLMs with Learnable Block Experts
America Forever Bytes
Technology
Abstract page for arXiv paper 2605.30876: dMoE: dLLMs with Learnable Block Experts
Abstract page for arXiv paper 2605.30992: Eigenvectors of Experts are Training-free Non-collapsing Routers
Abstract page for arXiv paper 2605.31010: MoG: Mixture of Experts for Graph-based Retrieval-Augmented Generation
Abstract page for arXiv paper 2605.31463: PithTrain: A Compact and Agent-Native MoE Training System
Abstract page for arXiv paper 2605.29714: Leveraging Routing Dynamics in Mixture-of-Experts Models for Efficient Language Adaptation
What Mixture of Experts (MoE) models mean for your GPU costs, your serving stack, and your deployment strategy in 2026.