Wednesday, June 17, 2026 · 6:30 pm – 7:30 pm
Price
Free
This workshop explores practical research directions for improving transformer efficiency in small and medium-sized language models. As models grow larger, compute cost, memory usage, inference latency, and deployment complexity increase significantly. The session covers architectural experiments, training optimisations, efficient attention mechanisms, memory-efficient techniques, and inference-focused design choices being explored while building open-source LLMs at FrontiersMind.
Speaker: Abhay Kumar, co-founder of FrontiersMind, an AI research lab focused on efficient small and medium-sized language models optimised for enterprise and real-world deployment.
What to expect:
Pre-read: Basics of Transformer architectures, The Illustrated Transformer, Memory-Efficient Attention (MHA vs. MQA vs. GQA vs. MLA), Understanding DeepSeek's Multi-Head Latent Attention.
Price
Free
DigitalOcean Cloud Credits — up to ₹1Cr
Up to $10K/month in DigitalOcean cloud and GPU credits for 12 months, plus 15 months of free paid support. For AI-native startups.