ML Engineering 201: Describing “Measurable Impact” from an IC’s PerspectiveUse the SWOP(S) method to measure your impact2d ago2d ago
Published inAI AdvancesSparse TransformersFrom naive sparse attention to Kimi’s ultra-long context model and DeepSeek’s NSAMar 1Mar 1
Published inAI AdvancesInside DeepSeek V3A high-level overview of the technologies used in the DeepSeek v3 modelFeb 5Feb 5
Published inTDS Archive2024 Survival Guide for Machine Learning Engineer InterviewsA year-end summary for junior-level MLE interview preparationDec 24, 20243Dec 24, 20243
Published inTDS ArchiveIs ReFT All We Needed?Representation Fintuning — Beyond the PEFT Techniques for fine-tuning LLMsNov 21, 2024Nov 21, 2024
Published inTDS ArchiveA Walkthrough of Nvidia’s Latest Multi-Modal LLM FamilyFrom LLaVA, Flamingo, to NVLMOct 10, 2024Oct 10, 2024
Published inTDS ArchiveFrom Set Transformer to Perceiver SamplerOn multi-modal LLM Flamingo’s vision encoderOct 8, 2024Oct 8, 2024
Published inTDS ArchiveThe Mystery Behind the PyTorch Automatic Mixed Precision LibraryHow to get 2X speed up model training using three lines of codeSep 17, 20241Sep 17, 20241