Published inAI AdvancesInside DeepSeek V3A high-level overview of the technologies used in the DeepSeek v3 modelFeb 5Feb 5
Published inTDS Archive2024 Survival Guide for Machine Learning Engineer InterviewsA year-end summary for junior-level MLE interview preparationDec 24, 20242Dec 24, 20242
Published inTDS ArchiveIs ReFT All We Needed?Representation Fintuning — Beyond the PEFT Techniques for fine-tuning LLMsNov 21, 2024Nov 21, 2024
Published inTDS ArchiveA Walkthrough of Nvidia’s Latest Multi-Modal LLM FamilyFrom LLaVA, Flamingo, to NVLMOct 10, 2024Oct 10, 2024
Published inTDS ArchiveFrom Set Transformer to Perceiver SamplerOn multi-modal LLM Flamingo’s vision encoderOct 8, 2024Oct 8, 2024
Published inTDS ArchiveThe Mystery Behind the PyTorch Automatic Mixed Precision LibraryHow to get 2X speed up model training using three lines of codeSep 17, 20241Sep 17, 20241
Published inTDS ArchiveTransformer? Diffusion? Transfusion!A gentle introduction to the latest multi-modal transfusion modelSep 12, 20241Sep 12, 20241
Published inTDS ArchiveA Practical Guide to Contrastive LearningHow to build your very first SimSiam model with FashionMNISTJul 30, 2024Jul 30, 2024
Published inTDS ArchiveFrom MOCO v1 to v3: Towards Building a Dynamic Dictionary for Self-Supervised Learning — Part 1A gentle recap on the momentum contrast learning frameworkJul 4, 2024Jul 4, 2024