Published inTowards Data ScienceA Walkthrough of Nvidia’s Latest Multi-Modal LLM FamilyFrom LLaVA, Flamingo, to NVLMOct 10Oct 10
Published inTowards Data ScienceFrom Set Transformer to Perceiver SamplerOn multi-modal LLM Flamingo’s vision encoderOct 8Oct 8
Published inTowards Data ScienceThe Mystery Behind the PyTorch Automatic Mixed Precision LibraryHow to get 2X speed up model training using three lines of codeSep 171Sep 171
Published inTowards Data ScienceTransformer? Diffusion? Transfusion!A gentle introduction to the latest multi-modal transfusion modelSep 121Sep 121
Published inTowards Data ScienceA Practical Guide to Contrastive LearningHow to build your very first SimSiam model with FashionMNISTJul 30Jul 30
Published inTowards Data ScienceFrom MOCO v1 to v3: Towards Building a Dynamic Dictionary for Self-Supervised Learning — Part 1A gentle recap on the momentum contrast learning frameworkJul 4Jul 4
Published inAI AdvancesThe Perspective of StructuresA gentle summary of three CVPR’24 papers on domain adaptationJun 24Jun 24
Published inTowards Data ScienceA Patch is More than 16*16 PixelsOn Pixel Transformer and Ultra-long Sequence Distributed TransformerJun 17Jun 17
Published inTowards Data ScienceFrom Masked Image Modeling to Autoregressive Image ModelingA brief review of the image foundation model pre-training objectivesJun 101Jun 101
Published inTowards Data ScienceML Engineering 101: A Thorough Explanation of The Error “DataLoader worker (pid(s) xxx) exited…A deep dive into PyTorch DataLoader with MultiprocessingJun 3Jun 3