Mengliu Zhao – Medium

Mengliu Zhao

ML Engineering 201: Describing “Measurable Impact” from an IC’s Perspective

Use the SWOP(S) method to measure your impact

5d ago

ML Engineering 201: Describing “Measurable Impact” from an IC’s Perspective

5d ago

Fear, Love, and Why

Things were just not right.

Mar 23

Fear, Love, and Why

Mar 23

Published in
AI Advances

Sparse Transformers

From naive sparse attention to Kimi’s ultra-long context model and DeepSeek’s NSA

Mar 1

Sparse Transformers

Mar 1

A Year in Writing

A personal reflection on my 2024 journey in writing blogs

Feb 9

A Year in Writing

Feb 9

Published in
AI Advances

Inside DeepSeek V3

A high-level overview of the technologies used in the DeepSeek v3 model

Feb 5

Inside DeepSeek V3

Feb 5

Published in
TDS Archive

2024 Survival Guide for Machine Learning Engineer Interviews

A year-end summary for junior-level MLE interview preparation

Dec 24, 2024

2024 Survival Guide for Machine Learning Engineer Interviews

Dec 24, 2024

Published in
TDS Archive

Is ReFT All We Needed?

Representation Fintuning — Beyond the PEFT Techniques for fine-tuning LLMs

Nov 21, 2024

Is ReFT All We Needed?

Nov 21, 2024

Published in
TDS Archive

A Walkthrough of Nvidia’s Latest Multi-Modal LLM Family

From LLaVA, Flamingo, to NVLM

Oct 10, 2024

A Walkthrough of Nvidia’s Latest Multi-Modal LLM Family

Oct 10, 2024

Published in
TDS Archive

From Set Transformer to Perceiver Sampler

On multi-modal LLM Flamingo’s vision encoder

Oct 8, 2024

From Set Transformer to Perceiver Sampler

Oct 8, 2024

Published in
TDS Archive

The Mystery Behind the PyTorch Automatic Mixed Precision Library

How to get 2X speed up model training using three lines of code

Sep 17, 2024

The Mystery Behind the PyTorch Automatic Mixed Precision Library

Sep 17, 2024

Mengliu Zhao

Mengliu Zhao

Wish I can pay and get more humbleness

Following

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech