36篇AI经典论文阅读清单 — 链接整理

来源：微信公众号「张小珺访谈｜36篇经典论文演义：探索 AI 发展的所有论文！」整理时间：2026-06-10

计算机视觉

01. AlexNet — ImageNet Classification with Deep Convolutional Neural Networks (2012)

论文链接： https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html
发表： NeurIPS (NIPS) 2012
PDF下载：NeurIPS 2012.pdf

02. VGGNet — Very Deep Convolutional Networks for Large-Scale Image Recognition (2014)

论文链接： https://arxiv.org/abs/1409.1556
发表： ICLR 2015
arXiv编号： arXiv:1409.1556
PDF下载：ICLR 2015.pdf，arXiv:1409.1556

03. GoogLeNet/Inception — Going Deeper with Convolutions (2014)

论文链接： https://arxiv.org/abs/1409.4842
发表： CVPR 2015
arXiv编号： arXiv:1409.4842
PDF下载：CVPR 2015.pdf，arXiv:1409.4842

04. ResNet — Deep Residual Learning for Image Recognition (2015)

论文链接： https://arxiv.org/abs/1512.03385
发表： CVPR 2016
arXiv编号： arXiv:1512.03385
PDF下载：CVPR 2016.pdf，arXiv:1512.03385

05. U-Net — Convolutional Networks for Biomedical Image Segmentation (2015)

论文链接： https://arxiv.org/abs/1505.04597
发表： MICCAI 2015
arXiv编号： arXiv:1505.04597
PDF下载：MICCAI 2015 (Springer)，arXiv:1505.04597

06. Batch Normalization — Accelerating Deep Network Training by Reducing Internal Covariate Shift (2015)

论文链接： https://arxiv.org/abs/1502.03167
发表： ICML 2015
arXiv编号： arXiv:1502.03167
PDF下载：ICML 2015.pdf，arXiv:1502.03167

07. Faster R-CNN — Towards Real-Time Object Detection with Region Proposal Networks (2015)

论文链接： https://arxiv.org/abs/1506.01497
发表： NeurIPS 2015
arXiv编号： arXiv:1506.01497
PDF下载：NeurIPS 2015.pdf，arXiv:1506.01497

08. YOLO — You Only Look Once: Unified, Real-Time Object Detection (2016)

论文链接： https://arxiv.org/abs/1506.02640
发表： CVPR 2016
arXiv编号： arXiv:1506.02640
PDF下载：CVPR 2016.pdf，arXiv:1506.02640

09. MobileNet — Efficient Convolutional Neural Networks for Mobile Vision Applications (2017)

论文链接： https://arxiv.org/abs/1704.04861
发表： arXiv 预印本 (2017) — MobileNetV2 后续发表于 CVPR 2018
arXiv编号： arXiv:1704.04861
PDF下载：arXiv:1704.04861

10. EfficientNet — Rethinking Model Scaling for Convolutional Neural Networks (2019)

论文链接： https://arxiv.org/abs/1905.11946
发表： ICML 2019
arXiv编号： arXiv:1905.11946
PDF下载：ICML 2019.pdf，arXiv:1905.11946

自然语言处理

11. Transformer — Attention Is All You Need (2017)

论文链接： https://arxiv.org/abs/1706.03762
发表： NeurIPS 2017
arXiv编号： arXiv:1706.03762
PDF下载：NeurIPS 2017.pdf，arXiv:1706.03762

12. BERT — Pre-training of Deep Bidirectional Transformers for Language Understanding (2018)

论文链接： https://arxiv.org/abs/1810.04805
发表： NAACL-HLT 2019
arXiv编号： arXiv:1810.04805
PDF下载：NAACL 2019.pdf，arXiv:1810.04805

13. GPT-1 — Improving Language Understanding by Generative Pre-Training (2018)

论文链接： https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
发表： OpenAI 技术报告 (未正式发表)
PDF下载：OpenAI Technical Report

14. GPT-2 — Language Models are Unsupervised Multitask Learners (2019)

论文链接： https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
发表： OpenAI 技术报告 (未正式发表)
PDF下载：OpenAI Technical Report

15. RoBERTa — A Robustly Optimized BERT Pretraining Approach (2019)

论文链接： https://arxiv.org/abs/1907.11692
发表： arXiv 预印本 (2019) — 后续工作被多次引用，但未在顶级会议正式发表
arXiv编号： arXiv:1907.11692
PDF下载：arXiv:1907.11692

16. T5 — Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2019)

论文链接： https://arxiv.org/abs/1910.10683
发表： JMLR 2020
arXiv编号： arXiv:1910.10683
PDF下载：JMLR 2020.pdf，arXiv:1910.10683

17. GPT-3 — Language Models are Few-Shot Learners (2020)

论文链接： https://arxiv.org/abs/2005.14165
发表： NeurIPS 2020
arXiv编号： arXiv:2005.14165
PDF下载：NeurIPS 2020.pdf，arXiv:2005.14165

多模态与视觉语言

18. CLIP — Learning Transferable Visual Models From Natural Language Supervision (2021)

论文链接： https://arxiv.org/abs/2103.00020
发表： ICML 2021
arXiv编号： arXiv:2103.00020
PDF下载：ICML 2021.pdf，arXiv:2103.00020

19. DALL-E — Zero-Shot Text-to-Image Generation (2021)

论文链接： https://arxiv.org/abs/2102.12092
发表： ICML 2021
arXiv编号： arXiv:2102.12092
PDF下载：ICML 2021.pdf，arXiv:2102.12092

20. BLIP — Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation (2022)

论文链接： https://arxiv.org/abs/2201.12086
发表： ICML 2022
arXiv编号： arXiv:2201.12086
PDF下载：ICML 2022.pdf，arXiv:2201.12086

21. Flamingo — A Visual Language Model for Few-Shot Learning (2022)

论文链接： https://arxiv.org/abs/2204.14198
发表： NeurIPS 2022
arXiv编号： arXiv:2204.14198
PDF下载：NeurIPS 2022.pdf，arXiv:2204.14198

22. LLaVA — Visual Instruction Tuning (2023)

论文链接： https://arxiv.org/abs/2304.08485
发表： arXiv 预印本 (2023) — 后续 LLaVA-1.5 工作被 CVPR 2024 接收
arXiv编号： arXiv:2304.08485
PDF下载：arXiv:2304.08485

23. GPT-4V — The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) (2023)

论文链接： https://arxiv.org/abs/2309.17421
发表： arXiv 预印本 (2023) — 微软研究院技术报告
arXiv编号： arXiv:2309.17421
PDF下载：arXiv:2309.17421

生成模型与扩散

24. GAN — Generative Adversarial Networks (2014)

论文链接： https://arxiv.org/abs/1406.2661
发表： NeurIPS 2014
arXiv编号： arXiv:1406.2661
PDF下载：NeurIPS 2014.pdf，arXiv:1406.2661

25. VAE — Auto-Encoding Variational Bayes (2013)

论文链接： https://arxiv.org/abs/1312.6114
发表： ICLR 2014
arXiv编号： arXiv:1312.6114
PDF下载：ICLR 2014.pdf，arXiv:1312.6114

26. StyleGAN — A Style-Based Generator Architecture for Generative Adversarial Networks (2019)

论文链接： https://arxiv.org/abs/1812.04948
发表： CVPR 2019
arXiv编号： arXiv:1812.04948
PDF下载：CVPR 2019.pdf，arXiv:1812.04948

27. DDPM — Denoising Diffusion Probabilistic Models (2020)

论文链接： https://arxiv.org/abs/2006.11239
发表： NeurIPS 2020
arXiv编号： arXiv:2006.11239
PDF下载：NeurIPS 2020.pdf，arXiv:2006.11239

28. Stable Diffusion — High-Resolution Image Synthesis with Latent Diffusion Models (2022)

论文链接： https://arxiv.org/abs/2112.10752
发表： CVPR 2022
arXiv编号： arXiv:2112.10752
PDF下载：CVPR 2022.pdf，arXiv:2112.10752

29. DiT — Scalable Diffusion Models with Transformers (2023)

论文链接： https://arxiv.org/abs/2212.09748
发表： ICLR 2023
arXiv编号： arXiv:2212.09748
PDF下载：ICLR 2023.pdf，arXiv:2212.09748

视觉Transformer与自监督

30. ViT — An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2020)

论文链接： https://arxiv.org/abs/2010.11929
发表： ICLR 2021
arXiv编号： arXiv:2010.11929
PDF下载：ICLR 2021.pdf，arXiv:2010.11929

31. MAE — Masked Autoencoders Are Scalable Vision Learners (2021)

论文链接： https://arxiv.org/abs/2111.06377
发表： CVPR 2022
arXiv编号： arXiv:2111.06377
PDF下载：CVPR 2022.pdf，arXiv:2111.06377

32. SimCLR — A Simple Framework for Contrastive Learning of Visual Representations (2020)

论文链接： https://arxiv.org/abs/2002.05709
发表： ICML 2020
arXiv编号： arXiv:2002.05709
PDF下载：ICML 2020.pdf，arXiv:2002.05709

33. MoCo — Momentum Contrast for Unsupervised Visual Representation Learning (2019)

论文链接： https://arxiv.org/abs/1911.05722
发表： CVPR 2020
arXiv编号： arXiv:1911.05722
PDF下载：CVPR 2020.pdf，arXiv:1911.05722

强化学习

34. DQN — Playing Atari with Deep Reinforcement Learning (2013)

论文链接： https://arxiv.org/abs/1312.5602
发表： NeurIPS 2013 Deep Learning Workshop — 扩展版发表于 Nature 2015: "Human-level control through deep reinforcement learning"
arXiv编号： arXiv:1312.5602
PDF下载：Nature 2015.pdf，arXiv:1312.5602

35. AlphaGo — Mastering the Game of Go with Deep Neural Networks and Tree Search (2016)

论文链接： https://www.nature.com/articles/nature16961
发表： Nature 529, 484–489 (2016)
PDF下载：Nature 2016.pdf

36. PPO — Proximal Policy Optimization Algorithms (2017)

论文链接： https://arxiv.org/abs/1707.06347
发表： arXiv 预印本 (2017) — OpenAI 技术报告，被广泛引用但未在顶级会议正式发表
arXiv编号： arXiv:1707.06347
PDF下载：arXiv:1707.06347

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
papers		papers
papers2		papers2
README.md		README.md
download.py		download.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

36篇AI经典论文阅读清单 — 链接整理

计算机视觉

自然语言处理

多模态与视觉语言

生成模型与扩散

视觉Transformer与自监督

强化学习

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

36篇AI经典论文阅读清单 — 链接整理

计算机视觉

自然语言处理

多模态与视觉语言

生成模型与扩散

视觉Transformer与自监督

强化学习

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages