2024必读的100篇生成式AI论文清单

大模型向量数据库云存储

2024年真是生成式人工智能研究大放异彩的一年!最让我们惊讶的是,整个领域的焦点发生了翻天覆地的变化。尤其是在 2023 年和 2024 年,情况开始变得截然不同,由于大模型模型已经能够做很多事情,因此也更加关注应用层面的研究。

论文集合地址: https://github.com/aishwaryanr/awesome-generative-ai-guide

picture.image论文合集的分类框架如上图所示,把AI研究想象成一个从输入到输出的系统,就像实际部署的场景一样。这个框架分为几层,每层都有其独特的关注点:

输入层: 这是大模型应用的起点,聚焦于输入处理和提示工程的研究。通过巧妙调整输入数据的方式,我们可以让大型语言模型(LLM)输出更优质的结果。

数据/模型层: 这一层关注的是模型的“燃料”和“引擎”。研究内容包括提升数据质量、生成合成数据,确保模型在丰富多样的数据集上训练。此外,还有基础架构的创新,比如新模型架构、多模态能力(融合文本、图像等)、成本与尺寸优化、模型对齐以及扩展上下文长度等。

应用层: 研究如何将LLM应用于现实世界。无论是特定领域的模型(如代码生成、文本转SQL或医疗应用),还是微调、检索增强生成(RAG)和多智能体系统等技术,这一层都是将理论转化为实用工具的关键。

输出层: 如何确保模型的输出靠谱?这一层的研究集中在评估方法上,从人机交互系统到基准测试和LLM评委,提供了多种有效评估AI输出的手段。

挑战: 生成式AI的局限性:对抗性攻击、模型可解释性、幻觉问题等,这些都是我们需要克服的现实挑战,以确保AI更安全、更可靠。

输入层

提示工程

  • DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
  • The Prompt Report: A Systematic Survey of Prompting Techniques

数据模型层

1. 数据质量/合成数据生成

  • On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey
  • Detecting Pretraining Data from Large Language Models
  • A Survey on Data Synthesis and Augmentation for Large Language Models
  • Scaling Synthetic Data Creation with 1,000,000,000 Personas
  • Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts

2. 基座大模型

  • DeepSeek-V3 Technical Report
  • xLSTM: Extended Long Short-Term Memory
  • Sparks of Artificial General Intelligence: Early Experiments with GPT-4
  • A Survey of Large Language Models
  • SAM 2: Segment Anything in Images and Videos
  • Qwen Technical Report
  • RWKV: Reinventing RNNs for the Transformer Era
  • KAN: Kolmogorov-Arnold Networks
  • The Llama 3 Herd of Models
  • Segment Anything
  • Differential Transformer
  • Foundation Models for Music: A Survey

3. 模型优化 (大小, 成本)

  • A Survey of Small Language Models
  • The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
  • TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
  • A Survey on LLM Inference-Time Self-Improvement
  • FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
  • LLM Pruning and Distillation in Practice: The Minitron Approach
  • What is the Role of Small Models in the LLM Era: A Survey

4. 多模态

  • Towards Generalist Biomedical AI
  • MusicLM: Generating Music From Text
  • The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
  • Multimodal Chain-of-Thought Reasoning in Language Models

5. 大模型对齐

  • Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
  • RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
  • The Capacity for Moral Self-Correction in Large Language Models
  • sDPO: Don't Use Your Data All at Once

6. 长上下文

  • LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
  • Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction
  • YaRN: Efficient Context Window Extension of Large Language Models
  • LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
  • LongNet: Scaling Transformers to 1,000,000,000 Tokens

应用层

1.领域模型

  • Qwen2.5-Coder Technical Report
  • A Survey of Large Language Models for Healthcare: From Data, Technology, and Applications to Accountability and Ethics
  • ChemCrow: Augmenting Large-Language Models with Chemistry Tools
  • MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models
  • A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?
  • A Survey on Language Models for Code
  • PMC-LLaMA: Towards Building Open-Source Language Models for Medicine
  • ChemLLM: A Chemical Large Language Model
  • A Survey of Large Language Models in Medicine: Progress, Application, and Challenge
  • Can Large Language Models Unlock Novel Scientific Research Ideas?

2. RAG

  • Corrective Retrieval Augmented Generation
  • HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction
  • Active Retrieval Augmented Generation
  • GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning
  • Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
  • Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
  • Retrieval-Augmented Generation for Large Language Models: A Survey
  • Text2SQL is Not Enough: Unifying AI and Databases with TAG
  • Searching for Best Practices in Retrieval-Augmented Generation
  • Seven Failure Points When Engineering a Retrieval Augmented Generation System

3. 智能体

  • The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
  • Large Language Model-Brained GUI Agents: A Survey
  • A Survey on Large Language Model based Autonomous Agents
  • Augmented Language Models: a Survey
  • A Taxonomy of AgentOps for Enabling Observability of Foundation Model based Agents
  • Toolformer: Language Models Can Teach Themselves to Use Tools

4. 多智能体

  • Emergent Autonomous Scientific Research Capabilities of Large Language Models
  • OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
  • AutoGen Studio: A No-Code Developer Tool for Building and Debugging Multi-Agent Systems
  • Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
  • AIOS: LLM Agent Operating System
  • AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
  • Large Language Model-Based Agents for Software Engineering: A Survey

5. 大模型微调

  • Instruction Tuning with GPT-4
  • LLMs + Persona-Plug = Personalized LLMs
  • Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
  • QLoRA: Efficient Finetuning of Quantized LLMs
  • LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
  • LoRA+: Efficient Low Rank Adaptation of Large Models
  • SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL
  • A Survey on Employing Large Language Models for Text-to-SQL Tasks
  • Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought

输出层

大模型评估

  • A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
  • Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models
  • RAGEval: Scenario-Specific RAG Evaluation Dataset Generation Framework
  • Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
  • A Survey on LLM-as-a-Judge
  • AgentBench: Evaluating LLMs as Agents
  • A Survey on Evaluation of Large Language Models
  • Self-Taught Evaluators
  • PromptBench: A Unified Library for Evaluation of Large Language Models
  • A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B
  • Evaluating Large Language Models: A Comprehensive Survey
  • Mathematical Capabilities of ChatGPT

挑战

生成式AI的局限性

  • LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
  • A Survey on Hallucination in Large Vision-Language Models
  • A Survey of Hallucination in Large Foundation Models
  • Chain-of-Verification Reduces Hallucination in Large Language Models
  • A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
  • One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era
  • Knowledge Conflicts for LLMs: A Survey

点击阅读原文,可以访问带有论文链接的文章

picture.image

添加微信,回复”RAG“进入交流群

picture.image

picture.image

0
0
0
0
关于作者
相关资源
IDC 大模型应用落地白皮书
大模型技术已深度融入业务实践,各企业期望其释放更大商业价值。 但大模型落地之路面临许多挑战和顾虑。 如何精准对接业务需求与发展蓝图,制定切实可行的大模型落地策略? IDC发布首个大模型应用策略与行动指南 一为您揭晓一
相关产品
评论
未登录
看完啦,登录分享一下感受吧~
暂无评论