最好的大语言模型资源汇总持续更新

picture.image

向AI转型的程序员都关注公众号机器学习AI算法工程

数据 Data
微调 Fine-Tuning
推理 Inference
评估 Evaluation
体验 Usage
知识库 RAG
智能体 Agents
搜索 Search
书籍 Book
课程 Course
教程 Tutorial
论文 Paper
Tips

资料获取地址

https://github.com/WangRongsheng/awesome-LLM-resourses?tab=readme-ov-file

数据 Data

Note

此处命名为 数据，但这里并没有提供具体数据集，而是提供了处理获取大规模数据的方法

我们始终秉持授人以鱼不如授人以渔

AotoLabel: Label, clean and enrich text datasets with LLMs.
LabelLLM: The Open-Source Data Annotation Platform.
data-juicer: A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs!
OmniParser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
MinerU: MinerU is a one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.
PDF-Extract-Kit: A Comprehensive Toolkit for High-Quality PDF Content Extraction.
Parsera: Lightweight library for scraping web-sites with LLMs.
Sparrow: Sparrow is an innovative open-source solution for efficient data extraction and processing from various documents and images.
Docling: Transform PDF to JSON or Markdown with ease and speed.
GOT-OCR2.0: OCR Model.
LLM Decontaminator: Rethinking Benchmark and Contamination for Language Models with Rephrased Samples.
DataTrove: DataTrove is a library to process, filter and deduplicate text data at a very large scale.
llm-swarm: Generate large synthetic datasets like Cosmopedia.
Distilabel: Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Common-Crawl-Pipeline-Creator: The Common Crawl Pipeline Creator.
Tabled: Detect and extract tables to markdown and csv.
Zerox: Zero shot pdf OCR with gpt-4o-mini.
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception.
TensorZero: make LLMs improve through experience.
Promptwright: Generate large synthetic data using a local LLM.

微调 Fine-Tuning

LLaMA-Factory: Unify Efficient Fine-Tuning of 100+ LLMs.
unsloth: 2-5X faster 80% less memory LLM finetuning.
TRL: Transformer Reinforcement Learning.
Firefly: Firefly: 大模型训练工具，支持训练数十种大模型
Xtuner: An efficient, flexible and full-featured toolkit for fine-tuning large models.
torchtune: A Native-PyTorch Library for LLM Fine-tuning.
Swift: Use PEFT or Full-parameter to finetune 200+ LLMs or 15+ MLLMs.
AutoTrain: A new way to automatically train, evaluate and deploy state-of-the-art Machine Learning models.
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO).
Ludwig: Low-code framework for building custom LLMs, neural networks, and other AI models.
mistral-finetune: A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models.
aikit: Fine-tune, build, and deploy open-source LLMs easily!
H2O-LLMStudio: H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs.
LitGPT: Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
LLMBox: A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
PaddleNLP: Easy-to-use and powerful NLP and LLM library.
workbench-llamafactory: This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral).
TinyLLaVA Factory: A Framework of Small-scale Large Multimodal Models.
LLM-Foundry: LLM training code for Databricks foundation models.
lmms-finetune: A unified codebase for finetuning (full, lora) large multimodal models, supporting llava-1.5, qwen-vl, llava-interleave, llava-next-video, phi3-v etc.
Simplifine: Simplifine lets you invoke LLM finetuning with just one line of code using any Hugging Face dataset or model.
Transformer Lab: Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
Liger-Kernel: Efficient Triton Kernels for LLM Training.
ChatLearn: A flexible and efficient training framework for large-scale alignment.
nanotron: Minimalistic large language model 3D-parallelism training.
Proxy Tuning: Tuning Language Models by Proxy.
Effective LLM Alignment: Effective LLM Alignment Toolkit.
Autotrain-advanced
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

推理 Inference

ollama: Get up and running with Llama 3, Mistral, Gemma, and other large language models.
Open WebUI: User-friendly WebUI for LLMs (Formerly Ollama WebUI).
Text Generation WebUI: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
Xinference: A powerful and versatile library designed to serve language, speech recognition, and multimodal models.
LangChain: Build context-aware reasoning applications.
LlamaIndex: A data framework for your LLM applications.
lobe-chat: an open-source, modern-design LLMs/AI chat framework. Supports Multi AI Providers, Multi-Modals (Vision/TTS) and plugin system.
TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.
vllm: A high-throughput and memory-efficient inference and serving engine for LLMs.
LlamaChat: Chat with your favourite LLaMA models in a native macOS app.
NVIDIA ChatRTX: ChatRTX is a demo app that lets you personalize a GPT large language model (LLM) connected to your own content—docs, notes, or other data.
LM Studio: Discover, download, and run local LLMs.
chat-with-mlx: Chat with your data natively on Apple Silicon using MLX Framework.
LLM Pricing: Quickly Find the Perfect Large Language Models (LLM) API for Your Budget! Use Our Free Tool for Instant Access to the Latest Prices from Top Providers.
Open Interpreter: A natural language interface for computers.
Chat-ollama: An open source chatbot based on LLMs. It supports a wide range of language models, and knowledge base management.
chat-ui: Open source codebase powering the HuggingChat app.
MemGPT: Create LLM agents with long-term memory and custom tools.
koboldcpp: A simple one-file way to run various GGML and GGUF models with KoboldAI's UI.
LLMFarm: llama and other large language models on iOS and MacOS offline using GGML library.
enchanted: Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.
Flowise: Drag & drop UI to build your customized LLM flow.
Jan: Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM).
LMDeploy: LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
RouteLLM: A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
MInference: About To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
Mem0: The memory layer for Personalized AI.
SGLang: SGLang is yet another fast serving framework for large language models and vision language models.
AirLLM: AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning. And you can run 405B Llama3.1 on 8GB vram now.
LLMHub: LLMHub is a lightweight management platform designed to streamline the operation and interaction with various language models (LLMs).
YuanChat
LiteLLM: Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, Groq etc.]
GuideLLM: GuideLLM is a powerful tool for evaluating and optimizing the deployment of large language models (LLMs).
LLM-Engines: A unified inference engine for large language models (LLMs) including open-source models (VLLM, SGLang, Together) and commercial models (OpenAI, Mistral, Claude).
OARC: ollama_agent_roll_cage (OARC) is a local python agent fusing ollama llm's with Coqui-TTS speech models, Keras classifiers, Llava vision, Whisper recognition, and more to create a unified chatbot agent for local, custom automation.
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains.
MemoryScope: MemoryScope provides LLM chatbots with powerful and flexible long-term memory capabilities, offering a framework for building such abilities.
OpenLLM: Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
Infinity: The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense embedding, sparse embedding, tensor and full-text.

评估 Evaluation

lm-evaluation-harness: A framework for few-shot evaluation of language models.
opencompass: OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
llm-comparator: LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed.
EvalScope
Weave: A lightweight toolkit for tracking and evaluating LLM applications.
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures.
Evaluation guidebook: If you've ever wondered how to make sure an LLM performs well on your specific task, this guide is for you!
Ollama Benchmark: LLM Benchmark for Throughput via Ollama (Local LLMs).

体验 Usage

LMSYS Chatbot Arena: Benchmarking LLMs in the Wild
CompassArena 司南大模型竞技场
琅琊榜
Huggingface Spaces
WiseModel Spaces
Poe
林哥的大模型野榜
OpenRouter

知识库 RAG

AnythingLLM: The all-in-one AI app for any LLM with full RAG and AI Agent capabilites.
MaxKB: 基于 LLM 大语言模型的知识库问答系统。开箱即用，支持快速嵌入到第三方业务系统
RAGFlow: An open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Dify: An open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
FastGPT: A knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.
Langchain-Chatchat: 基于 Langchain 与 ChatGLM 等不同大语言模型的本地知识库问答
QAnything: Question and Answer based on Anything.
Quivr: A personal productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Local & Private alternative to OpenAI GPTs & ChatGPT powered by retrieval-augmented generation.
RAG-GPT: RAG-GPT, leveraging LLM and RAG technology, learns from user-customized knowledge bases to provide contextually relevant answers for a wide range of queries, ensuring rapid and accurate information retrieval.
Verba: Retrieval Augmented Generation (RAG) chatbot powered by Weaviate.
FlashRAG: A Python Toolkit for Efficient RAG Research.
GraphRAG: A modular graph-based Retrieval-Augmented Generation (RAG) system.
LightRAG: LightRAG helps developers with both building and optimizing Retriever-Agent-Generator pipelines.
GraphRAG-Ollama-UI: GraphRAG using Ollama with Gradio UI and Extra Features.
nano-GraphRAG: A simple, easy-to-hack GraphRAG implementation.
RAG Techniques: This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.
ragas: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines.
kotaemon: An open-source clean & customizable RAG UI for chatting with your documents. Built with both end users and developers in mind.
RAGapp: The easiest way to use Agentic RAG in any enterprise.
TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text.
LightRAG: Simple and Fast Retrieval-Augmented Generation.
TEN: the Next-Gen AI-Agent Framework, the world's first truly real-time multimodal AI agent framework.
AutoRAG: RAG AutoML tool for automatically finding an optimal RAG pipeline for your data.
KAG: KAG is a knowledge-enhanced generation framework based on OpenSPG engine, which is used to build knowledge-enhanced rigorous decision-making and information retrieval knowledge services.

智能体 Agents

AutoGen: AutoGen is a framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen AIStudio
CrewAI: Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
Coze
AgentGPT: Assemble, configure, and deploy autonomous AI Agents in your browser.
XAgent: An Autonomous LLM Agent for Complex Task Solving.
MobileAgent: The Powerful Mobile Device Operation Assistant Family.
Lagent: A lightweight framework for building LLM-based agents.
Qwen-Agent: Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
LinkAI: 一站式 AI 智能体搭建平台
Baidu APPBuilder
agentUniverse: agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications. Furthermore, through the community, they can exchange and share practices of patterns across different domains.
LazyLLM: 低代码构建多Agent大模型应用的开发工具
AgentScope: Start building LLM-empowered multi-agent applications in an easier way.
MoA: Mixture of Agents (MoA) is a novel approach that leverages the collective strengths of multiple LLMs to enhance performance, achieving state-of-the-art results.
Agently: AI Agent Application Development Framework.
OmAgent: A multimodal agent framework for solving complex tasks.
Tribe: No code tool to rapidly build and coordinate multi-agent teams.
CAMEL: Finding the Scaling Law of Agents. A multi-agent framework.
PraisonAI: PraisonAI application combines AutoGen and CrewAI or similar frameworks into a low-code solution for building and managing multi-agent LLM systems, focusing on simplicity, customisation, and efficient human-agent collaboration.
IoA: An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through internet-like connectivity.
llama-agentic-system : Agentic components of the Llama Stack APIs.
Agent Zero: Agent Zero is not a predefined agentic framework. It is designed to be dynamic, organically growing, and learning as you use it.
Agents: An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents.
AgentScope: Start building LLM-empowered multi-agent applications in an easier way.
FastAgency: The fastest way to bring multi-agent workflows to production.
Swarm: Framework for building, orchestrating and deploying multi-agent systems. Managed by OpenAI Solutions team. Experimental framework.

搜索 Search

OpenSearch GPT: SearchGPT / Perplexity clone, but personalised for you.
MindSearch: An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT).
nanoPerplexityAI: The simplest open-source implementation of perplexity.ai.
curiosity: Try to build a Perplexity-like user experience.

书籍 Book

《大规模语言模型：从理论到实践》
《大语言模型》
《动手学大模型Dive into LLMs》
《动手做AI Agent》
《Build a Large Language Model (From Scratch)》
《多模态大模型》
《Generative AI Handbook: A Roadmap for Learning Resources》
《Understanding Deep Learning》
《Illustrated book to learn about Transformers & LLMs》
《Building LLMs for Production: Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG》
《大型语言模型实战指南：应用实践与场景落地》
《Hands-On Large Language Models》
《自然语言处理：大模型理论与实践》
《动手学强化学习》
《面向开发者的LLM入门教程》
《大模型基础》

课程 Course

斯坦福 CS224N: Natural Language Processing with Deep Learning
吴恩达: Generative AI for Everyone
吴恩达: LLM series of courses
ACL 2023 Tutorial: Retrieval-based Language Models and Applications
llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
微软: Generative AI for Beginners
微软: State of GPT
HuggingFace NLP Course
清华 NLP 刘知远团队大模型公开课
斯坦福 CS25: Transformers United V4
斯坦福 CS324: Large Language Models
普林斯顿 COS 597G (Fall 2022): Understanding Large Language Models
约翰霍普金斯 CS 601.471/671 NLP: Self-supervised Models
李宏毅 GenAI课程
openai-cookbook: Examples and guides for using the OpenAI API.
Hands on llms: Learn about LLM, LLMOps, and vector DBS for free by designing, training, and deploying a real-time financial advisor LLM system.
滑铁卢大学 CS 886: Recent Advances on Foundation Models
Mistral: Getting Started with Mistral
斯坦福 CS25: Transformers United V4
Coursera: Chatgpt 应用提示工程
LangGPT: Empowering everyone to become a prompt expert!
mistralai-cookbook
Introduction to Generative AI 2024 Spring
build nanoGPT: Video+code lecture on building nanoGPT from scratch.
LLM101n: Let's build a Storyteller.
Knowledge Graphs for RAG
LLMs From Scratch (Datawhale Version)
OpenRAG
通往AGI之路
Andrej Karpathy - Neural Networks: Zero to Hero
Interactive visualization of Transformer
andysingal/llm-course
LM-class
Google Advanced: Generative AI for Developers Learning Path
Anthropics：Prompt Engineering Interactive Tutorial
LLMsBook
Large Language Model Agents
Cohere LLM University
LLMs and Transformers
Smol Vision: Recipes for shrinking, optimizing, customizing cutting edge vision models.
Multimodal RAG: Chat with Videos
LLMs Interview Note
RAG++ : From POC to production: Advanced RAG course.
Weights & Biases AI Academy: Finetuning, building with LLMs, Structured outputs and more LLM courses.
Prompt Engineering & AI tutorials & Resources
Learn RAG From Scratch – Python AI Tutorial from a LangChain Engineer
LLM Evaluation: A Complete Course

教程 Tutorial

动手学大模型应用开发
AI开发者频道
B站：五里墩茶社
B站：木羽Cheney
YTB：AI Anytime
B站：漆妮妮
Prompt Engineering Guide
YTB: AI超元域
B站：TechBeat人工智能社区
B站：黄益贺
B站：深度学习自然语言处理
LLM Visualization
知乎: 原石人类
B站：小黑黑讲AI
B站：面壁的车辆工程师
B站：AI老兵文哲
Large Language Models (LLMs) with Colab notebooks
YTB：IBM Technology
YTB: Unify Reading Paper Group
Chip Huyen
How Much VRAM
Blog: 科学空间（苏剑林）
YTB: Hyung Won Chung
Blog: Tejaswi kashyap
Blog: 小昇的博客
知乎: ybq
W&B articles
Huggingface Blog
Blog: GbyAI
Blog: mlabonne
LLM-Action

论文 Paper

Note

🤝Huggingface Daily Papers、Cool Papers、ML Papers Explained

Hermes-3-Technical-Report
The Llama 3 Herd of Models
Qwen Technical Report
Qwen2 Technical Report
Qwen2-vl Technical Report
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Baichuan 2: Open Large-scale Language Models
DataComp-LM: In search of the next generation of training sets for language models
OLMo: Accelerating the Science of Language Models
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Jamba-1.5: Hybrid Transformer-Mamba Models at Scale
Jamba: A Hybrid Transformer-Mamba Language Model
Textbooks Are All You Need
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models data
OLMoE: Open Mixture-of-Experts Language Models
Model Merging Paper
Baichuan-Omni Technical Report
1.5-Pints Technical Report: Pretraining in Days, Not Months – Your Language Model Thrives on Quality Data
Baichuan Alignment Technical Report

Tips

What We Learned from a Year of Building with LLMs (Part I)
What We Learned from a Year of Building with LLMs (Part II)
What We Learned from a Year of Building with LLMs (Part III): Strategy
轻松入门大语言模型（LLM）
LLMs for Text Classification: A Guide to Supervised Learning
Unsupervised Text Classification: Categorize Natural Language With LLMs
Text Classification With LLMs: A Roundup of the Best Methods
LLM Pricing
Uncensor any LLM with abliteration
Tiny LLM Universe
Zero-Chatgpt
Zero-Qwen-VL
finetune-Qwen2-VL
MPP-LLaVA
build_MiniLLM_from_scratch
Tiny LLM zh
MiniMind: 3小时完全从0训练一个仅有26M的小参数GPT，最低仅需2G显卡即可推理训练.
LLM-Travel: 致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用
Knowledge distillation: Teaching LLM's with synthetic data
Part 1: Methods for adapting large language models
Part 2: To fine-tune or not to fine-tune
Part 3: How to fine-tune: Focus on effective datasets
Reader-LM: Small Language Models for Cleaning and Converting HTML to Markdown
LLMs应用构建一年之心得
LLM训练-pretrain
pytorch-llama: LLaMA 2 implemented from scratch in PyTorch.
Preference Optimization for Vision Language Models with TRL 【support model】
Fine-tuning visual language models using SFTTrainer 【docs】
A Visual Guide to Mixture of Experts (MoE)
Role-Playing in Large Language Models like ChatGPT
Distributed Training Guide: Best practices & guides on how to write distributed pytorch training code.
Chat Templates
Top 20+ RAG Interview Questions