LLM Awesome

Models
Engine/Framework/Service/Inference
- ggerganov/llama.cpp
  - MIT, C++
  - 有非常多的语言绑定
  - 支持最好最广泛使用的推理引擎
- NVIDIA/TensorRT-LLM
  - Toolbox for Optimized LLM Inference
- vllm-project/vllm
  - https://vllm.ai/
  - vLLM: Easy, Fast, and Cheap LLM Serving
- codelion/optillm
  - Optimizing inference proxy for LLMs
- bklieger-groq/g1
  - Reasoning/CoT/reasoning chain
  - HN
- mlc-ai/web-llm
  - Apache-2.0, TS, WebGPU
  - 直接在浏览器内运行模型 inference
- 商业
  - Perplexity
  - Fireworks AI
  - groq
  - Cloudflare AI Workers
  - Nvidia NIM
- Models
- 价格
Runner/Inference
- jmorganca/ollama
  - MIT, Golang
  - 快速启动任意模型
  - 提供模型管理
  - 提供管理接口
Application/WebUI/UI/Desktop/Consumer
- mudler/LocalAI
  - MIT, C++, Go
- abi/secret-llama
  - Apache-2.0, TS
  - Fully private LLM chatbot that runs entirely with a browser with no server needed. Supports Mistral and LLama 3.
- Yonom/assistant-ui
  - React Components for AI Chat
- lobehub/lobe-chat
  - ~~Apache-2.0~~, TypeScript
- n4ze3m/page-assist
  - MIT, TS
  - 浏览器插件
  - https://chrome.google.com/webstore/detail/page-assist/jfgfiigpkhlkbnfnbobbkinehhfdhndo
- open-webui/open-webui
  - MIT, Svelte, Python
  - WebUI for LLMs
  - open-webui/desktop
    - BSD-3, TS, Svelte, Electron
- janhq/jan
  - AGPLv3, Typescript
  - alternative to ChatGPT that runs 100% offline
  - Multiple engine - llama.cpp, TensorRT-LLM
- a16z-infra/llm-app-stack
- TavernAI
  - MIT, JS
- SillyTavern
  - AGPLv3, JS
  - fork TavernAI
  - LLM Frontend for Power Users
- enricoros/big-agi
  - MIT, TS, JS
- swirlai/swirl-search
  - Apache-2.0, Python
  - AI Search & RAG Without Moving Your Data
- Pythagora-io/gpt-pilot
  - MIT, Python
  - The first real AI developer
- All-Hands-AI/OpenHands
  - MIT, Python, TS
Service/API/Adapter/Gateway
- BerriAI/litellm
  - MIT, Python
  - 将各种 LLM 适配为 OpenAI 的 API 格式
- bricks-cloud/BricksLLM
  - MIT, Golang
  - Enterprise-grade API gateway
  - 提供访问控制、监控
  - 支持 OpenAI, Azure OpenAI, Anthropic, vLLM 等
- Helicone/helicone
  - Apache-2.0, TS
  - observability platform for LLMs
Structure
- outlines-dev/outlines
  - Apache-2.0, Pythone
  - Structured Text Generation
- guidance-ai/guidance
  - MIT, Pythone
Platform/Traing/Playground
- lm-sys/FastChat
  - Apache-2.0
  - training, serving, and evaluating
Web/UI/Chat
- Yidadaa/ChatGPT-Next-Web
- nluxai/nlux
  - MPL-2.0, TS, React
  - UI for any LLM, supporting LangChain / HuggingFace / Vercel AI,
- CopilotKit
  - MIT, TS
  - React UI + elegant infrastructure for AI Copilots
  - framework for building custom AI Copilots
  - npm:@copilotkit/react-core,@copilotkit/react-ui,@copilotkit/react-textarea
- Oneirocom/Magick
  - toolkit for AI builder
- hubtype/botonic
  - MIT, TS, React
- ~~yoctol/bottender~~
  - MIT, TS, React
RAG/Embedding
- vanna-ai/vanna
  - MIT
  - Text-to-SQL
- truefoundry/cognita
  - Apache-2.0, TS, Python
- EZ-hwh/AutoCrawler
- marqo-ai/marqo
  - Apache-2.0, Python
  - Unified embedding generation and search engine.
Service
- dify
Agent
- kingjulio8238/memary
  - Longterm Memory for Autonomous Agents
- a16z-infra/ai-town
  - MIT, TS
- coder-hxl/x-crawl
  - MIT, TS
  - Node.js AI-assisted crawler library
- andrewyng/translation-agent
  - Agentic translation using reflection workflow
Coding
- replit/ReplitLM
- albertan017/LLM4Decompile
  - Decompiling Binary Code
- jehna/humanify
  - Deobfuscate Javascript
  - https://thejunkland.com/blog/using-llms-to-reverse-javascript-minification.html
- facebook/llm-compiler
  - Meta LLM Compiler, a family of models built on Meta Code Llama with additional code optimization and compiler capabilities
Fine-tuning
- unslothai/unsloth
  - Apache-2.0, Python
  - Finetune Llama 3, Mistral, Phi, Gemma
  - QWen2 https://github.com/unslothai/unsloth/issues/149
- meta-llama/llama-recipes
- LoRA
- Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora
- https://huggingface.co/datasets/tatsu-lab/alpaca
Uncensor
- llm-attacks/llm-attacks
- https://huggingface.co/blog/mlabonne/abliteration
  - HN
Promopt
- langgptai/LangGPT
  - 结构化提示词
- linexjlin/GPTs
  - leaked prompts of GPTs
Vision
- vikhyat/moondream
  - tiny vision language model
stanfordnlp/dspy
- DSPy: The framework for programming—not prompting—foundation models
BloopAI/bloop
- Answer questions about your code with an LLM agent
HazyResearch/flash-attention
- Fast and memory-efficient exact attention
refuel-ai/autolabel
- Label, clean and enrich text datasets with LLMs
[cpacker/MemGPT](- https://github.com/cpacker/MemGPT)
- Teaching LLMs memory management for unbounded context
haotian-liu/LLaVA
- Large Language-and-Vision Assistant
- 类似 GPT4V
bionic-gpt/bionic-gpt
Reading/Explain
- LLM Visualization
  - https://bbycroft.net/llm
  - https://github.com/bbycroft/llm-viz
- https://intro-llm.github.io/
  - 大规模语言模型：从理论到实践
- https://spreadsheets-are-all-you-need.ai
tyxsspa/AnyText
- by Alibaba
Hannibal046/Awesome-LLM
imoneoi/openchat
What can LLMs never do?
- HN

Models

facebookresearch/llama
- GPLv3
ymcui/Chinese-LLaMA-Alpaca
- Apache-2.0
- 中文大语言模型
PotatoSpudowski/fastLLaMa
- MIT, C
clue-ai/PromptCLUE
clue-ai/ChatYuan
cocktailpeanut/dalai
belladoreai/llama-tokenizer-js
https://github.com/cogentapps/chat-with-gpt/blob/main/app/src/core/tokenizer/bpe.ts
https://github.com/dqbd/tiktoken
https://github.com/functorism/gpt4-tokenizer-visualizer
run-llama/llama_index

Models​

Models