What this is
A selective reference
A selective collection of AI projects, tools, and resources organized for easier browsing.
AI Resources
A selective list of notable AI models, tools, datasets, and experiments organized for useful browsing.
LifeHubber AI Resources is selective, not exhaustive. Use it as a starting point for browsing, not as an endorsement. Availability, access, usage limits, and terms can change, so check the original project materials before relying on a resource.
What this is
A selective collection of AI projects, tools, and resources organized for easier browsing.
How to use it
Each entry includes a short description, source label, and direct link to make scanning simpler.
Editorial approach
The list leans toward open-ish, inspectable, or publicly documented projects, without treating every entry as open-source.
Reader signals
A few entries readers have marked, or scroll down for the full list.
google/gemma-4
A family of multimodal models from Google DeepMind that handle text and image input and generate text output.
MiniMaxAI/MiniMax-M2.7
A large MiniMax model focused on agentic work, software engineering, tool use, and complex productivity workflows.
Donchitos/Claude-Code-Game-Studios
A multi-agent game-development studio system for Claude Code, organized around specialized agents, workflow skills, hooks, rules, and templates.
tencent/Hy3-preview
A Tencent Hy Team MoE model positioned around long-context reasoning, instruction following, coding, and agent task evaluation.
paperclipai/paperclip
A Node.js server and React UI for orchestrating teams of AI agents, assigning goals, and tracking work and costs from one dashboard.
Recently added
Fresh AI resources added for browsing.
NVIDIA-AI-Blueprints/aiq
An NVIDIA AI Blueprint for agentic research workflows, with shallow and deep research modes, citation-backed answers, YAML-configured tools and agents, CLI/web/async job paths, evaluation materials, and deployment assets.
Lum1104/Understand-Anything
A codebase and knowledge-base graph tool for AI coding environments, with plugin paths for Claude Code, Codex, Cursor, Copilot, Gemini CLI, and others, plus search, chat, tours, diff impact views, and an interactive dashboard.
nashsu/llm_wiki
A cross-platform desktop app for turning documents into an LLM-maintained wiki, with source traceability, graph search, optional vector retrieval, local API access, and a companion agent skill.
colbymchenry/codegraph
A local codebase knowledge graph for coding agents, with MCP-style tools, symbol relationships, call graphs, framework-aware routes, auto-sync, multi-agent setup paths, and project-reported savings in token use and tool calls.
meituan-longcat/LongCat-Video-Avatar-1.5
A Meituan LongCat audio-driven avatar video model for single- and multi-person generation, with audio-text-to-video, audio-image-text-to-video, video continuation, model weights, GitHub quickstart, and project-reported evaluation materials.
Browse the list
Showing 137 of 137 resources
Showing all resources
Occasional notes when new AI resources are added. The form below is handled by the mailing-list service, so its own terms apply when you subscribe.
AI Models
louis-e/arnis
Generates real-world locations inside Minecraft with a surprisingly high level of detail.
CohereLabs/command-a-plus-05-2026-w4a4
A Cohere Labs Command A+ model variant with W4A4 quantization, positioned around agentic tool use, multimodal inputs, multilingual work, long context, and Cohere-reported lower hardware requirements.
deepseek-ai/DeepSeek-OCR-2
A newer DeepSeek OCR model release for image/PDF OCR, document-to-Markdown workflows, dynamic resolution, vLLM/Transformers inference, and visual causal flow research.
deepseek-ai/deepseek-v4
A DeepSeek model family release positioned around long-context intelligence, reasoning modes, coding benchmarks, and agentic task evaluation.
google/gemma-4
A family of multimodal models from Google DeepMind that handle text and image input and generate text output.
zai-org/GLM-5.1
A flagship text-generation model positioned around agentic engineering, stronger coding performance, and longer-horizon tool use.
zai-org/GLM-OCR
A multimodal OCR model for complex document understanding, positioned around strong real-world document parsing and efficient deployment.
AngelSlim/Hy-MT1.5-1.8B-1.25bit
A low-bit on-device translation model from AngelSlim, positioned around 33-language offline translation, GGUF access, Android demo use, and 1.25-bit compression.
Tencent-Hunyuan/Hy-MT2
A Tencent-Hunyuan multilingual translation model family with 1.8B, 7B, and 30B-A3B variants, 33-language support, GGUF and FP8 options, IFMTBench, training notes, deployment guidance, and Tencent-reported translation results.
tencent/Hy3-preview
A Tencent Hy Team MoE model positioned around long-context reasoning, instruction following, coding, and agent task evaluation.
moonshotai/Kimi-K2.6
A multimodal agentic model positioned around long-horizon coding, tool use, autonomous execution, and broader software workflows.
bytedance-research/Lance
A ByteDance Research unified multimodal model for image and video understanding, generation, and editing, with model files, demos, inference scripts, Gradio setup, benchmark scripts, and a stated 40GB VRAM inference requirement.
LiquidAI/LFM2.5-350M
A hybrid model in the LFM2.5 family built for on-device deployment, with extended pre-training and reinforcement learning.
inclusionAI/Ling-2.6-flash
An inclusionAI instruct model positioned around faster responses, token efficiency, tool use, multi-step planning, and agent-oriented workloads.
robbyant/lingbot-map
A feed-forward 3D foundation model for streaming scene reconstruction, positioned around geometric consistency, long sequences, and efficient real-time inference.
nv-tlabs/lyra
A series of generative 3D world models from NVIDIA, positioned around explorable scenes, 3D consistency, and world-scale generation workflows.
XiaomiMiMo/mimo-v25
A Xiaomi MiMo model family positioned around multimodal understanding, agentic workflows, long-context use, and Pro variants for harder software and tool-heavy tasks.
MiniMaxAI/MiniMax-M2.7
A large MiniMax model focused on agentic work, software engineering, tool use, and complex productivity workflows.
OpenMOSS-Team/moss-vl
An OpenMOSS vision-language family with Base and Instruct releases for image, video, OCR, and document understanding work.
nvidia/Nemotron-Labs-Diffusion-14B
An NVIDIA 14B text-generation model from the Nemotron-Labs-Diffusion family, focused on switching between autoregressive, diffusion-style parallel decoding, and self-speculation for project-reported decoding efficiency gains.
allenai/olmoearth
An Ai2 remote-sensing foundation model family for satellite imagery and planetary-scale mapping, with v1.1 Base and BandExtractor models, model weights, training code, a technical report, and Ai2-reported lower compute cost.
Qwen/Qwen3.6-35B-A3B
An open-weight multimodal model positioned around agentic coding, tool use, long-context work, and real-world software workflows.
google-deepmind/tips
A family of vision-language encoders from Google DeepMind, positioned around image-text pretraining, spatial awareness, and general-purpose multimodal applications.
microsoft/TRELLIS.2
A Microsoft 3D generation model for high-fidelity image-to-3D asset creation, using O-Voxel structured latents, PBR materials, inference code, and training tools.
arcee-ai/trinity-large-thinking
A model designed for coherent multi-turn behavior, clean tool use, constrained instruction following, and efficient serving at scale.
allenai/WildDet3D
A promptable 3D detection system for real-world scenes, positioned around text, point, and box prompts for spatial perception workflows.
Zyphra/ZAYA1-8B
A small Zyphra mixture-of-experts reasoning model with public weights, 760M active parameters, 8.4B total parameters, deployment notes, and project-reported math and coding evaluations.
Speech Models
CohereLabs/cohere-transcribe-03-2026
A 2B parameter automatic speech recognition model for audio-in, text-out transcription across 14 languages.
fishaudio/s2-pro
A text-to-speech model with detailed control over prosody and emotional delivery.
KittenML/KittenTTS
A very small text-to-speech model designed to stay lightweight without feeling toy-like.
xzf-thu/Mega-ASR
A robust automatic speech recognition project for messy real-world audio, with code, inference and training paths, Hugging Face weights, a technical report, Voices-in-the-Wild-2M, and project-reported results across difficult acoustic scenarios.
XiaomiMiMo/MiMo-V2.5-ASR
A Xiaomi MiMo speech-recognition model focused on Mandarin, English, Chinese dialects, code-switched speech, noisy audio, songs, and multi-speaker transcription.
OpenMOSS/MOSS-Audio
An audio-understanding model family for speech, sound, music, captioning, time-aware QA, ASR, and reasoning over real-world audio.
OpenMOSS/MOSS-TTS-Nano
A tiny multilingual speech generation model positioned for real-time TTS, CPU-friendly local use, and lightweight deployment.
NVIDIA/personaplex
A real-time full-duplex speech-to-speech conversational model with persona control through role prompts and voice conditioning.
sbintuitions/sarashina2.2-tts
A Japanese-centric text-to-speech system from SB Intuitions, with Japanese and English generation, style transfer, and zero-shot voice generation support.
supertone-inc/supertonic
An on-device multilingual text-to-speech system built around ONNX Runtime, with local inference, 31-language support, expression tags, model assets, and examples across browser, mobile, desktop, and edge runtimes.
HumeAI/tada
A speech-language model that aligns speech and text into a single synchronized stream.
openbmb/VoxCPM2
A multilingual text-to-speech model with voice design, controllable voice cloning, and streaming support.
Music / Image Gen Models
ace-step/ACE-Step-1.5
A local music generation model aimed at fast song creation on consumer hardware, with support across CUDA, AMD, Intel, Mac, and CPU setups.
VAST-AI-Research/AniGen
A framework for generating animatable 3D assets from a single image, with mesh, skeleton, and skinning outputs for downstream animation and simulation workflows.
IGL-HKUST/CoMoVi
A framework for co-generating 3D human motion and realistic videos, with a focus on motion-conditioned video generation and training workflows.
lllyasviel/Fooocus
A local image-generation interface built around prompt-focused SDXL workflows, with Windows downloads, Colab access, inpainting, outpainting, image prompts, and presets.
meituan-longcat/LongCat-Video-Avatar-1.5
A Meituan LongCat audio-driven avatar video model for single- and multi-person generation, with audio-text-to-video, audio-image-text-to-video, video continuation, model weights, GitHub quickstart, and project-reported evaluation materials.
NVlabs/LongLive
An NVIDIA Labs infrastructure codebase for long video generation, with LongLive 2.0 NVFP4 and parallel training/inference support, multi-shot generation, async decoding, model links, docs, configs, and project-reported FPS and VBench results.
GVCLab/PersonaLive
A portrait image-animation framework for live-streaming-style video generation research, with offline and online inference, pretrained weights, a Web UI, and acceleration notes.
NVlabs/Sana
An NVIDIA Labs codebase for efficient high-resolution image and video generation, with Sana, Sana-1.5, Sana-Sprint, Sana-Video, training and inference pipelines, model zoo links, ComfyUI and diffusers paths, and newer world-model work.
HKUDS/ViMax
An agentic video-generation framework for turning ideas, scripts, or longer narratives into planned video workflows, with script generation, storyboards, shot planning, reference selection, consistency checks, and configurable chat, image, and video model providers.
AI Agents
ag2ai/ag2
A Python framework for AI agents and multi-agent workflows, with conversable agents, orchestration patterns, tools, human-in-the-loop flows, code execution options, structured outputs, and an active v1.0 transition note.
Panniantong/Agent-Reach
A CLI that gives AI agents broader web reach across platforms like Twitter, Reddit, YouTube, GitHub, Bilibili, and XiaoHongShu without paid API usage.
agentscope-ai/agentscope
An agent framework with core abstractions, visibility tooling, and built-in support for fine-tuning workflows.
aipoch/medical-research-skills
A curated library of medical research agent skills designed to support evidence review, protocol design, data analysis, and academic writing workflows.
browser-use/browser-use
A browser automation framework for AI agents that can navigate websites, click elements, type into pages, use custom tools, and run browser tasks through code or CLI workflows.
hilash/cabinet
An AI-first knowledge base and workspace system with agents, memory, scheduled jobs, and local file-based storage.
HKUDS/CatchMe
A lightweight, vectorless system for capturing a broader digital footprint as usable context.
Donchitos/Claude-Code-Game-Studios
A multi-agent game-development studio system for Claude Code, organized around specialized agents, workflow skills, hooks, rules, and templates.
colbymchenry/codegraph
A local codebase knowledge graph for coding agents, with MCP-style tools, symbol relationships, call graphs, framework-aware routes, auto-sync, multi-agent setup paths, and project-reported savings in token use and tool calls.
ComposioHQ/composio
An agent tool-integration layer with Python and TypeScript SDKs, toolkits, authentication, sessions, triggers, tool search, and workbench features for connecting agents to external services.
CopilotKit/CopilotKit
A frontend stack for building agent-native applications with chat UI, generative UI, shared state, backend tool rendering, human-in-the-loop workflows, and React or Angular app paths.
trycua/cua
Infrastructure for computer-use agents, with sandboxes, SDKs, benchmarks, and model integrations for agents working across desktop environments.
HKUDS/DeepTutor
An agent-native personalized tutoring system with tutoring workflows, persistent memory, a web app, CLI access, and a broader learning-support architecture.
bytedance/deer-flow
A ByteDance long-horizon agent harness for deep research, coding, file work, report generation, skills, sub-agents, memory, sandboxed execution, and message gateways.
langgenius/dify
A visual platform for building agentic workflows and AI applications with workflow and chatflow builders, model-provider connections, RAG pipelines, tools, APIs, logs, and cloud or self-hosted paths.
nico-martin/gemma4-browser-extension
An independent Chrome extension experiment for running an on-device browser agent with Transformers.js, WebGPU, Gemma 4, page RAG, tab tools, and semantic history search.
open-gitagent/gitagent
A framework-agnostic, git-native standard for defining and sharing AI agents.
google/skills
A public Agent Skills repository for Google products and technologies, including Google Cloud, with installable skills for Gemini API in Agent Platform, cloud basics, onboarding, authentication, observability, and well-architected guidance.
block/goose
An on-machine AI agent for complex development work, including coding, execution, debugging, workflow orchestration, and API interaction.
vectorize-io/hindsight
An agent memory system designed to help agents learn over time rather than only recall conversation history.
holaboss-ai/holaOS
A beta agent-workspace environment for recurring AI work-streams, with living workspaces, memory, history, files, apps, dashboards, runtime state, and sub-agent coordination.
Intelligent-Internet/ii-agent
An AI agent for practical work, built to be run, forked, and extended across solo, team, and internal-tooling use cases.
livekit/agents
A realtime framework for voice, video, and physical AI agents, with Python and Node.js paths, LiveKit room participants, WebRTC clients, telephony support, tools, testing, and deployment options.
nashsu/llm_wiki
A cross-platform desktop app for turning documents into an LLM-maintained wiki, with source traceability, graph search, optional vector retrieval, local API access, and a companion agent skill.
mastra-ai/mastra
A TypeScript framework for building AI agents and applications with model routing, workflows, human-in-the-loop steps, memory, tools, MCP servers, evals, and observability.
mem0ai/mem0
A memory layer for AI agents and assistants, with library, self-hosted, platform, SDK, CLI, cookbook, evaluation, and integration paths for persistent context.
MiniMax-AI/skills
A development skills library for AI coding agents, with structured guidance across frontend, fullstack, Android, iOS, and shader work.
allenai/molmoweb
A multimodal web agent from Ai2 that can navigate browser tasks from natural-language instructions.
HKUDS/nanobot
A lightweight personal AI agent project with packaged WebUI, goal tracking for longer tasks, image generation, provider presets, fallback models, plugin-style tools, channel integrations, and security hardening noted in its v0.2.0 release.
qwibitai/nanoclaw
A lightweight personal agent system that runs agents in isolated containers and connects them to messaging channels, memory, and scheduled jobs.
NVIDIA-AI-Blueprints/aiq
An NVIDIA AI Blueprint for agentic research workflows, with shallow and deep research modes, citation-backed answers, YAML-configured tools and agents, CLI/web/async job paths, evaluation materials, and deployment assets.
NVIDIA/skills
NVIDIA's public catalog of agent skills, framed around NVIDIA-verified skills, skill cards, scanning, signing, product-owned source repositories, and compatibility with the Agent Skills specification.
onyx-dot-app/onyx
An application layer for LLMs with a self-hostable interface and capabilities like RAG, web search, code execution, file creation, and deep research.
openagents-org/openagents
A collaboration project centered on AI agent networks designed to work together across shared workflows.
openai/openai-agents-python
A lightweight framework for multi-agent workflows, with tools, handoffs, guardrails, sessions, tracing, sandbox agents, and realtime voice support.
THU-MAIC/OpenMAIC
A multi-agent interactive classroom designed to offer an immersive learning experience with one-click setup.
rui-ye/OpenSeeker
A search agent system built around released training data, released models, and tool-based web information seeking.
HKUDS/OpenSpace
A framework focused on building agents that are smarter, lower-cost, and able to improve through self-evolving workflows.
alibaba/page-agent
A JavaScript in-page GUI agent for controlling web interfaces with natural language, aimed at browser-based workflows and interface automation.
VectifyAI/PageIndex
A vectorless, reasoning-based RAG framework for long-document retrieval, tree-structured indexing, traceable document search, and agent context workflows.
paperclipai/paperclip
A Node.js server and React UI for orchestrating teams of AI agents, assigning goals, and tracking work and costs from one dashboard.
pipecat-ai/pipecat
A Python framework and ecosystem for real-time voice and multimodal AI agents, with audio/video pipelines, transports, client SDKs, structured flows, and subagent support.
infiniflow/ragflow
A practical RAG and agent-context platform for document ingestion, chunking, retrieval, citations, knowledge workflows, and self-hosted AI applications.
agentscope-ai/ReMe
A memory management framework for AI agents, with file-based and vector-based systems for long-term memory and cross-session recall.
openai/symphony
An OpenAI engineering preview and specification for orchestrating coding agents from project work queues into isolated autonomous implementation runs.
Tencent/TencentDB-Agent-Memory
A local memory plugin for AI agents, with symbolic short-term memory, layered long-term memory, SQLite defaults, OpenClaw integration, Hermes support, and project-reported benchmark results.
tinyfish-io/skills
A public skills repo for TinyFish agent workflows, including web-agent automation and related utility skills.
bytedance/UI-TARS-desktop
A ByteDance GUI-agent desktop app and multimodal agent stack for local or remote computer and browser operation, with vision-language model control, CLI/Web UI paths, and MCP-oriented tooling.
Lum1104/Understand-Anything
A codebase and knowledge-base graph tool for AI coding environments, with plugin paths for Claude Code, Codex, Cursor, Copilot, Gemini CLI, and others, plus search, chat, tours, diff impact views, and an interactive dashboard.
ItzCrazyKns/Vane
A self-hostable AI answering engine for private search-style workflows, with local and cloud model providers, SearxNG-backed web search, cited sources, file uploads, and Docker setup paths.
Embodied / Physical AI
dimensionalOS/dimos
An operating system layer for controlling robots and other hardware platforms with natural-language workflows.
norma-core/hardware/elrobot
A low-cost 3D-printed robotic arm intended for physical AI research and imitation learning.
freemocap/freemocap
A research-grade motion capture system designed to stay low-cost, hardware-agnostic, and accessible for scientific, educational, and training use.
wu-yc/LabClaw
A large package of workflow skills for biomedical and scientific AI work across multiple lab-heavy domains.
unitreerobotics/unifolm-wbt-dataset
A real-world humanoid robot whole-body teleoperation dataset for open environments.
Productivity
nexu-io/open-design
A local-first AI design workspace that connects coding-agent CLIs to prototypes, decks, media outputs, design systems, sandboxed previews, and export workflows.
1weiho/open-slide
An agent-native slide framework for building React-based decks with coding agents, browser preview, comments, assets, present mode, and HTML/PDF export.
yazinsai/OpenOats
A meeting note-taking assistant designed to be more conversational and responsive than passive transcription.
crynta/terax-ai
A lightweight AI-native terminal and developer environment built with Tauri, Rust, and React, with a terminal, code editor, file explorer, web preview, AI side panel, BYOK providers, local model support, and approval-style file tools.
warpdotdev/warp
An agentic development environment born out of the terminal, with built-in coding-agent workflows and support for bringing external CLI agents into developer work.
Ecosystem
VoltAgent/awesome-design-md
A curated collection of DESIGN.md example files inspired by public websites, intended to help AI coding agents understand visual systems, design tokens, layout rules, and UI guardrails.
cocoindex-io/cocoindex
An incremental data engine for keeping AI-agent and LLM-app context fresh, with Python-native pipelines, delta-only processing, lineage, connectors, and targets for vector, graph, relational, and warehouse stores.
TencentCloud/CubeSandbox
Sandbox infrastructure for AI agents, positioned around fast startup, isolation, high concurrency, and self-hosted code-execution workflows.
NVIDIA-NeMo/DataDesigner
A synthetic data generation framework for creating structured datasets from scratch or seed data, with dependency-aware generation, validation, and quality scoring.
google-labs-code/design.md
A format specification and CLI toolkit for describing a design system to coding agents, positioned around persistent visual guidance, linting, and token-level design workflows.
googleworkspace/cli
A command-line interface for Workspace services, with repository notes that it is not officially supported by Google and is still changing.
heygen-com/hyperframes
A video rendering framework for HTML-based compositions, positioned around agent-friendly workflows, previewing, and MP4 rendering.
Vaibhavs10/insanely-fast-whisper
An opinionated CLI for very fast on-device transcription with Whisper.
K-Dense-AI/k-dense-byok
A desktop co-scientist setup built around scientific skills and bring-your-own-key workflows.
yichuan-w/LEANN
A lightweight vector database for personal RAG and semantic search, designed to run locally with much lower storage overhead.
lightpanda-io/browser
A headless browser designed with AI automation use cases in mind.
run-llama/liteparse
A local PDF parsing tool focused on fast, lightweight parsing, bounding boxes, OCR flexibility, and screenshots for agent workflows.
hiyouga/LlamaFactory
A unified fine-tuning and deployment platform for 100+ LLMs and VLMs, with a zero-code CLI, web UI, and support for many training approaches.
mnfst/manifest
A smart model router for personal AI agents, positioned around cost-aware request routing, fallbacks, provider control, and self-hosted agent workflows.
MiniMax-AI/cli
The official MiniMax CLI for terminal and agent workflows, with commands for text, image, video, speech, music, vision, and search.
openai/plugins
A curated collection of Codex plugin examples, manifests, and supporting files for extending Codex-based workflows.
openai/privacy-filter
A privacy-filtering model and local toolkit for detecting and masking personally identifiable information in text, positioned around high-throughput sanitization workflows.
PaddlePaddle/PaddleOCR
A document AI toolkit for turning PDFs and images into structured, LLM-ready data, positioned around multilingual OCR, document parsing, and agent-ready extraction workflows.
yusufkaraaslan/Skill_Seekers
A preprocessing layer for turning raw documentation into reusable inputs for skills, RAG pipelines, and AI coding tools.
github/spec-kit
A toolkit for spec-driven development, positioned around structured workflows, predictable implementation paths, and AI coding agent integrations.
vllm-project/vllm-omni
A framework for serving and running omni-modality models more efficiently.
jamiepine/voicebox
A local-first voice synthesis studio for voice cloning, speech generation, effects, and voice-powered app workflows.
Datasets
evolvent-ai/ClawMark
A living-world benchmark for multi-day, multimodal coworker agents, spanning 100 tasks across professional domains and real tool environments.
meituan-longcat/General365
A manually curated benchmark for general reasoning in LLMs, designed around high difficulty, broad task diversity, K-12-scope knowledge, and hybrid scoring.
meituan-longcat/LARYBench
A benchmark for evaluating latent action representations, with pipelines for action semantics, robotic control regression, and broader vision-to-action alignment.
openai/monitorability-evals
An OpenAI evaluation-data release for studying monitorability, with public eval splits, prompt templates, dataset mappings, and metric code from the Monitoring Monitorability paper.
allenai/olmOCR-bench
A benchmark for evaluating how well OCR systems convert PDFs into useful markdown while preserving structure.
run-llama/ParseBench
A document parsing benchmark for AI-agent workflows, focused on whether parsed PDFs preserve enough structure and meaning for reliable downstream use.
google/WaxalNLP
A large multilingual speech corpus for African languages introduced through the WAXAL paper.
Also in AI
A good next step might be AI Guides for help with choosing and using AI tools well, AI Access for free and low-cost ways to compare AI model access, AI Ballot for a clearer view of what readers are leaning toward.