LIFEHUBBER
Theme

AI Resources

Open-ish AI resources organized for browsing.

A selective list of notable AI models, tools, datasets, and experiments organized for useful browsing.

LifeHubber AI Resources is selective, not exhaustive. It is an editorial starting point, not an endorsement. Availability, access, usage limits, and terms can vary, so please review original project materials before relying on a resource.

What this is

A selective reference

A selective collection of AI projects, tools, and resources organized for easier browsing.

How to use it

Browse by category

Each entry includes a short description, source label, and direct link to make scanning simpler.

Editorial approach

Selective, not exhaustive

The aim is to keep the signal cleaner, with fewer items, clearer categories, and room for future updates over time.

Browse the list

Explore by section or filter the page.

Showing 102 of 102 resources

Showing all resources

AI Models

Models and experiments

21

Arnis

louis-e/arnis

GitHub

Generates real-world locations inside Minecraft with a surprisingly high level of detail.

World generation, mapping

DeepSeek-OCR-2

deepseek-ai/DeepSeek-OCR-2

GitHub

A newer DeepSeek OCR model release for image/PDF OCR, document-to-Markdown workflows, dynamic resolution, vLLM/Transformers inference, and visual causal flow research.

OCR, document understanding

DeepSeek-V4

deepseek-ai/deepseek-v4

Hugging Face

A DeepSeek model family release positioned around long-context intelligence, reasoning modes, coding benchmarks, and agentic task evaluation.

Reasoning models, long context

Gemma 4

google/gemma-4

Kaggle

A family of multimodal models from Google DeepMind that handle text and image input and generate text output.

Multimodal models

GLM-5.1

zai-org/GLM-5.1

Hugging Face

A flagship text-generation model positioned around agentic engineering, stronger coding performance, and longer-horizon tool use.

Agentic coding models

GLM-OCR

zai-org/GLM-OCR

GitHub

A multimodal OCR model for complex document understanding, positioned around strong real-world document parsing and efficient deployment.

OCR models, document understanding

Hy-MT1.5-1.8B-1.25bit

AngelSlim/Hy-MT1.5-1.8B-1.25bit

Hugging Face

A low-bit on-device translation model from AngelSlim, positioned around 33-language offline translation, GGUF access, Android demo use, and 1.25-bit compression.

On-device translation, model compression

Hy3 preview

tencent/Hy3-preview

Hugging Face

A Tencent Hy Team MoE model positioned around long-context reasoning, instruction following, coding, and agent task evaluation.

Reasoning models, coding agents

Kimi-K2.6

moonshotai/Kimi-K2.6

Hugging Face

A multimodal agentic model positioned around long-horizon coding, tool use, autonomous execution, and broader software workflows.

Agentic coding models

LFM2.5-350M

LiquidAI/LFM2.5-350M

Hugging Face

A hybrid model in the LFM2.5 family built for on-device deployment, with extended pre-training and reinforcement learning.

On-device models

Ling-2.6-flash

inclusionAI/Ling-2.6-flash

Hugging Face

An inclusionAI instruct model positioned around faster responses, token efficiency, tool use, multi-step planning, and agent-oriented workloads.

Efficient agent model

LingBot-Map

robbyant/lingbot-map

GitHub

A feed-forward 3D foundation model for streaming scene reconstruction, positioned around geometric consistency, long sequences, and efficient real-time inference.

Streaming 3D reconstruction

Lyra

nv-tlabs/lyra

GitHub

A series of generative 3D world models from NVIDIA, positioned around explorable scenes, 3D consistency, and world-scale generation workflows.

3D world models

MiMo-V2.5

XiaomiMiMo/mimo-v25

Hugging Face

A Xiaomi MiMo model family positioned around multimodal understanding, agentic workflows, long-context use, and Pro variants for harder software and tool-heavy tasks.

Multimodal models, agentic workflows

MiniMax-M2.7

MiniMaxAI/MiniMax-M2.7

Hugging Face

A large MiniMax model focused on agentic work, software engineering, tool use, and complex productivity workflows.

Agentic models

MOSS-VL

OpenMOSS-Team/moss-vl

Hugging Face

An OpenMOSS vision-language family with Base and Instruct releases for image, video, OCR, and document understanding work.

Vision-language models, multimodal understanding

Qwen3.6-35B-A3B

Qwen/Qwen3.6-35B-A3B

Hugging Face

An open-weight multimodal model positioned around agentic coding, tool use, long-context work, and real-world software workflows.

Agentic coding models, long context

TIPS / TIPSv2

google-deepmind/tips

GitHub

A family of vision-language encoders from Google DeepMind, positioned around image-text pretraining, spatial awareness, and general-purpose multimodal applications.

Vision-language encoders, spatial understanding

TRELLIS.2

microsoft/TRELLIS.2

GitHub

A Microsoft 3D generation model for high-fidelity image-to-3D asset creation, using O-Voxel structured latents, PBR materials, inference code, and training tools.

3D generation, image-to-3D

Trinity-Large-Thinking

arcee-ai/trinity-large-thinking

Hugging Face

A model designed for coherent multi-turn behavior, clean tool use, constrained instruction following, and efficient serving at scale.

Reasoning models

WildDet3D

allenai/WildDet3D

GitHub

A promptable 3D detection system for real-world scenes, positioned around text, point, and box prompts for spatial perception workflows.

3D perception models

Speech Models

Speech input, output, and interaction

10

Cohere Transcribe

CohereLabs/cohere-transcribe-03-2026

Hugging Face

A 2B parameter automatic speech recognition model for audio-in, text-out transcription across 14 languages.

STT, ASR

Fish Audio S2 Pro

fishaudio/s2-pro

Hugging Face

A text-to-speech model with detailed control over prosody and emotional delivery.

TTS, expressive speech

KittenTTS

KittenML/KittenTTS

GitHub

A very small text-to-speech model designed to stay lightweight without feeling toy-like.

Compact TTS

MiMo-V2.5-ASR

XiaomiMiMo/MiMo-V2.5-ASR

GitHub

A Xiaomi MiMo speech-recognition model focused on Mandarin, English, Chinese dialects, code-switched speech, noisy audio, songs, and multi-speaker transcription.

ASR, dialects, code-switching

MOSS-Audio

OpenMOSS/MOSS-Audio

GitHub

An audio-understanding model family for speech, sound, music, captioning, time-aware QA, ASR, and reasoning over real-world audio.

Audio understanding, ASR, reasoning

MOSS-TTS-Nano

OpenMOSS/MOSS-TTS-Nano

GitHub

A tiny multilingual speech generation model positioned for real-time TTS, CPU-friendly local use, and lightweight deployment.

TTS, realtime speech

PersonaPlex

NVIDIA/personaplex

GitHub

A real-time full-duplex speech-to-speech conversational model with persona control through role prompts and voice conditioning.

STS, conversational speech

sarashina2.2-tts

sbintuitions/sarashina2.2-tts

Hugging Face

A Japanese-centric text-to-speech system from SB Intuitions, with Japanese and English generation, style transfer, and zero-shot voice generation support.

Japanese TTS, voice generation

TADA

HumeAI/tada

Hugging Face

A speech-language model that aligns speech and text into a single synchronized stream.

Speech-language modeling

VoxCPM2

openbmb/VoxCPM2

Hugging Face

A multilingual text-to-speech model with voice design, controllable voice cloning, and streaming support.

TTS, voice cloning

Music / Image Gen Models

Generative media models

5

ACE-Step 1.5

ace-step/ACE-Step-1.5

GitHub

A local music generation model aimed at fast song creation on consumer hardware, with support across CUDA, AMD, Intel, Mac, and CPU setups.

Music generation

AniGen

VAST-AI-Research/AniGen

GitHub

A framework for generating animatable 3D assets from a single image, with mesh, skeleton, and skinning outputs for downstream animation and simulation workflows.

Animatable 3D asset generation

CoMoVi

IGL-HKUST/CoMoVi

GitHub

A framework for co-generating 3D human motion and realistic videos, with a focus on motion-conditioned video generation and training workflows.

Human motion and video generation

Fooocus

lllyasviel/Fooocus

GitHub

A local image-generation interface built around prompt-focused SDXL workflows, with Windows downloads, Colab access, inpainting, outpainting, image prompts, and presets.

Image generation UI

PersonaLive

GVCLab/PersonaLive

GitHub

A portrait image-animation framework for live-streaming-style video generation research, with offline and online inference, pretrained weights, a Web UI, and acceleration notes.

Portrait animation, video generation

AI Agents

Agents and interfaces

31

Agent-Reach

Panniantong/Agent-Reach

GitHub

A CLI that gives AI agents broader web reach across platforms like Twitter, Reddit, YouTube, GitHub, Bilibili, and XiaoHongShu without paid API usage.

Agent tooling, web access

AgentScope

agentscope-ai/agentscope

GitHub

A production-ready agent framework with core abstractions, visibility tooling, and built-in support for fine-tuning workflows.

Agent frameworks

AIPOCH Medical Research Skills

aipoch/medical-research-skills

GitHub

A curated library of medical research agent skills designed to support evidence review, protocol design, data analysis, and academic writing workflows.

Agent skills, medical research

cabinet

hilash/cabinet

GitHub

An AI-first knowledge base and workspace system with agents, memory, scheduled jobs, and local file-based storage.

Agent workspaces

CatchMe

HKUDS/CatchMe

GitHub

A lightweight, vectorless system for capturing a broader digital footprint as usable context.

Agent memory, context capture

Claude Code Game Studios

Donchitos/Claude-Code-Game-Studios

GitHub

A multi-agent game-development studio system for Claude Code, organized around specialized agents, workflow skills, hooks, rules, and templates.

Agent systems, game development

Cua

trycua/cua

GitHub

Infrastructure for computer-use agents, with sandboxes, SDKs, benchmarks, and model integrations for agents working across desktop environments.

Computer-use agents

DeepTutor

HKUDS/DeepTutor

GitHub

An agent-native personalized tutoring system with tutoring workflows, persistent memory, a web app, CLI access, and a broader learning-support architecture.

Education agents

DeerFlow

bytedance/deer-flow

GitHub

A ByteDance long-horizon agent harness for deep research, coding, file work, report generation, skills, sub-agents, memory, sandboxed execution, and message gateways.

Long-horizon agents, research workflows

Gemma 4 Browser Extension

nico-martin/gemma4-browser-extension

GitHub

An independent Chrome extension experiment for running an on-device browser agent with Transformers.js, WebGPU, Gemma 4, page RAG, tab tools, and semantic history search.

Browser agents, on-device AI

gitagent

open-gitagent/gitagent

GitHub

A framework-agnostic, git-native standard for defining and sharing AI agents.

Agent standards

goose

block/goose

GitHub

An on-machine AI agent for complex development work, including coding, execution, debugging, workflow orchestration, and API interaction.

Coding agents

Hindsight

vectorize-io/hindsight

GitHub

An agent memory system designed to help agents learn over time rather than only recall conversation history.

Agent memory, learning

I-Agent

Intelligent-Internet/ii-agent

GitHub

An AI agent for practical work, built to be run, forked, and extended across solo, team, and internal-tooling use cases.

General-purpose agents

MiniMax Skills

MiniMax-AI/skills

GitHub

A development skills library for AI coding agents, with structured guidance across frontend, fullstack, Android, iOS, and shader work.

Agent skills, coding

MolmoWeb

allenai/molmoweb

GitHub

A multimodal web agent from Ai2 that can navigate browser tasks from natural-language instructions.

Web agents

nanobot

HKUDS/nanobot

GitHub

An ultra-lightweight personal AI agent project focused on core agent workflows, compact implementation, and readable extension points.

Personal agents

NanoClaw

qwibitai/nanoclaw

GitHub

A lightweight personal agent system that runs agents in isolated containers and connects them to messaging channels, memory, and scheduled jobs.

Personal agents, container isolation

Onyx

onyx-dot-app/onyx

GitHub

An application layer for LLMs with a self-hostable interface and capabilities like RAG, web search, code execution, file creation, and deep research.

Agent interfaces

OpenAgents

openagents-org/openagents

GitHub

A collaboration project centered on AI agent networks designed to work together across shared workflows.

Agent networks, collaboration

OpenAI Agents SDK

openai/openai-agents-python

GitHub

A lightweight framework for multi-agent workflows, with tools, handoffs, guardrails, sessions, tracing, sandbox agents, and realtime voice support.

Agent frameworks

OpenMAIC

THU-MAIC/OpenMAIC

GitHub

A multi-agent interactive classroom designed to offer an immersive learning experience with one-click setup.

Education agents, multi-agent learning

OpenSeeker

rui-ye/OpenSeeker

GitHub

A search agent system built around released training data, released models, and tool-based web information seeking.

Search agents, information seeking

OpenSpace

HKUDS/OpenSpace

GitHub

A framework focused on building agents that are smarter, lower-cost, and able to improve through self-evolving workflows.

Agent systems, self-evolving

Page Agent

alibaba/page-agent

GitHub

A JavaScript in-page GUI agent for controlling web interfaces with natural language, aimed at browser-based workflows and interface automation.

GUI agents, browser control

PageIndex

VectifyAI/PageIndex

GitHub

A vectorless, reasoning-based RAG framework for long-document retrieval, tree-structured indexing, traceable document search, and agent context workflows.

Vectorless RAG, agent context

Paperclip

paperclipai/paperclip

GitHub

A Node.js server and React UI for orchestrating teams of AI agents, assigning goals, and tracking work and costs from one dashboard.

Agent orchestration, dashboards

RAGFlow

infiniflow/ragflow

GitHub

A practical RAG and agent-context platform for document ingestion, chunking, retrieval, citations, knowledge workflows, and self-hosted AI applications.

RAG, agent context

ReMe

agentscope-ai/ReMe

GitHub

A memory management framework for AI agents, with file-based and vector-based systems for long-term memory and cross-session recall.

Agent memory

Symphony

openai/symphony

GitHub

An OpenAI engineering preview and specification for orchestrating coding agents from project work queues into isolated autonomous implementation runs.

Agent orchestration, coding agents

TinyFish Skills

tinyfish-io/skills

GitHub

A public skills repo for TinyFish agent workflows, including web-agent automation and related utility skills.

Agent skills, web automation

Embodied / Physical AI

Robotics and physical systems

5

dimos

dimensionalOS/dimos

GitHub

An operating system layer for controlling robots and other hardware platforms with natural-language workflows.

Agentic physical systems

elrobot

norma-core/hardware/elrobot

GitHub

A low-cost 3D-printed robotic arm intended for physical AI research and imitation learning.

Robotics hardware

FreeMoCap

freemocap/freemocap

GitHub

A research-grade motion capture system designed to stay low-cost, hardware-agnostic, and accessible for scientific, educational, and training use.

Motion capture, embodied AI

LabClaw

wu-yc/LabClaw

GitHub

A large package of workflow skills for biomedical and scientific AI work across multiple lab-heavy domains.

Scientific workflows

UnifoLM-WBT-Dataset

unitreerobotics/unifolm-wbt-dataset

Hugging Face

A real-world humanoid robot whole-body teleoperation dataset for open environments.

Robotics dataset

Productivity

Useful daily tools

3

open-slide

1weiho/open-slide

GitHub

An agent-native slide framework for building React-based decks with coding agents, browser preview, comments, assets, present mode, and HTML/PDF export.

Agent-authored slides

OpenOats

yazinsai/OpenOats

GitHub

A meeting note-taking assistant designed to be more conversational and responsive than passive transcription.

Meetings, note-taking

Warp

warpdotdev/warp

GitHub

An agentic development environment born out of the terminal, with built-in coding-agent workflows and support for bringing external CLI agents into developer work.

Agentic developer environments

Ecosystem

Tools around the stack

20

CubeSandbox

TencentCloud/CubeSandbox

GitHub

Sandbox infrastructure for AI agents, positioned around fast startup, isolation, high concurrency, and self-hosted code-execution workflows.

Agent execution infrastructure

DataDesigner

NVIDIA-NeMo/DataDesigner

GitHub

A synthetic data generation framework for creating structured datasets from scratch or seed data, with dependency-aware generation, validation, and quality scoring.

Synthetic data tooling

DESIGN.md

google-labs-code/design.md

GitHub

A format specification and CLI toolkit for describing a design system to coding agents, positioned around persistent visual guidance, linting, and token-level design workflows.

Design-system tooling for coding agents

Google Workspace CLI

googleworkspace/cli

GitHub

A single command-line interface for Drive, Gmail, Calendar, Docs, Sheets, Chat, Admin, and related workflows.

Workspace automation

HyperFrames

heygen-com/hyperframes

GitHub

A video rendering framework for HTML-based compositions, positioned around agent-friendly workflows, previewing, and MP4 rendering.

Agent-friendly media tooling

insanely-fast-whisper

Vaibhavs10/insanely-fast-whisper

GitHub

An opinionated CLI for very fast on-device transcription with Whisper.

Transcription, local inference

k-dense-byok

K-Dense-AI/k-dense-byok

GitHub

A desktop co-scientist setup built around scientific skills and bring-your-own-key workflows.

Scientific assistants

LEANN

yichuan-w/LEANN

GitHub

A lightweight vector database for personal RAG and semantic search, designed to run locally with much lower storage overhead.

RAG infrastructure, vector databases

Lightpanda Browser

lightpanda-io/browser

GitHub

A headless browser designed with AI automation use cases in mind.

Automation infrastructure

liteparse

run-llama/liteparse

GitHub

A local PDF parsing tool focused on fast, lightweight parsing, bounding boxes, OCR flexibility, and screenshots for agent workflows.

Document parsing tools

LLaMA Factory

hiyouga/LlamaFactory

GitHub

A unified fine-tuning and deployment platform for 100+ LLMs and VLMs, with a zero-code CLI, web UI, and support for many training approaches.

Fine-tuning tooling

Manifest

mnfst/manifest

GitHub

A smart model router for personal AI agents, positioned around cost-aware request routing, fallbacks, provider control, and self-hosted agent workflows.

Model routing, agent infrastructure

MiniMax CLI

MiniMax-AI/cli

GitHub

The official MiniMax CLI for terminal and agent workflows, with commands for text, image, video, speech, music, vision, and search.

Multimodal CLI

OpenAI Plugins

openai/plugins

GitHub

A curated collection of Codex plugin examples, manifests, and supporting files for extending Codex-based workflows.

Codex plugins, developer tooling

OpenAI Privacy Filter

openai/privacy-filter

GitHub

A privacy-filtering model and local toolkit for detecting and masking personally identifiable information in text, positioned around high-throughput sanitization workflows.

Privacy tooling, PII filtering

PaddleOCR

PaddlePaddle/PaddleOCR

GitHub

A document AI toolkit for turning PDFs and images into structured, LLM-ready data, positioned around multilingual OCR, document parsing, and agent-ready extraction workflows.

OCR and document AI

Skill Seekers

yusufkaraaslan/Skill_Seekers

GitHub

A preprocessing layer for turning raw documentation into reusable inputs for skills, RAG pipelines, and AI coding tools.

Knowledge tooling

Spec Kit

github/spec-kit

GitHub

A toolkit for spec-driven development, positioned around structured workflows, predictable implementation paths, and AI coding agent integrations.

AI coding workflows

vllm-omni

vllm-project/vllm-omni

GitHub

A framework for serving and running omni-modality models more efficiently.

Inference infrastructure

Voicebox

jamiepine/voicebox

GitHub

A local-first voice synthesis studio for voice cloning, speech generation, effects, and voice-powered app workflows.

Voice tooling

Datasets

Benchmarks and corpora

7

ClawMark

evolvent-ai/ClawMark

GitHub

A living-world benchmark for multi-day, multimodal coworker agents, spanning 100 tasks across professional domains and real tool environments.

Agent benchmark, multimodal evaluation

General365

meituan-longcat/General365

GitHub

A manually curated benchmark for general reasoning in LLMs, designed around high difficulty, broad task diversity, K-12-scope knowledge, and hybrid scoring.

Reasoning benchmark

LARYBench

meituan-longcat/LARYBench

GitHub

A benchmark for evaluating latent action representations, with pipelines for action semantics, robotic control regression, and broader vision-to-action alignment.

Vision-to-action benchmark

Monitorability Evals

openai/monitorability-evals

GitHub

An OpenAI evaluation-data release for studying monitorability, with public eval splits, prompt templates, dataset mappings, and metric code from the Monitoring Monitorability paper.

Model monitoring, eval datasets

olmOCR-bench

allenai/olmOCR-bench

Hugging Face

A benchmark for evaluating how well OCR systems convert PDFs into useful markdown while preserving structure.

OCR benchmark

ParseBench

run-llama/ParseBench

GitHub

A document parsing benchmark for AI-agent workflows, focused on whether parsed PDFs preserve enough structure and meaning for reliable downstream use.

Document parsing benchmark

WaxalNLP

google/WaxalNLP

Hugging Face

A large multilingual speech corpus for African languages introduced through the WAXAL paper.

Speech dataset

Sponsored

Sponsored

Also in AI

Explore more from AI.

Browse the other AI destinations for more tools to explore, a clearer view of reader interest, and practical guidance.