01. Analysis
One hundred and thirty-seven stars separate LiveKit Agents (10,766) from TEN Framework (10,629) on GitHub as of May 8, 2026. That gap is statistical noise — meaningless as a selection criterion. Both projects are actively maintained, both target developers building applications where an AI participates live in voice or video conversations, and both are Apache 2.0 licensed. By every surface-level metric, they're interchangeable.
They're not. The architectural decisions each project made carry real consequences for how your application behaves in production, how easily you debug it at 2am, and how quickly your team ships the first working demo. I spent time pulling apart both codebases, and the differences are substantial enough to make this a genuine architectural choice — not just a star-count coin flip.
How we researched this
Our research pipeline queried the GitHub AI-for-video ecosystem on May 8, 2026, surfacing the five most-starred repositories in the space. HeyGen Hyperframes led at 23,135 stars, a wide margin ahead of LiveKit Agents (10,766) and TEN Framework (10,629), with backgroundremover (7,903) and ShortGPT (7,375) rounding out the dataset.
Reddit and Hacker News community discussion returned empty results from our pipeline at time of publication. That's a meaningful absence: it means we have no verbatim practitioner quotes, no upvoted workaround threads, no "don't use this for X" warnings from people who burned time on edge cases. This analysis draws from the GitHub repositories themselves, the architectural patterns each project exposes in its public documentation, and the codebases as they stood on the research date. Official pricing pages for both projects' commercial cloud offerings returned no parseable data; treat any pricing mentioned here as estimates based on publicly documented tiers, not verified current rates.
What LiveKit Agents actually is
LiveKit Agents (GitHub: livekit/agents, 10,766 stars) is not a standalone project — it's the AI layer built on top of LiveKit, one of the most widely deployed open-source WebRTC infrastructures in existence. That parentage matters enormously.
LiveKit the infrastructure project has been around since 2021 and has accumulated serious production deployment history: real-time video conferencing applications, interactive streaming, collaborative tools. When the LiveKit team built their agents framework, they built it on that foundation, inheriting everything that foundation provides: tested WebRTC signaling, production-hardened TURN server integrations, browser SDK compatibility across Chrome, Firefox, and Safari, and a developer ecosystem that already knew how to run LiveKit in production.
The agents framework itself uses a "worker" architecture. You define an agent as a Python (or Node.js) class, and that agent runs as a worker that picks up jobs — typically a new participant joining a room. The framework handles the WebRTC plumbing: connecting your agent to the room, pulling the participant's audio stream, piping it through speech-to-text (Deepgram, Whisper, and others are supported via plugins), feeding the transcript to an LLM, converting the response back through text-to-speech (ElevenLabs, OpenAI TTS, Cartesia), and playing the audio back into the room. The VAD (voice activity detection) pipeline sits in the middle managing turn-taking.
What LiveKit Agents is genuinely good at: reducing the cognitive load of handling realtime media in Python. The plugin system is mature — you swap in a different STT or TTS provider by changing one line, and the rest of the pipeline stays intact. The framework manages the concurrency model for you; you write sequential-looking agent logic even though multiple streams are flowing in parallel.
Current limitation worth naming: LiveKit Agents is Python-first. The Node.js SDK exists but has historically lagged the Python implementation in features. If your team is TypeScript-native, check the feature parity before committing — as of May 8, 2026, the Python SDK was the one getting new capabilities first.
What TEN Framework actually is
TEN Framework (GitHub: TEN-framework/ten-framework, 10,629 stars) takes a fundamentally different architectural bet. Where LiveKit grew upward from WebRTC infrastructure, TEN was designed from scratch around a different primitive: a graph of extensions.
In TEN, your application is a directed graph of components called extensions. Each extension has defined input and output ports for different data types: audio frames, video frames, text messages, data blobs, commands. You wire extensions together into a pipeline — an STT extension takes raw audio frames and emits text messages; an LLM extension takes text messages and emits text responses; a TTS extension takes text and emits audio. The TEN runtime manages the message passing between extensions, handling buffering, back-pressure, and scheduling.
This architecture has a real advantage: it's language-agnostic at the extension level. You can write one extension in C++ for low-latency audio processing, wire it to a Python extension for LLM interaction, and connect that to a Go extension for some custom output — all running in the same graph managed by the TEN runtime. LiveKit Agents doesn't do this. If you need a sub-5ms audio processing component, C++ is the right language for it, and TEN's extension model accommodates that in a way LiveKit's Python-first worker model doesn't.
TEN was developed with backing from Agora, the commercial real-time video infrastructure company (NASDAQ: API). That lineage means TEN has real expertise behind its audio/video handling — Agora has been in the realtime communications business since 2014. The potential downside is the same one you always weigh with corporate-backed open source: long-term roadmap alignment with the backing company's commercial interests.
Current limitation worth naming: TEN's graph-of-extensions architecture is more powerful than LiveKit's worker model, and more complex to debug. When something goes wrong in a LiveKit Agents app, you trace a single Python execution path. When something goes wrong in a TEN graph, you're inspecting message flows across extension boundaries — a harder debugging surface, especially for developers new to dataflow architectures.
Where they actually diverge
The headline numbers are almost identical. The architectures are not. Here's where the real differences land:
Setup time to first working demo
LiveKit Agents wins this decisively. The Python SDK has a livekit-agents package on PyPI, a CLI that scaffolds a starter agent in under a minute, and extensive quickstart documentation with copy-paste code. A developer with no prior LiveKit experience can have a working voice agent in a LiveKit room in under two hours. The framework abstracts enough of the WebRTC complexity that you don't need to understand rooms, tracks, or signaling to get started.
TEN has a steeper initial ramp. The graph configuration uses a JSON schema (manifest.json, property.json) that you must understand before your first extension runs. The TEN runtime is a separate binary that your extensions load into — you're not just running a Python script, you're configuring a process. The payoff of that complexity is power; the cost is that the first afternoon is harder.
Latency profile
Both frameworks target sub-500ms round-trip latency for typical voice AI applications (STT → LLM → TTS cycles). Neither achieves this universally — latency is dominated by the AI service APIs you call, not the framework overhead. A GPT-4o call with streaming and ElevenLabs turbo TTS will perform similarly whether you're in LiveKit Agents or TEN.
Where TEN has a genuine edge: if you need to insert custom signal processing — noise suppression, voice enhancement, echo cancellation — before the STT stage, TEN's C++ extension support lets you run that processing at near-native speed within the same graph. LiveKit Agents requires you to handle that as a separate preprocessing step, typically in Python or via an external service. For most developers building standard voice assistants, this doesn't matter. For teams building applications with challenging audio environments (open office, noisy backgrounds), the difference is real.
Deployment model
LiveKit Agents deploys as a worker process that connects to a LiveKit server — either self-hosted or LiveKit's managed cloud. The managed cloud offers a free tier (roughly 10,000 connection minutes per month as of publication, though this is unverified current pricing). Kubernetes-based deployment is well-documented in the LiveKit ecosystem.
TEN's deployment model is more flexible at the cost of being less opinionated. You deploy the TEN runtime and your extensions — this can be Docker containers, bare metal, or cloud VMs, but you're responsible for more of the infrastructure decisions. TEN doesn't have an equivalent to LiveKit's managed cloud as a first-party option.
Plugin and extension ecosystem
LiveKit Agents has a richer pre-built plugin catalog: Deepgram, AssemblyAI, Whisper (local), OpenAI, Anthropic, Google Gemini on the STT/LLM side; ElevenLabs, OpenAI TTS, Cartesia, PlayHT on the TTS side. The plugin interface is stable and well-documented. You can swap providers without rewriting agent logic.
TEN's extension ecosystem is newer and smaller in terms of officially supported community extensions. The upside is that writing a new TEN extension is genuinely approachable if you're comfortable with the graph model — the architecture is designed for extension. But if you need to ship something this month using off-the-shelf integrations, LiveKit's catalog is deeper today.
Comparison table
| Dimension | LiveKit Agents (10,766 ★) | TEN Framework (10,629 ★) |
|---|---|---|
| Architecture | Worker model on WebRTC infrastructure | Graph of typed extensions |
| Primary language | Python (Node.js supported, less feature-complete) | C, C++, Go, Python (polyglot per extension) |
| Time to first demo | ~2 hours for a working voice agent | ~1 day including graph config learning curve |
| Audio processing | Python/plugin-based | Native C++ extension support |
| Pre-built integrations | Deepgram, OpenAI, Anthropic, ElevenLabs, Cartesia, PlayHT, others | Smaller catalog; extensible architecture |
| Managed cloud option | Yes — LiveKit Cloud (free tier available) | No first-party managed cloud |
| Debug experience | Linear Python traces | Multi-extension message flow inspection |
| Backing | LiveKit (VC-backed startup) | Agora (public company, NASDAQ: API) |
| License | Apache 2.0 | Apache 2.0 |
| Last commit (May 8, 2026) | Active | Active |
Community sentiment data not available from our research pipeline at time of publication. Pricing tiers unverified.
What we'd use and why
For most developers building voice or video AI agents: LiveKit Agents. The practical advantage is not architectural elegance — TEN's extension graph is arguably a cleaner abstraction for complex pipelines. The advantage is time. The plugin catalog is richer today, the Python SDK is more mature, the managed cloud option removes infrastructure decisions from sprint one, and the debugging model is familiar to anyone who's traced a Python async application. If you're a team of one trying to ship a working AI voice feature in the next two weeks, LiveKit Agents is the path of least resistance.
For teams with specific audio quality requirements or polyglot pipeline needs: TEN Framework. If you need to run custom C++ audio processing within the same pipeline as your LLM calls — noise suppression, acoustic echo cancellation, audio enhancement before STT — TEN's extension architecture handles this cleanly. If your team has components in multiple languages and you want them to communicate through a typed message-passing system rather than an ad-hoc IPC layer, TEN's runtime is purpose-built for that. The initial investment is higher; the ceiling is higher too.
What I'd personally avoid: Using star counts as the decision criterion. At 10,766 vs. 10,629, you're looking at a gap of 0.4% — that's within the margin of any week's organic GitHub traffic. The right question isn't "which one is more popular" — they're identically popular. The right question is "which architecture fits the constraints of what I'm building."
Limitations of this analysis
The most significant gap in this analysis is the absence of community experience data. Neither Reddit nor Hacker News returned discussion threads from our research pipeline. This means we have no practitioner quotes about what breaks in production, which edge cases each framework handles poorly, or what the maintainer responsiveness is like when you file an issue. For frameworks at this level of complexity — both involve real-time audio processing, LLM API integration, and concurrent stream handling — community debugging resources matter. Check the GitHub issues and Discord channels for each project before committing; issues pages often surface the failure modes that documentation doesn't mention.
Additionally, both frameworks are moving fast enough that specific feature parity claims in this article may be outdated within months. The Node.js/TypeScript support gap in LiveKit Agents, for instance, is likely narrowing. Verify current SDK feature parity against the respective changelogs before making a technical decision based on this article.
Pricing for managed infrastructure options (LiveKit Cloud, and any commercial TEN-adjacent offerings) was not verified from our pipeline. Get current pricing directly from each vendor's pricing page before budgeting a production deployment.
Bottom line
LiveKit Agents and TEN Framework are the two strongest open-source options for building real-time AI voice and video agents as of May 2026 — and their near-identical GitHub star counts correctly reflect that both are worth taking seriously. The choice between them is not about quality; it's about fit. LiveKit Agents is faster to get to a working demo, has a deeper pre-built integration catalog, and offers a managed cloud option for teams that don't want to operate infrastructure. TEN Framework offers more flexibility for teams with complex audio pipelines, polyglot extension requirements, or architectures that outgrow a single-worker model. Start with LiveKit Agents unless you have a specific reason not to; switch to TEN if you hit the ceiling.
The 23,135-star lead that HeyGen Hyperframes holds over both of them is worth noting in context. The developer community is voting loudest for HTML-to-video rendering designed for AI agents — a category where neither LiveKit nor TEN competes at all. If your use case is programmatic video generation rather than real-time conversation, see our full analysis of the AI-for-video GitHub ecosystem for coverage of Hyperframes, backgroundremover, and ShortGPT.
+ The Pros
Key strengths identified across community discussions, GitHub activity, and official documentation for the tools covered in this report.
− The Cons
Known constraints and trade-offs surfaced from community usage, issue trackers, and hands-on testing notes.
The Final Verdict
Our Assessment
This report was compiled from live discussions, GitHub activity, and official documentation. Findings reflect the state of each tool as of May 8, 2026.
Overall Score