What Actually Shipped Open-Source This Week
Eight GitHub repos took off this week — three inference engines, one Linux kernel LPE, and a pattern about where the AI stack is heading.
Three of the eight projects on this list are inference engines or wrappers around them. That's the week telling you something. The frontier-model arms race has cooled enough that the most-starred week-old repos aren't new models — they're the layer right underneath the models: inference runtimes, agent harnesses, virtual filesystems, and one Linux kernel exploit reminding everyone the foundation can still crack open. Here's what shipped, what's actually novel, and the pattern we saw underneath all of it.
ds4 — antirez ships a DeepSeek V4 Flash engine in C and Metal
antirez/ds4 ⭐ 1,916 stars · C
Salvatore "antirez" Sanfilippo (yes, the Redis guy) wrote a stand-alone Metal inference engine for exactly one model: DeepSeek V4 Flash. The bet is narrow on purpose — no GGUF, no framework, no abstraction layer. Just a model-specific graph executor with on-disk KV cache persistence and a 2-bit quantization scheme that gets the 284B-parameter model running on a 128GB MacBook. The novelty isn't the technique; it's the discipline. Antirez calls it a "deliberately narrow bet" against the local-inference urge to chase every new model. This week, the people starring it agree.
deepclaude — swap Claude Code's brain for DeepSeek V4 Pro
aattaran/deepclaude ⭐ 1,625 stars · JavaScript
Claude Code is the best autonomous coding agent on the market. It also costs $200/month with usage caps. deepclaude is a small shell wrapper that points Claude Code's API calls at DeepSeek V4 Pro instead of Anthropic — same tool loop, same file-editing UX, ~17x cheaper at the meter. The hack works because Claude Code reads ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN as plain environment variables, and DeepSeek V4 Pro hits 96.4% on LiveCodeBench at $0.87 per million output tokens. Whether you read this as fair-game backend swapping or brutal substitution arbitrage, it's getting starred fast.
mirage — one filesystem for every agent backend
strukto-ai/mirage ⭐ 1,308 stars · TypeScript
Mirage mounts S3, Gmail, Slack, GitHub, and Redis side-by-side as a single virtual filesystem so an LLM agent can cat, grep, and pipe across them with the bash vocabulary it already knows. Instead of one tool wrapper per integration — the current N×M nightmare — agents see a unified Unix-like tree. grep alert /slack/general/*.json | wc -l goes through Slack. cat /github/mirage/README.md goes through GitHub. The premise is that you don't need to teach LLMs new APIs; you need to make every API look like one they already understand.
tokenspeed — speed-of-light inference for agentic workloads
lightseekorg/tokenspeed ⭐ 742 stars · Python
Pitched as TensorRT-LLM performance with vLLM usability, TokenSpeed is built specifically around agent traffic patterns — short-context tool calls, branching, KV churn — instead of the long-context chat-completion shape most engines optimize for. The notable engineering choice is encoding the request lifecycle and KV-cache ownership as a finite-state machine, with safe resource reuse enforced by the type system at compile time. Currently a preview release reproducing their Kimi K2.5 / Blackwell numbers. Not production-ready yet. Worth watching anyway.
dirtyfrag — a Linux LPE that doesn't need a race window
V4bel/dirtyfrag ⭐ 2,450 stars · C
Hyunwoo Kim's Dirty Frag chains two page-cache write vulnerabilities (xfrm-ESP and RxRPC) into a deterministic Linux root LPE — no race condition, no kernel panic on failure, near-100% success rate. One of the bugs covers nine years of kernel history. The xfrm-ESP half got CVE-2026-43284 and is patched in mainline; the RxRPC half is reserved as CVE-2026-43500 with no patch in any tree yet. PoC is one git clone and a gcc invocation. Patch your boxes.
petdex — a public gallery for Codex-compatible animated pets
crafter-station/petdex ⭐ 1,093 stars · TypeScript
Codex pets are animated companions that ride along in the terminal while Codex works — purely cosmetic, purely delightful. Petdex is the public gallery: browse, preview every animation state, validate community submissions in-browser, download a single pet pack or the full archive. It looks like a joke. It's actually a useful demonstration of how third-party content ecosystems form around tool platforms — the same pattern VS Code extensions, Slack apps, and Figma plugins followed at exactly this stage. File under "things that look unserious until they aren't."
how-to-train-your-gpt — every line of an LLM, commented like you're five
raiyanyahya/how-to-train-your-gpt ⭐ 738 stars · Jupyter Notebook
A 12-chapter, 3,900-line interactive textbook that walks you through writing a LLaMA-3-style model from absolute scratch — tokenizer, embeddings, RoPE, attention, training loop, inference engine. Every line annotated with what it does and why. The pitch isn't "build your own GPT in a weekend"; it's "stop using attention as a black box." The variance argument behind 1/√d_k. Why pre-norm beats post-norm for deep networks. Where every gradient flows during backprop. The most honest thing in the AI-explainer space at exactly the moment everything else is getting louder.
robotics-skills-suite — 76 Claude skills for industrial robotics compliance
jherrodthomas/robotics-skills-suite ⭐ 508 stars
Thirty-eight builder skills paired with thirty-eight matching reviewer skills, covering ISO 10218, ISO 13849, IEC 62061, ISO 12100, ISO/TS 15066 cobot biomechanics, ISO 3691-4 AMR risk, ROS2 architecture, ISO 9283 performance verification, and IEC 62443 industrial cybersecurity — every standard a robot integrator needs to ship a compliant cell. Each builder produces an audit-ready xlsx workbook; each reviewer cross-checks it with a confirmation dashboard. Skills hand off via stable xlsx contracts, so changing a number in the upstream Risk Assessment cascades through PLr → Compliance Matrix → Declaration of Conformity automatically. This is what Claude skills look like when someone takes them seriously as compliance plumbing.
The Pattern
Five of the eight are coordination layers, not models. Three are inference runtimes (ds4, tokenspeed, deepclaude as a backend wrapper). Two are agent infrastructure (mirage, robotics-skills-suite). One is a content-ecosystem wedge (petdex). One is education (how-to-train-your-gpt). One is a kernel exploit reminding the whole stack what it sits on. The frontier-model arms race has cooled enough that this week's most-starred week-old repos are about what to do with the models we already have. Toolchain. Plumbing. Harness. Pick the metaphor — the space below the model is finally getting interesting.
The Bottom Line
If you're picking what to read this weekend: ds4 for the engineering discipline, mirage for the conceptual shift, robotics-skills-suite for the pattern of where Claude skills are actually headed. And patch dirtyfrag before Monday.