Alice | The Extension of You

Section 01

Decentralization of Compute InfrastructureWhen "Surplus Hardware" Beats "Cloud Rental"

We must re-examine the economic attributes of computing power. The moat of closed-source models is built on the "scarcity of H100 clusters," but Moore's Law of hardware is ruthlessly filling in this moat.

1.1From Aristocratic Privilege to Common Tool

Take the V100 as an example—a card that once sold for thousands of dollars now flows into the secondary market as "surplus hardware," allowing any willing tinkerer to build a local inference station at minimal cost.

Consumer Silicon Revolution

Apple's M-series chips (like M3 Ultra) use Unified Memory Architecture, allowing Mac Studio to fluently run DeepSeek 67B at Q4 quantization. The upcoming iPhone 17 Pro with its NPU iteration means 14B-level models will not just "run"—they will "reside" permanently.

Software as Catalyst

Hardware is just soil—software is the catalyst. llama.cpp, Exo, MLX and other open-source projects are essentially doing "compute extraction." They enable consumer GPUs, even pure CPUs, to achieve surprising token generation rates.

Physics-Level Advantage

When local model running becomes a certainty, cloud vendors lose a trump card: latency and privacy. For latency-sensitive and privacy-sensitive tasks, local compute has an absolute physics-level advantage.

Shrinking Safe Harbor

Cloud models are left with only one refuge: "handling ultra-large-scale complex logic." And this refuge is shrinking by the day.

Section 02

Open Source Swarm TacticsAsymmetric Warfare & the Victory of Micro-Updates

Closed-source models are "Cathedrals"—grand but rigid. Open-source models are "Bazaars"—chaotic but full of vitality.

2.1Evolution Speed Differential

Closed-Source Iteration

Iteration cycles measured in "quarters" or even "years." Every adjustment requires massive training costs and lengthy alignment testing (RLHF).

Centralized, Slow, Expensive

Open-Source Evolution

Distributed collective intelligence. Thousands of labs, tens of thousands of hackers, testing in hundreds of directions simultaneously. Breakthroughs (Sparse Attention, Flash Attention, new quantization) spread within 24 hours.

Distributed, Fast, Free

2.2Small & Beautiful: Dimensional Strike

General large models try to solve all problems with one brain, leading to severe "Alignment Tax"—sacrificing sharpness in specific domains for safety and generality.

Through LoRA (Low-Rank Adaptation) and other fine-tuning techniques, the open-source ecosystem allows countless "specialist geniuses" to emerge:

Legal Specialist Model

A 7B model fine-tuned on legal corpus can outperform GPT-4 in contract review tasks.

Medical Diagnosis Assistant

Domain-specific training creates experts that general models cannot match.

2.3The Compound Interest of Micro-Updates

Closed-source models cannot "micro-update" just for you. But local models can.

Your local model can perform incremental learning every day based on your new notes and new code. This "daily progress" personalized compound interest is an advantage that centralized behemoths can never achieve.

Section 03

"Good Enough" Philosophy & Router ArchitectureEnding Compute Waste

We are emerging from the superstition of "parameters above all" and entering a sober period of "utility above all."

3.1The 70-80 Point Free Lunch

Looking back at internet history, most users don't pursue top-tier bandwidth or the most hardcore servers—they just pursue "good enough."

90%

of daily needs (summarizing emails, polishing copy, simple chat) only require 70-80 point intelligence level.

Local small models (SLM) only need to pay electricity—marginal cost is nearly zero. This is a devastating blow to closed-source models that charge per token: nobody would pay shipping for an armored truck to deliver a bottle of water.

3.2Rise of Model Router (Intelligent Routing)

Future system architecture is not Single Model, but Hybrid Hierarchy. Deploy an extremely sensitive Router on the edge:

L180%

Local

Handle simple instructions, private data, real-time interaction

Cost: $0

L215%

Cloud Medium Model

Handle tasks requiring network or stronger logic

Cost: Low

L35%

Cloud Ultra-Large / Agent Cluster

Handle extremely difficult breakthrough tasks

Cost: High

Under this architecture, closed-source large models will degrade from "daily necessities" to "occasionally-called luxury items."

Section 04

Rich ContextWhy AI Doesn't Understand You

This is the most fatal Achilles' heel of closed-source models. No matter how smart GPT-5 is, it still knows nothing about you.

4.1The Essence of Alignment is Context

Why does AI-written content feel mediocre? Because it has no Context Alignment with you.

Your industry jargon

What book you read last night

Your unique taste in "humor"

Your current mood

Without Context, AI can only give a "normal distribution average" based on probability. This is why all AI-written articles sound correct but boring, like textbooks.

4.2Averageistan vs. Extremistan

Averageistan

Closed-source models live in "Averageistan"—they pursue universal correctness. Safe, general, and utterly forgettable.

Extremistan

Human brilliance often comes from "Extremistan"—those biased, positioned, uniquely-assumed Edge Takes. Only local AI with all your data can understand and generate content with "soul."

4.3The Value Formula

$$\text{Value} = \text{Intelligence} \times \text{Context}$$

Closed-source: Intelligence 100 × Context 1= 100

Local: Intelligence 60 × Context 1000= 60,000

Rich Context + Agentic RAG + Small Model > Vague Context + Large Model

This is an irreversible value formula.

Section 05

The End of Closed-Source History & the Organic Appless FutureFrom Scenario to Flow

The history of closed-source models is a history of "trying to exhaust the world" and failing. They thought scraping all internet data gave them God's perspective, not realizing the truly valuable data—Personal Context—has never appeared on the public web.

5.1From Scenario to Flow

Current App development logic is a relic of the industrial age: PMs imagine 50 scenarios, programmers write 50 scripts. This is a "static mapping" of the world. But the real world is dynamic, chaotic.

The future Agentic Web has no Apps—only Generative Experience Flow. Intelligence flows, adapts, and responds to your needs in real-time.

5.2The Curse of Dimensionality

Trying to exhaust scenarios is mathematically a dead end. A Context Cell contains:

TimeLocationHeart RateSocial RelationsHistorical MemoryIntent...

This is a state space where N approaches infinity. Any preset Scripts are like "carving a mark on the boat to find a dropped sword" in this high-dimensional space.

5.3DNA vs. Zombie

The Zombie

If we limit AI to preset scenarios, it's a "functionally complex zombie"—it crashes when encountering situation #5001.

The DNA

True Agents should be like DNA: they don't encode "Results"—they encode "Rules & Protocols." They don't know what you'll do today, but based on your Context and current resources, they know how to generate solutions in real-time.

5.4The Transfer of Trust

In this world, the core moat is no longer "whose model has more parameters"—it's Trust.

01Who can provide the most secure Local Sandbox?

02Who can define universal Context Protocol?

03Who can build Trust Infrastructure for Agent collaboration?

This is the Main Dish. Closed-source large models completed "general education." Now, the local-first, personal-sovereignty, organically-grown Appless Future is just beginning.

Summary

One Sentence Summary

"We don't need a smarter cloud god. We need a digital twin that is completely loyal to us and absolutely understands us—local-first."

The End of Closed-Source ModelsHistory Rhymes Again