Trend Dynamics
Updated dailyVelocity
+58
Maturity
24
Signals
521
All-time observed
Definition
Edge inference is the execution of LLM and multimodal model inference entirely on a user-owned device, without round-trips to a hyperscaler datacenter.
Why It Matters
Edge inference changes the privacy, latency, and unit-economics story for any AI feature that today requires a server — and reopens hardware cycles for Apple, Qualcomm, and AMD.
Signals Feeding This Trend
- On-device model size/quality benchmarks36%benchmark
- NPU TOPS announcements28%hardware
- OS-level AI runtime releases (Apple Intelligence, Copilot+)36%release
Companies Involved
- Apple
- Qualcomm
- AMD
- Microsoft
- Mistral
- Meta
Timeline
- 2024-06
Apple Intelligence announced — on-device 3B model.
- 2024-Q3
Copilot+ PCs launch with 40+ TOPS NPU baseline.
Predictions
- 12 monthshigh confidence
A 7B-class model runs at >30 tokens/sec on a mainstream consumer phone.
Related Trends
Track every signal feeding Edge Inference
Steek surfaces individual signals the moment they enter the index.