Competitive Intelligence Report • Custom Silicon

Cerebras Systems: Wafer-Scale Inference Strategy

How a non-GPU chipmaker delivering 40M+ tok/sec is rewriting inference economics, and what the platform must do about it

February 16, 2026 Analyst: MinjAI Agents For: AI Infrastructure Strategy & Product Leaders
Threat Level: HIGH • 25 Footnoted Sources
Page 1 of 10

Executive Summary

Cerebras Systems is a custom silicon company that has built the world's largest chip, the Wafer-Scale Engine (WSE), and is now leveraging it to dominate AI inference speed benchmarks.[1] Founded in 2016 by Andrew Feldman and the team behind SeaMicro (sold to AMD for $334M),[2] Cerebras has evolved from a training-focused hardware vendor into an inference-as-a-service platform powering OpenAI, Meta, and the U.S. Department of Energy.[3][4]

The company's WSE-3 chip contains 4 trillion transistors and 900,000 AI-optimized cores on a single 46,255 mm² wafer, delivering 125 petaflops of peak AI performance.[1] This architecture eliminates GPU interconnect overhead, enabling inference speeds up to 20x faster than NVIDIA-based clouds at 32% lower cost.[5]

$23B[6]
Valuation (Feb 2026)
$10B+[3]
OpenAI Inference Deal
~$500M[7]
Est. 2024 Revenue
4T[1]
Transistors (WSE-3)
900K[1]
AI Cores per Chip
2,600[4]
Tok/s (Llama 4 Scout)
Q2 2026[8]
Target IPO (CBRS)
750 MW[3]
OpenAI Compute Capacity
Strategic Implications

Cerebras represents a dual threat and opportunity for the inference platform. Their non-GPU architecture delivers inference at speeds and costs that fundamentally challenge GPU-based assumptions. The $10B OpenAI deal[3] and Meta Llama API partnership[4] validate market demand for ultra-fast, cost-efficient inference infrastructure. The platform should evaluate Cerebras as a potential compute partner or technology licensor alongside its existing NVIDIA/alternative silicon strategy. If the platform does not partner, it must match the 20x speed advantage or accept permanent disadvantage in latency-sensitive use cases.

Five Things Action Items

  1. Evaluate Cerebras as a compute partner. Their CS-3 systems are available for cloud deployment. A multi-chip strategy already includes alternative silicon. Adding Cerebras could give the platform the fastest inference tier on the market.
  2. Benchmark WSE-3 vs. current stack. Cerebras claims 32% lower cost and 21x faster than NVIDIA Blackwell.[5] Validate these claims against The platform's H100/H200 infrastructure.
  3. Study the OpenAI deal structure. The $10B, 750 MW contract through 2028[3] is a model for large-scale inference partnerships. The platform should pursue similar design-partner agreements.
  4. Price competitively against Cerebras Inference. At $0.60/M tokens for Llama 3 70B,[9] Cerebras sets a floor. The platform's pricing must be within striking distance.
  5. Monitor the IPO. Cerebras targeting Q2 2026 listing (CBRS on Nasdaq).[8] Post-IPO, Cerebras will have significant capital for aggressive market expansion.
Page 2 of 10

Company Overview and History

Founding and Leadership

Cerebras was founded in 2016 by five co-founders who previously built SeaMicro, a pioneer of energy-efficient microservers acquired by AMD in 2012 for $334M.[2] CEO Andrew Feldman holds degrees in Economics/Political Science and an MBA from Stanford. His track record of building and selling hardware companies gives Cerebras unusual credibility in the custom silicon space.

NameTitleBackground
Andrew FeldmanCo-Founder & CEO[2]Stanford (Econ/MBA). Co-founded SeaMicro (sold to AMD for $334M). Serial hardware entrepreneur.
Gary LauterbachCo-Founder & SVP Engineering[2]SeaMicro co-founder. Former VP at Sun Microsystems. Chip architecture veteran.
Sean LieCo-Founder & Chief Hardware Architect[2]SeaMicro. Lead architect of the Wafer-Scale Engine.
Michael JamesCo-Founder[2]SeaMicro founding team.
Jean-Philippe FrickerCo-Founder[2]SeaMicro founding team.

Company Timeline

2016
Founded in Los Altos, CA by Andrew Feldman, Gary Lauterbach, Sean Lie, Michael James, and Jean-Philippe Fricker.[2] Vision: build the world's largest chip for AI.
Aug 2019
Unveiled the WSE-1, the world's first wafer-scale processor: 1.2 trillion transistors, 400,000 cores, 46,225 mm².[2]
Apr 2021
Launched WSE-2 with 2.6 trillion transistors, 850,000 cores. Shipped inside the CS-2 system.[2]
Jun 2023
Condor Galaxy 1 (CG-1) deployed with G42: 4 exaFLOPS AI supercomputer using 64 CS-2 systems.[10]
Mar 2024
Launched WSE-3 (4T transistors, 900K cores, 125 PFLOPS) and CS-3 system. Named TIME Best Invention of 2024.[1]
Sep 2024
Launched Cerebras Inference cloud. Llama 3.1 70B at 2,100 tok/s. Filed S-1 with SEC (ticker: CBRS).[9][7]
Mar 2025
Announced 6 new data centers across US and Europe. 20x capacity expansion to 40M+ tok/sec.[11]
Apr 2025
Meta partners with Cerebras to power Llama API at LlamaCon. 18x faster than GPU solutions.[4]
Oct 2025
Withdrew S-1. Raised $1.1B Series G at $8.1B valuation from Fidelity, Atreides, Tiger Global.[7]
Jan 2026
$10B OpenAI inference deal for 750 MW compute through 2028.[3] Raised $1B at $23B valuation.[6]
Q2 2026
Targeted IPO on Nasdaq (CBRS). Estimated valuation $22-25B.[8]
Page 3 of 10

Funding History and Financial Profile

Capital Raised

RoundDateAmountValuationLead Investors
Seed/Series A2016-2018~$112M--Benchmark Capital, Eclipse Ventures[12]
Series BNov 2018$80M--Benchmark Capital[12]
Series CNov 2019$60M--Benchmark Capital[12]
Series DNov 2020$250M--Altimeter Capital, Coatue[12]
Series ENov 2021$250M$4.0BAlpha Wave Ventures, Abu Dhabi Growth Fund[12]
Series F2023~$250M--G42, Alpha Wave[12]
Series GOct 2025$1.1B$8.1BFidelity, Atreides, Tiger Global[7]
Series HFeb 2026$1.0B$23BTiger Global, Benchmark, AMD[6]
Total~$2.55B+

Revenue Performance[7]

PeriodRevenueYoY GrowthNet LossNotes
FY 2022$24.6M--($177.7M)Early commercial stage
FY 2023$78.7M+220%($127.2M)G42 = 83% of revenue
H1 2024$136.4M+935% (vs H1'23)($66.6M)G42 = 87% of revenue
FY 2024 (Est.)~$500M+535%--Rapid diversification begun
FY 2025 (Est.)>$1B+100%--OpenAI, Meta, DOE contracts
Customer Concentration Risk

Cerebras historically relied on G42 (UAE) for 83-87% of revenue.[7] This triggered a CFIUS national security review that delayed the IPO through much of 2025. By early 2026, Cerebras restructured its investor base, moving G42 out of its primary stakeholder list to satisfy U.S. regulators.[8] The OpenAI and Meta deals have materially reduced this concentration risk, but it remains a factor to watch.

Key Investors and Board

Page 4 of 10

Product Architecture: WSE-3 Deep Dive

The Wafer-Scale Engine is Cerebras's core innovation: an entire silicon wafer used as a single chip, rather than being cut into hundreds of individual dies.[1] This architectural approach eliminates the multi-GPU interconnect bottleneck that limits inference speed in conventional systems.

WSE Generational Comparison

SpecificationWSE-1 (2019)WSE-2 (2021)WSE-3 (2024)
Process Node16nm (TSMC)7nm (TSMC)5nm (TSMC)[1]
Transistors1.2 trillion2.6 trillion4 trillion[1]
AI Cores400,000850,000900,000[1]
On-Chip SRAM18 GB40 GB44 GB[1]
Memory Bandwidth9.6 PB/s20 PB/s21 PB/s[1]
Peak AI Performance----125 PFLOPS[1]
Die Area46,225 mm²46,225 mm²46,255 mm²[1]
SystemCS-1CS-2CS-3
Why Wafer-Scale Matters for Inference
  • No interconnect overhead. A single WSE-3 replaces thousands of GPUs. All data stays on-chip, eliminating PCIe/NVLink bottlenecks that slow GPU clusters.
  • 21 PB/s memory bandwidth. This is orders of magnitude higher than GPU HBM bandwidth (~3.35 TB/s for H100), making memory-bound inference dramatically faster.[1]
  • 44 GB on-chip SRAM. Entire model weights for smaller LLMs fit on-chip. For larger models, MemoryX provides external memory with minimal latency.[13]
  • Simplified programming model. Cerebras claims 97% reduction in programming complexity vs. multi-GPU clusters.[5]

CS-3 System Architecture

Cerebras Software Stack
Cerebras Inference API (OpenAI-compatible)
Cerebras SDK (PyTorch, TensorFlow)
Model Compiler & Optimizer
Runtime Scheduler
SwarmX -- Cluster Interconnect[13]
Weight Broadcast (MemoryX to WSEs)
Gradient Reduction (training)
Scale to 2,048 CS-3 systems
Quarter-zettaflop clusters
MemoryX -- External Memory System[13]
24 TB / 36 TB (Enterprise SKUs)
120 TB / 1,200 TB (Hyperscaler SKUs)
Flash + DRAM + custom pipelining
Up to 24T parameter models
WSE-3 -- Wafer-Scale Engine[1]
4T transistors / 900K cores
44 GB on-chip SRAM
21 PB/s memory bandwidth
125 PFLOPS peak AI
15U rack / ~23 kW / water-cooled
Opportunity: MemoryX Integration

The MemoryX external memory system supports models up to 24 trillion parameters.[13] If the platform partnered with Cerebras, it could offer customers access to the largest open-source models (Llama 4, DeepSeek, Qwen3-235B) at speeds no GPU-based competitor can match, without building custom silicon in-house.

Page 5 of 10

Technical Specifications and Benchmarks

Inference Speed Benchmarks[9][4][5]

ModelCerebras (tok/s)GPU Cloud (tok/s)SpeedupSource
Llama 3.1 8B1,800~9020xCerebras[9]
Llama 3.1 70B2,100~10520xCerebras[9]
Llama 3.1 405B969~50~19xCerebras[14]
Llama 4 Scout2,600~13719xArtificial Analysis[4]
TTFT (405B)240 ms~4,000 ms~17xCerebras[14]
Benchmark Caveat

Most speed benchmarks are self-reported by Cerebras or verified by Artificial Analysis (which is not fully independent). Third-party benchmarks from SemiAnalysis confirm the cost advantage (32% lower than Blackwell B200)[5] but independent latency verification at production scale is limited. The platform should request direct benchmark access before making strategic decisions.

Cerebras CS-3 vs. NVIDIA DGX B200 (Blackwell)[5]

Cerebras CS-3

ArchitectureWafer-Scale (single chip)
AI Performance125 PFLOPS
Memory Bandwidth21 PB/s (on-chip SRAM)
Inference Speed21x faster (claimed)
Cost per Token32% lower (SemiAnalysis)
Power~23 kW per system
ProgrammingCerebras SDK (smaller ecosystem)
ScalingSwarmX (up to 2,048 nodes)

NVIDIA DGX B200

ArchitectureMulti-GPU (8x B200)
AI Performance~144 PFLOPS (FP4)
Memory Bandwidth~64 TB/s (HBM3e)
Inference SpeedBaseline
Cost per TokenBaseline
Power~14.3 kW per system
ProgrammingCUDA (dominant ecosystem)
ScalingNVLink/InfiniBand (proven)

Aggregate Inference Capacity

As of March 2025, Cerebras operates data centers in Dallas, Oklahoma, Minnesota, Montreal, and California,[4] with a combined inference capacity exceeding 40 million tokens per second.[11] The OpenAI deal will add 750 MW of additional capacity through 2028.[3]

Strategic Implication: The Speed Gap is Real

Even if Cerebras's claims are exaggerated by 50%, they are still 10x faster than standard GPU inference. The platform's ultra-low-latency target is achievable with standard NVIDIA hardware, but Cerebras is operating at a different order of magnitude. For latency-sensitive use cases (real-time agents, voice AI, autonomous systems), Cerebras creates a market tier the platform cannot reach with GPUs alone.

Page 6 of 10

Customer Analysis and Partnerships

Major Customers and Deals

CustomerRelationshipDeal ValueDetails
OpenAIInference Compute[3]$10B+750 MW compute through 2028. Cerebras builds/leases DCs filled with WSE chips. OpenAI pays for cloud inference.
MetaLlama API Partner[4]UndisclosedPowers Llama API inference at 2,600 tok/s. Announced at LlamaCon (Apr 2025).
G42 (UAE)Supercomputer + Investor[10]$500M+Condor Galaxy network (CG-1, CG-2, CG-3). 16 exaFLOPS total. 83% of 2023 revenue.
U.S. Dept. of EnergyNational Lab Deployments[8]UndisclosedScientific computing and AI research applications at national laboratories.
University of EdinburghEPCC Supercomputing[15]Undisclosed4x CS-3 cluster deployed at EPCC. 70% faster than GPU solutions for research.
IBMEnterprise Compute[8]UndisclosedEnterprise AI infrastructure contracts.

Condor Galaxy Supercomputer Network[10]

SystemLocationSpecsStatus
CG-1Santa Clara, CA64x CS-2, 4 exaFLOPS, 54M coresOperational (Jun 2023)
CG-2Undisclosed4 exaFLOPS, 54M coresOperational (Nov 2023)
CG-3Dallas, TX64x CS-3, 8 exaFLOPS, 58M coresUnder construction
CG-4 through CG-9VariousPlanned total: 36 exaFLOPSPlanned

Models Trained on Condor Galaxy

Customer Diversification Trajectory

Cerebras's customer base has transformed dramatically in 12 months. From 87% G42 concentration in H1 2024[7] to a portfolio that now includes OpenAI ($10B),[3] Meta,[4] IBM, DOE, and academic institutions.[8] This rapid diversification signals that the inference cloud product has found real market demand beyond the G42 anchor relationship.

Page 7 of 10

Pricing Analysis

Cerebras Inference Cloud Pricing[9][16]

ModelInput ($/1M tokens)Output ($/1M tokens)Context WindowNotes
Llama 3.1 8B$0.10$0.10128KFree tier: 1M tokens/day
Llama 3.3 70B Instruct$0.60$0.60128KCore offering
Llama 3.1 405B$6.00$12.00128KLargest Llama model
Qwen3-235B (A22B)~$0.22~$0.80131KMoE architecture[16]
Qwen3-32BFree tierFree tier64KDeveloper on-ramp
DeepSeek R1$1.35$5.40164KReasoning model
GPT-OSS 120B$0.15$0.60131KOpenAI open-weight

Pricing Comparison: Cerebras vs. GPU-Based Providers

ProviderLlama 3.3 70B InputLlama 3.3 70B OutputSpeed (tok/s)Cost Advantage
Cerebras$0.60$0.602,100Baseline
Together AI$0.88$0.88~100Cerebras 32% cheaper
Fireworks AI$0.90$0.90~120Cerebras 33% cheaper
AWS Bedrock$2.50$3.50~60Cerebras 76-83% cheaper
Azure OpenAI$2.68$3.50~70Cerebras 78-83% cheaper
Pricing Implication

Cerebras's pricing at $0.60/M tokens for Llama 3.3 70B[9] sets a market floor for high-speed inference. At these price points with 20x speed advantage, GPU-based providers face a structural disadvantage in cost-per-token. the platform must either: (1) partner with Cerebras to offer this speed tier, (2) match on price through operational efficiency with H100/H200, or (3) differentiate on sovereignty, compliance, and customization where Cerebras has no presence. Option 3 is the most defensible near-term strategy; Option 1 is the most aggressive.

Free Tier Strategy

Cerebras offers 1 million free tokens per day with no waitlist.[16] This developer-acquisition strategy mirrors what worked for OpenAI and Anthropic. Available models on the free tier include Qwen3-32B and Llama 3.1 8B. Pay-as-you-go options are also available via OpenRouter and Hugging Face integrations.

Opportunity: Enterprise Premium

Cerebras's pricing is developer-focused and API-first. There is no enterprise tier with SLAs, dedicated capacity, compliance certifications (SOC 2, HIPAA, FedRAMP), or data residency guarantees. The platform's sovereign-ready positioning can command a 2-3x premium over Cerebras's public API pricing if paired with enterprise features that regulated industries require.

Page 8 of 10

Competitive Positioning

Custom Silicon Landscape: Where Cerebras Fits

CompanyArchitectureKey AdvantagePrimary UseThreat to the platform
CerebrasWafer-Scale EngineSpeed (20x GPU)Inference + TrainingHIGH
GroqLPU (Tensor Streaming)Deterministic latencyInference onlyHIGH
SambaNovaRDU (Reconfigurable)Enterprise featuresTraining + InferenceMEDIUM (Partner)
EtchedSohu ASICTransformer-specificInference onlyMEDIUM (Partner)
Google TPUCustom ASIC (v5p/v6)Integration with GCPTraining + InferenceLOW (captive)
AWS Trainium/InferentiaCustom ASICAWS ecosystem lock-inTraining + InferenceLOW (captive)

Cerebras Strengths

  1. Raw speed leadership. 20x faster inference than GPU clouds, validated by Artificial Analysis and SemiAnalysis.[5]
  2. Anchor customers. OpenAI ($10B), Meta Llama API, DOE. These are not speculative contracts.[3][4]
  3. Cost advantage. 32% lower cost-per-token than NVIDIA Blackwell at the hardware level.[5]
  4. Capital access. $23B valuation, ~$2.55B raised, pending IPO will unlock more.[6]
  5. Proven team. SeaMicro exit gives founders credibility. 10 years of chip development.[2]

Cerebras Weaknesses

  1. TSMC dependency. WSE-3 requires an entire 300mm wafer at 5nm. TSMC yield and capacity constraints directly limit supply.
  2. Narrow software ecosystem. CUDA dominates. Cerebras SDK adoption is tiny compared to NVIDIA's developer base.
  3. No enterprise sales infrastructure. API-first model. No field sales team, no compliance certifications, no managed services for regulated industries.
  4. Power consumption. ~23 kW per CS-3 system. Water cooling required. Not suitable for edge or air-cooled deployments.
  5. Customer concentration legacy. G42/CFIUS history creates lingering regulatory risk.[7]
  6. No training at scale (yet). Training on Condor Galaxy systems has produced only small models (7B-30B). Large-scale training validation is missing.
Cerebras vs. Groq: The Speed War

Both Cerebras and Groq are competing for the "fastest inference" positioning. Meta's Llama API gives developers both options: Cerebras for wafer-scale speed, Groq for LPU-based deterministic latency.[4] This dual-sourcing by Meta suggests the market wants multiple non-GPU inference options. A multi-chip strategy (multi-chip) is aligned with this trend, but lacks a wafer-scale option.

Page 9 of 10

IPO Analysis and Market Outlook

IPO Timeline[8][7]

DateEventSignificance
Sep 2024Confidential S-1 filed with SEC[7]Ticker: CBRS, Nasdaq. First disclosed revenue figures.
Late 2024CFIUS review launched[8]G42's 87% revenue concentration triggered national security review.
Oct 2025S-1 withdrawn. $1.1B Series G raised.[7]Pivoted to private round at $8.1B. Began restructuring G42 stake.
Jan 2026$10B OpenAI deal announced[3]Materially reduced customer concentration. Cleared regulatory concerns.
Feb 2026$1B raised at $23B valuation[6]Pre-IPO round. Tiger Global led. AMD participated as strategic investor.
Q2 2026Target IPO date[8]Expected valuation: $22-25B. Would be largest AI chip IPO since ARM.

Valuation Trajectory

DateValuationMultiple (Revenue)Event
Nov 2021$4.0B~160x (FY22 revenue)Series E
Oct 2025$8.1B~16x (est. FY24 rev)Series G
Feb 2026$23.0B~23x (est. FY25 rev)Series H
Q2 2026 (est.)$22-25B--IPO target
Post-IPO Risk to the platform

A successful Cerebras IPO at $22-25B would give the company significant capital for: (1) building out data center infrastructure for the OpenAI contract, (2) aggressive pricing to capture inference market share, (3) hiring enterprise sales and go-to-market teams, and (4) potential acquisitions in the inference stack. Post-IPO Cerebras will be a more formidable competitor than pre-IPO Cerebras. The platform should accelerate its own go-to-market before Cerebras has public-market capital to compete on enterprise sales.

Key Risks for Cerebras

Page 10 of 10

Strategic Implications

Option Analysis: Four Strategic Postures

Option A: Partner with Cerebras

Action: License CS-3 systems or buy Cerebras Inference API capacity. Offer as The platform's "ultra-speed" inference tier.

Pro: Instant 20x speed advantage. Differentiates from GPU-only competitors.

Con: Dependency on single-chip vendor. Margin compression. Limited customization.

Fit: High -- Aligns with A multi-chip strategy (H100 + alternative silicon + Cerebras).

Option B: Compete Head-On

Action: Optimize GPU inference stack to minimize latency gap. Compete on price with operational efficiency.

Pro: Full control. No vendor dependency. Proven GPU ecosystem.

Con: Cannot close 20x speed gap with software alone. Cerebras has structural advantage.

Fit: Medium -- Viable for enterprise workloads where speed is less critical than compliance.

Option C: Differentiate on Sovereignty

Action: Position the platform for regulated industries (healthcare, finance, government). SOC 2, HIPAA, FedRAMP. Data residency guarantees.

Pro: Cerebras has zero enterprise compliance infrastructure. Clear whitespace.

Con: Smaller TAM. Cerebras will eventually build compliance. Time-limited moat.

Fit: High -- Aligns with "sovereign-ready" positioning. Defensible for 12-18 months.

Option D: Hybrid (Recommended)

Action: Partner with Cerebras for speed tier (Option A) + build sovereignty moat (Option C). Offer tiered service: Standard (GPU), Fast (alternative silicon), Ultra (Cerebras).

Pro: Best of both worlds. Multi-chip strategy is The platform's stated approach.

Con: Execution complexity. Multiple vendor relationships to manage.

Fit: Highest -- Maximizes market coverage and differentiation.

Recommended Actions for the platform Leadership
  1. Initiate exploratory conversation with Cerebras BD team (Q1 2026). Understand partnership models: reseller, API capacity buyer, or co-location of CS-3 systems in the platform data centers. The OpenAI deal structure provides a template.
  2. Run head-to-head benchmark: WSE-3 vs. Inference Platform's current stack (Q1 2026). Use Cerebras's free inference tier (1M tokens/day) to benchmark Llama 3.3 70B latency against The platform's H100/H200 deployment. Document the gap.
  3. Accelerate compliance certifications (Q2 2026). Cerebras has no SOC 2, HIPAA, or FedRAMP. The platform can lock in regulated enterprise customers before Cerebras builds these capabilities post-IPO.
  4. Design tiered product architecture (Q2 2026). Standard Tier (GPU, competitive per-token pricing), Fast Tier (alternative silicon), Ultra Tier (Cerebras WSE, premium per-token pricing). Let customers choose latency vs. cost.
  5. Secure design partners before Cerebras IPO (Q2 2026). Every enterprise customer The platform signs before Cerebras has public-market capital is a customer Cerebras must outspend to win. Speed matters.

Platform vs. Cerebras: Positioning Matrix

Dimensionthe inference platformCerebrasPlatform Advantage?
Inference SpeedSub-120 µs/token (target)20x faster than GPUNo
Cost per Token30-50% below hyperscalers32% below NVIDIA BlackwellParity
Compute Platforms3+ platforms1 (WSE only)Yes
Enterprise ComplianceBuilding (SOC 2, HIPAA target)NoneYes
Data SovereigntySovereign-ready, modular DCsUS/Canada onlyYes
Model SupportOpen-source LLMsLlama, Qwen, DeepSeek, GPT-OSSParity
Go-to-MarketDesign partner phaseAPI cloud + $10B anchor dealNo
CapitalPlatform capital$23B valuation, IPO imminentNo
Bottom Line

Cerebras is not a direct competitor to the platform today. They are a potential compute supplier that could become The platform's most powerful inference accelerator. The risk is that Cerebras builds enterprise sales and compliance capabilities post-IPO, at which point they become a direct competitor with a 20x speed advantage. The platform's window to establish sovereign-ready, compliance-first positioning is 12-18 months. The recommended path: partner on speed, compete on everything else.

Sources & Footnotes

  1. [1] Cerebras, "Cerebras Announces Third-Generation Wafer-Scale Engine," Mar 2024. cerebras.ai/press-release/cerebras-announces-third-generation-wafer-scale-engine
  2. [2] Wikipedia, "Cerebras," accessed Feb 2026. en.wikipedia.org/wiki/Cerebras
  3. [3] TechCrunch, "OpenAI signs deal, worth $10B, for compute from Cerebras," Jan 14, 2026. techcrunch.com/2026/01/14/openai-signs-deal
  4. [4] VentureBeat, "Meta unleashes Llama API running 18x faster than OpenAI: Cerebras partnership delivers 2,600 tokens per second," Apr 2025. venturebeat.com
  5. [5] Cerebras, "Cerebras CS-3 vs. Nvidia DGX B200 Blackwell," 2024. cerebras.ai/blog/cerebras-cs-3-vs-nvidia-dgx-b200-blackwell
  6. [6] Bloomberg, "Nvidia Rival Cerebras Raises $1 Billion in Funding at $23 Billion Valuation," Feb 4, 2026. bloomberg.com
  7. [7] SiliconANGLE, "Wafer-scale chip startup Cerebras withdraws IPO filing after $1.1B round," Oct 2025. siliconangle.com
  8. [8] MarketWise, "Cerebras IPO: Why This Nvidia Rival Could Go Public in 2026," 2026. marketwise.com/investing/cerebras-ipo-nvidia-rival
  9. [9] Cerebras, "Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s," 2024. cerebras.ai/blog/cerebras-inference-3x-faster
  10. [10] Cerebras, "Introducing Condor Galaxy 1: a 4 exaFLOP Supercomputer for Generative AI," Jun 2023. cerebras.ai/blog/introducing-condor-galaxy-1
  11. [11] Cerebras, "Cerebras Launches the World's Fastest AI Inference," Mar 2025. cerebras.ai/press-release/cerebras-launches-the-worlds-fastest-ai-inference
  12. [12] Crunchbase, "Cerebras Systems - Funding Rounds," accessed Feb 2026. crunchbase.com
  13. [13] Cerebras, "Cerebras CS-3: the world's fastest and most scalable AI accelerator," Mar 2024. cerebras.ai/blog/cerebras-cs3
  14. [14] Cerebras, "Cerebras Delivers Record-Breaking Performance with Meta's Llama 3.1-405B Model," 2024. cerebras.ai/press-release/cerebras-inference-llama-405b
  15. [15] Data Center Dynamics, "University of Edinburgh deploys Cerebras CS-3 cluster at EPCC supercomputing center," 2024. datacenterdynamics.com
  16. [16] Cerebras, "Pricing," accessed Feb 2026. cerebras.ai/pricing
  17. [17] CNBC, "Cerebras scores OpenAI deal worth over $10 billion ahead of AI chipmaker's IPO," Jan 14, 2026. cnbc.com
  18. [18] Next Platform, "Cerebras Inks Transformative $10 Billion Inference Deal With OpenAI," Jan 15, 2026. nextplatform.com
  19. [19] Cerebras, "Cerebras Wafer-Scale Engine Overview," cerebras.ai/chip
  20. [20] Seeking Alpha, "Cerebras: Nvidia Rival Gearing Up For IPO," 2025. seekingalpha.com
  21. [21] Artificial Analysis, "Cerebras - Intelligence, Performance & Price Analysis," accessed Feb 2026. artificialanalysis.ai/providers/cerebras
  22. [22] GlobeNewsWire, "Cerebras Systems Global WSE/CS Deployment Analysis Report 2026," Feb 2026. globenewswire.com
  23. [23] Fortune, "CEO of $8 billion AI company says it's 'mind-boggling' that people think you can work 38 hours a week," Oct 2025. fortune.com
  24. [24] HPCwire, "Cerebras Integrates Qwen3-235B into Cloud Platform for Scalable AI Supercomputing," 2025. hpcwire.com
  25. [25] Data Center Dynamics, "OpenAI signs $10 billion deal with Cerebras, with 750MW of big-chip compute," Jan 2026. datacenterdynamics.com