QumulusAI and vCluster Partner to Accelerate Enterprise AI Development and Launch AI Infrastructure Lab
Atlanta, GA – [March 26, 2026 at 3:00 am ET] – QumulusAI, a provider of GPU-powered cloud infrastructure for artificial intelligence, today announced a partnership with vCluster, creators of virtual Kubernetes cluster technology, to enable developers to quickly and cost-effectively create secure, isolated Kubernetes environments for AI development.
The companies have also established the vCluster AI Lab, a new environment that enables vCluster to accelerate product innovation for the rapidly evolving AI ecosystem. The AI Lab runs on QumulusAI’s distributed GPU infrastructure, providing the vCluster team with direct access to scalable GPU resources that enable vCluster engineers to rapidly prototype new product features, experiment with emerging AI workloads, and refine orchestration capabilities as GPU architectures and AI frameworks continue to evolve.
With the rapid adoption of generative AI, enterprise development teams face a familiar dilemma. They must choose between waiting weeks to provision dedicated infrastructure, or piling teams into shared Kubernetes environments with no real isolation, creating security, governance, and resource contention problems. The result is delayed projects and GPU capacity that sits underutilized while teams wait for access. What organizations need is a way to instantly provision secure, isolated Kubernetes environments on top of existing GPU infrastructure, giving each team dedicated access without the overhead of standing up entirely separate clusters.
Through the partnership, QumulusAI will offer a managed Kubernetes solution powered by vCluster technology, enabling enterprises and AI developers to deploy isolated Kubernetes environments on shared GPU infrastructure. The solution enables AI development at hyperspeed by allowing teams to rapidly spin up development, testing, and production environments without duplicating infrastructure while maintaining secure separation and optimal utilization of GPU resources at scale.
The environments run on QumulusAI infrastructure powered by NVIDIA's Blackwell based B300 and RTXPRO 6000 platform, designed to support modern AI training, inference, and experimentation workloads.
“AI teams need infrastructure that moves as fast as their ideas,” said Ryan DiRocco, CTO of QumulusAI. “By combining vCluster’s trusted Kubernetes virtualization technology with QumulusAI’s distributed GPU cloud, organizations can spin up isolated environments in minutes and begin building quickly. We believe this partnership will give enterprises the flexibility, access, and speed required to move AI from experimentation to production.”
“AI infrastructure is evolving at an extraordinary pace, and platform tooling must evolve with it,” said Lukas Gentele, CEO of vCluster. “Our new AI Lab, powered by QumulusAI infrastructure, gives us the ability to test new ideas quickly and ensure our platform is ready for the next generation of AI workloads. At the same time, customers benefit from enterprise-grade Kubernetes environments optimized for GPU-accelerated development.”
“This partnership reflects a broader shift in the market toward more flexible and efficient AI infrastructure models,” said Steven Dickens, CEO and Principal Analyst, HyperFRAME Research. “The ability to rapidly spin up isolated environments on shared GPU resources addresses a real gap for enterprises trying to move from experimentation into production.”
About vCluster
vCluster Labs is the leading platform for operating GPU infrastructure, enabling AI cloud providers to deliver a hyperscaler-like experience to their customers and AI factories to build the same experience for their internal teams. Its technology delivers the full operational stack operators need to run their GPU data centers — managed Kubernetes, fast isolated tenant provisioning, and automated node provisioning and lifecycle management — enabling them to accelerate time to value, reduce operational burden, and maximize the ROI of every GPU. Trusted by fast-growing AI cloud providers and NVIDIA Cloud Partners, with an NVIDIA-validated reference architecture for DGX systems, vCluster helps operators turn GPU hardware into scalable AI factories. Outside of AI infrastructure, enterprises including FICO, GoFundMe, and Aussie Broadband use vCluster to deliver consistent, self-service Kubernetes platforms across multi-cloud and hybrid environments. Learn more at www.vcluster.com.
About QumulusAI
QumulusAI is a vertically integrated AI infrastructure company focused on delivering a distributed AI cloud by innovating around power, data center, and GPU-based cloud services. The company delivers fast access to high-performance computing with enhanced cost control, reliability, and flexibility. Machine learning teams, AI startups, research institutions, and growing enterprises can now scale their AI training and inference workloads quickly and cost effectively. For more information, https://www.qumulusai.com
Press: media@qumulusai.com
Investors:investors@qumulusai.com
Disclaimer
This press release contains certain “forward-looking statements” that are based on current expectations, forecasts and assumptions that involve risks and uncertainties, and on information available to QumulusAI as of the date hereof. QumulusAI’s actual results could differ materially from those stated or implied herein, due to risks and uncertainties associated with its business. Forward-looking statements include statements regarding QumulusAI’s expectations, beliefs, intentions or strategies regarding the future, and can be identified by forward-looking words such as “anticipate,” “believe,” “could,” “continue,” “estimate,” “expect,” “intend,” “may,” “should,” “will” and “would” or words of similar import. Forward-looking statements include, without limitation, statements regarding anticipated results of QumulusAI’s partnership with vCluster, QumulusAI’s plans, objectives, expectations and intentions, and other statements that are not historical facts. QumulusAI expressly disclaims any obligation or undertaking to disseminate any updates or revisions to any forward-looking statement contained in this press release to reflect any change in QumulusAI’s expectations with regard thereto or any change in events, conditions or circumstances on which any such statement is based in respect of its business, the strategic partnership or otherwise.
Infrastructure Friction Isn't Slowing Your AI. It's Shaping It.
The most damaging effect of infrastructure friction on enterprise AI isn't delay. It's selection.
When teams know that GPU provisioning takes weeks, that scaling requires procurement cycles, and that capacity is uncertain, they adjust. Not by pushing harder against the constraints, but by internalizing them. They stop proposing the experiments they know will get stuck in queues. They default to workloads that fit within existing commitments. They pursue the incremental project over the transformative one.
The AI strategy that reaches the boardroom isn't a reflection of what's possible. It's a reflection of what the infrastructure will permit. And the gap between those two things is where competitive advantage lives.
The Projects That Never Got Proposed
Consider two scenarios, drawn from a recent HyperFRAME Research analysis of enterprise AI infrastructure challenges.
The first: an AI development firm building customized language models for enterprise customers. Their business model depends on speed and margin flexibility — demonstrating product-market fit with minimized capital outlay, then scaling capacity as demand materializes. Under legacy infrastructure, this team faces a fundamental innovation roadblock. The small-scale experimentation needed for rapid iteration is expensive relative to results. Scaling commitments require capital they can't deploy until product-market fit is proven. The result is a cash flow squeeze that delays time-to-market and constrains the very experimentation needed to get there.
The second: an enterprise with multiple business units pursuing independent AI initiatives. Each unit needs capacity for experimentation. None can justify dedicated infrastructure until use cases are validated. Under centralized procurement models, teams queue for shared resources. Internal SLAs create multi-week delays. Budget cycles prohibit rapid scaling. The teams that move fastest are the ones using shadow IT — fragmenting the organization's vendor leverage and creating infrastructure sprawl that will need to be consolidated later, compounding the original delays.
Different organizations. Same underlying constraint: infrastructure that dictates the pace and scope of AI ambition rather than supporting it.
The Compounding Cost of Stop-Start Development
Infrastructure friction doesn't just add time to a development cycle. It degrades the cycle itself.
When a team submits a GPU allocation request and waits weeks for capacity, they don't pause productively. They context-switch. Engineers pick up other work. Subject-matter experts move to different priorities. When capacity finally arrives, the team re-assembles and re-ramps — rebuilding context that was fresh weeks earlier.
Training runs generate insights that demand immediate follow-up. Under a stop-start model, those insights sit in a queue instead. By the time the team can act on what they learned, the learning has cooled. The model architecture conversation has moved on. The follow-up experiment, designed in the momentum of discovery, gets redesigned from a standing start.
Projects that should take months stretch into years. Many are abandoned. And when enough projects stall, the organizational narrative shifts. "AI doesn't work for us." Not because the technology failed — because the infrastructure imposed a rhythm incompatible with how AI development actually progresses.
Three Layers of Speed
Speed in AI development isn't one thing. It's three.
Provisioning speed determines how quickly teams can start. When provisioning is measured in hours rather than weeks, the gap between "approved" and "running" collapses. Teams maintain context. Momentum carries forward.
Iteration speed determines how quickly teams can learn. When infrastructure supports rapid follow-up — scaling a training run, testing a hypothesis, adjusting architecture based on results — the learning cycle tightens. More iterations per quarter means more insights per quarter.
Scaling speed determines how quickly teams can capitalize on success. When an experiment shows promise, the ability to scale from prototype to production without procurement cycles or commitment renegotiation is the difference between capturing a market window and watching it close.
Bottlenecks at any of these layers constrain the entire development cycle. And general-purpose cloud infrastructure, optimized for steady-state enterprise workloads, can introduce friction at each one.
Infrastructure as Accelerator
At QumulusAI, we built our architecture around a conviction: speed isn't a secondary consideration — it's a first-order design constraint. Our distributed model, with GPU capacity continuously replenished across colocation partnerships, is designed to eliminate the centralized allocation queues that create the stop-start patterns described above. Provisioning measured in hours. Seamless scaling as experiments succeed. Infrastructure that adapts to the pace of learning rather than the pace of procurement.
This is what we mean by hyperspeed compute: infrastructure velocity as competitive differentiation.
The Access and Speed dimensions of our FACTS framework directly address these challenges. But diagnosing the specific friction points in your own infrastructure requires a structured approach. HyperFRAME Research's latest brief provides that structure — a set of diagnostic questions and a decision lens for evaluating whether your infrastructure is enabling your AI development velocity or quietly constraining it.
Heading into GTC, the infrastructure conversation will be louder than it's been in years. The question worth asking before you get there: is your infrastructure keeping pace with your team's ability to learn?
The Most Expensive GPU Is the One You're Not Using
There's a cost conversation happening across enterprise AI that's focused on the wrong number.
Teams compare price-per-GPU-hour across providers. Procurement builds spreadsheets modeling committed versus on-demand pricing. Finance asks whether the cloud bill is growing faster than the AI roadmap can justify. All of these are reasonable questions — and none of them capture the actual economic drag on enterprise AI development.
The real cost problem is structural, not transactional. And it's compounding quietly in three places most organizations aren't measuring.
The Capacity You're Paying for But Not Using
AI workloads don't behave like traditional enterprise applications. They oscillate between dormant windows and intensive bursts — a training run that demands every available GPU for 72 hours, followed by weeks of analysis, architecture adjustment, and preparation for the next run. Inference workloads spike with product launches and user adoption curves, then stabilize at a fraction of peak demand.
Commitment-based pricing models, designed for the steady-state resource consumption of information-scale workloads, force a binary choice on AI teams. Over-provision, and pay for capacity that sits idle during dormant periods. Or under-provision, and wait for capacity during the intensive windows when speed matters most.
Both options carry real cost — one measured in direct spend, the other in delayed value creation and lost development momentum. The industry conversation tends to focus on the former because it shows up on an invoice. But the latter is usually more expensive.
The Budget Uncertainty That Kills Experimentation
GPU infrastructure pricing in the current market features complex structures with layered egress fees, variable storage charges, and commitment penalties that make even medium-term forecasting difficult. When teams can't predict what a month of development will actually cost, they respond rationally: they optimize for conservative cost predictability rather than needed performance.
This isn't a failure of discipline. It's a natural consequence of cost opacity. And its downstream effect is significant: teams that can't forecast confidently don't experiment ambitiously. They default to workloads they know will stay within budget guardrails. They pursue the safe project over the transformative one.
The irony is that AI development demands exactly the kind of iterative, exploratory approach that cost unpredictability discourages. The fail-fast methodology that drives breakthroughs requires the freedom to spin up capacity for a hypothesis, test it quickly, and redirect resources based on what you learn. When every experiment carries budget uncertainty, organizations build in caution that slowly erodes their competitive edge.
The Innovation You Never Attempted
This is the cost that doesn't appear on any balance sheet.
When infrastructure economics are unpredictable, teams don't just slow down — they self-censor. Projects that would require uncertain scaling commitments never get proposed. Use cases that demand intensive experimentation get deprioritized in favor of workloads with clearer infrastructure cost profiles. The AI strategy itself becomes shaped by what the infrastructure budget can absorb rather than what the business opportunity demands.
The result: organizations operating well below their AI potential, not because their teams lack ideas or capability, but because the infrastructure cost model has made ambition feel financially irresponsible. Leadership sees "we don't have strong enough AI use cases" when the real issue is "our infrastructure economics are filtering out the strongest ones before they reach the proposal stage."
This is the invisible innovation tax. And it compounds. Every quarter of constrained experimentation is a quarter where competitors with more transparent, flexible infrastructure economics are testing hypotheses your team never proposed.
Reframing Cost as a Strategic Variable
At QumulusAI, we see GPU infrastructure pricing as a strategic variable — not just a line item. Our approach centers on two principles: cost and flexibility.
Cost means total price visibility. No hidden egress fees. No unpredictable storage charges. When teams can forecast their monthly costs with confidence, they regain the freedom to experiment — to pursue the ambitious workload alongside the safe one, to iterate quickly without worrying about budget surprise.
Flexibility means the ability to right-size infrastructure to the actual rhythm of AI development. Scale up for intensive training runs. Scale down during analysis and preparation windows. Move from fractional GPU prototyping to bare-metal production clusters without being locked into capacity tiers designed for a different workload pattern.
These two dimensions — Cost and Flexibility — are central to the FACTS framework we use to evaluate infrastructure alignment with AI development needs. The full framework, developed in collaboration with HyperFRAME Research, provides specific diagnostic questions for assessing whether your current infrastructure economics are enabling your AI strategy or quietly constraining it.
Enterprise Doesn't Have an AI Problem. It Has an Infrastructure Problem.
Enterprise AI has a problem, and it's not where most organizations are looking.
The conversation in boardrooms, analyst briefings, and conference keynotes centers on models — which architecture, which parameters, which training approach will yield the next breakthrough. Meanwhile, the infrastructure layer has quietly become the primary constraint on competitive velocity. Not because it's broken, but because it was built for a fundamentally different kind of work.
The Architecture Mismatch
The hyperscale model that powers most enterprise cloud infrastructure was designed for what we'd call information-scaleworkloads: serving web pages, processing transactions, streaming content, storing documents. It was built when the bottleneck was storage and networking capacity, and it solved those problems brilliantly. Massive centralized data centers serving virtualized workloads across global networks delivered economies of scale that transformed enterprise IT.
AI-native workloads operate under a completely different set of demands. Training runs require burst GPU capacity that can spike overnight and go dormant for weeks. Inference at scale needs sustained throughput with tight latency requirements. Experimentation — the fail-fast iteration that drives real AI progress — needs low-commitment flexibility to spin resources up and down as learning dictates.
These are intrinsically different optimization requirements. And when organizations try to run intelligence-scale workloads on information-scale infrastructure, the friction shows up in predictable ways: multi-week provisioning delays, commitment-based pricing that forces teams to pay for idle capacity or wait for what they need, cost structures so opaque that teams optimize for conservative predictability rather than performance, and ecosystem lock-in that compounds switching costs over time.
None of this is a failure of the hyperscale model. It reflects architectural priorities designed for breadth of service at global scale. But for organizations pursuing AI-native development, that architectural mismatch has become the chokepoint.
What Infrastructure Friction Actually Looks Like
The symptoms are familiar to any enterprise AI team. Engineers submit GPU allocation requests and wait weeks for capacity. When it arrives, they execute an intensive training run, generate insights that require immediate follow-up — and then wait again. Between cycles, institutional knowledge decays. Team members context-switch to other projects. Momentum dissipates.
Projects that should take months take years. Many are abandoned entirely.
What's less visible is the second-order effect: infrastructure friction doesn't just slow the projects that get approved. It shapes which projects get proposed in the first place. Teams internalize provisioning constraints. They pre-filter their ambitions based on what they believe the infrastructure will support. The most costly outcome isn't a delayed project — it's the breakthrough experiment that never made it past a whiteboard because someone knew the queue would kill it.
This dynamic creates what HyperFRAME Research describes as a growing divide between "AI-mature" organizations that can navigate these overheads and those stuck in "pilot purgatory" — trapped in cycles of experimentation that never reach production because the infrastructure model won't support the transition.
A Diagnostic Lens, Not Just a Diagnosis
At QumulusAI, we've built our architecture around what we call the FACTS framework: Flexibility, Access, Cost, Trust, and Speed. These five dimensions serve as both a design philosophy and a diagnostic tool for infrastructure decision-makers.
Flexibility addresses the over-provision / under-provision trap — the ability to right-size infrastructure to current workflow rather than committing to capacity tiers designed for steady-state operations.
Access measures how quickly teams can actually get capacity in their hands.
Cost evaluates transparency and predictability, not just price.
Trust captures whether SLAs and security models can adapt to specific regulatory and operational requirements.
And Speed — not just provisioning speed, but iteration speed and scaling speed — determines whether infrastructure accelerates or constrains the development cycle.
These dimensions aren't theoretical. They represent the specific friction points where enterprise AI initiatives stall, where budgets burn without delivering value, and where competitive velocity is quietly lost.
Why This Matters Now
Infrastructure decisions made in the next twelve months establish the foundation for the most critical years of enterprise AI development ahead. Organizations that secure infrastructure velocity advantages early benefit from a compounding effect: faster provisioning drives faster iteration, which drives faster learning, which drives faster deployment. Each cycle reinforces the next.
Organizations still locked into legacy infrastructure models will find the gap widening — not because their teams are less talented, but because their infrastructure imposes a speed limit their competitors have removed.
HyperFRAME Research recently published an independent research brief examining this infrastructure velocity gap in depth. The brief introduces the FACTS framework as a structured diagnostic for enterprise AI infrastructure and presents a practical approach to evaluating where your current setup may be creating drag.
QumulusAI Introduces "Hyperspeed Compute" as a New Model for Enterprise AI Infrastructure
New HyperFRAME research finds infrastructure velocity is now the primary constraint on enterprise AI progress
ATLANTA, GA / ACCESS Newswire / March 5, 2026 / QumulusAI, a distributed AI infrastructure provider, today announced the release of a new research brief developed in collaboration with HyperFRAME Research, The Hyperspeed Compute Era: Reclaiming AI Velocity for Enterprise Teams. The report examines why enterprise AI initiatives are increasingly stalled by infrastructure constraints and outlines a new approach designed to eliminate lengthy GPU access delays, rigid capacity commitments, and cost opacity.
According to the research, enterprise AI has entered a "flight to efficiency" phase. Rather than large, monolithic model builds, teams are prioritizing smaller, fine-tuned models and faster iteration cycles. Further, most infrastructure environments remain optimized for "information-scale" workloads - pages, processing transactions, streaming content, storing documents - instead of "intelligence-scale" workloads - training models, running inference, fine-tuning on proprietary data. The result is a widening infrastructure velocity gap that separates AI-mature organizations from those stuck in prolonged pilots.
"The biggest limiter on enterprise AI today isn't models or ambition, it's access," said Mike Maniscalco, CEO of QumulusAI. "Teams are waiting weeks, if not months, for GPU capacity, paying for idle commitments, and losing momentum while procurement and provisioning catch up. Infrastructure has become a strategic bottleneck - and teams should be looking to augment hyperscale infrastructure with hyperspeed compute."
Infrastructure Is Now the Competitive Choke Point
The HyperFRAME report identifies three structural issues shaping enterprise AI outcomes in 2026:
Provisioning latency: Multi-week waits for GPU access slow iteration and kill a fail-fast development strategy.
Architectural misalignment: Hyperscale environments optimized for steady workloads struggle with burst-driven AI development.
Cost uncertainty: Complex pricing models and commitment structures discourage experimentation.
These constraints not only slow projects, they shape which AI initiatives are attempted at all.
"Infrastructure choice now directly determines AI velocity," said Steven Dickens, CEO and Principal Analyst at HyperFRAME Research. "Organizations that remove friction early gain a compounding advantage across every development cycle that follows."
Introducing Hyperspeed Compute and the FACTS Framework
The research introduces QumulusAI's FACTS framework - Flexibility, Access, Cost, Trust, and Speed - as a diagnostic lens for evaluating AI infrastructure readiness. The framework is designed to help enterprises identify where legacy infrastructures create friction and where alternative architectures can restore development momentum.
QumulusAI's approach, described in the report as "hyperspeed compute," is built around:
Flexible scaling from fractional GPUs to dedicated clusters
Access to distributed GPU capacity across colocation partners
Cost transparency without hidden egress or storage fees
Trust-based partnership model focused on capacity planning, not one-time transactions
Speed - rapid deployments designed to bring compute online for clients in weeks vs months
The report recommends a portfolio approach, combining hyperscale environments for steady-state workloads with hyperspeed infrastructure for experimentation, burst capacity, and early production phases.
From Waiting to Iterating
The report concludes that infrastructure decisions made in early 2026 will shape enterprise AI competitiveness for years to come. Organizations that prioritize infrastructure velocity can iterate faster, learn faster, and deploy faster - creating a flywheel effect that compounds over time.
The full research brief, The Hyperspeed Compute Era, is available from QumulusAI (ADD LINK). Enterprises interested in validating the approach can participate in QumulusAI's pilot program, designed to test provisioning speed, cost predictability, and iteration velocity with real workloads.
About HyperFRAME Research
HyperFRAME Research provides independent analysis of AI, cloud, and infrastructure markets, helping enterprises and technology providers understand emerging architectures and their business impact.
About QumulusAI
QumulusAI is a vertically integrated AI infrastructure company focused on delivering a distributed AI cloud by innovating around power, data center and GPU-based cloud services-the company delivers immediate access to high-performance computing with enhanced cost control, reliability, and flexibility. Machine learning teams, AI startups, research institutions, and growing enterprises can now scale their AI training and inference workloads quickly and cost effectively. For more information, visit https://www.qumulusai.com
For more information on QumulusAI:
Press: media@qumulusai.com
Investors: investors@qumulusai.com
Follow QumulusAI on social media: https://www.linkedin.com/company/qumulusai
This press release contains certain "forward-looking statements" that are based on current expectations, forecasts and assumptions that involve risks and uncertainties, and on information available to QumulusAI as of the date hereof. QumulusAI's actual results could differ materially from those stated or implied herein, due to risks and uncertainties associated with its business and leadership changes. Forward-looking statements include statements regarding QumulusAI's expectations, beliefs, intentions or strategies regarding the future, and can be identified by forward-looking words such as "anticipate," "believe," "could," "continue," "estimate," "expect," "intend," "may," "should," "will" and "would" or words of similar import. Forward-looking statements include, without limitation, statements regarding future operating and financial results, QumulusAI's plans, objectives, expectations and intentions, and other statements that are not historical facts. QumulusAI expressly disclaims any obligation or undertaking to disseminate any updates or revisions to any forward-looking statement contained in this press release to reflect any change in QumulusAI's expectations with regard thereto or any change in events, conditions or circumstances on which any such statement is based in respect of its business, the strategic partnership or otherwise.
QumulusAI Announces Leadership and Board Updates to Support Next Phase of Growth
QumulusAI today announced leadership and board updates to support execution at scale as the company expands its hyper-distributed AI cloud platform and advances toward the public markets.
Steve Gertz has stepped down as Chairman of the Board to focus fully on his operational role as Chief Growth Officer. Mike Maniscalco, Chief Executive Officer, has been appointed Chairman of the Board. The company also announced the appointment of Dr. Homaira Akbari to its Board of Directors.
Gertz served as Chairman from February 2025 to February 2026 during a formative period in which QumulusAI established its operating foundation across GPU cloud services, modular data center infrastructure, and power-aligned deployment models. His transition to Chief Growth Officer in December 2025 formalized his day-to-day leadership role as the company entered a more execution-driven phase.
As Chief Growth Officer, Gertz is responsible for capital markets engagement, strategic customer and partner development, executive leadership support, and evaluation of inorganic growth opportunities.
“Steve has been operating as part of the leadership team for some time,” said Maniscalco. “This change clarifies the separation between governance and execution while keeping continuity where it matters.”
“As QumulusAI scales, growth has to keep pace with infrastructure velocity,” said Gertz. “My conviction in the company’s vision and execution is what led me to join the executive team, where I can focus on accelerating revenue, building strategic relationships, and expanding the ecosystems around our platform.”
QumulusAI also announced the appointment of Akbari to its Board of Directors, where she will serve on the Audit Committee and the Nominating and Corporate Governance Committee.
Akbari is President and CEO of AKnowledge Partners, advising Fortune 1000 companies and private equity firms on AI, cybersecurity, IoT, and energy transition. She previously served as President and CEO of SkyBitz and held senior leadership roles at Microsoft and Thales. She currently serves on the boards of Banco Santander, Landstar System, and Babcock & Wilcox Enterprises.
Akbari holds a Ph.D. in particle physics from Tufts University and is the author of The Cyber Savvy Boardroom.
“Homaira brings deep operating experience and strong public-company governance perspective,” said Maniscalco. “Her background in AI and cybersecurity is highly relevant as we scale infrastructure for enterprise workloads.”
“QumulusAI is tackling one of the defining infrastructure challenges of this decade,” said Akbari. “I’m pleased to guide the company as it scales responsibly and prepares for its next stage of growth.”
About QumulusAI
QumulusAI is a vertically integrated AI infrastructure company focused on delivering a distributed AI cloud by innovating around power, data center and GPU-based cloud services—the company delivers immediate access to high-performance computing with enhanced cost control, reliability, and flexibility. Machine learning teams, AI startups, research institutions, and growing enterprises can now scale their AI training and inference workloads quickly and cost effectively. For more information, visit https://www.qumulusai.com
For more information on QumulusAI:
Press: media@qumulusai.com
Investors: investors@qumulusai.com
Follow QumulusAI on social media:
https://www.linkedin.com/company/qumulusai
This press release contains certain “forward-looking statements” that are based on current expectations, forecasts and assumptions that involve risks and uncertainties, and on information available to QumulusAI as of the date hereof. QumulusAI’s actual results could differ materially from those stated or implied herein, due to risks and uncertainties associated with its business and leadership changes. Forward-looking statements include statements regarding QumulusAI’s expectations, beliefs, intentions or strategies regarding the future, and can be identified by forward-looking words such as “anticipate,” “believe,” “could,” “continue,” “estimate,” “expect,” “intend,” “may,” “should,” “will” and “would” or words of similar import. Forward-looking statements include, without limitation, statements regarding future operating and financial results, QumulusAI’s plans, objectives, expectations and intentions, and other statements that are not historical facts. QumulusAI expressly disclaims any obligation or undertaking to disseminate any updates or revisions to any forward-looking statement contained in this press release to reflect any change in QumulusAI’s expectations with regard thereto or any change in events, conditions or circumstances on which any such statement is based in respect of its business, the strategic partnership or otherwise.
QumulusAI Deploys 1,144 NVIDIA Blackwell GPUs Through Drawdown Under $500M USD.AI Facility
Drawdown under innovative financing marks initial phase of QumulusAI's 2026 GPU expansion roadmap targeting more than 23,000 GPUs by year-end
QumulusAI, a vertically integrated AI infrastructure company delivering hyper-distributed compute at hyperspeed, today announced the deployment of 1,144 NVIDIA Blackwell GPUs, representing their first drawdown under its previously announced $500 million non-recourse financing facility with USD.AI.
The first phase of the deployment consists of 760 NVIDIA Blackwell GPUs and marks QumulusAI's first large-scale implementation of its innovative capital model. The structure aligns flexible financing with next-generation GPU infrastructure to accelerate time to market for enterprise AI customers.
QumulusAI has also funded the second phase of deployment with a deposit for its next 384-GPU B300 cluster scheduled for late-March delivery, with the remaining balance expected to be funded in part through a subsequent draw under the USD.AI facility.
"This deployment demonstrates how AI infrastructure must be built in this era. It needs to be fast, modular, and capital-efficient," said Mike Maniscalco, Chief Executive Officer of QumulusAI. "By combining NVIDIA's Blackwell platform with a flexible financing structure, we are bringing meaningful compute capacity online at hyperspeed while maintaining capital discipline."
Blackwell-Powered Infrastructure at Scale
The deployment includes Blackwell-based server platforms powered by NVIDIA's next-generation architecture, designed to support increasingly complex AI training and inference workloads. Blackwell delivers significant improvements in performance, memory bandwidth, and energy efficiency. These gains enable customers to train larger models, accelerate inference pipelines, and improve cost-per-token economics.
By integrating Blackwell infrastructure into its hyper-distributed cloud model, QumulusAI provides customers access to enterprise-grade GPU access without hyperscaler bottlenecks, long procurement cycles, or rigid multi-year commitments.
Innovative Financing as a Growth Engine
The deployment represents the first drawdown under QumulusAI's $500 million USD.AI financing facility, announced earlier this year. The structure enables phased infrastructure activation aligned with customer demand. Traditional data center financing models are built around long construction timelines and large upfront capital commitments. QumulusAI's model allows capacity to scale incrementally.
This approach allows QumulusAI to:
Deploy GPU capacity in phases
Accelerate time to revenue
Maintain balance sheet flexibility
Scale infrastructure alongside customer demand
By aligning capital velocity with deployment velocity, QumulusAI is redefining how AI infrastructure reaches the market.
The original announcement of the $500 million financing facility can be found here.
Part of a Broader 2026 GPU Expansion Roadmap
The 760 Blackwell GPUs represent the initial phase of QumulusAI's broader 2026 capacity expansion plan. The company expects total GPU inventory to exceed 20,000 GPUs by the end of 2026, driven by phased deployments across its distributed network of data center partners throughout the year.
Planned 2026 deployments include scaled rollouts of B300 and RTX Pro 6000 platforms, with additional Blackwell-based capacity expected in successive phases.
As AI demand accelerates globally, QumulusAI's distributed model positions the company to respond rapidly to training and inference workloads across industries including healthcare, financial services, media, automotive, and advanced research.
Building the Hyper-Distributed AI Cloud
QumulusAI's infrastructure strategy is built on five core pillars: Flexibility, Access, Cost, Trust, and Speed. These principles enable customers to scale AI workloads with predictable performance and enterprise-grade reliability.
By combining next-generation GPU platforms, innovative capital structures, modular data center partnerships, and distributed geographic deployment, QumulusAI continues executing on its mission of Breaking AI's Biggest Barriers.
Additional deployments under the USD.AI facility are expected throughout 2026 as the company advances its roadmap.
About QumulusAI
QumulusAI is building the next-generation AI cloud through a hyper-distributed, modular infrastructure model that integrates power, data centers, and GPU-as-a-Service. The company delivers enterprise-grade compute at hyperspeed, enabling AI developers, enterprises, and research institutions to scale training and inference workloads without traditional infrastructure constraints.
About Permian Labs
Permian Labs is the developer of USD.AI, building the infrastructure that connects institutional capital with real-world AI compute. Permian Labs designs the legal, financial, and technical systems that transform GPUs into collateral and make them accessible through blockchain- based credit markets. By bridging traditional asset finance with DeFi innovation, Permian Labs enables AI operators to scale efficiently while creating new opportunities for investors to access yield from real-world infrastructure.
Visit: https://www.gpuloans.com
About USD.AI
USD.AI is the world's first blockchain-native credit market for GPU-backed infrastructure. The protocol turns AI hardware into tokenized collateral, unlocking financing markets with deep liquidity, attractive cost of capital and instant settlement for emerging AI operators who require capital to scale. Through its dual-token model, USDai (a stablecoin with deep liquidity) and sUSDai (its yield-bearing counterpart), USD.AI creates new liquidity pathways for operators while offering investors scalable, real-world yields. Developed by Permian Labs, USD.AI combines DeFi principles with institutional-grade securitization standards to accelerate the financing of AI infrastructure worldwide.
For more information on QumulusAI:
Press: media@qumulusai.com
Investors: investors@qumulusai.com
Follow QumulusAI on social media: https://www.linkedin.com/company/qumulusai
For more information on USD.AI
Email: hello@usd.ai
This press release contains certain "forward-looking statements" that are based on current expectations, forecasts and assumptions that involve risks and uncertainties, and on information available to QumulusAI as of the date hereof. QumulusAI's actual results could differ materially from those stated or implied herein, due to risks and uncertainties associated with its business and/or the strategic partnership, which include, without limitation, the company's ability to complete future deployments and continued availability under the USD.AI facility, market volatility and/or regulatory conditions. Forward-looking statements include statements regarding QumulusAI's expectations, beliefs, intentions or strategies regarding the future, and can be identified by forward-looking words such as "anticipate," "believe," "could," "continue," "estimate," "expect," "intend," "may," "should," "will" and "would" or words of similar import. Forward-looking statements include, without limitation, statements regarding future operating and financial results, QumulusAI's plans, objectives, expectations and intentions, and other statements that are not historical facts. QumulusAI expressly disclaims any obligation or undertaking to disseminate any updates or revisions to any forward-looking statement contained in this press release to reflect any change in QumulusAI's expectations with regard thereto or any change in events, conditions or circumstances on which any such statement is based in respect of its business, the strategic partnership or otherwise.
Moonshot and QumulusAI Announce Strategic Agreement with Connected Nation Internet Exchange Points to Deploy a Nationally Distributed AI Compute and Internet Exchange Platform
The AI IXP platform will accelerate low-latency AI deployments to the network edge across the U.S.
Lewisville, TX – January 15, 2026 – Moonshot Energy, a Texas-based manufacturer of critical electrical and modular infrastructure for AI, together with QumulusAI, Inc., a provider of inference-optimized GPU-as-a-Service, today announced that they and Connected Nation Internet Exchange Points (dba IXP.us) have entered into a Strategic Commercial Agreement. Through this joint venture, Moonshot and QumulusAI (QAI Moon) will design and deploy a nationally distributed, fully integrated platform with IXP.us that pairs carrier-neutral Internet Exchange Points (IXP) with modular AI Pods at 25 initial sites, scaling to 125 across U.S. university research campuses and municipalities.
This collaboration brings together carrier-neutral interconnection, modular AI infrastructure, and QumulusAI’s GPU-as-a-Service platform into a repeatable, scalable national architecture purpose-built for next-generation inference and AI workloads that reduce latency and extend AI compute access beyond centralized hyperscale data centers.
QumulusAI provides the GPU orchestration, workload delivery, and commercial operating model that enables AI compute to be deployed closer to networks, data sources, and end users directly addressing the latency, cost, and sovereignty constraints currently challenging centralized hyperscale data centers. The initial QAI Moon deployment will begin by July 2026, at the IXP.us “alpha site”, located at 2205 N Fountain Street, Wichita, Kansas, 67220, on the Wichita State University campus, with expansion planned across additional IXP.us markets currently in development.
“This partnership represents the physical convergence of power, compute, and interconnection at the exact point where AI demand is moving,” said Ethan Ellenberg, CEO of Moonshot. “By pairing Moonshot’s modular electrical and AI infrastructure with the IXP.us carrier-neutral interconnection model and QumulusAI’s GPU platforms, we are creating a repeatable national architecture that delivers ultra-low-latency AI without the constraints of hyperscale data centers.”
QAI Moon AI Pod Benefits
Each QAI Moon AI Pod deployment is designed to operate as a network-dense, low-latency-optimized inference platform, architected to support QumulusAI’s GPU-as-a-Service delivery model, requiring:
Dual, geographically diverse 400G IP transit connections from four independent ISPs
Redundant 400G IX ports on the DE-CIXaaS switch at each IXP.us facility
Direct adjacency for high-count dark fiber between IXP interconnection infrastructure and modular AI compute
~2,000kw initial module sizing by market with flexible, customer application-driven GPU series deployment
Together, these elements enable ultra-low-latency access to GPU resources while maintaining full carrier neutrality and operational resilience.
Strategic Partnership Value
A first-of-its-kind national platform combining IXPs with distributed AI Pods
Reduced inference latency by colocating GPUs at the network interconnection edge
Modular, repeatable deployments aligned with real-world power availability
A capital-efficient and scalable alternative to hyperscale, centralized AI data centers
Open access for network operators, enterprises, and AI customers alike
Enables QumulusAI to deliver scalable, network-adjacent GPU-as-a-Service for inference-heavy and real-time AI workloads across emerging rural markets
“AI workloads are increasingly inference-driven, latency-sensitive, and distributed, but the infrastructure hasn’t kept pace,” said Mike Maniscalco, CEO of QumulusAI. “This partnership allows us to place GPU compute directly at the network edge, where data moves and decisions happen. Together with Moonshot and IXP.us, we’re building a national platform that makes high-performance AI compute practical, scalable, and economically viable beyond hyperscale data centers.”
In January 2025, Moonshot formed Moonshot Energy, a dedicated GPU-as-a-Service operating company, to expand beyond manufacturing into AI compute infrastructure. In mid-2025, strategy accelerated with the formation of QAI Moon, a joint venture between Moonshot Energy and QumulusAI, created to commercialize distributed, inference-optimized GPU-as-a-Service at national scale.
“With this strategic relationship, we will enable the first scalable low-latency compute infrastructure directly adjacent to our network-dense interconnection facilities, including on many university campuses,” said Hunter Newby, CEO of Newby Ventures and Co-CEO, IXP.us. “We could not be more pleased to work with QumulusAI and Moonshot to build what will be the foundation of the AI economy.”
In November 2025, QAI Moon executed a Memorandum of Understanding with IXP.us to serve as the anchor AI compute customer across the IXP.us national IXP footprint, with QumulusAI providing the initial GPU-as-a-Service workloads and customer demand driving phased deployment. QAI Moon is currently targeting an initial deployment of AI compute infrastructure at 25 of the CN IXP 125 sites with a goal of full-scale, national deployment within five years.
“Building upon Connected Nation’s original mission, IXP.us and our strategic relationship with Moonshot and QumulusAI will ensure that no state gets left behind in the AI revolution,” said Tom Ferree, CEO of Connected Nation, Inc. & Co-CEO, IXP.us. “With this relationship, we will build the low-latency network interconnection ecosystem that has been a missing piece of the AI puzzle.”
The IXP.us existing partnerships with DE-CIX as the Internet Exchange (IX) operator, TOWARDEX for the diverse manhole and conduit access design and construction, and Connectbase for the orderly assembly and transparency of all Outside Plant Fiber, transport and IP transit networks entering the IXP, uniquely position the platform to meet the stringent, low-latency connectivity requirements of next-generation AI workloads, physically distributed and at scale.
“As AI workloads become increasingly latency-sensitive, the convergence of interconnection and compute is a natural and necessary evolution of digital infrastructure,” said Ivo Ivanov CEO DE-CIX. “By bringing AI Pods directly to Internet Exchanges, this initiative demonstrates how neutral, distributed platforms can enable the next generation of real-time AI services at scale.”
“Having developed the Hub Express System in Boston, the nation’s first large-scale Meet-Me Street network, TOWARDEX has established itself in the utility construction and fiber infrastructure sectors for its operational excellence and world-class construction techniques for building large-scale multi-tenant fiber networks,” said James Jun CEO, TOWARDEX. “We believe strongly that abundant access to dark fiber is the key to unlocking growth for AI and inferencing. Through our partnership with IXP.us and QAI Moon, we share in our collective vision to make AI more accessible to communities across the nation.”
“This partnership represents a new model for AI infrastructure, placing GPU compute directly at carrier-neutral Internet Exchange Points to minimize latency and maximize network choice,” said Ben Edmond, CEO & Founder of Connectbase. “Connectbase is proud to support IXP.us by providing full transparency and system-of-record visibility for all fiber, transport, and IP transit entering each site. As AI workloads move closer to the network edge, operational clarity across physical and commercial connectivity becomes foundational to scale.”
Network operators interested in bidding on and provisioning dark fiber and, or lit circuits for QAI Moon should visit the IXP.us profile on connectbase.com.
Project info graphic provided by Precepture.
###
About Moonshot
Moonshot is currently constructing AI facilities for Neoclouds and Hyperscalers in the U.S. and has recently commissioned a 500,000-square-foot advanced manufacturing facility in Lewisville, Texas. This expansion enables Moonshot to deliver integrated electrical systems and modular AI infrastructure at scale, supporting the rapid deployment of low-latency, network-adjacent AI facilities nationwide, incorporating end-to-end mechanical cooling solutions and expert commissioning services through a strategic collaboration with Data Air Flow on all deployments. For more information, visit https://moonshotus.com.
About QumulusAI
QumulusAI is a vertically integrated AI infrastructure company focused on delivering a distributed AI cloud by innovating around power, data center and GPU-based cloud services — the company delivers immediate access to high-performance computing with enhanced cost control, reliability, and flexibility. Machine learning teams, AI startups, research institutions, and growing enterprises can now scale their AI training and inference workloads quickly and cost effectively. For more information, visit https://www.qumulusai.com.
About Connected Nation Internet Exchange Points (IXP.us)
Connected Nation Internet Exchange Points (IXP.us) was launched in 2022 to develop IXP facilities in 125 regional hub communities across the United States and its territories, focusing primarily on research university campuses. IXP.us is a joint venture between non-profit Connected Nation, Inc. and Newby Ventures. The company is focused on the development of neutral, physical real estate to facilitate low-latency network interconnection in unserved and underserved markets, with no monthly recurring cross-connect fees. For more information, visit https://www.ixp.us.
IXP.us is currently developing its “alpha” Internet Exchange Point facility at Wichita State University as its community anchor institution partner to establish the first-ever neutral, network interconnection facility for the city of Wichita. This IXP.us facility will be the archetype for similar deployments elsewhere.
Media Inquiries
Connectbase
Melissa Frank
mfrank@connectbase.com
DE-CIX
Carsten Titt
carsten.titt@de-cix.net
IXP.us
Jessica Denson
502-341-2024
jessica.denson@ixp.us
Moonshot/TOWARDEX
Nick Mitsis
202-361-5789
nmitsis@percepture.com
QumulusAI
Tannis Baldock
650-576-4782
media@qumulusai.com
NVIDIA B200 vs. McLaren 750S: Understanding Performance Through Two Extremes
If you were to spend roughly $350,000 on high performance today, you could choose between a McLaren 750S Spider and an NVIDIA B200 server. It’s an unusual comparison, but also a useful one. Both machines sit at the extreme of what modern engineering can deliver. Both depend on tight coordination of power, heat, materials, and control systems. And both reveal something about where the frontier of performance is heading.
A Tale of Two Machines
The McLaren represents mechanical performance refined to its sharpest point. Its twin-turbo 4.0-liter V8 generates around 740 horsepower, and the frame around it is optimized for stiffness, airflow, weight distribution, and stability. Every decision in the car’s design — from the carbon-fiber monocoque to the intercooling architecture — serves a single purpose: convert combustion into acceleration, cornering, and control with as little waste as possible. When everything is working, the result is immediate and unmistakable. You feel the performance the instant you ask for it.
An NVIDIA B200 server expresses performance differently but pursues it with the same intensity. In a QumulusAI configuration, each node includes eight Blackwell B200 GPUs with 180 GB of HBM3e per GPU, dual Intel Xeon 6960P processors offering 144 threads each, more than three terabytes of high-speed system memory, and significant local NVMe storage. Instead of managing airflow over brakes or torsional stiffness under load, the server manages the thermals of densely packed silicon, the stability of multi-kilowatt power draw, and the bandwidth required to keep all eight GPUs operating near their limits. It’s built not for bursts; it’s tuned for steady, uninterrupted precision at scale.
| Metric | McLaren 750S | NVIDIA B200 Server |
|---|---|---|
| Cost | ~$350,000 | ~$350,000 |
| Power | 740 HP | ~80,000 TFLOPs (AI compute) |
| Top Speed / Throughput | 206 mph | ~3,200 GB/s bandwidth |
| Energy Use | Premium gasoline | ~10–12 kW under load |
| Form Factor | 1 vehicle | 8 GPUs per node |
| Primary Purpose | Acceleration & handling | High-throughput computation |
How Performance Manifests in Each Domain
Neither machine is immune to time. New GPU generations will push past the B200, and future supercars will continue climbing the refinement curve. But the way their engineering expresses value could not be more different. A car concentrates its performance into moments. Think a straightaway, an apex, a brief stretch where the driver can access what the machine was built to do. A B200 system expresses its performance continuously. Every hour it is powered and properly fed, it produces something: training runs completed, tokens generated, simulations accelerated, or product cycles shortened.
This isn’t about which machine is “better.” It’s about how distinct engineering disciplines solve the same underlying problem — transforming energy into performance — under completely different constraints and goals.
Where the Engineering Frontier Has Moved
Seen through this lens, the analogy becomes less about novelty and more about intuition. Both machines depend on managing energy in a way that pushes their architectures to the limit. A McLaren draws power in rapid spikes, demanding airflow, fuel, and cooling that respond instantly to changes in throttle and load. The design is built around short, intense bursts of energy and the mechanical choreography that turns them into motion.
A B200 server draws power steadily and relentlessly. A well-loaded node consumes multiple kilowatts in a constant flow, and the entire system exists to keep that energy moving through memory channels, interconnect fabric, and silicon without interruption. The thermal and electrical constraints are as real and as strict as anything in motorsport; they simply unfold over hours or months instead of seconds.
This shift in how energy is used, from peak bursts to sustained throughput, reflects where many of today’s most significant engineering challenges now lie. The frontier is no longer defined only by aerodynamics, combustion, and mechanical stresses, but also by memory bandwidth, thermal envelopes, voltage stability, and the orchestration required to keep computational workloads saturated at scale. Performance hasn’t abandoned the mechanical world it defined a century ago. It has expanded into new domains for the 21st century.
Conclusion
Comparing a McLaren to a B200 server is unconventional, but it highlights something essential. These machines are built for entirely different outcomes, yet both represent the limits of engineering in their respective domains. One expresses performance through speed and handling. The other through throughput and reliability across vast, continuous workloads.
Bruce McLaren once said, “Life is measured in achievement, not in years alone.” It’s a fitting thought to end on. A supercar achieves what it was meant to achieve: moments of precision and power, perfectly executed. A B200 achieves something different: the steady, compounding work behind the systems and applications shaping the future of AI. Both are milestones of human capability. One measures its achievements in seconds, the other in sustained computation that accumulates over time.
As the demands for computation grow, the center of performance continues to move toward the systems built to transform power into scalable, uninterrupted intelligence — the infrastructure built to break AI’s biggest barriers.
Kubernetes Is Becoming the Standard for AI Infrastructure at KubeCon 2025
The message at KubeCon North America 2025 was loud and clear: Kubernetes is no longer just an option for running AI; it's rapidly becoming the standard underlying infrastructure for it. The Cloud Native Computing Foundation (CNCF) is laser-focused on building the serverless infrastructure needed to power the growth of AI workloads, especially the massive demand for inference. That means the ad-hoc, experimental days of running AI on Kubernetes are over. We’re coalescing around new, enterprise-grade solutions designed to make AI workloads stable, portable, and scalable.
1. Standardization for Smarter GPU Scheduling
The community's immediate priority, made abundantly clear at KubeCon, is to formalize how we manage specialized hardware — moving GPUs from complex, vendor-specific resources to easily shared infrastructure. This mandate is being realized through two key initiatives. First, Dynamic Resource Allocation (DRA) has reached general availability, providing an official Kubernetes API for managing GPUs and ensuring a stable foundation across all vendors. Second, the concept of smarter GPU scheduling and tenancy is maturing. GPUs are now becoming more shareable like CPUs. Kubernetes-native batch queues spanning multiple clusters ensure multi-GPU workloads stay within the fastest interconnect domains. When combined with hardware partitioning and virtual isolation, teams can dramatically cut queue times, achieve higher utilization, and offer safe self-service environments.
2. Optimizing Multi-Cloud Inference Serving
The need for enterprise AI to run anywhere from the public cloud to on-prem was a recurring and critical theme at KubeCon 2025, driving multi-cloud archtecture as a non-negotiable requirement. Keynotes highlighted the strategy of treating Kubernetes as the abstraction layer, enabling organizations to package their entire AI stack into a single, repeatable deployment (like a composite Helm chart) that runs unchanged from large bare-metal clusters down to single nodes in any environment. This guarantees performance without vendor lock-in. Furthermore, the production serving layer is standardizing around Kubernetes Custom Resource Definitions (CRDs). This allows for model-aware traffic routing to implement complex canary, A/B testing, and cost-based routing. Leveraging engine-level optimizations, such as compilation, quantization (FP8/FP4), and speculative decoding, production models can shrink cold-start times and maximize tokens-per-second, moving the scaling metric from simple GPU utilization to predictable latency and throughput.
3. GPU Centric FinOps and Observability
Solving the challenges of GPU efficiency and cost (FinOps) tied platform maturity to profitability at KubeCon. The key takeaway from the sessions: connecting GPU cost directly to business value and workload performance is paramount. This means utilizing LLM-aware tracing alongside raw GPU telemetry (utilization, memory, power) to quickly explain performance regressions and prevent over-provisioning. Cost allocation is now tied to measured GPU usage, providing visibility into idle spend by team and workload.
Finally, the importance of AI-ready storage was underscored, specifically using object and parallel file systems that support direct-to-GPU I/O. This removes data bottlenecks that often masquerade as "GPU problems." For operators, the final step is simple: instrument the process end-to-end, allocate resources by utilization, and optimize continuously to drive lower cost-per-thousand-tokens and stronger SLOs.
The Key Consensus from KubeCon 2025
Ultimately, the key consensus from KubeCon 2025 confirmed that the cloud-native ecosystem is no longer just hosting AI; it is actively becoming the operating system for AI. By focusing on unified standardization (DRA), guaranteeing portability, and implementing rigorous FinOps based on deep GPU observability, the CNCF and its projects are solving the critical challenges that have historically plagued enterprise machine learning at scale. For organizations invested in the cloud-native approach, this maturity means greater efficiency, unparalleled flexibility, and a clear path to running complex, mission-critical AI workloads predictably and profitably.
The Hyperspeed Compute Era Is Here
Why Developers Should Care and Why the Infrastructure Choice Matters
You can ship models faster than ever—but you still can’t get compute when you need it. Every AI developer knows the feeling: GPUs in backorder, latency creeping up, budgets burning while workloads wait. The slowdown isn’t your code. It’s the infrastructure.
The Internet scaled information. AI scales intelligence. Hyperscalers built for the former. QumulusAI builds for the latter. This fundamental difference defines the Hyperspeed Compute Era.
The New Reality: Infrastructure Itself is a Barrier
A few years ago, the hardest part of building AI was the modeling itself—wrangling data, tuning architectures, keeping experiments reproducible. Those challenges haven’t disappeared, but they’re no longer the thing slowing progress down. Frameworks have matured, open-source tools are stable, and model weights are widely available. What’s harder now is getting reliable compute when you need it. As open models proliferate and workloads scale, infrastructure—not algorithms—is what determines who ships first.
The numbers tell the story:
The global AI infrastructure market is projected to grow from $135.81 billion in 2024 to $394.46 billion by 2030 (Markets and Markets).
Data centers designed for AI processing will require about $5.2 trillion in capital expenditure by 2030 (McKinsey & Company).
GPU-as-a-Service markets are expanding at 26.5% annually (Markets and Markets).
Your skill isn’t slowing you down. You’re waiting for compute to catch up to your code.
Why Infrastructure Is Now a Developer Problem
Infrastructure choice directly shapes your ability to ship. At QumulusAI, we’ve identified five pillars that define next-generation GPU infrastructure. We call these the F.A.C.T.S.
Flexibility
Whether you need shared GPUs for experimentation, dedicated GPUs for production workloads, or bare-metal servers for full control, infrastructure should adapt to your workflow—not the other way around. Deploy on-grid or off, burst or reserve, all within a unified framework that scales as you do.
Access
Compute without compromise. Developers shouldn’t wait weeks for procurement or compete for capacity. Our distributed architecture and ready-to-run GPU nodes provide availability when and where you need it, from short-term projects to continuous inference.
Cost
Performance shouldn’t mean unpredictable bills. By integrating power, data center, and compute operations, we maintain predictable economics that scale with you. Energy efficiency and disciplined asset cycles make sustained experimentation possible without financial guesswork.
Trust
AI progress depends on confidence in your infrastructure. From secure data isolation to SLA-backed reliability, every layer of our stack—power, network, and GPU—is designed for integrity. You retain ownership of your workloads and visibility into every node you touch.
Speed
The defining factor of the Hyperspeed Compute Era. Our modular infrastructure enables rapid provisioning and consistent performance so your ideas can move from concept to deployment without delay. Speed isn’t just a metric—it’s the rhythm of innovation.
These pillars underpin our philosophy for how infrastructure should behave in an AI-driven world: adaptive, available, affordable, transparent, and fast.
What Hyperspeed Infrastructure Delivers
Modern AI development demands systems that match its velocity. Hyperspeed infrastructure delivers exactly that:
Instant provisioning. From shared GPU hours to dedicated bare metal servers, resources are available when you need them—not after the window of inspiration closes.
Transparent operations. See what’s running, where it lives, and what it costs. No black boxes, no hidden throttles.
AI-optimized architecture. Purpose-built for training and inference workloads rather than repurposed from general computing.
Scalable foundation. Start small with shared resources, grow into dedicated GPUs, and expand into bare metal—all on a single, consistent platform.
The result: infrastructure that accelerates development instead of constraining it.
Your Next Move in the Hyperspeed Era
You’ve got the models, the vision, the roadmap. The question is whether your infrastructure can keep pace.
Infrastructure has become the strategic differentiator between teams that ship and teams that wait. Between prototypes that stall and products that scale.
If you’re ready to move beyond infrastructure bottlenecks and into the Hyperspeed Compute Era, it’s time to explore solutions built for how AI actually works.
Explore QumulusAI Cloud, Cloud Pro, and Cloud Pure today and see how hyperspeed infrastructure can move your next build from “just deployed” to “already scaling.”
QumulusAI Secures $500M Non-Recourse Financing Facility through USD.AI to Accelerate AI Infrastructure Growth
QumulusAI announces a $500 million non-recourse financing facility arranged by Permian Labs and distributed through the USD.AI Protocol to accelerate AI infrastructure growth.
Partnership with Permian Labs, the developer of the USD.AI protocol unlocks blockchain-based credit markets for scalable GPU deployments
ATLANTA, GA / October 9, 2025 / QumulusAI, a provider of GPU-powered cloud infrastructure for artificial intelligence, today announced a $500 million non-recourse financing facility arranged by Permian Labs and distributed through the USD.AI Protocol.
The facility allows QumulusAI to finance up to 70% of approved GPU deployments with stablecoin liquidity from USD.AI’s blockchain-based credit market. This structure offers faster access to capital compared to traditional financing alternatives like bank or private credit capital, with flexible terms enabling a non-dilutive path to scale AI infrastructure.
QumulusAI’s facility reflects a broader shift in how compute infrastructure is financed. Global demand for AI infrastructure is projected to surpass $6.7 trillion by 2030, yet capital remains concentrated among hyperscalers like OpenAI, Google, and Meta. USD.AI’s financing model opens new pathways for emerging operators like QumulusAI, linking real-world hardware directly to blockchain-based credit markets for accelerated, more transparent scaling.
Permian Labs developed the financing framework behind USD.AI, which treats GPUs as a financeable commodity. Permian Labs issues GPU Warehouse Receipt Tokens (GWRTs), and USD.AI serves as the on-chain DeFi protocol that enables those tokens to be used as collateral for borrowing stablecoin-based credit, unlocking capital for the next generation of AI builders.
This structure creates yield-bearing opportunities for onchain depositors, while giving operators fast, transparent access to non-dilutive financing.
For QumulusAI, the $500 million facility signals institutional confidence in its infrastructure growth strategy and provides a repeatable model for scaling deployments with blockchain-native financing rails. For Permian Labs and USD.AI, it represents the continued expansion of real- world assets into on-chain credit markets, bridging institutional capital with income-generating compute infrastructure.
"This partnership represents a paradigm shift in AI infrastructure financing," said Mike Maniscalco, CEO of QumulusAI. "By leveraging Permian Labs' tokenization framework, we can scale faster and more flexibly – meeting the surge in AI compute demand without the constraints of legacy financing.”
"QumulusAI is exactly the type of innovative AI operator we built USD.AI to serve," said Conor Moore, Permian Labs Co-Founder and COO. "Their integrated approach to AI supercompute—combining HPC cloud, purpose-built data centers, and controlled power generation—fits seamlessly with our tokenized financing model, proving how blockchain can unlock institutional capital for real-world infrastructure."
About QumulusAI
QumulusAI is a vertically integrated AI infrastructure company focused on delivering a distributed AI cloud by innovating around power, data center and GPU-based cloud services—the company delivers immediate access to high-performance computing with enhanced cost control, reliability, and flexibility. Machine learning teams, AI startups, research institutions, and growing enterprises can now scale their AI training and inference workloads quickly and cost effectively.
For more information, visit https://www.qumulusai.com
About Permian Labs
Permian Labs is the developer of USD.AI, building the infrastructure that connects institutional capital with real-world AI compute. Permian Labs designs the legal, financial, and technical systems that transform GPUs into collateral and make them accessible through blockchain- based credit markets. By bridging traditional asset finance with DeFi innovation, Permian Labs enables AI operators to scale efficiently while creating new opportunities for investors to access yield from real-world infrastructure.
About USD.AI
USD.AI is the world’s first blockchain-native credit market for GPU-backed infrastructure. The protocol turns AI hardware into tokenized collateral, unlocking financing markets with deep liquidity, attractive cost of capital and instant settlement for emerging AI operators who require capital to scale. Through its dual-token model, USDai (a stablecoin with deep liquidity) and sUSDai (its yield-bearing counterpart), USD.AI creates new liquidity pathways for operators while offering investors scalable, real-world yields. Developed by Permian Labs, USD.AI combines DeFi principles with institutional-grade securitization standards to accelerate the financing of AI infrastructure worldwide.
For more information on QumulusAI: Press: media@qumulusai.com Investors: investors@qumulusai.com
Follow QumulusAI on social media: https://www.linkedin.com/company/qumulusai
For more information on Permian Labs:
Email: hello@permianlabs.xzy
This press release contains certain “forward-looking statements” that are based on current expectations, forecasts and assumptions that involve risks and uncertainties, and on information available to QumulusAI as of the date hereof. QumulusAI’s actual results could differ materially from those stated or implied herein, due to risks and uncertainties associated with its business and/or the strategic partnership, which include, without limitation, integration challenges between the companies, market volatility and/or regulatory conditions. Forward-looking statements include statements regarding QumulusAI’s expectations, beliefs, intentions or strategies regarding the future, and can be identified by forward-looking words such as “anticipate,” “believe,” “could,” “continue,” “estimate,” “expect,” “intend,” “may,” “should,” “will” and “would” or words of similar import. Forward-looking statements include, without limitation, statements regarding future operating and financial results, QumulusAI’s plans, objectives, expectations and intentions, and other statements that are not historical facts. QumulusAI expressly disclaims any obligation or undertaking to disseminate any updates or revisions to any forward-looking statement contained in this press release to reflect any change in QumulusAI’s expectations with regard thereto or any change in events, conditions or circumstances on which any such statement is based in respect of its business, the strategic partnership or otherwise.
HyperFRAME Research: Why QumulusAI Is Built Different in the AI Compute Race
What does it actually take to compete in AI infrastructure when everyone claims to have "high-performance compute"? That's the question Steven Dickens, CEO and Principal Analyst at HyperFRAME Research, tackles in his recent analysis of QumulusAI. After sitting down with our leadership team, Dickens digs into the real differentiators that matter in a market where simply having GPUs isn't enough—it's about how you architect them, who you serve, and how quickly you can move.
The piece cuts through the noise around AI infrastructure providers by highlighting a fundamental truth: the hyperscalers are built for massive scale, but not every workload needs that. Training a large language model has completely different requirements than running real-time inference or handling sensitive regulated data. Dickens points out that QumulusAI's modular architecture lets us serve customers who fall between the cracks—companies that need something more tailored than what AWS or Google offer, but with the reliability and performance they'd expect from tier-one infrastructure. He draws comparisons to neocloud providers like CoreWeave and Lambda Labs, noting that success in this space comes down to being developer-friendly and operationally nimble in ways that larger incumbents simply can't be.
What really stands out in Dickens' analysis is his focus on our deployment strategy. While others are building gigawatt-scale data center campuses that take years to come online, QumulusAI is taking a different approach—deploying compute in smaller, distributed pockets that can reach customers in months instead of years. This matters because the AI infrastructure market is supply-constrained, and speed wins. Dickens recognizes that our ability to move fast and deploy flexibly across multiple geographies isn't just an operational advantage—it's a fundamental differentiator that lets us serve customers the hyperscalers can't reach quickly enough. Check out the video interview with CEO Michael Maniscalco below, and read the full HyperFRAME Research analysis for his complete take on our growth strategy.
SiliconANGLE: Neocloud Infrastructure and QumulusAI’s Compute Strategy
SiliconANGLE recently published an in-depth profile of QumulusAI and our approach to GPU-powered cloud infrastructure for artificial intelligence workloads. Written by Zeus Kerravala, principal analyst at ZK Research, the piece explores what sets neoclouds apart in today's AI infrastructure landscape and why enterprises are increasingly looking beyond traditional hyperscalers for their GPU compute needs.
The article digs into our full-stack ownership model—from power and data centers to GPU-accelerated cloud services—and how we're addressing the massive gap between AI compute supply and demand. CEO Michael Maniscalco shares our strategy of deploying smaller, modular compute clusters that can reach the market faster and more cost-effectively than gigawatt-scale data center campuses. While hyperscalers serve the OpenAIs of the world, there are millions of businesses that need flexible, affordable access to GPU infrastructure for model training, inference, and AI development. That's the opportunity we're going after.
Kerravala also highlights what makes our approach different: flexibility in architecture, transparent pricing, and a developer-friendly experience that abstracts away data center complexity. Whether you're a consulting firm scaling AI solutions for clients or a development team building the next generation of AI-powered applications, the goal is the same—getting reliable GPU compute into your hands quickly without the usual barriers. Read the complete article on SiliconANGLE for Maniscalco's full thoughts on the neocloud market, our partnership strategy, and where AI infrastructure is headed.
QumulusAI Appoints Former Applied Digital CTO Michael Maniscalco as CEO to Lead Growth in AI Infrastructure Market
CTO and CMO appointments round out team with the experience to bring enterprise-grade AI supercomputing infrastructure to market
ATLANTA, GA / September 25, 2025 / QumulusAI, a provider of GPU-powered cloud infrastructure for artificial intelligence, today announced the appointment of Michael Maniscalco as Chief Executive Officer to propel the company through a rapid growth phase.
Maniscalco, formerly CTO of Applied Digital, brings deep expertise in scaling high-performance computing platforms – under his leadership at Applied Digital, his team deployed 6,000 state-of-the-art GPUs in 12 months. At QumulusAI, he will drive expansion of the company’s differentiated approach of owning the full stack – from energy and data centers to GPU-accelerated cloud services – delivering cost-efficient, enterprise-grade AI infrastructure with the speed to move fast and the scale to grow with customers.
The company also announced two additional executive appointments: Ryan DiRocco as Chief Technology Officer and Stephen Hunton as Chief Marketing Officer. DiRocco, previously CTO at Performive LLC, a leading VMware-focused managed multicloud provider. In this role, he will oversee QumulusAI’s technical strategy, ensuring products are secure, high-performing, and aligned with customer needs, while guiding clients’ smooth, cost-effective adoption of AI.
Hunton, who most recently served as Head of Global Social and Content Experience at IBM, adds global marketing expertise from Google, YouTube and Chevrolet. In this role, Hunton will focus on establishing the brand as the category leader in AI infrastructure – driving market visibility, accelerating enterprise adoption, and building the momentum that will fuel long-term value for customers and partners/investors.
The strengthened leadership team will focus on expanding market presence, accelerating product innovation, and building strategic partnerships as QumulusAI advances its mission to make enterprise-grade AI supercomputing more accessible.
“These appointments mark a pivotal inflection point for QumulusAI,” said Steve Gertz, Chairman of the Board. “AI adoption is accelerating across every industry, and the ability to deliver scalable, cost-efficient infrastructure has become a critical enabler. Michael, Ryan, and Stephen bring proven expertise in building technology platforms, scaling infrastructure, and creating global brands. This team has the vision and execution experience needed to establish QumulusAI as a premier AI infrastructure provider.”
“The demand for scalable AI infrastructure is one of the fastest-growing markets in tech,” said Steven Dickens, CEO & Principal Analyst at HyperFrame Research. “QumulusAI’s model of controlling the full stack positions it to deliver performance and economics that many enterprises simply can’t get from hyperscalers. Adding Michael Maniscalco as CEO is a strong signal the company is ready to scale.”
View the original press release on ACCESS Newswire
Yotta 2025: Our Key Takeaways
Yotta 2025 brought together leaders from energy, data centers, hardware, and software. The central message was unmistakable: data centers are now critical infrastructure, and AI is accelerating demand at a pace unlike anything before. Analysts estimate $7 trillion will be invested in the next five years. Many compared this moment to the arrival of the steam engine — a turning point for how human work is organized and scaled.
1. Power Stole the Show
The defining bottleneck is no longer GPUs but electricity. Yotta 2025 often felt like a power conference. Attendees agreed that the only true competitive advantage today is speed to market. Not everyone will win, but those who act fast will be rewarded.
On-site cogeneration is emerging as the reality, from natural gas turbines to nuclear, geothermal, and microgrids. Cully Cavness of Crusoe Energy highlighted Oracle’s Abilene campus, where a 350-megawatt natural gas plant is bridging grid delays while replacing traditional diesel backup. The grid is not ready for the AI boom, and reliable behind-the-meter solutions will dominate the near term.
QumulusAI Take: We see power as inseparable from compute. That’s why our roadmap integrates natural gas generation with sub-50 MW facilities, letting us move quickly while staying modular. Rather than betting on single gigawatt campuses, we’re aligning with distributed builds that can be deployed near stranded or excess energy, tightening the loop between power and compute availability.
2. Speed Defines Competitiveness
The urgency of speed was felt everywhere. Modular power capacity, fast turbine sourcing, and flexible microgrids allow players to bypass interconnection queues and deliver capacity years faster. In this market, bold moves can dethrone incumbents overnight. Speed is not just an advantage, it is survival.
QumulusAI Take: Speed doesn’t just mean turbine procurement — it’s about bringing usable compute online. We leverage colocation partners with idle or excess capacity, which enables customers to scale into environments already built and powered. That accelerates delivery while we stand up additional purpose-built campuses.
3. Cooling Innovation Is Inevitable
Power density and heat go hand in hand. Cooling is becoming a limiting factor, sparking debate between liquid-to-chip, hybrid air and liquid, and immersion cooling. Advances in cold plate manufacturing, new conducting materials, and non-PFAS fluids hint at an innovation wave reminiscent of cooling ICBMs. We are still in the first inning, with huge opportunities for next-generation clean tech and climate tech.
4. Talent Is a Hidden Bottleneck
Amid all the focus on hardware and megawatts, one sobering reality came up repeatedly: we need people to build. Skilled labor is dwindling, and without a workforce that can deploy turbines, lay cables, and assemble racks, even the most advanced plans stall.
5. Navigating Uncertainty
To work in this space is to embrace risk. Technology is evolving almost as fast as it can be deployed. Within 12 months, we’ll seen rack densities for AI factories surge from 130kw to 600kw for Nvidia’s Rubin Ultra. Cooling, power, rack density, and chips all shift underfoot. Regional regulations, viral usage spikes, and divergent workloads make forecasting complex. Yet these hurdles create opportunities for faster iteration and collaboration. The industry is laying track as the train speeds forward.
6. Optimizing Inference
With inference workloads surging, optimization is becoming as critical as raw scale. OpenAI described multi-layered strategies:
Inference-side efficiencies like caching and smart routing to cut latency.
Model-side efficiencies like downsizing and unifying architectures.
Adaptive routing to direct queries to the right type of model.
Strict latency targets are now treated as non-negotiable. Every optimization is ultimately about user experience.
7. Push for Hardware Diversity
NVIDIA remains the market leader, leveraging its software ecosystem and performance to justify premium pricing. Competitors like AMD, Intel, and newer accelerator companies face difficulty matching NVIDIA’s scale, though some succeed in niche use cases.
At the same time, there is broad recognition that no single chip architecture can address all workloads. Blended environments that combine CPUs, GPUs, and accelerators are increasingly seen as the practical approach to balancing training, inference, and real-time applications.
8. Edge and Regional Growth
While gigawatt-scale data centers attract attention, smaller regional deployments are also gaining traction. Proximity to renewable energy sources and lower land costs are driving some operators away from traditional hubs. In certain regions, former industrial towns are being redeveloped as data center sites, creating new employment pipelines through retraining programs.
QumulusAI Take: We believe the future isn’t just in hyperscale hubs but in a mesh of regional deployments. Sub-50 MW campuses can align with regional grids, sit closer to enterprises, and adapt to new workloads faster than monolithic builds. This approach also unlocks optionality: combining our own infrastructure with colocation partnerships ensures flexibility while maintaining enterprise-grade reliability.
Yotta 2025 underscored how far the industry has come and how much uncertainty remains. Power shortages, cooling constraints, labor gaps, and shifting workloads are immediate challenges. Yet with trillions of dollars flowing into the sector, the momentum is clear. The companies that act quickly, adapt to uncertainty, and integrate land, power, and compute into unified strategies are most likely to shape the next decade of digital infrastructure.
Modular Designs Are the Starting Point for the Future of AI Infrastructure
“Data centers are evolving to become AI-optimized, modular, purpose-built ecosystems.” — Pipeline Magazine, June 2025
The recent piece from Pipeline makes a compelling case for modular data center design in the AI era. They highlight the rapid shift toward prefabricated builds, new cabinet geometries, high-density liquid cooling, and pre-integrated power systems—and how all of it is converging to meet the demands of AI.
We agree. That’s why QumulusAI’s latest facilities in Oklahoma and Texas are being built around the very modular design principles Pipeline describes.
But we also believe modularity alone won’t get us where we need to go.
What AI workloads require isn’t just faster construction or tighter thermal envelopes—it’s orchestration. The real barrier to AI isn’t just the time it takes to build. It’s aligning every layer of the stack: energy, power distribution, compute, cooling, and deployment timelines.
That’s where the QumulusAI approach builds on what Pipeline calls out.
We deploy modular designs—but we tie them directly to:
Behind-the-meter natural gas with fixed 10-year pricing to eliminate energy volatility
Real-time GPU inventory access for priority deployment of H200s and B200s
Cluster designs optimized around pulse-load behavior
Factory-tested cooling subsystems that drop in without delay
Immersion cooling built into the spec from day one, not retrofitted later
Modular construction builds the site. Integrated infrastructure gets it to revenue.
And that’s the part most headlines miss.
As the Pipeline article concludes, “deep collaboration across the supply chain” is the only way forward. At QumulusAI, we’ve taken that a step further: we’ve compressed the supply chain into a single delivery model—from molecules to models, from megawatts to machines.
Not Hyperscale. Hyperspeed.
There’s something awe-inspiring about a 500 MW data center. Until you remember how long it takes to build. The tech that goes in often changes faster than the permits clear. And by the time power comes online? The workloads it was designed for may be obsolete.
That’s the hyperscale dilemma: chasing AI growth with industrial-age momentum.
QumulusAI is built to move differently.
Forget Massive. Think Modular.
While the industry celebrates ever-larger campuses, we’re focused on sub-50 MW facilities deployed where they’re actually needed. These aren’t proof-of-concepts or pop-up sheds—they’re fully redundant, GPU-optimized data centers, designed from day one for AI performance and next-gen cooling.
By staying under the 50 MW threshold, we avoid years-long approval cycles. We co-locate with gas and fiber. And we activate faster than most teams can even negotiate a hyperscale contract.
The Cost of Overbuilding
What’s often left out of hyperscale headlines is the cost—not just in dollars, but in friction:
Communities face rising opposition: noise, water consumption, and grid strain have turned public sentiment.
Companies face lock-in: rigid contracts for compute that may no longer serve their evolving models.
And regulators are playing catch-up with energy realities that hyperscalers helped create.
Meanwhile, investors wait. Clients stall. Innovation slows.
AI Moves Fast. So Should Infrastructure.
We’re not anti-scale. We’re anti-lag.
QumulusAI is proving that scale doesn’t have to mean sprawl. By deploying purpose-built facilities faster, closer to where the demand lives, we give our clients access to compute without the drag. No twelve-month waitlist. No fifteen-year amortization gamble.
Just energy-efficient, AI-tuned, revenue-generating infrastructure—in months, not years.
From Molecules to Models
Our approach is vertically integrated: power gen, data center, compute. That means fewer intermediaries, more predictability, and complete control across the stack. It also means we can pass savings to clients and reinvest faster in the tech that matters.
This isn’t just about facilities. It’s about philosophy. QumulusAI believes infrastructure should evolve at the pace of innovation—not slow it down.
Public Backlash Against Data Centers Is Emerging. Here’s Our Plan.
Public pushback against data centers is rising—and not without reason. When massive mega and giga factories threaten to overwhelm local grids, or quietly shift infrastructure costs onto ratepayers, communities are right to demand better.
In New Jersey, electric rates jumped 20%, and lawmakers say hyperscale data centers are overloading infrastructure without covering the costs. NJ101.5
In Pennsylvania, grid operators say surging AI and data center demand is tipping the balance—leaving power supplies potentially short under extreme summer conditions. WESA
In Illinois, new legislation would require data centers to report energy and water use—aiming to uncover whether residents are unknowingly footing the bill for AI growth. Capitol News Illinois
QumulusAI: Built for the Long Term
At QumulusAI, we’re building for the long term: a more strategic, more nimble, and more measured approach to AI infrastructure. Our plans align with local capacity, not against it, and sustain real growth.
Right-sized for the region, not oversized for the headline: We build nimble, sub-50MW facilities designed to match local capacity—not overwhelm it.
Built with diverse, sustainable power—including behind-the-meter natural gas: Our model reduces grid stress, improves resiliency, and aligns with long-term environmental planning.
Live in months, not years: Our modular data centers deploy fast—without dragging down utilities or forcing costly upgrades on ratepayers.
In step with the communities we serve: We work directly with policymakers, utilities, and local leaders to align infrastructure growth with public interest—not just private demand.
Sustainable by design: Our energy-efficient clusters are optimized for AI workloads from day one—minimizing waste, maximizing performance, and staying accountable to the regions that host us.
The data center industry is at a crossroads. We can keep bulldozing through communities with oversized projects that privatize profits and socialize costs—or we can prove that AI infrastructure can actually strengthen the places that host it. The choice we make now will determine whether communities welcome the next wave of technology or fight it at every turn.