How PremAI hands GPU rental sourcing to Spheron and stays focused on building confidential AI
A long-term H200 rental at the best price in the market, negotiated through a vetted Tier 3+ data center partner and validated before PremAI committed. Spheron stays on as PremAI's GPU rental sourcing partner for future capacity.
About PremAI
PremAI is an applied AI research lab based in Lugano, Switzerland. The team builds private, sovereign AI for enterprises that cannot hand their data to a public LLM API. The product line covers Fluso, a private AI workspace, a Confidential API for developers, and Sovereign AI for teams that need to run the stack inside their own perimeter. Across all three, the pitch is the same: customers keep full ownership of the models, the data, and the deployment environment.
The customer base sits in regulated industries: healthcare providers, financial institutions, government bodies, and RegTech teams in the EU. They all need the same thing. Provable isolation of customer data, hardware-signed proofs of how it was processed, and an architecture that holds up under audit. PremAI's stack runs inference inside secure enclaves with post-quantum encryption. Data never touches a disk. Inference runs in volatile memory and vanishes the moment it is done.
The challenge
PremAI needed H200 capacity for the engineering work behind their stack. H200 rentals are easy to find on the surface. The right H200 rental, the kind that sits inside a Tier 3+ facility with the right power, cooling, network paths, and operational maturity, at a market-leading price, on long-term commitment terms that actually fit the workload, is a different search. That offer rarely lands on the top of a Google result.
Doing that search in-house would mean running a sourcing project across multiple operators. Comparing rental pricing across data centers. Checking hardware lineage. Negotiating commitment length and rate. Vetting operational quality before signing. Then, once the rental was live, owning the operational relationship with the data center directly any time an issue came up. None of that is the work PremAI's engineers should be doing. The team's job is the AI stack: the runtime, the privacy architecture, the customer-facing API. Comparison shopping for GPU rentals is overhead, and overhead at the wrong stage of a company kills momentum.
GPU rentals are everywhere, but the right rental at the right price on long-term commitment terms is a search you do not want eating engineering hours. The first listings we found were never the ones we ended up wanting. We needed someone whose actual job it was to know the market.
Why PremAI chose Spheron
PremAI brought Spheron in as their GPU rental sourcing partner. Spheron already runs the work PremAI would otherwise have to build a team for: knowing which data center partners have H200 capacity available, knowing the going market rate, vetting the facility before recommending it, and negotiating long-term commitment terms that match the shape of the workload.
The H200 rental Spheron put in front of PremAI came from a vetted Tier 3+ data center partner at a price well below the on-demand rates the hyperscalers quote for the same hardware. Commitment terms were negotiated on PremAI's behalf, so the team got long-term pricing without sitting through a procurement cycle. Before PremAI committed, Spheron ran validation on the rental: power-on diagnostics, GPU stress tests, network checks, the usual sweep that catches early-life failures before they hit a production workload. PremAI went live on a rental that already worked, not on one that needed three rounds of support tickets to come up clean.
When something does come up on the rental, PremAI does not chase the data center directly. They go through Spheron's in-house help desk. A network issue, a hardware swap, a configuration tweak, a question on an incident, all of it goes to Spheron's engineers, who escalate to the data center on PremAI's behalf. The same engineers who vetted the partner facility are the ones picking up the message, so nothing gets lost in translation.
Spheron came back with a specific H200 rental at a specific price, with long-term commitment terms already negotiated and the data center already vetted. That is the work we did not want to do ourselves. When an issue comes up on the rental, we message their team and they handle the data center side.
The setup
Spheron found and validated the H200 rental inside the Tier 3+ partner facility, negotiated long-term commitment terms on PremAI's behalf, and handed off a working rental with the configuration the deployment needed. Bare metal where it matters, networking and storage pre-configured for the workload.
When PremAI needs more GPU capacity, the request goes through Spheron. We work the market for them: which partner has the right rental available, what the current rate looks like, what commitment length and price PremAI should hold out for. The team gets long-term rental pricing without spending engineering time on comparison spreadsheets, and sourcing runs at the pace of the product roadmap instead of a quarterly procurement cycle.
The same help desk handles anything operational that comes up on the rental. A network issue, a hardware swap, a configuration change, a question on an incident, additional spare capacity for a fine-tuning run. PremAI messages Spheron, Spheron talks to the data center, the same engineers who vetted the partner facility see it through.
The outcome
PremAI ships confidential AI to regulated industries without spending engineering time on GPU rental sourcing. The H200 capacity is in place at a price that fits the unit economics, on commitment terms that match a long-term workload. The data center relationship is handled. When more capacity is needed, the same channel scales.
PremAI's engineers spend their time on the AI stack, not on rental comparison spreadsheets, contract negotiation, or DC tickets. Spheron stays on as the rental sourcing partner for whatever comes next, whether that is more H200 capacity, a different GPU class, or rentals in another region.
We did not want to spend engineering time comparison-shopping GPU rentals. Spheron found us H200 capacity at the best price in the market, from a Tier 3+ partner they had already vetted, with long-term commitment terms negotiated on our behalf and the rental validated before we went live. When we need more capacity, or anything operational with the data center, we message their team directly. That lets us stay focused on the product.
Need GPU capacity sourced for you?
Tell us what you're building, the GPU type you need, and the region. We source it from one of our vetted data center partners and get you on-demand access with per-minute billing.