Not the Model, but Sovereignty: The New Line of Division for Artificial Intelligence in Banking

Not the Model, but a Matter of Sovereignty

For the past two years, conversations about generative AI have been driven by sheer horsepower: faster responses, better accuracy, larger context windows… Banking, however, quietly shifted this debate onto a different plane. Because in a bank, data is not an output; it is a legal responsibility. Customer information, transaction history, financial behavioral data… All are valuable only under the condition of auditability. That is precisely why generative AI functions not as “a new application,” but as a mirror that makes a bank’s risk appetite visible. In banking today, the critical question is not “Which LLM should we use?” but “Where and how will we run this AI?” This decision, which looks technical, is in fact a management decision that affects everything—from budget to governance, from scaling strategy to supplier risk.

The cloud brought speed; banking demanded sovereignty

The cloud was the first acceleration lane for generative AI: easy to try, fast for proofs of concept, elastically scalable. But in banking, the word “easy” has never been a decision driver on its own. Because regulation tells banks this: Data must be protected, traceable, and reversible when necessary. In Europe, DORA brings not only security controls but also critical third-party dependencies and concentration risk onto the board agenda; the AI Act introduces a tighter framework, especially for high-risk use cases in financial services. In Türkiye, BDDK regulations and KVKK practices convey the same message in a different tone: Control will remain with the bank.

That is why the trend we see in the field is not a fashionable debate like “cloud vs on-prem,” but a sovereignty-by-design approach. Global examples use different models, yet show the same reflex: running the LLM behind the bank’s firewall, in a way that leaves an audit trail. Whether it’s HSBC discussing a self-hosted approach in partnership with Mistral AI, or Wells Fargo building a privacy-first line in its Fargo assistant that manages millions of interactions “without PII going to the LLM,” the signal is the same: Productivity will increase, but auditability will not be compromised. It may look like a technology preference, but it is fundamentally a choice in enterprise risk management.

“If there is no auditability, productivity is not sustainable either.”

The on-premise shift is not explained by a single reason: three forces converge

As CBOT, as GenAI projects scale within banks, one thing becomes clear: the on-premise/self-hosted approach rests on a foundation too rational to be explained by the word “caution.” Three forces meet at the same point.

The first is regulatory pressure and third-party risk. With DORA, supplier concentration is no longer an SLA discussion; it is an operational resilience issue. Banks ask not only “Where does the data reside?” but also “How critical is our model dependency?” Because generative AI is not merely a service call; it is a system that carries context, processes data, and influences decisions. That is why many institutions design the architecture “within a secure network” and restrict access based on roles; in some cases, deployments limited to private networks accessible only to employees exist for exactly this reason.

The second force is economic rationality. Token-based costs may look flexible at the outset; but in banking, GenAI usage is not “occasional.” Call-center summarization, document processing, internal knowledge assistants, risk and compliance workflows… These are engines that run 24/7. Usage intensity such as Wells Fargo’s scale surpassing hundreds of millions of interactions, or knowledge assistants embedded into the daily workflows of advisors at institutions like Morgan Stanley, shows that GenAI is no longer an “experiment” but an “operation.” Once it becomes operational, the math changes: instead of unit cost, a predictable financial model that eliminates surprise costs becomes more valuable.

“Controlled cost is not only a budget advantage; it is decision speed.”

The third force is technological maturity. Two or three years ago, running an on-premise LLM looked like something only a few major players could do. Today, the open-source ecosystem, LLMOps practices, fine-tuning techniques, and hybrid patterns have matured substantially. Examples such as BloombergGPT, which demonstrate that specialized model approaches are possible in domains like finance, have weakened the notion that “this can only be done by renting frontier models.” In other words, the question is no longer “Can we do it?” but “What level of control do we want?”

Banks are not moving AI back; they are moving it to the center

Those who misread the on-premise debate say “banks are going backwards.” The opposite is true: Banks are taking GenAI from the periphery and bringing it closer to the core. In a world where fraud detection competes in milliseconds, latency is not just a performance metric; it is a financial-loss metric. In compliance and KYC processes, speed is not merely operational efficiency; it is the management of regulatory risk. In document processing and reporting, automation is not only time saved; it means a standardized audit trail. Deutsche Bank’s approach to shortening report-production times while increasing analytical depth, or Garanti BBVA making its Ugi digital assistant part of operations at scale, tells us one thing: In banking, the question “Where do we run it?” directly turns into “Which processes do we authorize it for?”

That is why, as CBOT, we treat architecture not as an IT decision but as a governance decision. The model’s location reveals the institution’s risk appetite. If generative AI is moving beyond a layer that improves customer experience and entering credit, risk, compliance, and operational processes, then sovereignty-by-design becomes not “nice to have,” but “must-have.”

Conclusion: The new dividing line is not the model, but ownership

In the period ahead, the divide in banking will not be between those who “use LLMs” and those who don’t. The divide will be between those who position AI as strategic infrastructure and those who use it as a tactical tool. Because data sovereignty is not only a security topic; it also means cost sovereignty, scaling sovereignty, and strategy sovereignty.

Whoever controls the data controls the cost.
Whoever controls the infrastructure controls scaling.
Whoever controls the model controls strategy.

And in 2026, banking’s clearest question becomes this:
Not whether we use AI, but whether we truly own it.