The landscape of enterprise Artificial Intelligence (AI) is shifting gears. Over the past few years, many companies have experimented with models, assistants, and advanced analytics in isolated environments. However, once AI moves into production, the debate shifts from “which model to use” to a far more complex issue: how to provide secure, reliable, and governed access to data where it already resides.
In this context, Cloudera has announced an expansion of its portfolio so that part of its inference and analytics stack can also run on-premises, meaning within the client’s data centers. On February 9, 2026, the company announced that Cloudera AI Inference and Cloudera Data Warehouse with Trino will be available in local environments, along with improvements to Cloudera Data Visualization to “seamlessly connect” AI and analytics workflows across cloud, edge, and data centers.
This move aims to address a tension that has become evident in recent projects: moving sensitive data to the cloud to feed models may involve greater risk, more regulatory exposure, and often, more operational friction. Therefore, Cloudera emphasizes a counterintuitive approach: bringing AI to where the data is, rather than bringing data to AI.
From “pilot” AI to “industrial” AI: governance and security as success conditions
Cloudera frames its announcement around a recurring theme in executive and data office discussions: when AI integrates into critical processes, the priority becomes governed access to information. The company references its report “The State of Enterprise AI and Data Architecture” and highlights that nearly half of organizations store data in a data warehouse, underscoring the importance of ensuring AI applications can query data without removing it from protected environments.
Practically, this promise is attractive for regulated sectors (finance, healthcare, industry, government): less data movement, fewer blind spots, and a more manageable risk surface. It also recognizes an increasingly hefty factor affecting budgets: the need to keep sustainable costs as AI moves from experimentation to stable, ongoing service consuming resources continuously.
Cloudera AI Inference on-premises: inference with NVIDIA stack inside the data center
This announcement positions Cloudera AI Inference as a solution for deploying and scaling models directly within the data center. The company emphasizes that this layer is powered by NVIDIA technology and can run “any AI model,” including open models such as NVIDIA Nemotron, covering use cases like large language models (LLMs), fraud detection, computer vision, or speech.
In terms of infrastructure and serving, Cloudera refers to acceleration with the NVIDIA stack and the use of NVIDIA Blackwell GPUs, along with NVIDIA Triton Inference Server and NVIDIA NIM microservices for high-performance, scalable model serving.
The rationale behind this stack is twofold: first, providing more predictable economics by avoiding the cost volatility some organizations associate with cloud inference; second, maintaining latency control, compliance, and privacy by keeping data and execution within the data center’s perimeter.
Data Warehouse with Trino in the data center: a layer to accelerate queries without losing control
The second key aspect of the announcement is Cloudera Data Warehouse with Trino, now also available for on-premises deployments. This approach is especially relevant for hybrid architectures: centralizing security, governance, and observability over the “data estate” (the complete set of corporate data) while speeding up access to insights.
Cloudera describes this evolution as a way to turn complex data into actionable results without compromising security, compliance, or operations. In an era where many companies operate across multiple clouds, edge environments, and legacy infrastructure, this promise addresses a real challenge: the costs of managing dispersed data, inconsistent policies, and limited visibility.
Data Visualization enhanced with AI features: automatic summaries and traceability
Alongside inference and warehouse, Cloudera introduces improvements to Cloudera Data Visualization aimed at simplifying “AI-driven” workflows and enriching analysis within the data center and beyond. Notable enhancements include:
- AI annotation: instant generation of summaries and contextual insights for charts and visualizations, removing the need for manual writing.
- More resilient AI functions: handling transient incidents and usage analytics for monitoring and optimization.
- Query traceability and logging: each query records message ID, timestamp, and question to ensure transparency and facilitate troubleshooting.
- Simplified administration: easier role assignment through updated configuration parameters, streamlining scenarios based on SSO without hardcoded credentials or manual user promotion.
In a governed data environment, traceability becomes essential: a requirement for audits, compliance, and understanding how specific results are generated—particularly when AI is used in reporting or sensitive decision-making.
Statements: control and flexibility as core messages
Leo Brunnick, Cloudera’s Chief Product Officer, frames the announcement as a way to give customers more control and flexibility, enabling deployment of inference, Trino-based warehouse, and visualization where critical data resides, without sacrificing security or operational efficiency.
From NVIDIA, Pat Lee (Vice President of Strategic Enterprise Alliances) highlights the collaboration as a pathway to scale inference with Blackwell, Dynamo-Triton, and NIM, promising to combine control, predictable economics, and efficiency in the data center.
Next stop: DeveloperWeek and the focus on open lakehouse architecture
Cloudera is also expanding its presence at DeveloperWeek (February 18-20, 2026), where it plans to host a session on designing a cloud-native open lakehouse architecture with Apache Iceberg. The message is clear: the market increasingly favors open, portable architectures that combine governance, performance, and flexibility without locking organizations into a single deployment model.
Frequently Asked Questions
What does “AI inference on-premises” mean, and why is it relevant for companies with sensitive data?
It involves running inference (model service) within the data center to keep data and processing in controlled environments, reducing compliance risks and avoiding moving sensitive information outside the corporate perimeter.
What does Cloudera Data Warehouse with Trino offer compared to a cloud-only approach?
According to Cloudera, it accelerates access to insights while maintaining centralized security, governance, and observability—especially useful when data and workloads are distributed across cloud, edge, and data centers.
What role do NVIDIA Blackwell, Triton, and NIM play in this announcement?
Cloudera states that its AI Inference leverages NVIDIA’s stack, including Blackwell GPUs, Dynamo-Triton Inference Server, and NIM microservices, to deliver high-performance, scalable model serving in enterprise environments.
Why is query traceability in AI visualization and analytics becoming critical?
Because it enables auditing of what was asked, when, and in what context—essential for compliance, transparency, and troubleshooting when AI is integrated into reporting or sensitive decision-making processes.

