Dell Reinforces Its AI Data Platform: More Performance for RAG, Vector Search, and Unstructured Data with Elastic, Starburst, and NVIDIA

Dell Technologies has announced new capabilities in its AI Data Platform, the data pillar of the Dell AI Factory ecosystem, aiming to help companies transform distributed and siloed data into measurable AI results. The proposal focuses on a critical point for any organization moving from pilots to production: separating storage and processing to eliminate bottlenecks, maintaining on-prem control, and providing flexibility for workloads that range from training and fine-tuning to RAG and inference.

The update revolves around four blocks: storage engines, data engines, integrated cyber-resilience, and data management services. The package is integrated with NVIDIA AI Data Platform reference design and relies on partnerships with Elastic and Starburst to accelerate semantic search, federated analytics, and agentic workflows.

Why it matters: from “data swamp” to real-time AI results

Enterprise AI adoption is confronting a stubborn fact: data is spread out, comes in a multiformat, and changes. With its AI Data Platform, Dell proposes an open and modular base to unify pipelines, activate metadata, and avoid bottlenecks between storage layers, compute, and models. The declared goal is for data teams to seamlessly connect NAS and S3 repositories, orchestrate incremental ingestion to keep indices and vector stores up to date, and deploy hybrid (keyword + vector) searches accelerated by GPU with on-prem control.

Storage engines: PowerScale and ObjectScale, tuned for AI

The platform’s storage enginesDell PowerScale (NAS) and Dell ObjectScale (native S3 object)— are the foundation for placing data where it performs best and moving it smoothly between stages (preparation, training, inference, RAG).

PowerScale: Parallel NAS for AI, validated at GPU scale

  • PowerScale offers multi-protocol access and parallel performance for training, fine-tuning, inference, and RAG pipelines.
  • With the new integration of NVIDIA GB200 and GB300 NVL72 and software updates, Dell promises consistent performance, simplified management at scale, and compatibility with common AI applications and stacks.
  • The PowerScale F710 has achieved NVIDIA Cloud Partner (NCP) certification for high-performance storage. Dell states that the system scales to more than 16,000 GPUs with up to 5× less rack space, 88% fewer network switches, and up to 72% less energy consumption compared to comparable alternatives.

ObjectScale: S3 object with acceleration and improvements for small objects

  • ObjectScale — touted as “the highest-performance object platform” — provides scalable S3 native for massive AI workloads. It can be deployed as an appliance or as a new software-defined option on Dell PowerEdge servers, which is up to 8 times faster than the previous all-flash object generation, according to Dell.
  • Key advancements:
    • S3 over RDMA (tech preview in December 2025): up to 230% more throughput, 80% less latency, and 98% less CPU usage than traditional S3.
    • Performance/efficiency improvement in small objects (10 KB): up to 19% more throughput and up to 18% less latency in large deployments.
    • Deeper integration with AWS S3 and bucket-level compression to reduce data surface area and storage/movement costs.

The combination of PowerScale + ObjectScale enables data teams to choose the appropriate layer for each stage — for example, NAS for training scratch and object for data lakes and RAG repositories — while maintaining common policies for security, telemetry, and cyber-resilience.

Data engines: semantic search with Elastic and federated analytics with Starburst

Beyond storage, Dell expands its data engines: specialized tools to organize, query, and activate information in AI workflows.

Data Search Engine (with Elastic): RAG and semantic search at scale

Developed in collaboration with Elastic, the Data Search Engine targets RAG, semantic search, and generative pipelines. It integrates with MetadataIQ to discover and catalog data, and can search billions of files across PowerScale and ObjectScale using granular metadata. Operational advantages include:

  • Incremental ingestion: only updates changed files, saving compute and keeping vector DB current.
  • Familiar SDKs: developers can create RAG apps in LangChain or other frameworks.
  • Governance: anchors search in controlled repositories managed by IT, with auditing and security integrated.

Data Analytics Engine (with Starburst): from federation to “agentic SQL”

With Starburst, Dell strengthens its Data Analytics Engine to enable querying without data movement across spreadsheets, databases, cloud warehouses, and lakehouses. Innovations include:

  • Agentic Layer: a layer using LLM to automatically document, extract insights, and embed AI into SQL workflows within seconds.
  • Unified access to vector stores (Iceberg, the Data Search Engine, PostgreSQL + PGVector, etc.) to facilitate RAG and search from a single gateway.
  • Model monitoring and governance at enterprise level: tracking, auditing, and controlling AI use.
  • MCP Server for Data Analytics Engine (February 2026): enabling multi-agent development and AI applications on the analytics layer.

GPU-accelerated hybrid search with NVIDIA cuVS

The platform will include integration with NVIDIA cuVS for vector search and hybrid (keyword + vector) search accelerated by GPU. The promise is next-gen performance in vector search with turnkey deployment: a ready-to-operate environment for IT, on-premise, and at scale, leveraging Dell’s secure infrastructure. For teams currently managing multiple search and vectorization stacks, the benefit is in consolidation and standardization without sacrificing capabilities.

What industry players say: from “fragmented data” to “production-ready platforms”

From Dell, Arthur Lewis (ISG) frames the announcement around a cross-cutting need: simplifying data complexity, unifying pipelines, and delivering AI-ready data at scale, with practical use cases spanning real-time diagnostics in healthcare to predictive maintenance in industry. At NVIDIA, Justin Boitano emphasizes that the platform pushes a new generation of intelligent storage capable of interpreting data semantics. Elastic and Starburst highlight, respectively, boosts in search/discovery and analytic federation, addressing typical bottlenecks where data is spread across too many sites. Meanwhile, Maya HTT emphasizes real-time telemetry and AI efficiency in fields like aerospace and maritime, leveraging PowerScale and NVIDIA infrastructure.

Availability: milestone-based schedule

  • PowerScale with NVIDIA GB200 / GB300 NVL72 integration and NCP validation: available.
  • ObjectScale S3 over RDMA: Tech Preview in December 2025.
  • ObjectScale software updates: December 2025.
  • Data Analytics Engine Agentic Layer: February 2026.
  • MCP Server for Data Analytics Engine: February 2026.
  • Data Search Engine in Dell AI Data Platform: first half of 2026.
  • NVIDIA cuVS integration: first half of 2026.

Key questions for the CIO: five considerations before deployment

  1. Where are my data and what SLA does each use case require?
    Map NAS, S3, relational/nonSQL databases, spreadsheets, warehouses, and lakehouses to decide placement (PowerScale vs. ObjectScale) and movement (what, when, how).
  2. How do I keep indices and vector DBs fresh without burning compute?
    Enable incremental ingestion and metadata (via MetadataIQ and/or catalogs) to recalculate only what’s needed for RAG and search.
  3. Which search and vectorization stack should I consolidate?
    Assess hybrid search with cuVS and unified vector access (Iceberg, PGVector, etc.) to reduce fragmentation.
  4. What guardrails (security and compliance) do I need?
    Define perimeters, encryption, auditing, and retention; leverage built-in cyber-resilience and model monitoring to prevent ungoverned AI use.
  5. How can I measure ROI for RAG and semantic search?
    Use KPIs like response latency, hit rate, cost per query, hours saved in support, and accuracy against ground truth to prioritize investments.

Critical perspective: promised performance vs. operational maturity

The figures accompanying the announcement —230% more throughput and 80% less latency in S3 over RDMA, up to 8× in the software-defined ObjectScale, or the jump of PowerScale F710 to more than 16,000 GPUs with significant reductions in space, networking, and energy — paint a very attractive scenario for data-hungry AI workloads. The key will be in bringing that performance into daily operations: governance, SLA for ingestion and indexing, total cost of vectorization at scale, and end-to-end observability to prevent the ‘lake’ from becoming a swamp.

Simultaneously, the push toward agentic workflows on SQL and governed semantic search suggests Dell aims to bring AI capabilities closer to data teams, not just MLOps. If the schedule holds and integrations with Elastic/Starburst/cuVS are production-ready, the platform could save months in implementations that currently get stuck adapting tools and interfaces.


Frequently Asked Questions

What differentiates Dell AI Data Platform from a traditional data lake?
It’s not just storage: it combines data engines (semantic search, federated analytics, model governance) with storage engines (parallel NAS and native S3 object) and cyber-resilience. It explicitly separates data and processing to prevent bottlenecks in training, RAG, and inference.

What benefits does S3 over RDMA in ObjectScale bring for AI workloads?
Tech preview from December 2025 aims at higher throughput and lower latency/CPU when transporting S3 over RDMA. In scenarios with intensive inference, re-indexing, or large-scale RAG pipelines, it can reduce costs and speed up SLAs.

How does NVIDIA cuVS fit into the platform?
It enables vector search and hybrid (keyword + vector) search accelerated by GPU in on-premises environments, with turnkey deployment. For teams managing multiple search stacks, it helps standardize and improve performance without losing control.

When will new features be available and what does their adoption require?
Integration of PowerScale + GB200/GB300 validated by NCP is already available. ObjectScale S3 over RDMA and related updates are expected in December 2025. The Agentic Layer and MCP Server for the Data Analytics Engine will launch in February 2026. The Data Search Engine and cuVS integration are planned for the first half of 2026. Adoption will involve coordinating network, security, metadata, and existing pipelines.

via: dell

Scroll to Top