Artificial intelligence has become the central focus of digital transformation in businesses, but the technological reality reveals a concerning gap between expectations and the actual capacity of organizations to leverage it. A new report developed by Harvard Business Review Analytic Services in collaboration with Cloudera shows that most companies still lack a sufficiently mature database to support large-scale AI projects.
The study, titled Taming the Complexity of AI Data Readiness, is based on a survey conducted with 231 professionals involved in data and AI decision-making within their organizations. Its findings point to a structural problem in the business tech ecosystem: although AI adoption is accelerating, the data infrastructure needed to sustain it remains inadequate.
Most companies are still unprepared for AI
Results clearly show this disconnect: Only 7% of organizations say their data is fully prepared for AI implementation.
Conversely:
- 15% believe their data is nearly ready
- 51% state that their data is partially ready
- 27% admit their data is unprepared or barely so
These numbers reflect a common situation in many organizations: AI experimentation advances rapidly, but the quality, governance, and availability of data do not keep pace.
The issue isn’t a lack of data. Modern companies generate enormous volumes of information from business systems, IoT sensors, financial transactions, customer interactions, or social media. The challenge lies in transforming these scattered data into assets usable by AI algorithms.
Data silos remain the biggest obstacle
One of the main problems identified in the study is data fragmentation within organizations.
56% of respondents cite data silos or the difficulty in integrating different sources of information as the primary barrier to preparing data for AI.
Additional relevant factors include:
- Lack of a clear data strategy (44%)
- Data quality issues or biases (41%)
- Regulatory restrictions on data use (34%)
Practically, this means many AI projects remain in pilot or testing phases because companies cannot build robust, scalable data pipelines.
Furthermore, the report indicates that 73% of organizations believe they should prioritize data quality much more in their AI initiatives, highlighting a growing awareness of the problem.
Data governance becomes a strategic priority
As AI becomes integrated into critical business processes, data governance turns into a central element of technological strategies.
According to the study, only 23% of organizations currently have a defined data strategy for AI adoption, while 53% are in the process of developing one.
Key components included in these strategies often feature:
- Sensitive data protection and privacy (59%)
- Data quality and consistency (46%)
- Governance and control of data lifecycle (41%)
Data management has thus become a crucial component to ensure AI systems are reliable, auditable, and compliant with regulatory requirements.
In sectors like banking, healthcare, or industry, where data is highly sensitive, these aspects are especially critical.
Cloud, hybrid, and edge: the new map of enterprise data
The report also provides an insightful snapshot of where data powering AI systems is stored and processed.
Currently:
- 51% of companies use the cloud as the primary data storage environment for AI
- 28% employ hybrid environments combining cloud and on-premises systems
- 11% manage their data exclusively on on-premise infrastructures
Additionally, 77% of organizations plan to increase cloud data storage over the next 12 months, confirming a trend toward more scalable and flexible infrastructure.
However, the report also highlights that many companies are adopting architectures where algorithms run where data resides instead of transferring large volumes of information between data centers or cloud platforms.
This approach responds to several technical and regulatory factors:
- Reducing latency
- Enhancing security
- Meeting data sovereignty regulations
- Lowering data transfer costs
Agentic AI could accelerate data management
Another highly interesting aspect of the report is the rising interest in agentic artificial intelligence, systems capable of executing complex tasks autonomously within business processes.
The study indicates that:
- 65% of respondents believe many business processes will be augmented or replaced by agentic AI in the next two years
- 47% see this technology as a potential solution for improving data quality issues
In data management, agentic AI systems could automate tasks such as:
- Data cleaning
- Inconsistency detection
- Pipeline creation
- Quality supervision and data drift monitoring
This would significantly reduce manual effort in data preparation, one of the biggest costs in AI projects.
The real challenge of enterprise AI
The report concludes that AI is prompting a profound shift in how companies perceive their data.
For decades, data management was mainly viewed as an operational function or a technological cost. Today, it’s transforming into a key strategic asset for business competitiveness.
AI not only requires advanced algorithms but also modern data infrastructures, solid governance, and architectures capable of operating in hybrid and distributed environments.
In this context, the success of AI initiatives will depend less on the models used and more on organizations’ ability to build reliable, accessible, and well-governed data ecosystems.
Frequently Asked Questions about Business Data and Artificial Intelligence
Why is data quality critical for AI?
AI models learn from the data available. If that data contains errors, inconsistencies, or biases, the AI’s outputs will also be flawed. Therefore, data quality, governance, and traceability are essential elements in enterprise AI projects.
What does it mean for data to be “AI-ready”?
It means that data is properly structured, integrated, clean, accessible, and governed. It also involves reliable data pipelines, proper metadata, and control mechanisms that enable safe and scalable use of information in AI models.
What role do hybrid architectures play in AI projects?
Hybrid architectures combine cloud infrastructure, on-premise data centers, and edge computing. This setup facilitates processing data close to where it’s generated or stored, reducing latency, improving security, and ensuring compliance with data sovereignty regulations.
How can agentic AI improve data management?
Agentic AI systems can automate complex data management tasks such as cleaning, classification, pipeline creation, and quality monitoring. This accelerates AI project development and reduces manual effort required for data preparation.

