In a cloud market dominated by major hyperscalers, Backblaze has decided to play an uncommon card: showing all its guts. The company has published its first report Performance Stats Q3 2025, a set of synthetic performance tests comparing its storage service Backblaze B2 with AWS S3, Cloudflare R2, and Wasabi Object Storage.
Meanwhile, it has also updated its classic Drive Stats report, revealing the actual failure rates of the hard drives powering its platform. Performance and reliability, all in the open, at a time when many companies are questioning how to balance cost, features, and risk in the cloud.
Cloud Performance Testing: A “Level Playing Field”
The Performance Stats tests were carried out using Warp, an open-source tool to measure S3-compatible storage performance.
The test scenario was unique and controlled:
- Ubuntu virtual machine hosted on Vultr, in the New York / New Jersey zone.
- Traffic routed to US-East type regions of each provider.
- Same network conditions and configuration for all.
- Upload and download tests with files of 256 KiB, 5 MiB, 50 MiB, and 100 MiB.
- Profiles of five minutes, in both multi-thread and single-thread modes, repeating tests for stable averages.
The stated goal is to provide a reproducible snapshot of how each service performs from a real client’s perspective, away from the internal, less transparent benchmarks that some industry members often publish.
Uploads: Backblaze Excels with Small Files, AWS with Medium Sizes
The first set of tests measures the average upload time of a single file, in milliseconds. The lower the number, the better. No data is available for Wasabi due to platform restrictions during the first 30 days of account life.
Average upload times (ms)
| Provider | 256 KiB | 2 MiB | 5 MiB |
|---|---|---|---|
| Backblaze B2 | 12.11 | 116.38 | 268.28 |
| AWS S3 | 28.73 | 76.79 | 201.40 |
| Cloudflare R2 | 16.02 | 123.25 | 311.48 |
| Wasabi | – | – | – |
Clear insights emerge:
- Backblaze B2 is fastest with small files (256 KiB), averaging just 12.11 ms.
- AWS S3 leads with 2 MiB and 5 MiB files, achieving 76.79 ms and 201.40 ms respectively.
- Cloudflare R2 consistently lags behind in this specific metric.
These numbers are particularly relevant for scenarios where individual latency is critical — e.g., interactive uploads from web applications — but they don’t tell the whole story when moving large volumes of data.
Multi-threaded Upload Throughput: Backblaze and Wasabi Lead in Continuous Loads
The report’s especially valuable in tests of sustained throughput: how many MiB per second each service can upload over five minutes during parallel uploads.
Here, the higher the figure, the better.
Multi-threaded upload throughput (MiB/s)
| Provider | 256 KiB | 5 MiB | 50 MiB | 100 MiB |
|---|---|---|---|---|
| Backblaze B2 | 163.80 | 573.30 | 1,738.90 | 2,554.50 |
| AWS S3 | 96.20 | 653.40 | 1,747.50 | 2,391.20 |
| Cloudflare R2 | 24.10 | 314.50 | 1,036.50 | 1,356.00 |
| Wasabi | 162.80 | 669.20 | 1,847.20 | 1,977.50 |
Key takeaways include:
- For very small files (256 KiB), Backblaze and Wasabi are nearly tied for the top spot, significantly outperforming AWS and especially Cloudflare.
- In medium sizes (5 MiB and 50 MiB), Wasabi stands out in throughput, with AWS and Backblaze close behind.
- At the 100 MiB level, Backblaze B2 takes the lead with 2,554.50 MiB/s, ahead of AWS and Wasabi.
- Cloudflare R2 generally stays at the bottom, especially penalized with small objects.
For real workloads — such as log ingestion, database backups, or uploading datasets for AI models — these figures are more indicative than upload time of a single file, as they reflect each service’s ability to leverage parallelization effectively.
Downloads: AWS Leads in Latency, Backblaze Excels in Sustained Downloads
Results for file download are more mixed:
- In average download times per file, AWS S3 leads across all sizes, with Backblaze B2 and Cloudflare R2 trailing behind.
- When measuring sustained multi-thread throughput, Backblaze B2 performs best for 256 KiB, 50 MiB, and 100 MiB files, while AWS dominates slightly at 5 MiB.
- In single-thread download tests, Wasabi surprises with the best result at 256 KiB, but Backblaze takes the top spot with 5 MiB, 50 MiB, and 100 MiB.
The report pays special attention to TTFB (Time to First Byte), useful for understanding initial latency. However, it’s important to remember that this metric alone doesn’t fully capture the user experience, as network routes, connection reuse policies, and caches can distort this metric if viewed in isolation.
Transparent Methodology and Recognized Limitations
Backblaze emphasizes that its tests are synthetic and reproducible, but not an exact replica of all production scenarios. Among acknowledged limitations are:
- Traffic generated always from a single region (NY/NJ to US-East regions).
- Possible caching effects on repeated reads, despite efforts to minimize them.
- Rate limiting or blacklisting policies by some providers during high traffic patterns, such as with Wasabi.
- No tests conducted on other continents and did not include giants like Google Cloud or Azure for now.
The added value of the report isn’t so much in providing a definitive “ranking,” but in opening the methodology and data, inviting third parties to replicate or challenge the results.
Use Cases: Where Backblaze Fits According to Its Own Numbers
Based on the benchmarks, Backblaze highlights several scenarios where its combination of reasonable latency and good throughput is particularly advantageous:
- AI and ML inference, where quick access to models and artifacts can reduce response latency.
- Feature stores and embedding searches, typical in vector database systems for recommendation engines and RAG, handling many small objects written once and read many times.
- RAG (retrieval-augmented generation) systems, storing and retrieving document snippets at high speed to feed language models.
- Log and event analytics (SIEM, IoT), where continuous writes and intensive reads coexist.
- Interactive data lakes and origin points for CDNs, where throughput consistency and no egress costs can be decisive.
The company underscores that “ideal” performance depends both on the provider and on application design: knowing that a service performs better with small or large objects allows data partitioning to maximize strengths.
Drive Stats Q3 2025: Disk Reliability in Numbers
Beyond network performance, Backblaze maintains its traditional Drive Stats report, capturing real-world behavior of the hard drives in its cloud. The Q3 2025 snapshot summarizes the fleet status as follows:
- Data drives in operation: 328,348
- Accumulated days of operation in the quarter: 29,431,703
- Failures recorded in the quarter: 1,250
- Quarterly AFR (annualized failure rate): 1.55%
Looking further back, reference numbers indicate:
| Period | Drive-days | Failures | AFR |
|---|---|---|---|
| Q3 2025 | 29,431,703 | 1,250 | 1.55% |
| 2024 | 101,906,290 | 4,372 | 1.57% |
| Historical | 527,372,275 | 18,953 | 1.31% |
The historical failure rate remains notably steady around 1.3%, despite fleet growth and increasing capacities. Notably, disks of 20 TB or more have progressively increased—over 67,000 units, representing nearly 21% of the fleet.
The report also mentions:
- An “endless zero-fail” club for the quarter with four specific models (including Seagate ST8000NM000A 8TB and Toshiba MG11ACA24TE 24TB).
- Three models with exceptionally high AFRs, over 5.88%, marked as outliers:
- Seagate ST10000NM0086 (10 TB), with 7.97% and an average age exceeding 7 years.
- Seagate ST14000NM0138 (14 TB), 6.86%, known for historically high failure rates.
- Toshiba MG08ACA16TEY (16 TB), with 16.95%, partly due to firmware and maintenance work rather than mass mechanical failures.
How Backblaze Defines a “Disk Failure”
The report’s transparency includes how the company determines a disk has failed for statistical purposes, which is less straightforward than it seems.
The process operates across three tiers:
- SMART monitoring and read/write errors, using tools like smartmontools and an internal system called drive sentinel to track unrecoverable errors and other indicators.
- Podstats program, which periodically collects SMART attributes from each unit in every Storage Pod and generates centralized XML files.
- Data engineering layer, cross-referencing SMART info with replacement tickets in data centers. A disk is deemed failed if it disappears from reports and is replaced, or if it does not rejoin the pool after a reasonable period.
This approach means some AFR spikes, like for Toshiba units undergoing firmware updates, are more “cosmetic” than reflecting actual physical degradation—a nuance explained by Backblaze to prevent misinterpretation.
Implications for Companies and Developers
Backblaze’s dual performance and reliability report sends a clear message: in the cloud, it’s not enough to look at cost per gigabyte or rely on a single latency or failure rate figure.
For data teams, infrastructure managers, or AI projects, these data points help to:
- More honestly compare Backblaze B2 with AWS, Cloudflare, or Wasabi — at least in a controlled scenario.
- Understand how each provider’s performance varies with file size and access patterns (single vs. multi-thread), critical for workloads like RAG, data lakes, or massive log ingestion.
- Assess the operational maturity of a provider through AFR figures and how it handles outliers and maintenance campaigns.
In an era where many organizations seek alternatives to hyperscalers for certain workloads, such data provides valuable context for more informed decisions.
Frequently Asked Questions
What are the practical differences between looking at average upload times and sustained throughput in the cloud?
Average upload time measures how long it takes to complete a single file, highly influenced by network latency and initial handshake. Sustained throughput indicates how many MiB per second can be transferred continuously over several minutes using parallel uploads. For backups, data ingestion, or AI training, throughput is often more representative than a single file’s latency.
Why does Backblaze use MiB units instead of MB in its performance tests?
A MiB (mebibyte) is based on powers of two, equal to 1,048,576 bytes, whereas a MB (megabyte) is decimal, defined as 1,000,000 bytes. In storage and performance contexts, using MiB avoids ambiguities and provides more precise measurements—especially crucial when comparing speeds and file sizes in performance-sensitive environments.
How reliable is Backblaze as a storage provider according to Drive Stats Q3 2025?
With an annualized failure rate of 1.55% in Q3 2025, a 2024 AFR of 1.57%, and a historical average of 1.31%, Backblaze’s disk fleet demonstrates consistent reliability over time. The analysis of outliers and the explanations regarding temporary AFR spikes—like firmware updates—show active health monitoring and a structured replacement policy.
What workloads of AI and analytics are best suited for Backblaze B2 compared to other providers?
Performance Stats suggest Backblaze B2 excels in small object reads and writes and in sustained throughput for very large files. This makes it suitable for inference tasks, RAG systems, feature stores, log analytics, and content origins for CDNs. For medium-sized objects, providers like AWS or Wasabi might offer similar or better performance; the final choice should consider both access patterns and overall cost.
via: backblaze

