
Who MinIO AIStor is for#
AI/ML teams replacing cloud object storage for training data
Teams training large models on terabytes of image, text, or multimodal data can run MinIO on on-premises GPU servers, keeping training data physically close to compute and eliminating per-GB egress fees. PyTorch and TensorFlow data loaders work against MinIO's S3-compatible endpoint without code changes.
Skip if:
Your training data is already in AWS S3 and your compute runs in the same AWS region. If data and compute are co-located in AWS, S3 is simpler to operate and egress costs are minimal for in-region transfers.
Platform engineers building a hybrid cloud data lakehouse
MinIO's support for Apache Iceberg catalogs and Delta Sharing makes it a storage layer for hybrid architectures where on-premises data needs to interoperate with cloud analytics services like Athena, BigQuery, or Databricks. A single MinIO namespace can span edge, on-premises data center, and cloud tiers under one S3-compatible endpoint.
Skip if:
Your entire data stack lives within a single cloud provider and you have no regulatory or cost reason to run on-premises. Managed lakehouse services like AWS Lake Formation carry less operational overhead in that scenario.
Infrastructure teams migrating Hadoop HDFS clusters
Organizations decommissioning HDFS clusters commonly migrate to MinIO because it supports the same massive dataset sizes with a modern S3 API. Spark and Hive connect directly to MinIO, so existing job code needs minimal changes. The migration eliminates the Hadoop operational burden without forcing a move to cloud-only storage.
Skip if:
Your Hadoop workloads depend on HDFS-specific features like federation or custom ACLs that have no S3 equivalent. Map the migration surface carefully before committing, as feature gaps can require significant job rewrites.
Compliance-sensitive enterprises needing on-premises object storage
Financial services, healthcare, and government teams that cannot store sensitive data in a public cloud can run MinIO entirely on-premises with encryption at rest and in transit, IAM-based access control, and immutable object locks for regulatory retention requirements like WORM compliance.
Skip if:
Your compliance requirements do not prohibit cloud storage. If your legal and security teams approve AWS GovCloud or Azure Government, managed cloud storage carries less operational overhead than a self-hosted cluster.
The problem it solves#
Cloud object storage works for small workloads, but teams training large AI models or running petabyte-scale analytics hit two walls fast: cost and control. AWS S3, Azure Blob Storage, and Google Cloud Storage charge per GB stored plus egress fees, and those costs compound quickly when pipelines move terabytes daily. There is also no control over where data lives, which matters for compliance, air-gapped deployments, or organizations that cannot ship sensitive training data to a third-party cloud.
Most self-hosted storage alternatives were designed for general file storage, not the throughput demands of AI training or the S3 API compatibility that all modern ML tooling expects. Teams end up either paying the cloud tax or maintaining fragile, non-standard storage systems that break their data pipelines.
How it solves it#
S3-Compatible API with S3 Express One Zone support
Implements the full Amazon S3 API, including the newer S3 Express One Zone protocol. Any AWS SDK, CLI tool, or ML framework that works with S3 works with MinIO with only an endpoint URL change. No code modifications needed for boto3, the AWS CLI, PyTorch data loaders, or Spark connectors.
Benchmarked throughput above 2.2 TiB/s
Delivers over 2.2 TiB/s read/write throughput on NVMe storage in published benchmarks. This is purpose-built for AI training pipelines where storage throughput directly determines how fast a model iterates, not an adapted file server or general-purpose NAS.
Native AI framework and lakehouse integration
Ships with direct integration for PyTorch, TensorFlow, Apache Spark, and Apache Iceberg catalogs. Also supports Delta Sharing protocol, making MinIO a drop-in storage layer for hybrid lakehouses without additional middleware or data connectors.
Erasure coding with bit-rot protection
Stores data using Reed-Solomon erasure coding, tolerating up to half the drives in a cluster failing without data loss. Continuous bit-rot protection verifies data integrity on disk during idle periods, catching silent corruption before it affects reads.
Multi-site replication and immutable object locks
Supports active-active replication across multiple data centers or cloud regions, with object versioning, immutability locks, and lifecycle policies. Immutable locks meet regulatory retention requirements in financial services, healthcare, and government environments.
Software-defined on commodity hardware
Runs on standard x86 servers with NVMe, SSD, or HDD drives. No proprietary appliances or vendor hardware required. Organizations can deploy on existing data center hardware or validated reference architectures, which lowers total cost of ownership compared to dedicated storage appliances.
Strengths and trade-offs#
Strengths
- Drop-in S3 compatibility with no code changesMinIO implements S3 API completely, meaning existing AWS SDK code runs against it with only an endpoint URL change. Teams migrating from AWS S3 or building hybrid environments can use MinIO alongside cloud S3 without maintaining separate code paths for on-premises and cloud.
- Concrete throughput benchmarks for AI workloadsUnlike most self-hosted storage options, MinIO publishes verified benchmarks (2.2 TiB/s on NVMe). It is designed from the ground up for AI/ML throughput demands, not adapted from a general-purpose NAS or distributed file system.
- No egress fees on self-hosted deploymentsMoving data out of AWS S3 costs $0.09/GB. On a self-hosted MinIO cluster, egress is bounded only by your network hardware. Teams with large pipelines that move data frequently across compute and storage can eliminate a recurring cloud bill that scales with data volume.
- Active open source project with commercial backingMinIO has over 60,000 GitHub stars and is maintained by MinIO Inc., which funds the project through its enterprise AIStor tier. The commercial backing means the open source codebase receives sustained investment, unlike many community-only projects that stall when core contributors move on.
Trade-offs
- -AGPL-3.0 requires releasing modifications used in servicesThe community edition is licensed under AGPL-3.0. If you run a modified version of MinIO as a service, you must release your modifications under the same license. Commercial or proprietary use of a modified MinIO without releasing changes requires a separate commercial license from MinIO Inc. Teams should review this carefully before customizing the codebase for internal infrastructure.
- -Community edition ships as source code onlyAs of recent changes, the MinIO community edition no longer provides pre-compiled binary releases. Installing it requires building from Go source with `go install github.com/minio/minio@latest` (requires Go 1.24+) or building a Docker image from the provided Dockerfile. Teams that want pre-packaged binaries need AIStor Free from min.io/download.
- -Self-hosting at scale requires operational ownershipRunning MinIO in production means managing hardware failures, drive replacements, erasure coding configuration, cluster upgrades, and capacity planning yourself. AWS S3 and comparable cloud services handle all of this as a managed service. Enterprise support with SLA coverage is available in AIStor Enterprise, but it is not included in the community tier.
MinIO AIStor vs alternatives#
MinIO AIStor vs AWS S3
MinIO AIStor and AWS S3 both provide S3-compatible object storage, but they serve fundamentally different deployment models. MinIO runs on your own hardware or cloud VMs; AWS S3 is a fully managed service that handles all infrastructure on your behalf.
| Feature | MinIO AIStor | AWS S3 |
|---|---|---|
| License | AGPL-3.0 | Proprietary |
| Self-hosting | Yes | No |
| Egress fees | None (on-premises) | ~$0.09/GB |
| S3 API compatibility | Full, including S3 Express | Native |
| AI framework support | Native (PyTorch, TF, Spark, Iceberg) | Via SDKs |
| Throughput (on-premises NVMe) | 2.2+ TiB/s | Network-bound |
| SLA-backed support | Enterprise tier | Standard / Premium plans |
MinIO is the better choice when data must stay on-premises for compliance, when egress costs on S3 are material, or when your compute is physically co-located with storage and you need throughput beyond what cloud networking allows. AWS S3 remains the right choice when you want zero operational overhead, when compute already runs on AWS in the same region, and when the managed reliability and AWS ecosystem integrations outweigh the per-GB costs.
Install and self-host#
# Install from source (requires Go 1.24+)
go install github.com/minio/minio@latest
# Start a single-node server
minio server /data --console-address :9001
# Build and run as Docker container
docker build -t myminio:minio .
docker run -p 9000:9000 -p 9001:9001 myminio:minio server /tmp/minio --console-address :9001
# Install MinIO Client (mc) and verify connectivity
go install github.com/minio/mc@latest
mc alias set local http://localhost:9000 minioadmin minioadmin
mc admin info local
# Kubernetes via Helm
helm install minio minio/minioWhat it's built on#
- Languages
- Go
- Infrastructure
- AWSKubernetes
FAQ#
Is MinIO AIStor free to use?
The community edition source code is free under AGPL-3.0. MinIO also offers AIStor Free, a full-featured standalone edition with pre-packaged binaries at no cost. AIStor Enterprise adds distributed deployment, commercial support, and production SLAs for organizations that need them. The main cost of the community edition is infrastructure and the operational time to run it.
Is MinIO AIStor compatible with Amazon S3?
Yes. MinIO implements the full Amazon S3 API, including S3 Express One Zone. Any AWS SDK, CLI tool, or application built for S3 works against MinIO with only the endpoint URL changed. No code changes are required. This makes MinIO a common on-premises replacement or hybrid companion for AWS S3.
How do I install MinIO?
The community edition is now source-only. Install with go install github.com/minio/minio@latest (requires Go 1.24+) or build a Docker image using the provided Dockerfile. Kubernetes deployments can use the official MinIO Operator or the community Helm charts. AIStor Free at min.io/download provides pre-compiled binaries if you prefer a packaged install.
What license does MinIO use?
The MinIO codebase is licensed under GNU AGPL-3.0. You can use and modify it freely for personal and internal use. If you run a modified version as a networked service, AGPL-3.0 requires releasing your modifications under the same license. Commercial use of a modified MinIO without releasing changes requires a commercial license from MinIO Inc.
Can MinIO replace AWS S3 for AI workloads?
For teams with on-premises infrastructure, or strong cost or compliance reasons to avoid cloud storage, yes. MinIO benchmarks at over 2.2 TiB/s throughput on NVMe hardware and integrates natively with PyTorch, TensorFlow, and Spark via S3-compatible endpoints. It is not a drop-in replacement for teams deeply integrated into AWS-native services like SageMaker that assume data and compute are co-located within AWS infrastructure.
Similar open-source tools#
Storj
Decentralized S3 storage with end-to-end client-side encryption
CLI-Anything
Empower AI agents with agent-native CLIs
RuView
Intelligent AI agents for real-world applications
Flue Framework
Build powerful, autonomous agents with TypeScript.
RuFlo
Deploy intelligent AI agents with ease.
ClickHouse
Fast open source column-oriented database for analytics

