Storage
Datasets that live next to your GPUs.
Store training data, checkpoints, and evaluation sets in S3-compatible buckets colocated with Scout compute. Version it, share it, or list it in the Scout dataset marketplace.
What you get
Object storage built for training runs.
S3-compatible object storage
Use any S3 client, SDK, or DataLoader. No proprietary protocol, no lock-in.
Colocated with GPU compute
Datasets sit in the same region as Scout instances — no per-GB egress when you train.
Dataset versioning
Immutable snapshots, content-addressed manifests, and diffable revisions for every dataset.
Marketplace-ready
Publish a dataset publicly, gate it behind a license, or sell access by the seat or download.
Encrypted at rest
AES-256 server-side encryption with per-bucket access policies and signed-URL downloads.
Multi-region replication
Opt-in geo-replication keeps training data close to whichever region has capacity.
Storage tiers
Pay for what you actually keep.
No egress fees between Scout storage and Scout compute. External egress billed at $0.02 / GB.
Hot
Active training sets. Sub-millisecond reads from any Scout GPU region.
Standard
Working datasets, model checkpoints, and evaluation corpora.
Archive
Cold backups, raw collection runs, and long-term provenance.
Dataset marketplace
Buy datasets. Sell datasets. Train on both.
Any bucket can be published as a listing. Set a price, a license, and access controls. Buyers train directly against the dataset without ever leaving Scout.
| Dataset | Type | Size | Access |
|---|---|---|---|
| OpenRoad-2M | Driving footage | 11.4 TB | $1,400 / seat |
| MedImg-Pathology | Histology slides | 3.2 TB | $3,900 / seat |
| VoiceCommons-EN | Speech corpus | 820 GB | Free · CC-BY |
| RetailShelf-HD | Product imagery | 2.7 TB | $650 / seat |
Examples shown for illustration. Listings populate as the marketplace opens.
Bring your data. Or find new data.
Join the waitlist for storage buckets and early marketplace access.