Multi-Cloud File Storage in Go: Write Once, Deploy Anywhere

Multi-Cloud Madness

Picture this: I'm deep into a project that needs to deploy across AWS, Google Cloud, and Azure. Different clients, different preferences, same application. My first instinct? "No problem, I'll just create a Go interface and implement it three times." Classic developer thinking, right?

I was already sketching out the architecture in my head:

package storage

type CloudStorage interface {
    Upload(file io.Reader) error
    Download(key string) (io.Reader, error)
}

type (
    S3Storage    struct{} // implement for AWS
    GCSStorage   struct{} // implement for GCP  
    AzureStorage struct{} // implement for Azure
)

Three implementations, three sets of SDKs, three different authentication flows, three ways for things to break. I was about to create my own little maintenance nightmare.

Then, during one of those late-night coding sessions, I stumbled upon gocloud.dev. Wait, what? Google built a unified cloud abstraction layer? And it's not just for Google Cloud? That discovery changed everything. Instead of maintaining three separate implementations, I could build one robust, memory-efficient storage client that works everywhere.

No more SDK juggling, no more maintaining three different codebases, and significantly fewer provider-specific implementation details to worry about.

This guide is the result of that journey – from the initial "I'll just implement it three times" moment to building a production-ready, cloud-agnostic storage solution that's been battle-tested across all three major providers.

Why Cloud Agnostic Storage?

Before we dive into code, let's talk about why this matters:

  • Vendor Independence: Switch providers without code changes
  • Multi-cloud Deployments: Use different providers for different environments
  • Cost Optimization: Choose the most cost-effective provider per use case
  • Multi-tenant Flexibility: Route different customers/tenants to different providers based on their requirements or contracts

Core Implementation

Here's how we're building this. The Go Cloud Development Kit (CDK) gives us a unified interface across cloud providers—same code, different clouds. The architecture is straightforward:

  • Unified Interface: Single API for all cloud operations
  • Provider Abstraction: Automatic provider detection via URL schemes
  • Memory Efficiency: Streaming uploads/downloads without buffering
flowchart TD
    App[Your Application] --> |blob.OpenBucket| Router[URL Router]
    Router --> |s3://| S3[AWS S3 Driver]
    Router --> |gs://| GCS[GCS Driver]
    Router --> |azblob://| Azure[Azure Driver]
    S3 --> Bucket[blob.Bucket Interface]
    GCS --> Bucket
    Azure --> Bucket
    Bucket --> |io.Copy| Stream[Streaming 32KB Buffer]
    Stream --> Cloud[Cloud Storage]

1. Setting Up Dependencies

Before we write any code, we'll need to grab the Go CDK and its driver packages.

go get -u gocloud.dev
go get -u gocloud.dev/blob/s3blob
go get -u gocloud.dev/blob/gcsblob
go get -u gocloud.dev/blob/azureblob
package storage

import (
    "context"
    "fmt"
    "io"

    "gocloud.dev/blob"
    _ "gocloud.dev/blob/azureblob" // Azure driver
    _ "gocloud.dev/blob/gcsblob"   // GCP driver
    _ "gocloud.dev/blob/s3blob"    // AWS driver
)

Understanding the blob Package

The blob package abstracts away the differences between storage providers, allowing us to write portable Go CDK code. It lets us use a single API to read, write, list, and delete blobs (files/objects) regardless of where they're stored.

It supports multiple backends by using specific drivers:

  1. Cloud Providers: gcsblob (Google Cloud), s3blob (Amazon S3), and azureblob (Azure).
  2. Local/Testing: fileblob (local disk) and memblob (in-memory).

2. URL-Based Provider Selection

The Go CDK uses URL schemes as a routing mechanism to determine which driver to use. When we call blob.OpenBucket(ctx, bucketURL), the CDK examines the URL scheme (the part before ://) and automatically routes our request to the appropriate driver. For example, s3:// routes to the S3 driver, gs:// to the GCS driver, and azblob:// to the Azure driver. This design allows us to switch between cloud providers by simply changing the URL string, without modifying any of our storage logic code.

package storage

const (
    awsBucketURLFormat   = "s3://%s?awssdk=v2"
    gcpBucketURLFormat   = "gs://%s"
    azureBucketURLFormat = "azblob://%s"
)

func GetBucketURL(cloud, bucketName string) (string, error) {
    switch cloud {
    case "aws":
        return fmt.Sprintf(awsBucketURLFormat, bucketName), nil
    case "gcp":
        return fmt.Sprintf(gcpBucketURLFormat, bucketName), nil
    case "azure":
        return fmt.Sprintf(azureBucketURLFormat, bucketName), nil
    }
    return "", fmt.Errorf("unsupported cloud %s", cloud)
}

Note on AWS SDK Version: The awssdk=v2 query parameter forces the use of AWS SDK v2, which is the current recommended version. Without this parameter, the CDK might default to the older v1 SDK. Using v2 ensures better performance, more features, and continued support from AWS.

3. Memory-Efficient Upload Design

Here's where it gets interesting. Our FileUploader is surprisingly simple because Go CDK does the heavy lifting:

package storage

type FileUploader struct {
    bucket *blob.Bucket
}

type (
    UploadConfig struct {
        // ObjectKey is the path/key where the file will be stored in the bucket.
        // Examples: "documents/file.pdf", "images/2024/photo.jpg"
        ObjectKey string

        // Metadata contains optional key-value pairs to store with the file.
        // Can be nil if not needed. Available metadata varies by provider.
        Metadata map[string]string

        // ContentType specifies the MIME type of the file (e.g., "image/jpeg", "text/plain").
        ContentType string
    }

    UploadResult struct {
        // Key is the object key where the file was stored
        Key string

        // Size is the number of bytes uploaded
        Size int64
    }
)

func NewFileUploader(ctx context.Context, bucketURL string) (*FileUploader, error) {
    bucket, err := blob.OpenBucket(ctx, bucketURL)
    if err != nil {
        return nil, fmt.Errorf("failed to open bucket %s: %w", bucketURL, err)
    }
    return &FileUploader{bucket: bucket}, nil
}

func (fu *FileUploader) Upload(ctx context.Context, reader io.Reader, config *UploadConfig) (*UploadResult, error) {
    opts := &blob.WriterOptions{
        ContentType: config.ContentType,
        Metadata:    config.Metadata,
    }

    writer, err := fu.bucket.NewWriter(ctx, config.ObjectKey, opts)
    if err != nil {
        return nil, fmt.Errorf("failed to create writer: %w", err)
    }

    defer func() {
        // Ignore error here since we explicitly check Close() below
        // This ensures cleanup even if earlier operations fail
        _ = writer.Close()
    }()

    // Stream with a small fixed buffer (typically 32KB), not loading entire file into memory
    size, err := io.Copy(writer, reader)
    if err != nil {
        return nil, fmt.Errorf("upload failed: %w", err)
    }

    if err := writer.Close(); err != nil {
        return nil, fmt.Errorf("failed to close: %w", err)
    }

    return &UploadResult{Key: config.ObjectKey, Size: size}, nil
}

func (fu *FileUploader) Close() error {
    return fu.bucket.Close()
}

4. Memory-Efficient Downloads

Downloads follow the same pattern—clean and simple:

package storage

// FileDownloader provides a reusable interface for downloading files
// from cloud storage. It maintains a connection to a specific bucket and can be used
// for multiple downloads. Always call Close() when done to release resources.
type FileDownloader struct {
    // bucket is the GoCloud blob.Bucket instance for the source storage
    bucket *blob.Bucket
}

func NewFileDownloader(ctx context.Context, bucketURL string) (*FileDownloader, error) {
    bucket, err := blob.OpenBucket(ctx, bucketURL)
    if err != nil {
        return nil, fmt.Errorf("failed to open bucket %s: %w", bucketURL, err)
    }
    return &FileDownloader{bucket: bucket}, nil
}

// DownloadConfig contains configuration for file downloads.
// All fields are optional except ObjectKey.
type DownloadConfig struct {
    // ObjectKey is the path/key of the file to download from the bucket.
    // Examples: "documents/file.pdf", "images/2024/photo.jpg"
    ObjectKey string
}

// DownloadResult contains information about the downloaded file and the reader.
type DownloadResult struct {
    // Reader provides access to the file content. Caller must close when done.
    Reader io.ReadCloser

    // Key is the object key where the file is stored
    Key string

    // ContentType is the MIME type of the file
    ContentType string

    // Size is the size of the file in bytes.
    Size int64
}

func (fd *FileDownloader) Download(ctx context.Context, config *DownloadConfig) (*DownloadResult, error) {
    reader, err := fd.bucket.NewReader(ctx, config.ObjectKey, nil)
    if err != nil {
        return nil, fmt.Errorf("failed to create reader: %w", err)
    }

    return &DownloadResult{
        Reader:      reader,
        Key:         config.ObjectKey,
        ContentType: reader.ContentType(),
        Size:        reader.Size(),
    }, nil
}

func (fd *FileDownloader) Close() error {
    return fd.bucket.Close()
}

5. Authentication Handling

If you're looking at that code thinking "wait, where's all the credential handling?"—you're not missing anything. The Go Cloud CDK handles authentication automatically using each provider's standard methods:

AWS S3

# IAM roles (recommended for production)
# Or environment variables:
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_REGION=us-east-1

Google Cloud Storage

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

Azure Blob Storage

# Managed identity (recommended for production)
# Or environment variables:
export AZURE_STORAGE_ACCOUNT=mystorageaccount
export AZURE_STORAGE_KEY=your_key

Putting It All Together

Here's a complete example showing how to upload and download files using our storage client:

package main

import (
    "context"
    "fmt"
    "io"
    "log"
    "os"
    "time"

    "yourproject/storage"  // Import your storage package
)

func main() {
    ctx := context.Background()

    // Choose your cloud provider - switch by changing this variable
    cloud := "aws"  // or "gcp" or "azure", in production this will usually be from ENV
    bucketName := "my-files-bucket" // typically from config in production

    // Get the provider-specific bucket URL
    bucketURL, err := storage.GetBucketURL(cloud, bucketName)
    if err != nil {
        log.Fatalf("Failed to get bucket URL: %v", err)
    }

    // Example 1: Upload a file
    if err := uploadExample(ctx, bucketURL); err != nil {
        log.Fatalf("Upload failed: %v", err)
    }

    // Example 2: Download the file back
    if err := downloadExample(ctx, bucketURL); err != nil {
        log.Fatalf("Download failed: %v", err)
    }
}

func uploadExample(ctx context.Context, bucketURL string) error {
    // Create uploader
    uploader, err := storage.NewFileUploader(ctx, bucketURL)
    if err != nil {
        return fmt.Errorf("failed to create uploader: %w", err)
    }
    defer uploader.Close()

    // Open file to upload
    file, err := os.Open("document.pdf")
    if err != nil {
        return fmt.Errorf("failed to open file: %w", err)
    }
    defer file.Close()

    // Upload with configuration
    result, err := uploader.Upload(ctx, file, &storage.UploadConfig{
        ObjectKey:   "uploads/2024/document.pdf",
        ContentType: "application/pdf",
        Metadata: map[string]string{
            "uploaded-by": "api-server",
            "timestamp":   time.Now().Format(time.RFC3339),
        },
    })
    if err != nil {
        return fmt.Errorf("upload failed: %w", err)
    }

    fmt.Printf("âś“ Uploaded %s (%d bytes)\n", result.Key, result.Size)
    return nil
}

func downloadExample(ctx context.Context, bucketURL string) error {
    // Create downloader
    downloader, err := storage.NewFileDownloader(ctx, bucketURL)
    if err != nil {
        return fmt.Errorf("failed to create downloader: %w", err)
    }
    defer downloader.Close()

    // Download file
    result, err := downloader.Download(ctx, &storage.DownloadConfig{
        ObjectKey: "uploads/2024/document.pdf",
    })
    if err != nil {
        return fmt.Errorf("download failed: %w", err)
    }
    defer result.Reader.Close()

    // Create local file
    outFile, err := os.Create("downloaded-document.pdf")
    if err != nil {
        return fmt.Errorf("failed to create output file: %w", err)
    }
    defer outFile.Close()

    // Stream to disk (memory efficient!)
    written, err := io.Copy(outFile, result.Reader)
    if err != nil {
        return fmt.Errorf("failed to write file: %w", err)
    }

    fmt.Printf("âś“ Downloaded %s (%d bytes, %s)\n",
        result.Key, written, result.ContentType)
    return nil
}

Key Takeaways from This Example:

  1. Provider Switching - Change cloud := "aws" to "gcp" or "azure" and everything else stays the same
  2. Resource Management - Always defer Close() on uploaders and downloaders
  3. Streaming - Files stream directly from source to destination without loading into memory

Why Streaming Over Alternative Methods

Remember that io.Copy call in our Upload method? That one line is doing serious work. Here's why the streaming approach matters:

The Alternatives (and Why They're Problematic)

Loading Entire File into Memory:

// ❌ Don't do this - loads entire file into RAM
data, err := ioutil.ReadAll(file)
// Uploading a 500MB file? Your app needs 500MB+ RAM

Buffering in Memory:

// ❌ Don't do this - copies everything to a buffer first
var buf bytes.Buffer
io.Copy(&buf, file)
// Same problem: full file size in memory

Why Streaming Wins

1. Predictable Memory Usage

  • Streaming uses a fixed 32KB buffer regardless of file size
  • 10MB file? 32KB RAM. 10GB file? Still 32KB RAM
  • Enables running on smaller, cheaper instances

2. Handle Large Files Safely

  • Upload multi-gigabyte files without out-of-memory errors
  • No need to worry about file size limits in your application code
  • The cloud provider handles size limits, not your RAM

3. Better Concurrency

  • Handle 100 simultaneous uploads without proportional memory growth
  • Each operation uses minimal memory (32KB vs entire file size)
  • Scale horizontally without memory concerns

4. Faster Time-to-First-Byte

  • Start uploading immediately as data arrives
  • No waiting to buffer entire file before beginning transfer
  • Better user experience for large files

Limitations and Trade-offs

Alright, real talk. I've shown you all the wins, but Go CDK isn't perfect. Here's what we're trading for that portability:

Features We'll Miss

Resumable Uploads: Native SDKs support resumable uploads for large files. If our 5GB upload fails at 90%, we can resume from where it left off. With Go CDK, we start over from scratch.

Pre-signed URLs: Can't generate URLs that allow temporary, direct uploads/downloads without credentials. This is crucial for browser-based file uploads where we don't want to proxy through our server.

Multipart Upload Control: Native SDKs let us configure multipart uploads for large files (parallel chunks, custom part sizes). Go CDK may use multipart internally but doesn't expose controls, so we can't optimize for our specific use case.

Performance Considerations

Minor Overhead: The abstraction layer may add a small performance cost. For most applications, this is negligible. But if we're building something that processes millions of tiny files per second, the native SDK might be faster.

No Advanced Optimizations: Native SDKs let us tune things like part size for multipart uploads, connection pooling, and retry strategies. The Go CDK uses sensible defaults, but we can't fine-tune them.

When to Use Native SDKs Instead

Use provider-specific SDKs when:

  • You need provider-specific features
  • You're committed to one cloud
  • You're optimizing for the last 5% of performance
  • You need very specific error handling for provider quirks

The pragmatic approach? Start with Go CDK. If you hit a wall, drop down to the native SDK for just that operation. Your 95% use case stays portable.

Conclusion

What started as "I'll just implement it three times" turned into a single, elegant solution. The Go Cloud CDK proves that cloud-agnostic storage doesn't have to be complicated—just a few import statements, URL-based routing, and streaming I/O patterns.

The real win? Your code stays the same whether you're deploying to AWS, Google Cloud, or Azure. No vendor lock-in, no SDK juggling, and your memory footprint stays predictable regardless of file sizes. That's the kind of abstraction worth using when working with multiple clouds. If you're committed to a single provider, stick with their native SDK—abstract only when you need to.

Now go build something that works everywhere.

Back to all posts