Configuration Guide
This guide explains how to configure storage backends in genro-storage.
Overview
Storage backends are configured through the StorageManager.configure() method.
You can configure storage from:
Python dictionaries - for programmatic configuration
YAML files - for declarative configuration
JSON files - for declarative configuration
All configuration methods support the same format and backends.
Configuration Methods
From Python Dictionaries
The most flexible method for programmatic configuration:
from genro_storage import StorageManager
storage = StorageManager()
storage.configure([
{
'name': 'home',
'type': 'local',
'path': '/home/user'
},
{
'name': 'uploads',
'type': 's3',
'bucket': 'my-app-uploads',
'region': 'eu-west-1'
}
])
From YAML Files
Best for environment-specific configuration files:
storage.yaml:
# Local development storage
- name: home
type: local
path: /home/user
- name: temp
type: local
path: /tmp/app
# Production S3 storage
- name: uploads
type: s3
bucket: prod-app-uploads
region: eu-west-1
prefix: uploads/
- name: backups
type: s3
bucket: prod-app-backups
region: eu-west-1
Python code:
storage = StorageManager()
storage.configure('/etc/app/storage.yaml')
From JSON Files
Alternative to YAML:
storage.json:
[
{
"name": "home",
"type": "local",
"path": "/home/user"
},
{
"name": "uploads",
"type": "s3",
"bucket": "prod-app-uploads",
"region": "eu-west-1"
}
]
Python code:
storage = StorageManager()
storage.configure('./config/storage.json')
Storage Backend Types
Local Filesystem
Access files on the local filesystem.
Required fields:
name: Mount point nametype: Must be"local"path: Absolute path to base directory
Example:
- name: home
type: local
path: /home/user
- name: temp
type: local
path: /tmp/app
storage.configure([
{'name': 'home', 'type': 'local', 'path': '/home/user'},
{'name': 'temp', 'type': 'local', 'path': '/tmp/app'}
])
Usage:
node = storage.node('home:documents/report.pdf')
content = node.read()
Amazon S3
Access files in Amazon S3 buckets.
Required fields:
name: Mount point nametype: Must be"s3"bucket: S3 bucket name
Optional fields:
prefix: Path prefix within bucket (default:"")region: AWS region (default: from AWS config)anon: Anonymous access (default:False)key: AWS access key (default: from AWS config)secret: AWS secret key (default: from AWS config)endpoint_url: Custom S3 endpoint for S3-compatible services
Example:
# Standard S3
- name: uploads
type: s3
bucket: my-app-uploads
region: eu-west-1
prefix: uploads/
# With credentials
- name: backups
type: s3
bucket: my-app-backups
region: us-east-1
key: AKIAIOSFODNN7EXAMPLE
secret: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# Anonymous public bucket
- name: public-data
type: s3
bucket: public-datasets
anon: true
Usage:
# Upload file
node = storage.node('uploads:2024/report.pdf')
node.write(pdf_data, mode='wb')
# List directory
folder = storage.node('uploads:2024')
for file in folder.children():
print(f"{file.basename}: {file.size} bytes")
Google Cloud Storage
Access files in Google Cloud Storage buckets.
Required fields:
name: Mount point nametype: Must be"gcs"bucket: GCS bucket name
Optional fields:
prefix: Path prefix within bucket (default:"")token: Path to service account JSON key fileproject: GCP project ID
Example:
- name: backups
type: gcs
bucket: my-app-backups
project: my-gcp-project
token: /etc/secrets/gcp-service-account.json
storage.configure([{
'name': 'backups',
'type': 'gcs',
'bucket': 'my-app-backups',
'token': '/etc/secrets/gcp-service-account.json'
}])
Azure Blob Storage
Access files in Azure Blob Storage containers.
Required fields:
name: Mount point nametype: Must be"azure"container: Container nameaccount_name: Storage account name
Optional fields:
account_key: Storage account keysas_token: Shared access signature tokenconnection_string: Full connection string
Example:
- name: archive
type: azure
container: archives
account_name: mystorageaccount
account_key: xxxxxxxxxxxxxxxxxxxxx
HTTP Storage (Read-Only)
Access files via HTTP/HTTPS (read-only).
Required fields:
name: Mount point nametype: Must be"http"base_url: Base URL for HTTP requests
Example:
- name: cdn
type: http
base_url: https://cdn.example.com
storage.configure([{
'name': 'cdn',
'type': 'http',
'base_url': 'https://cdn.example.com'
}])
# Read-only access
node = storage.node('cdn:assets/logo.png')
image_data = node.read(mode='rb')
Memory Storage (Testing)
In-memory storage for testing and development.
Required fields:
name: Mount point nametype: Must be"memory"
Example:
# Perfect for unit tests
storage.configure([{'name': 'test', 'type': 'memory'}])
node = storage.node('test:temp.txt')
node.write("test data")
assert node.read() == "test data"
Permission Control
Configure access permissions for any storage backend to restrict operations.
Permission Levels
Three permission levels are available:
readonly: Only read operations (read, exists, list)readwrite: Read and write operations, but no deletedelete: Full access including delete (default)
Example:
storage.configure([
# Read-only access to public data
{'name': 'public', 'type': 'http', 'base_url': 'https://cdn.example.com',
'permissions': 'readonly'},
# Read-write but no delete for backups
{'name': 'backups', 'type': 'gcs', 'bucket': 'my-backups',
'permissions': 'readwrite'},
# Full access (default)
{'name': 'uploads', 'type': 's3', 'bucket': 'my-uploads'}
])
Using Permissions
Permissions are validated at configuration time and enforced at runtime:
# Read-only mount
storage.configure([
{'name': 'public', 'type': 's3', 'bucket': 'public-data',
'permissions': 'readonly'}
])
node = storage.node('public:file.txt')
# Allowed operations
content = node.read()
exists = node.exists
files = node.children()
# Raises StoragePermissionError
node.write("data")
node.delete()
node.mkdir()
YAML Configuration
# Read-only CDN
- name: cdn
type: http
base_url: https://cdn.example.com
permissions: readonly
# Read-write backups (no delete)
- name: backups
type: gcs
bucket: my-backups
permissions: readwrite
# Full access uploads
- name: uploads
type: s3
bucket: my-uploads
permissions: delete
Permission Validation
Permissions are validated against backend capabilities:
# Invalid: HTTP is read-only, cannot request readwrite
storage.configure([
{'name': 'cdn', 'type': 'http', 'base_url': 'https://cdn.example.com',
'permissions': 'readwrite'} # Error!
])
# Raises: StorageConfigError: Backend is read-only
Best Practices
Principle of least privilege - Use
readonlywhen possibleProtect backups - Use
readwriteto prevent accidental deletionValidate at startup - Permissions are checked at configuration time
Use for security - Restrict access to sensitive storage locations
Advanced Configuration
Multiple Configurations
You can call configure() multiple times. Mounts with the same name
are replaced:
# Initial setup
storage.configure([
{'name': 'home', 'type': 'local', 'path': '/home/user'}
])
# Add more mounts later
storage.configure([
{'name': 'uploads', 'type': 's3', 'bucket': 'my-bucket'}
])
# Replace existing mount
storage.configure([
{'name': 'home', 'type': 'local', 'path': '/mnt/newlocation'}
])
Environment-Specific Configuration
Use different configuration files per environment:
import os
storage = StorageManager()
# Load environment-specific config
env = os.getenv('APP_ENV', 'development')
config_file = f'/etc/app/storage-{env}.yaml'
storage.configure(config_file)
storage-development.yaml:
- name: uploads
type: local
path: /tmp/dev-uploads
storage-production.yaml:
- name: uploads
type: s3
bucket: prod-uploads
region: eu-west-1
Configuration from Environment Variables
Build configuration dynamically from environment:
import os
storage = StorageManager()
storage.configure([
{
'name': 'uploads',
'type': 's3',
'bucket': os.getenv('S3_BUCKET'),
'region': os.getenv('AWS_REGION', 'eu-west-1'),
'key': os.getenv('AWS_ACCESS_KEY_ID'),
'secret': os.getenv('AWS_SECRET_ACCESS_KEY')
}
])
Checking Configured Mounts
# List all configured mounts
print(storage.get_mount_names())
# ['home', 'uploads', 'temp']
# Check if mount exists
if storage.has_mount('uploads'):
node = storage.node('uploads:file.txt')
else:
print("Uploads storage not configured")
Complete Example
config/storage-prod.yaml:
# Local temporary storage
- name: temp
type: local
path: /tmp/app
# User uploads to S3
- name: uploads
type: s3
bucket: prod-app-uploads
region: eu-west-1
prefix: uploads/
# Backups to GCS
- name: backups
type: gcs
bucket: prod-app-backups
token: /etc/secrets/gcp-key.json
# Static assets from CDN
- name: cdn
type: http
base_url: https://cdn.example.com
Python application:
from genro_storage import StorageManager
# Initialize and configure
storage = StorageManager()
storage.configure('/etc/app/storage-prod.yaml')
# Process upload
upload = storage.node('temp:processing/image.jpg')
upload.write(uploaded_data, mode='wb')
# Save to S3
final = storage.node('uploads:2024/images/photo.jpg')
upload.copy_to(final)
# Backup to GCS
backup = storage.node('backups:daily/2024-10-26/photo.jpg')
final.copy_to(backup)
# Cleanup temp
upload.delete()
# Access CDN asset
logo = storage.node('cdn:assets/logo.png')
logo_data = logo.read(mode='rb')
Best Practices
Use YAML for declarative configs - easier to read and maintain
Separate configs per environment - development, staging, production
Store credentials securely - use secrets managers, not config files
Use meaningful mount names -
uploads,backups, nots3_1Configure once at startup - don’t reconfigure during runtime
Use prefixes in cloud storage - organize files within buckets
Test with memory backend - fast, no cleanup needed
Troubleshooting
Mount not found error:
# Error: StorageNotFoundError: Mount point 'uploads' not found
node = storage.node('uploads:file.txt')
# Check configured mounts
print(storage.get_mount_names())
# Verify mount is configured
if not storage.has_mount('uploads'):
storage.configure([{'name': 'uploads', 'type': 's3', 'bucket': 'my-bucket'}])
Invalid configuration:
# Missing required field
storage.configure([{'name': 'uploads', 'type': 's3'}]) # Missing 'bucket'!
# Raises: StorageConfigError: Missing required field 'bucket' for S3 storage
Path escaping base directory:
storage.configure([{'name': 'home', 'type': 'local', 'path': '/home/user'}])
node = storage.node('home:../../../etc/passwd')
# Raises: ValueError: Path escapes base directory