🇬🇧 English
🇬🇧 English
Appearance
🇬🇧 English
🇬🇧 English
Appearance
Deliverable D2.1.3: Hosting Infrastructure Specifications
The MAPS infrastructure is built from scratch on a DigitalOcean account dedicated to the GST-MAPS project. Provisioning uses Pulumi as the IaC (Infrastructure as Code) tool, with the same technology stack described in deliverable D2.1.2: DOKS for container orchestration, cert-manager for TLS certificates, external-secrets for secrets management, CloudNative PostgreSQL (CNPG) for the database.
The reference domain is maps.gransassotech.it (or equivalent assigned by GST). DNS management is handled via AWS Route53, with a hosted zone dedicated to the project.
Before starting provisioning, the following must be available:
| Resource | Type | Notes |
|---|---|---|
| DigitalOcean Account | New GST account or separate project | With active billing |
| DigitalOcean API Token | Personal Access Token with read+write scope | For Pulumi DO provider |
| AWS Account | For Route53 and AWS Parameter Store | GST account dedicated to the project |
| Route53 Hosted Zone | For the maps.gransassotech.it domain | Zone ID required for Pulumi |
| AWS IAM Credentials | Access Key + Secret Key with Route53 + SSM permissions | For external-dns and external-secrets |
| S3 Bucket | For Pulumi state | E.g. gst-maps-pulumi-state |
| AWS KMS Key | For Pulumi state encryption | Alias pulumi-maps |
Provisioning is implemented with a maps Pulumi stack structured as follows:
pulumi/maps/
├── __main__.py # Entry point
├── cluster.py # DOKS cluster creation
├── components.py # Helm component installation
├── security.py # IAM, network policies, secrets
├── dns.py # Route53 DNS records
├── services.py # Namespaces and service configuration
├── utils.py # Helpers
├── Pulumi.yaml # Project definition
├── Pulumi.maps.yaml # maps stack configuration
├── helm/
│ └── maps/ # Helm values for the maps stack
│ ├── cert-manager.yaml
│ ├── external-secrets.yaml
│ ├── cnpg.yaml
│ ├── ingress-nginx.yaml
│ ├── external-dns.yaml
│ └── metrics-server.yaml
└── manifests/
└── maps/ # Additional Kubernetes manifests
└── cluster-issuer.yaml# Pulumi.maps.yaml (excerpt)
config:
env: maps
defaultDOregion: fra1
route53Domains:
- maps.gransassotech.it
cluster:
name: maps
version: "1.33" # Update to the latest available stable version
control_plane_high_availability: false
node_pools:
- name: general-purpose
size: s-2vcpu-4gb
count: 2
- name: workloads
size: s-4vcpu-8gb
autoscale:
min: 1
max: 3The general-purpose pool (2 fixed nodes, 2vCPU/4GB) hosts system components: ingress-nginx, cert-manager, external-dns, external-secrets. The workloads pool (1-3 autoscaling nodes, 4vCPU/8GB) hosts application services: Prefect, PostgreSQL (CNPG), OpenMetadata, CKAN. The sizing is consistent with that described in deliverable D2.1.2: under normal conditions 1 workloads node is sufficient, with ETL peaks triggering autoscaling up to 3 nodes.
| Component | Chart | Repo | Namespace | Notes |
|---|---|---|---|---|
| metrics-server | metrics-server | kubernetes-sigs | kube-system | HPA metrics |
| cert-manager | cert-manager | jetstack | cert-manager | CRDs enabled |
| ingress-nginx | ingress-nginx | kubernetes-nginx | ingress-nginx | DO LoadBalancer |
| external-dns | external-dns | kubernetes-sigs | kube-system | AWS Route53 provider |
| external-secrets | external-secrets | external-secrets | external-secrets | CRDs enabled |
| cnpg | cloudnative-pg | cloudnative-pg | cnpg-system | PostgreSQL Operator |
The Helm values for each component are standard configurations for each of the listed charts, with only the adaptations related to the project domain and credentials.
# manifests/maps/cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: admin@gransassotech.org
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
ingressClassName: nginx| Record | Type | Destination |
|---|---|---|
maps.k8s.maps.gransassotech.it | A | DOKS Load Balancer IP (managed by Pulumi/external-dns) |
prefect.maps.gransassotech.it | CNAME | maps.k8s.maps.gransassotech.it |
metadata.maps.gransassotech.it | CNAME | maps.k8s.maps.gransassotech.it |
ckan.maps.gransassotech.it | CNAME | maps.k8s.maps.gransassotech.it |
The Load Balancer A record is created automatically by Pulumi via setup_public_lb_dns_record(). The CNAME records for individual services are created automatically by external-dns upon reading the Kubernetes Ingress resources.
Pulumi automatically creates an IAM user external-dns-maps with a policy limited to route53:ChangeResourceRecordSets on the maps.gransassotech.it hosted zone. The credentials are injected as a Kubernetes Secret in the kube-system namespace.
Application secrets are stored in AWS Parameter Store with the following path schema:
/maps/{service}/{parameter}
Examples:
/maps/postgres/password
/maps/prefect/secret-key
/maps/openmetadata/jwt-secret
/maps/ckan/api-keyFor each application service, Pulumi creates:
external-secrets-{service}-maps with a read-access policy on the ssm:GetParameter* parameters at path /maps/{service}/*SecretStore resource referencing the credentialsExternalSecret resource mapping SSM parameters to Kubernetes SecretsSecrets are synchronized with a 1-hour refresh interval.
System secrets (non-application) are stored encrypted in Pulumi.maps.yaml via AWS KMS:
external-dns-credentials: IAM credentials for external-dnsgitlab-runner-secret: CI/CD runner token (if configured)The PostgreSQL database for the MAPS project is managed via the CNPG operator, already installed in the cluster. The CNPG Cluster resource for MAPS specifies:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: maps-postgres
namespace: maps
spec:
instances: 2
imageName: ghcr.io/cloudnative-pg/postgis:17
postgresql:
parameters:
shared_buffers: "256MB"
max_connections: "100"
storage:
size: 100Gi
storageClass: do-block-storage
backup:
barmanObjectStore:
destinationPath: s3://gst-maps-backups/postgres
s3Credentials:
accessKeyId:
name: maps-backup-credentials
key: ACCESS_KEY_ID
secretAccessKey:
name: maps-backup-credentials
key: SECRET_ACCESS_KEY
wal:
retention: "7d"
retentionPolicy: "30d"CNPG automatically manages: synchronous replication between the 2 instances, automatic failover, continuous WAL backup to S3, and point-in-time recovery.
A dedicated S3 bucket (DigitalOcean Spaces or AWS S3) gst-maps-backups is required, with the corresponding write-access credentials for CNPG.
MAPS application services are deployed in the maps namespace:
apiVersion: v1
kind: Namespace
metadata:
name: maps
labels:
environment: production
project: gst-mapsRunning pulumi up --stack maps from the pulumi/maps/ directory performs the following operations in order:
# Tools required on the provisioning machine
pulumi >= 3.0
python >= 3.11
poetry
kubectl
doctl (DigitalOcean CLI)
aws CLI (configured with maps profile)# 1. Configure AWS profile
export AWS_PROFILE=maps
# 2. Login to Pulumi state on S3
pulumi login s3://gst-maps-pulumi-state
# 3. Select stack
pulumi stack select maps
# 4. Preview (verify without applying)
pulumi preview
# 5. Apply
pulumi up| Service | URL | Authentication |
|---|---|---|
| Prefect UI | https://prefect.maps.gransassotech.it | Prefect credentials |
| OpenMetadata | https://metadata.maps.gransassotech.it | Admin OIDC |
| CKAN | https://ckan.maps.gransassotech.it | CKAN Admin |
| PostgreSQL | Internal to cluster (port 5432) | Internal pods only |
| Item | Configuration | Monthly Cost |
|---|---|---|
| DOKS control plane | Managed | $12 |
| general-purpose pool | 2x s-2vcpu-4gb | $48 ($24/node) |
| workloads pool (min) | 1x s-4vcpu-8gb | $48 |
| workloads pool (max) | 3x s-4vcpu-8gb | $144 |
| CNPG block storage | 100 GB | $10 |
| Bronze block storage | 100 GB | $10 |
| Load Balancer | HTTPS Ingress | $12 |
| Spaces (backup) | 250 GB | $5 |
| Total (min) | ~$145/month | |
| Total (max, ETL peak) | ~$241/month |
The difference compared to the D2.1.2 estimate ($311/month) is due to the adoption of self-hosted CNPG instead of Managed PostgreSQL ($80/month), and the autoscaling mechanism that reduces costs during idle periods.