Skip to content

Hosting Infrastructure Specifications ​

Deliverable D2.1.3: Hosting Infrastructure Specifications

Context and Premises ​

The MAPS infrastructure is built from scratch on a DigitalOcean account dedicated to the GST-MAPS project. Provisioning uses Pulumi as the IaC (Infrastructure as Code) tool, with the same technology stack described in deliverable D2.1.2: DOKS for container orchestration, cert-manager for TLS certificates, external-secrets for secrets management, CloudNative PostgreSQL (CNPG) for the database.

The reference domain is maps.gransassotech.it (or equivalent assigned by GST). DNS management is handled via AWS Route53, with a hosted zone dedicated to the project.

Prerequisite Accounts and Credentials ​

Before starting provisioning, the following must be available:

ResourceTypeNotes
DigitalOcean AccountNew GST account or separate projectWith active billing
DigitalOcean API TokenPersonal Access Token with read+write scopeFor Pulumi DO provider
AWS AccountFor Route53 and AWS Parameter StoreGST account dedicated to the project
Route53 Hosted ZoneFor the maps.gransassotech.it domainZone ID required for Pulumi
AWS IAM CredentialsAccess Key + Secret Key with Route53 + SSM permissionsFor external-dns and external-secrets
S3 BucketFor Pulumi stateE.g. gst-maps-pulumi-state
AWS KMS KeyFor Pulumi state encryptionAlias pulumi-maps

Pulumi Project Structure ​

Provisioning is implemented with a maps Pulumi stack structured as follows:

pulumi/maps/
├── __main__.py          # Entry point
├── cluster.py           # DOKS cluster creation
├── components.py        # Helm component installation
├── security.py          # IAM, network policies, secrets
├── dns.py               # Route53 DNS records
├── services.py          # Namespaces and service configuration
├── utils.py             # Helpers
├── Pulumi.yaml          # Project definition
├── Pulumi.maps.yaml     # maps stack configuration
├── helm/
│   └── maps/            # Helm values for the maps stack
│       ├── cert-manager.yaml
│       ├── external-secrets.yaml
│       ├── cnpg.yaml
│       ├── ingress-nginx.yaml
│       ├── external-dns.yaml
│       └── metrics-server.yaml
└── manifests/
    └── maps/            # Additional Kubernetes manifests
        └── cluster-issuer.yaml

DOKS Cluster Configuration ​

Cluster Parameters ​

yaml
# Pulumi.maps.yaml (excerpt)
config:
  env: maps
  defaultDOregion: fra1
  route53Domains:
    - maps.gransassotech.it

  cluster:
    name: maps
    version: "1.33"          # Update to the latest available stable version
    control_plane_high_availability: false
    node_pools:
      - name: general-purpose
        size: s-2vcpu-4gb
        count: 2
      - name: workloads
        size: s-4vcpu-8gb
        autoscale:
          min: 1
          max: 3

Node Pool: Rationale ​

The general-purpose pool (2 fixed nodes, 2vCPU/4GB) hosts system components: ingress-nginx, cert-manager, external-dns, external-secrets. The workloads pool (1-3 autoscaling nodes, 4vCPU/8GB) hosts application services: Prefect, PostgreSQL (CNPG), OpenMetadata, CKAN. The sizing is consistent with that described in deliverable D2.1.2: under normal conditions 1 workloads node is sufficient, with ETL peaks triggering autoscaling up to 3 nodes.

Helm Components to Install ​

ComponentChartRepoNamespaceNotes
metrics-servermetrics-serverkubernetes-sigskube-systemHPA metrics
cert-managercert-managerjetstackcert-managerCRDs enabled
ingress-nginxingress-nginxkubernetes-nginxingress-nginxDO LoadBalancer
external-dnsexternal-dnskubernetes-sigskube-systemAWS Route53 provider
external-secretsexternal-secretsexternal-secretsexternal-secretsCRDs enabled
cnpgcloudnative-pgcloudnative-pgcnpg-systemPostgreSQL Operator

The Helm values for each component are standard configurations for each of the listed charts, with only the adaptations related to the project domain and credentials.

Manifest: Let's Encrypt ClusterIssuer ​

yaml
# manifests/maps/cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: admin@gransassotech.org
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - http01:
          ingress:
            ingressClassName: nginx

DNS Management ​

Required Records ​

RecordTypeDestination
maps.k8s.maps.gransassotech.itADOKS Load Balancer IP (managed by Pulumi/external-dns)
prefect.maps.gransassotech.itCNAMEmaps.k8s.maps.gransassotech.it
metadata.maps.gransassotech.itCNAMEmaps.k8s.maps.gransassotech.it
ckan.maps.gransassotech.itCNAMEmaps.k8s.maps.gransassotech.it

The Load Balancer A record is created automatically by Pulumi via setup_public_lb_dns_record(). The CNAME records for individual services are created automatically by external-dns upon reading the Kubernetes Ingress resources.

IAM Credentials for external-dns ​

Pulumi automatically creates an IAM user external-dns-maps with a policy limited to route53:ChangeResourceRecordSets on the maps.gransassotech.it hosted zone. The credentials are injected as a Kubernetes Secret in the kube-system namespace.

Secrets Management ​

AWS Parameter Store Structure ​

Application secrets are stored in AWS Parameter Store with the following path schema:

/maps/{service}/{parameter}

Examples:
/maps/postgres/password
/maps/prefect/secret-key
/maps/openmetadata/jwt-secret
/maps/ckan/api-key

External Secrets Operator ​

For each application service, Pulumi creates:

  1. An IAM user external-secrets-{service}-maps with a read-access policy on the ssm:GetParameter* parameters at path /maps/{service}/*
  2. The IAM credentials as a Kubernetes Secret in the service namespace
  3. A SecretStore resource referencing the credentials
  4. An ExternalSecret resource mapping SSM parameters to Kubernetes Secrets

Secrets are synchronized with a 1-hour refresh interval.

Secrets Managed Directly by Pulumi ​

System secrets (non-application) are stored encrypted in Pulumi.maps.yaml via AWS KMS:

  • external-dns-credentials: IAM credentials for external-dns
  • gitlab-runner-secret: CI/CD runner token (if configured)

Database: CloudNative PostgreSQL (CNPG) ​

The PostgreSQL database for the MAPS project is managed via the CNPG operator, already installed in the cluster. The CNPG Cluster resource for MAPS specifies:

yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: maps-postgres
  namespace: maps
spec:
  instances: 2
  imageName: ghcr.io/cloudnative-pg/postgis:17
  postgresql:
    parameters:
      shared_buffers: "256MB"
      max_connections: "100"
  storage:
    size: 100Gi
    storageClass: do-block-storage
  backup:
    barmanObjectStore:
      destinationPath: s3://gst-maps-backups/postgres
      s3Credentials:
        accessKeyId:
          name: maps-backup-credentials
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: maps-backup-credentials
          key: SECRET_ACCESS_KEY
      wal:
        retention: "7d"
    retentionPolicy: "30d"

CNPG automatically manages: synchronous replication between the 2 instances, automatic failover, continuous WAL backup to S3, and point-in-time recovery.

Storage for Backups ​

A dedicated S3 bucket (DigitalOcean Spaces or AWS S3) gst-maps-backups is required, with the corresponding write-access credentials for CNPG.

Kubernetes Namespace for MAPS ​

MAPS application services are deployed in the maps namespace:

yaml
apiVersion: v1
kind: Namespace
metadata:
  name: maps
  labels:
    environment: production
    project: gst-maps

Provisioning Procedure ​

Execution Order ​

Running pulumi up --stack maps from the pulumi/maps/ directory performs the following operations in order:

  1. DOKS cluster creation with the two node pools
  2. Network policy and IAM user creation for external-dns
  3. System Kubernetes secrets creation
  4. Helm component installation (in parallel where possible)
  5. Kubernetes manifest application (ClusterIssuer, etc.)
  6. Load Balancer DNS record configuration on Route53
  7. Namespace and application service secrets configuration

Local Prerequisites ​

bash
# Tools required on the provisioning machine
pulumi >= 3.0
python >= 3.11
poetry
kubectl
doctl (DigitalOcean CLI)
aws CLI (configured with maps profile)

Commands ​

bash
# 1. Configure AWS profile
export AWS_PROFILE=maps

# 2. Login to Pulumi state on S3
pulumi login s3://gst-maps-pulumi-state

# 3. Select stack
pulumi stack select maps

# 4. Preview (verify without applying)
pulumi preview

# 5. Apply
pulumi up

Service Access After Deployment ​

ServiceURLAuthentication
Prefect UIhttps://prefect.maps.gransassotech.itPrefect credentials
OpenMetadatahttps://metadata.maps.gransassotech.itAdmin OIDC
CKANhttps://ckan.maps.gransassotech.itCKAN Admin
PostgreSQLInternal to cluster (port 5432)Internal pods only

DigitalOcean Cost Estimate ​

ItemConfigurationMonthly Cost
DOKS control planeManaged$12
general-purpose pool2x s-2vcpu-4gb$48 ($24/node)
workloads pool (min)1x s-4vcpu-8gb$48
workloads pool (max)3x s-4vcpu-8gb$144
CNPG block storage100 GB$10
Bronze block storage100 GB$10
Load BalancerHTTPS Ingress$12
Spaces (backup)250 GB$5
Total (min)~$145/month
Total (max, ETL peak)~$241/month

The difference compared to the D2.1.2 estimate ($311/month) is due to the adoption of self-hosted CNPG instead of Managed PostgreSQL ($80/month), and the autoscaling mechanism that reduces costs during idle periods.