Design, provision, and operate our core platform: multi-env/multi-region Kubernetes clusters; networking, security, and identities; storage (object store, tables/iceberg/“DuckLake”), and runtime for Python jobs, notebooks, and services., Build code abstractions: Python CLIs/SDKs, templates, and controllers that abstract infra into simple workflows., Define IaC standards (Terraform, Helm) for everything: clusters, apps, policies, and data runtimes., Design and implement platform control plane: end user secrets management, cost controls, tenant/resource limits, SSO, and scheduling., Ship observability as a product: metrics/logs/traces, golden dashboards, SLIs/SLOs, runbooks, and incident/postmortem practice., You’ll work intensively with the CTO and software engineers in the team on all of the above.