Design, implement, and maintain infrastructure as code using Terraform across AWS and GCP environments., Manage and optimize Kubernetes clusters, using Helm for packaging and deployment of applications., Build and maintain observability stacks using Grafana, Prometheus, Loki, and tracing tools like Grafana Tempo., Ensure high availability, scalability, and resilience of production systems., Improve deployment processes with CI/CD pipelines (e.g., GitHub Actions), enabling safe and fast delivery of software., Support internal teams by providing reliable, well-documented, and secure infrastructure., Troubleshoot production incidents, perform root cause analysis, and implement postmortem processes., Maintain and harden Docker-based environments and Cloud systems., Champion best practices in monitoring, incident management, performance tuning, and infrastructure automation., Collaborate with development, product, and security teams to ensure infrastructure supports business needs.