Design, build, and maintain Go microservices that handle AI model inference, data processing pipelines, and real-time streaming workflows., Architect scalable APIs (gRPC/REST) that serve as the bridge between AI models and production applications., Own the Kubernetes infrastructure (EKS), including deployments, autoscaling policies, service mesh, and cluster health monitoring., Implement service-to-service communication using gRPC and message queues (RabbitMQ/SQS) for asynchronous processing., Integrate with cloud AI services (AWS Bedrock, OpenAI, Anthropic) and manage model serving infrastructure., Build multi-tenant capabilities including authentication (JWT/JWKS), rate limiting, usage tracking, and tenant isolation., Partner with the Data & AI team to productionize machine learning models—wrapping them in production-ready services with proper health checks, circuit breakers, and graceful degradation., Build comprehensive observability: structured logging, metrics (Prometheus), distributed tracing (Jaeger/Tempo), and alerting., Implement CI/CD pipelines and infrastructure-as-code (Terraform) for automated deployments and disaster recovery., Ensure high availability through proper monitoring, incident response, and post-mortem analysis., Optimize resource utilization for GPU workloads and cost-efficient scaling strategies.