Required Skills - advanced knowledge of:
- AWS
- Infrastructure as a Code (Terraform)
- Ansible automation
- Kubernetes
- CI/CD (GitLab, Jenkins or Bamboo)
- Python, Perl, or Golang
- Best practices and IT operations in an always-up, always-available mission critical service
Desired Experience:
- EKS, ECS , ECR
- Working in an agile environment, focused on rapid cycles and CD
- Supporting, analyzing, and troubleshooting large-scale distributed mission-critical systems
- Building software and/or platforms where security, regulatory compliance and high availability are critical
- Strong understanding of Information Security in various environments
Responsibilities:
- Set up, integrate, and maintain a scalable, stable set of CI/CD tools to support development, testing, and security scanning
- Be accountable for a large-scale SaaS app w/a mission-critical customer base
- Manage multiple tools, infrastructure, and roles in a fast-paced environment
- Own the availability of our SaaS infrastructure and application
- Implement best-in-class AWS solution using infrastructure as code
- Collaborate with engineering and product to continuously improve service availability and quality
- Be involved in the entire production lifecycle: code deployments, infrastructure management, and troubleshooting
- Share ownership w/the Dev team, and own service availability and proactive issue prevention, using structured troubleshooting to mitigate issues
- Work closely with our Dev and DevOps teams to ensure that our production services are secure, scalable, performant, and resilient