About this role
Our client is seeking a skilled Software Engineer specializing in Site Reliability Engineering (SRE) to join their dynamic team. The ideal candidate will play a crucial role in ensuring the reliability, availability, and performance of our client’s digital products.
Key Responsibilities:
- Design, build, and maintain scalable and reliable systems.
- Monitor system performance and troubleshoot issues proactively.
- Collaborate with development teams to enhance system architecture and improve operational efficiency.
- Implement automation tools and frameworks to streamline processes.
- Conduct post-incident reviews and implement corrective actions.
- Develop and maintain documentation for systems and processes.
Required Skills & Qualifications:
- Proficiency in programming languages such as Python, Java, or Go.
- Experience with cloud platforms like AWS or Azure.
- Strong understanding of containerization technologies (Docker, Kubernetes).
- Familiarity with monitoring tools (Prometheus, Grafana).
- Knowledge of CI/CD pipelines and DevOps practices.
- Excellent problem-solving skills and ability to work in a fast-paced environment.
Experience:
- 4-6 years of relevant experience in software engineering and site reliability engineering.
What we offer:
- A collaborative work environment with opportunities for professional growth.
- The chance to work on cutting-edge technologies in a rapidly growing industry.
- A culture that values innovation and encourages new ideas.
This role is managed by AI-First Talent on behalf of our client. Your application is reviewed directly by our talent team.