Apply on
Original
Simplified
We are seeking a talented DevOps/Site Reliability Engineer (SRE) with a strong background in DevOps practices, Linux environments, and proficiency in scripting and programming languages like Bash, Shell, Python, or Golang. You will be responsible for managing and automating the deployment, monitoring, and reliability of our services. You will work closely with development teams to ensure systems are scalable, resilient, and performance-optimized.
Responsibilities
Experience: 4+ years of experience in DevOps, SRE, or related roles.
Technical Skills:
Monitoring & Logging: Familiarity with monitoring tools like Prometheus, Grafana, Datadog, Splunk.
Responsibilities
- Design, build, and maintain CI/CD pipelines to support continuous integration and deployment
- Develop and implement tools to automate operational processes
- Manage and monitor Linux-based systems for performance, availability, and security
- Collaborate with cross-functional teams to optimize system architecture, ensuring high availability and reliability
- Implement and manage infrastructure as code (IaC) tools
- Conduct root cause analysis on production issues and implement corrective actions to prevent reoccurrence
- Monitor application and infrastructure performance and implement improvements
- Write and maintain scripts in Bash/Shell, Python, or Golang to automate tasks and support infrastructure operations
- Participate in on-call rotation to support production systems
Experience: 4+ years of experience in DevOps, SRE, or related roles.
Technical Skills:
- Proficient in Bash/Shell scripting
- Solid programming skills in Python or Golang
- Strong understanding of Linux systems
- Familiarity with cloud platforms (AWS, GCP, Azure) and containerization tools like Docker and Kubernetes
Monitoring & Logging: Familiarity with monitoring tools like Prometheus, Grafana, Datadog, Splunk.