Все вакансии

Senior Site Reliability Engineer

EPAM · зарплата не указана · Ho Chi Minh City, Viet Nam · сайт компании · опубликовано 5 июня 2026 г.

Компания EPAM
Источник сайт компании
Опубликовано 5 июня 2026 г.
Зарплата зарплата не указана

Описание вакансии

EPAM Vietnam is hiring a Senior Site Reliability Engineer to support and stabilize a complex, business-critical environment. This is a hands-on, high-ownership role responsible for production incidents, releases, monitoring, alerting and operational excellence.
You will work across Linux, Windows, SQL Server, CI/CD, Kubernetes and Azure while supporting both modern cloud workloads and legacy business-critical systems.
Responsibilities
Own production incidents end-to-end, from triage to fix and follow-up
Troubleshoot Linux & Windows systems, services and databases
Operate and improve monitoring and alerting tools
Support batch workflows and schedulers
Work across production and disaster recovery environments
Improve runbooks, alert quality and operational processes
Requirements
Strong experience in production operations, SRE or infrastructure support
Proven expertise in troubleshooting Linux and Windows production systems and operational knowledge of Microsoft SQL Server diagnostics
Experience with CI/CD pipelines and deployments (e.g., Octopus Deploy, TeamCity and Git/Bitbucket)
Proficiency in monitoring and alerting tools (e.g., Prometheus and Grafana)
Familiarity with batch scheduling tools (e.g., Control-M and TeamCity) and messaging systems (e.g., RabbitMQ)
Working knowledge of Kubernetes and Azure cloud environments
Clear communication for incident management and stakeholder interaction
Strong sense of ownership, sound judgment in escalation and a proactive approach to production reliability

Навыки

  • site reliability engineering
  • Linux
  • SQL
  • CI/CD
  • Kubernetes
  • Git
  • Bitbucket
  • RabbitMQ
Открыть вакансию в ленте