Все вакансии

Data Center Site Lead

Oracle · зарплата не указана · Thailand · сайт компании · опубликовано 23 мая 2026 г.

Компания Oracle
Источник сайт компании
Опубликовано 23 мая 2026 г.
Зарплата зарплата не указана

Описание вакансии

The Data Center Site Lead is responsible for leading operational excellence across OCI data center facilities, ensuring high availability, safety, and performance of critical infrastructure. This role will oversee rack deployment activities, infrastructure commissioning, environmental monitoring, and operational governance with colocation providers. The successful candidate will bring hands-on experience from hyperscale cloud environments and possess a strong understanding of both mechanical and electrical systems supporting modern data centers, including liquid-cooled deployments.
About Oracle Cloud Infrastructure (OCI)
Oracle Cloud Infrastructure (OCI) is building the next generation cloud platform that operates at hyperscale across a rapidly expanding global footprint. OCI's mission is to provide customers with high-performance, highly available, and secure cloud infrastructure services. As part of our continued growth, we are seeking an experienced Data Center Site Lead to oversee day-to-day operations, infrastructure deployments, and colocation partner management within our mission-critical data center environments.
Role Overview
The Data Center Site Lead is responsible for leading operational excellence across OCI data center facilities, ensuring high availability, safety, and performance of critical infrastructure. This role will oversee rack deployment activities, infrastructure commissioning, environmental monitoring, and operational governance with colocation providers. The successful candidate will bring hands-on experience from hyperscale cloud environments and possess a strong understanding of both mechanical and electrical systems supporting modern data centers, including liquid-cooled deployments.
This position requires a highly collaborative leader capable of coordinating cross-functional teams, driving operational rigor, and ensuring adherence to service level agreements (SLAs) and operational standards.
Key Responsibilities Data Center Operations
Lead day-to-day operations of OCI data center facilities to ensure maximum uptime, reliability, and operational efficiency.
Serve as the primary site operational lead for mission-critical infrastructure and customer-impacting events.
Drive operational readiness and continuous improvement initiatives across the site.
Infrastructure Deployment & Capacity Expansion
Oversee server rack deployments, hardware installations, and capacity expansion projects.
Coordinate with internal engineering, network, logistics, and deployment teams to ensure timely execution of infrastructure rollouts.
Support implementation and operational management of liquid-cooled data halls and associated cooling infrastructure.
Commissioning & Infrastructure Readiness
Support commissioning and acceptance testing of electrical and mechanical infrastructure, including:
UPS systems
Switchgear
Power distribution systems
Generators
Cooling systems
CRAH/CRAC units
Liquid cooling systems
Building Management Systems (BMS)
Validate operational readiness prior to production handover.
Environmental Monitoring & Compliance
Monitor and manage critical environmental parameters, including:
Temperature
Humidity
Airflow
Power utilization
Cooling performance
Water leak detection systems
Ensure compliance with OCI operational standards, safety requirements, and regulatory obligations.
Drive root cause analysis and corrective actions for environmental excursions or operational anomalies.
Colocation Provider Management
Act as the primary operational interface with colocation providers.
Conduct regular operational governance meetings and service reviews.
Monitor and enforce contractual SLA adherence and service performance metrics.
Escalate and resolve facility-related issues impacting operations.
Review maintenance activities, change management plans, and risk assessments with providers.
Incident & Change Management
Lead operational bridge calls during incidents and critical events.
Coordinate cross-functional response teams to restore services and mitigate risks.
Ensure proper execution of change management processes and operational procedures.
Drive post-incident reviews and corrective action tracking.
Team Leadership & Stakeholder Engagement
Provide leadership and guidance to site operations personnel and supporting vendors.
Collaborate with global operations, engineering, network, security, and capacity planning teams.
Develop and maintain site operating procedures, runbooks, and operational documentation.
Required Qualifications
Bachelor's degree in Engineering, Data Center Operations, Facilities Management, or a related technical discipline, or equivalent practical experience.
8+ years of experience in data center operations, facilities engineering, or critical environment management.
Prior experience working within a hyperscale cloud provider environment (e.g., Oracle Cloud, AWS, Microsoft Azure, Google Cloud, Meta, or similar).
Demonstrated experience operating and supporting liquid-cooled data center environments.
Experience managing rack deployment programs and large-scale hardware installations.
Strong understanding of critical electrical and mechanical systems supporting data centers.
Experience supporting commissioning, testing, and handover of data center infrastructure.
Experience managing relationships with colocation providers and external vendors.
Proven experience conducting operational reviews, governance meetings, and SLA performance assessments.
Strong incident management and operational escalation experience.
Excellent communication, stakeholder management, and leadership skills.
Preferred Qualifications
Experience operating large-scale AI, HPC, or GPU-intensive infrastructure environments.
Knowledge of data center monitoring systems, BMS, DCIM, and environmental management platforms.
Familiarity with ITIL-based operational processes.
Project management experience supporting capacity expansion and infrastructure programs.
Data center certifications such as:
CDCP
CDCS
DCEP
Uptime Institute Certifications
Relevant electrical or mechanical engineering certifications
Key Competencies
Operational Excellence
Critical Infrastructure Management
Hyperscale Data Center Operations
Liquid Cooling Technologies
Vendor & Colocation Management
Incident Command & Escalation Management
Commissioning & Infrastructure Readiness
Service Level Governance
Leadership & Team Development
Cross-Functional Collaboration
Why Join OCI?
At Oracle Cloud Infrastructure, you will play a critical role in building and operating one of the world's fastest-growing cloud platforms. You'll work alongside industry-leading experts, influence the design and operation of next-generation data centers, and contribute directly to OCI's global expansion and innovation initiatives.
Required Technical Skills & Expertise
Critical Electrical Infrastructure
Strong understanding of end-to-end data center power train architecture, including: Utility power feeds
Substations and transformers
Medium- and low-voltage switchgear
Automatic Transfer Switches (ATS)
Static Transfer Switches (STS)
Uninterruptible Power Supply (UPS) systems
Power Distribution Units (PDUs)
Remote Power Panels (RPPs)
Busway systems
Generator systems and fuel infrastructure
Ability to assess power capacity, redundancy models (N, N+1, 2N), and operational risk.
Mechanical & Cooling Systems
Deep knowledge of data center mechanical systems, including: Chillers
Cooling towers
CRAH/CRAC units
Direct-to-chip liquid cooling systems
CDU (Coolant Distribution Unit) operations
Heat rejection systems
Water treatment and leak detection systems
Building Management Systems (BMS)
Experience troubleshooting thermal performance and optimizing cooling efficiency in high-density environments.
IT Systems & Hardware Operations
Experience supporting hyperscale server deployment and lifecycle management.
Strong understanding of: Server hardware architecture
Storage systems
RAID configurations and storage resiliency concepts
Firmware and hardware maintenance procedures
Asset lifecycle management
Network rack integration and structured cabling practices
Familiarity with hardware diagnostics, break-fix processes, and operational readiness testing.
Industrial Controls & Monitoring Systems
Experience with data center monitoring and automation platforms, including: DCIM platforms
BMS and EPMS systems
Environmental monitoring systems
Understanding of industrial communication protocols such as: Modbus TCP/IP
Modbus RTU
SNMP
BACnet
OPC-based monitoring architectures
Ability to interpret telemetry, alarms, trends, and infrastructure performance metrics.
Vendor & Colocation Ecosystem Management
Strong understanding of the data center vendor landscape across: Electrical infrastructure providers
HVAC and liquid cooling manufacturers
Power systems vendors
Monitoring and controls platforms
Experience working with leading OEMs and service providers such as Schneider Electric, Vertiv, Eaton, Siemens, ABB, Cummins, Caterpillar, Trane, Johnson Controls, Stulz, Carrier, and equivalent industry vendors.
Ability to coordinate maintenance, commissioning, warranty support, and escalation management across multiple vendors and service partners.
Operational Governance & Service Management
Experience leading operational reviews with colocation providers and service partners.
Strong understanding of: SLA and KPI management
Change management processes
Preventive and corrective maintenance programs
Incident management and root cause analysis (RCA)
Risk assessments and operational readiness reviews
Proven ability to lead technical bridge calls and coordinate cross-functional response teams during critical incidents.
Additional Preferred Qualifications
Experience supporting AI/HPC clusters and GPU-based infrastructure deployments.
Knowledge of ASHRAE thermal guidelines and modern liquid cooling standards.
Familiarity with energy efficiency metrics including PUE, WUE, and cooling optimization strategies.
Experience with commissioning methodologies and Integrated Systems Testing (IST).
Understanding of sustainability initiatives and energy management programs within hyperscale data centers.
Career Level - IC4
Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives.
True innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

Навыки

  • Airflow
  • AWS
Открыть вакансию в ленте