Director, Data Center Operations - North America

US.svg
United States
Location
United States
United States
United States
United States
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
0
USD
220000
-
330000
Date posted
October 31, 2025
Job type
Full-time
Experience level
Mid level

Job Description

Lambda, The Superintelligence Cloud, builds Gigawatt-scale AI Factories for Training and Inference. Lambda’s mission is to make compute as ubiquitous as electricity and give every person access to artificial intelligence. One person, one GPU.


If you'd like to build the world's best deep learning cloud, join us. 

Lambda, Inc. is seeking a highly skilled and experienced Director of Data Center Operations to lead and support Lambda Data Center Operations in North America.


What You'll Do:

As Director of Data Center Operations for North America you lead and support large-scale AI and high-performance computing (HPC) infrastructure in all of Lambda’s North America data centers. This individual will lead and oversee all aspects of data center operations — including reliability, hardware break/fix, capacity planning, provider interface, team mentorship, and new data center setup —ensuring world-class uptime, customer response, and scalability to meet rapidly growing AI infrastructure demands.

Key Responsibilities:

Strategic Leadership

  • Develop and execute the North American data center operations strategy aligned with AI infrastructure goals and organizational growth.

  • Drive continuous improvement across facility operations, emphasizing sustainability, efficiency, and resilience.

  • Partner with Engineering, Capacity Planning, and Infrastructure teams to forecast and support future AI and GPU-based compute requirements. As well as provide operational feedback on designs and system improvements.

  • Oversee expansion projects, retrofits, and site selection in collaboration with Data Center Infrastructure Engineering and HPC Architecture teams.

Operational Excellence

  • Lead a multi-site operations team ensuring 24/7/365 reliability, availability, and SLA response across all facilities.

  • Establish standardized procedures, metrics, and best practices for preventive maintenance, incident management, and service delivery.

  • Monitor operational KPIs including uptime, PUE, safety, and compliance with corporate and regulatory standards.

  • Implement automation and AI-driven monitoring solutions to optimize system performance and predictive maintenance. Coordinate and communicate data center provider maintenances with customers and impacted teams.

Team Leadership and Development

  • Build, mentor, and scale a high-performing team of operations managers, technicians, and engineers across multiple regions.

  • Routinely visit all sites to maintain standards, develop relationships, and identify areas of efficiency.

  • Foster a culture of safety, accountability, and continuous learning driving data center operations to take on more responsibility and work up the stack.

  • Assist in the build out of new data center whitespace and deployment of AI Infrastructure.

Financial and Vendor Management

  • Develop and manage operating budgets, capital expenditures, and cost-optimization initiatives.

  • Oversee strategic vendor partnerships with numerous data center providers for power, cooling, maintenance, and critical infrastructure components.

Risk and Compliance

  • Ensure compliance with environmental, safety, and industry regulations (e.g., NFPA, OSHA, ISO standards).

  • Lead incident response and root cause analysis to drive preventive improvements for incidents related to data center operations or infrastructure.

  • Act as primary point of contact for audits related to data center operations for compliance such as SOCII, ISO, etc.

Qualifications:

  • 10+ years of experience in data center operations, with at least 7 years in a leadership role managing multi-site or hyperscale facilities.

  • Proven experience supporting AI, HPC, or cloud infrastructure at scale.

  • Deep understanding of power and cooling systems, networking, capacity planning, and facility automation tools (DCIM, BMS, etc.).

  • Strong track record of improving operational efficiency and managing relationships with data center providers.

  • Preferred Bachelor’s degree in Engineering, Computer Science, or related field; Master’s bonus.

  • Exceptional communication, cross-functional collaboration, and stakeholder management skills. Ability to build relationships and consensus and positive team culture.

  • Willingness to travel (up to 50%) to data center sites across North America and data center sites under construction.

Preferred Skills:

  • Experience with GPU clusters, AI infrastructure networking, and large-scale storage systems.

  • Familiarity with cloud-scale operational practices (e.g., AWS, Google, Microsoft data center standards).

  • Certifications such as CDCDP, CDCP, PMP, or PE are a plus.

Salary Range Information

The annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

About Lambda

  • Founded in 2012, ~400 employees (2025) and growing fast

  • We offer generous cash & equity compensation

  • Our investors include Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, US Innovative Technology, Gradient Ventures, Mercato Partners, SVB, 1517, Crescent Cove.

  • We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability

  • Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG

  • Health, dental, and vision coverage for you and your dependents

  • Wellness and Commuter stipends for select roles

  • 401k Plan with 2% company match (USA employees)

  • Flexible Paid Time Off Plan that we all actually use

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.

Apply now
Lambda AI is hiring a Director, Data Center Operations - North America. Apply through Homebase and and make the next move in your career!
Apply now
Companies size
501-1000
employees
Founded in
2012
Headquaters
San Francisco, CA, United States
Country
United States
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

Director, Data Center Operations - North America

Full-time
MLOps / DevOps Engineer
US.svg
United States

AI Infrastructure Deployment Lead

Full-time
MLOps / DevOps Engineer
GB.svg
United Kingdom

ML Ops Engineer (Remote)

Full-time
MLOps / DevOps Engineer
TW.svg
Taiwan

MLOps 工程師 (MLOps Engineer)

Full-time
MLOps / DevOps Engineer
US.svg
United States

Security Engineer

Full-time
MLOps / DevOps Engineer
US.svg
United States

Senior Platform Engineer

Full-time
MLOps / DevOps Engineer
Open Modal