Location
Pangyo, South Korea
Pangyo, South Korea
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
Category
AI Engineer
Date posted
February 9, 2026
Job type
Full-time
Experience level
Summary this job with AI
Highlight
Highlight

Job Description

We are looking for the best

At 42dot, our AI Infrastructure Engineer manages the high-performance AI infrastructure orchestrating thousands of GPUs across multiple data centers. You will contribute to the scaling, monitoring, and operational optimization required to maintain a robust and world-class computing environment.

Responsibilities

  • Operate and maintain a large-scale GPU cluster consisting of thousands of GPUs across multiple data centers using Kubernetes and Slurm.

  • Monitor and diagnose failures across the GPU hardware and software stacks to ensure high availability and rapid recovery.

  • Develop automation tools and scripts using Python or Shell to streamline repetitive infrastructure management tasks and improve operational efficiency.

  • Manage GPU resource quotas and provide technical support to ML researchers to ensure optimal utilization of computing resources.

  • Participate in the architectural design and performance tuning of distributed training environments for large-scale autonomous driving models.

Qualifications

  • Strong proficiency in Linux operating systems, including a solid understanding of kernel operations, process management, and system security.

  • Practical experience with containerization technologies (Docker) and orchestration (Kubernetes), including building, managing, and troubleshooting containerized environments.

  • Solid understanding of networking fundamentals, including TCP/IP and HTTP(S), with the ability to perform basic network troubleshooting.

  • Ability to write clean and maintainable scripts in Python or Shell for automation and system administration.

  • Logical approach to problem-solving with the persistence to identify and resolve root causes in complex, large-scale systems.

  • Strong communication skills to effectively collaborate with cross-functional teams and external partners.

Preferred Qualifications

  • Experience in building observability stacks with Prometheus, Grafana, and Datadog for large-scale clusters.

  • Experience in building or operating infrastructure on public cloud platforms such as AWS or GCP.

  • Knowledge of the NVIDIA accelerated computing stack, including drivers, CUDA, and NCCL.

  • Familiarity with the ML model training lifecycle and deep learning frameworks such as PyTorch or TensorFlow.

  • Experience with large-scale workload managers or resource scheduling tools such as Kubernetes or Slurm.

  • Familiarity with Infrastructure as Code (IaC) tools such as Terraform to manage complex infrastructure.

Interview Process

  • 서류전형 - 온라인 코딩테스트 - 화상면접 (1시간 내외) - 대면면접 (3시간 내외) - 최종합격

  • 전형절차는 직무별로 다르게 운영될 수 있으며, 일정 및 상황에 따라 변동될 수 있습니다.

  • 전형일정 및 결과는 지원서에 등록하신 이메일로 개별 안내드립니다.

Additional Information

  • 이력서 제출 시 주민등록번호, 가족관계, 혼인 여부, 연봉, 사진, 신체조건, 출신 지역 등 채용절차법상 요구 금지된 정보는 제외 부탁드립니다.

  • 모든 제출 파일은 30MB 이하의 PDF 양식으로 업로드를 부탁드립니다. (이력서 업로드 중 문제가 발생한다면 지원하시고자 하는 포지션의 URL과 함께 이력서를 recruit@42dot.ai으로 전송 부탁드립니다.)

  • 인터뷰 프로세스 종료 후 지원자의 동의하에 평판조회가 진행될 수 있습니다.

  • 국가보훈대상자 및 취업보호 대상자는 관계법령에 따라 우대합니다.

  • 장애인 고용 촉진 및 직업재활법에 따라 장애인 등록증 소지자를 우대합니다.

  • 42dot은 의뢰하지 않은 서치펌의 이력서를 받지 않으며, 요청하지 않은 이력서에 대해 수수료를 지불하지 않습니다.

※ 지원 전 아래 내용을 꼭 확인해 주세요.

Apply now
42dot is hiring a AI Infrastructure Engineer. Apply through The Homebase and and make the next move in your career!
Apply now
Companies size
501-1000
employees
Founded in
2019
Headquaters
Seongnam-si, South Korea
Country
Korea, Republic of
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

GB.svg
United Kingdom

Healthcare & life sciences AI agent analyst (contract)

Contractor
AI Engineer
US.svg
United States

Agentic Engineer

Full-time
AI Engineer
US.svg
United States

Senior Forward Deployed Engineer

Full-time
AI Engineer
GB.svg
United Kingdom

Hardware Engineer, Silicon Design

Full-time
AI Engineer
AE.svg
United Arab Emirates

Founding Engineer, Dubai

Full-time
AI Engineer
KR.svg
South Korea

AI Infrastructure Engineer

Full-time
AI Engineer
Open Modal