Location
San Francisco United States
San Francisco United States
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
USD
349000
-
523000
Date posted
September 4, 2025
Job type
Full-time
Experience level
Mid level

Job Description

We're here to help the smartest minds on the planet build Superintelligence. The labs pushing the edge? They run on Lambda. Our gear trains and serves their models, our infrastructure scales with them, and we move fast to keep up. If you want to work on massive, world-changing AI deployments with people who love action and hard problems, we're the place to be.


If you'd like to build the world's best deep learning cloud, join us. 


*Note: This position requires presence in our San Francisco office location 4 days per week; Lambda’s designated work from home day is currently Tuesday.


Engineering at Lambda is responsible for building and scaling our cloud offering. Our scope includes the Lambda website, cloud APIs and systems as well as internal tooling for system deployment, management and maintenance.

What You’ll Do

  • Architect high-performance networking solutions that power cloud platforms, with a focus on ultra-low-latency and high-bandwidth connectivity.

  • Define the network topology and architectural patterns for large-scale GPU clusters, storage backends, and multi-tenant environments.

  • Evaluate, benchmark, and select next-generation network technologies (e.g., InfiniBand NDR/XDR, RoCE, 400G/800G Ethernet) to meet AI workload requirements.

  • Develop and maintain network architecture standards, reference designs, and scalability roadmaps for multi-site and hybrid environments.

  • Partner with compute and storage architects to ensure seamless end-to-end data flow and fault tolerance.

  • Guide network automation strategies and tooling to enable efficient provisioning, telemetry, and operational visibility.

  • Mentor engineers and cross-functional teams on advanced network concepts, troubleshooting, and architectural best practices.

You

  • Proven experience (7+ years) architecting high-performance data center networks, preferably for HPC, AI/ML, or large-scale cloud infrastructure.

  • Deep expertise with InfiniBand (HDR/NDR) and advanced Ethernet fabrics, including RoCE and RDMA protocols.

  • Strong understanding of data center switching architectures, congestion control, QoS, and network virtualization (VXLAN, EVPN).

  • Skilled in designing for low-latency and high-throughput data paths, including GPU-to-GPU and storage traffic optimization.

  • Proficient with routing/switching protocols (BGP, OSPF) and software-defined networking (SDN) concepts.

  • Experience building resilient, fault-tolerant network architectures with redundancy, failover, and high availability.

  • Excellent communication and leadership skills, capable of influencing technical decisions across diverse teams.

Nice to Have

  • Hands-on experience with AI workload profiling, collective communication patterns (e.g., NCCL, MPI), and network tuning for distributed training.

  • Familiarity with network automation frameworks and telemetry tools.

  • Exposure to DPU/SmartNIC technologies, including NVIDIA BlueField, or similar.

  • Knowledge of large-scale, multi-site interconnect design, including DWDM or metro/long-haul networking.

  • Experience collaborating with hyperscale or enterprise customers on highly customized network designs.

Salary Range Information

The annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

About Lambda

  • Founded in 2012, ~400 employees (2025) and growing fast

  • We offer generous cash & equity compensation

  • Our investors include Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, US Innovative Technology, Gradient Ventures, Mercato Partners, SVB, 1517, Crescent Cove.

  • We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability

  • Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG

  • Health, dental, and vision coverage for you and your dependents

  • Wellness and Commuter stipends for select roles

  • 401k Plan with 2% company match (USA employees)

  • Flexible Paid Time Off Plan that we all actually use

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.

Apply now
Lambda AI is hiring a Staff HPC Network Architect. Apply through Homebase and and make the next move in your career!
Apply now
Companies size
501-1000
employees
Founded in
2012
Headquaters
San Francisco, CA, United States
Country
United States
Industry
Software Development
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

Staff HPC Network Architect

Full-time
MLOps / DevOps Engineer
US.svg
United States

AI Infrastructure Engineer

Full-time
MLOps / DevOps Engineer
US.svg
United States

Data Center Systems Operations Engineer

Full-time
MLOps / DevOps Engineer
IN.svg
India

Customer Support Engineer, India

Full-time
MLOps / DevOps Engineer
US.svg
United States

Senior Manager - Security Incident Detection and Response

Full-time
MLOps / DevOps Engineer
US.svg
United States

Engineering Manager - Observability

Full-time
MLOps / DevOps Engineer
Open Modal