Location
Melbourne, Australia
Melbourne, Australia
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
Category
AI Engineer
Date posted
March 5, 2026
Job type
Full-time
Experience level
Entry Level 1-2
Summary this job with AI
Highlight
Highlight

Job Description

About the role

Maincode is training Matilda, the first large language model built and trained from scratch in Australia. Our new compute cluster is live, and we are now scaling the next version.

This role sits directly inside that training stack. You will build the pipelines, infrastructure, and tooling that determine how efficiently Matilda trains, how stable long runs are, and how fast new experiments can be executed. Training runs last days or weeks. Small changes propagate through complex systems. The work requires precision and patience.

We build AI systems from first principles: designing the architectures, running the infrastructure, shaping the training process, and operating the models ourselves. Matilda is not a research prototype. It is a production system, trained at scale and served for open public access.

Maincode operates one of the largest private AI compute environments in Australia, built for a single purpose: training our own models. This is not a role that wraps external APIs or ships user-facing features. You will be working on the systems that train a large language model from scratch.

What you would actually do

You will build and maintain the systems that support large scale model training.

This includes:

  • Designing and maintaining distributed training pipelines for large language models

  • Building data ingestion and preprocessing systems for large training datasets

  • Developing tooling for experiment management, checkpointing, and reproducibility

  • Monitoring and debugging long running training jobs across clusters

  • Improving reliability and observability across the training stack

  • Optimising training throughput across compute, memory, and data pipelines

  • Working closely with researchers to translate experimental ideas into training runs

  • Diagnosing failures across infrastructure, training loops, and data pipelines

Training runs can last days or weeks. Small changes propagate through complex systems.

You will spend time inside code, logs, dashboards, and experiment outputs. The goal is simple: make large scale training reliable.

The kind of person who does well here

We are looking for engineers early in their careers who want to understand how large models are actually trained.

You may have one or two years of experience building production software. What matters most is curiosity and the willingness to learn how these systems behave under load.

People who tend to do well here:

  • Care about how systems behave over long runtimes

  • Enjoy debugging complex distributed systems

  • Pay attention to logs, metrics, and system behaviour

  • Prefer understanding how a system works rather than relying on abstraction

  • Are comfortable working close to infrastructure

  • Have the patience to diagnose failures that appear hours into a run

  • Want to learn how large scale AI training actually happens

You do not need prior experience training large language models. What matters is intellectual curiosity, persistence, and the ability to learn quickly.

How you would work

You will write production code that sits directly in the training stack.

You should be comfortable:

  • Working in Python

  • Using machine learning frameworks such as PyTorch or JAX

  • Writing reliable infrastructure for large compute workloads

  • Debugging distributed systems and long running jobs

  • Collaborating closely with researchers and infrastructure engineers

Much of the work sits between research and infrastructure. Ideas move quickly, but the systems that support them must remain stable.

What this role is not

  • It is not primarily about building user facing applications

  • It is not about prompt engineering

  • It is not about wrapping external APIs or third party models

You will be working on the systems that train our own models from scratch.

Why Maincode

Maincode builds AI systems end to end. We prepare the data, design the training process, run the infrastructure, and operate the models ourselves.

You will work with a small team that:

  • Builds the full AI stack rather than outsourcing it

  • Treats infrastructure as part of the intelligence system itself

  • Values engineers who want to understand how things actually work

  • Is building long term capability in training and operating large models

If you want to work directly on the systems that train large language models from scratch, this is the only role in Australia that will put you inside that work.

Note

This is a full time role based in Melbourne, working closely with our in person engineering and research team. At this time we are not able to offer visa sponsorship, so applicants must have existing and unrestricted work rights in Australia.

Apply now
Maincode is hiring a AI Software Engineer (Model Training). Apply through The Homebase and and make the next move in your career!
Apply now
Companies size
11-50
employees
Founded in
Headquaters
Melbourne, Australia
Country
Australia
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

AU.svg
Australia

AI Software Engineer (Model Training)

Full-time
AI Engineer
GE.svg
Germany

Partner AI Deployment Engineer

Full-time
AI Engineer
earth.svg
Middle East

Forward Deployed Engineer, Agentic Platform

Full-time
AI Engineer
US.svg
United States

Senior AI Engineer - USA

Full-time
AI Engineer
GB.svg
United Kingdom

Senior AI Engineer - United Kingdom

Full-time
AI Engineer
CH.svg
Switzerland

Senior AI Engineer - Switzerland

Full-time
AI Engineer
Open Modal