About Krea
At Krea, we are building next-generation AI creative tools.
We are dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not replace it.
We believe AI is a new medium that allows us to express ourselves through various formats—text, images, video, sound, and even 3D. We're building better, smarter, and more controllable tools to harness this medium.
This job
Data is one of the fundamental pieces of Krea. Huge amounts of data power our AI training pipelines, our analytics and observability, and many of the core systems that make Krea tick.
As a data engineer, you will…
… build distributed systems to process gigantic (petabytes) amounts of files of all kinds (images, video, and even 3D data). You should feel comfortable solving scaling problems as you go.
… work closely with our research team to build ML pipelines and deploy models to make sense of raw data.
… play with massive amounts of compute on huge kubernetes GPU clusters - our main GPU cluster takes up an entire datacenter from our provider.
… learn machine learning engineering (ML experience is a bonus, but you can also learn it on the job) from world-class researchers on a small yet highly effective tight-knit team.
Example projects
Find clean scenes in millions of videos, running distributed data pipelines that detect shot boundaries and saving timestamps of clips.
Solve orchestration and scaling issues with a large-scale distributed GPU job processing system on kubernetess.
Build systems to deploy and combine different LLMs to caption massive amounts of multimedia data in a variety of different ways.
Design multi-stage pipelines to turn petabytes of raw data into clean downstream datasets, with metadata, annotations, and filters.
Strong candidates may have experience with…
Python, PyArrow, DuckDB, SQL, massive relational databases, PyTorch, Pandas, NumPy…
Kubernetes
Designing and implementing large-scale ETL systems
Fundamental knowledge of containerization, operating systems, file-systems, and networking.
Distributed systems design
About us
We’re building AI creative tooling.
We’ve raised over $83M from the best investors in Silicon Valley.
We’re a team of 12 with millions of active users scaling aggressively.