Top Data Engineer Jobs Openings in 2025
Looking for opportunities in Data Engineer? This curated list features the latest Data Engineer job openings from AI-native companies. Whether you're an experienced professional or just entering the field, find roles that match your expertise, from startups to global tech leaders. Updated everyday.
People also search for:
Lead Data Engineer
Air Ops
1-10
-
United States
Full-time
Remote
false
About AirOpsToday thousands of leading brands and agencies use AirOps to win the battle for attention with content that both humans and agents love.We’re building the platform and profession that will empower a million marketers to become modern leaders — not spectators — as AI reshapes how brands reach their audiences.We’re backed by awesome investors, including Unusual Ventures, Wing VC, Founder Collective, XFund, Village Global, and Alt Capital, and we’re building a world-class team with in-person hubs in San Francisco, New York, and Montevideo, Uruguay.About the RoleAs Lead Data Engineer, you will own and scale the data platform that powers AirOps insights on AI search visibility and content performance. You will set technical direction, write production code, and build a small, high-output team that turns raw web, content, and AI agent data into trustworthy datasets. Your work will drive customer-facing analytics and product features while giving our content and growth teams a clear path from strategy to execution. You value extreme ownership, sweat the details on data quality, and love partnering across functions to ship fast without losing rigor.Key ResponsibilitiesData platform ownership: design, build, and operate batch and streaming pipelines that ingest data from crawlers, partner APIs, product analytics, and CRM.Core modeling: define and maintain company-wide models for content entities, search queries, rankings, AI agent answers, engagement, and revenue attribution.Orchestration and CI: implement workflow management with Airflow or Prefect, dbt-based transformations, version control, and automated testing.Data quality and observability: set SLAs, add tests and data contracts, monitor lineage and freshness, and lead root cause analysis.Warehouse and storage: run Snowflake or BigQuery and Postgres with strong performance, cost management, and partitioning strategies.Semantic layer and metrics: deliver clear, documented metrics datasets that power dashboards, experiments, and product activation.Product and customer impact: partner with Product and Customer teams to define tracking plans and measure content impact across on-site and off-site channels.Tooling and vendors: evaluate, select, and integrate the right tools for ingestion, enrichment, observability, and reverse ETL.Team leadership: hire, mentor, and level up data and analytics engineers; establish code standards, review practices, and runbooks.Qualifications5+ years in data engineering with 2+ years leading projectsExpert SQL and Python with deep experience building production pipelines at scaleHands-on with dbt and a workflow manager such as Airflow or PrefectStrong background in dimensional and event-driven modeling and a company-wide metrics layerExperience with Snowflake or BigQuery, plus Postgres for transactional use casesTrack record building data products for analytics and customer reportingCloud experience on AWS or GCP and infrastructure as code such as TerraformDomain experience in SEO, content analytics, or growth experimentation is a plusClear communicator with a bias for action, curiosity, and a high bar for qualityOur Guiding PrinciplesExtreme OwnershipQualityCuriosity and PlayMake Our Customers HeroesRespectful CandorBenefitsEquity in a fast-growing startupCompetitive benefits package tailored to your locationFlexible time off policyGenerous parental leaveA fun-loving and (just a bit) nerdy team that loves to move fast!
Data Engineer
Data Science & Analytics
Apply
September 28, 2025
Founding Data Engineer
Elicit
11-50
USD
0
185000
-
305000
United States
Full-time
Remote
false
About ElicitElicit is an AI research assistant that uses language models to help professional researchers and high-stakes decision makers break down hard questions, gather evidence from scientific/academic sources, and reason through uncertainty.What we're aiming for:Elicit radically increases the amount of good reasoning in the world.For experts, Elicit pushes the frontier forward.For non-experts, Elicit makes good reasoning more accessible. People who don't have the tools, expertise, time, or mental energy to make carefully-reasoned decisions on their own can do so with Elicit.Elicit is a scalable ML system based on human-understandable task decompositions, with supervision of process, not outcomes. This expands our collective understanding of safe AGI architectures.Visit our Twitter to learn more about how Elicit is helping researchers and making progress on our mission.Why we're hiring for this roleTwo main reasons:Currently, Elicit operates over academic papers and clinical trials. One of your key initial responsibilities will be to build a complete corpus of these documents, available as soon as they're published, combining different data sources and ingestion methods. Once that's done there is a growing list of other document types and sources we'd love to integrate!One of our main initiatives is to broaden the sorts of tasks you can complete in Elicit. We need a data engineer to figure out the best way to ingest massive amounts of heterogeneous data in such a way as to make it usable by LLMs. We need your help to integrate into our customers' custom data providers to that they can create task-specific workflows over them.In general, we're looking for someone who can architect and implement robust, scalable solutions to handle our growing data needs while maintaining high performance and data quality.Our tech stackData pipeline: Python, Flyte, SparkProbably less relevant to you, but ICOI:Backend: Node and Python, event sourcingFrontend: Next.js, TypeScript, and TailwindWe like static type checking in Python and TypeScript!All infrastructure runs in Kubernetes across a couple of cloudsWe use GitHub for code reviews and CIWe deploy using the gitops pattern (i.e. deploys are defined and tracked by diffs in our k8s manifests)Am I a good fit?Consider the questions:How would you optimize a Spark job that's processing a large amount of data but running slowly?What are the differences between RDD, DataFrame, and Dataset in Spark? When would you use each?How does data partitioning work in distributed systems, and why is it important?How would you implement a data pipeline to handle regular updates from multiple academic paper sources, ensuring efficient deduplication?If you have a solid answer for these—without reference to documentation—then we should chat!Location and travelWe have a lovely office in Oakland, CA; there are people there every day but we don't all work from there all the time. It's important to us to spend time with our teammates, however, so we ask that all Elicians spend about 1 week out of every 6 with teammates.We wrote up more details on this page.What you'll bring to the role5+ years of experience as a data engineer: owning make-or-break decisions about how to ingest, manage, and use dataStrong proficiency in Python (5+ years experience)You have created and owned a data platform at rapidly-growing startups—gathering needs from colleagues, planning an architecture, deploying the infrastructure, and implementing the toolingExperience with architecting and optimizing large data pipelines, ideally with particular experience with Spark; ideally these are pipelines which directly support user-facing features (rather than internal BI, for example)Strong SQL skills, including understanding of aggregation functions, window functions, UDFs, self-joins, partitioning, and clustering approachesExperience with columnar data storage formats like ParquetStrong opinions, weakly-held about approaches to data quality managementCreative and user-centric problem-solvingYou should be excited to play a key role in shipping new features to users—not just building out a data platform!Nice to HaveExperience in developing deduplication processes for large datasetsHands-on experience with full-text extraction and processing from various document formats (PDF, HTML, XML, etc.)Familiarity with machine learning concepts and their application in search technologiesExperience with distributed computing frameworks beyond Spark (e.g., Dask, Ray)Experience in science and academia: familiarity with academic publications, and the ability to accurately model the needs of our usersHands-on experience with industry standard tools like Airflow, DBT, or HadoopHands-on experience with standard paradigms like data lake, data warehouse, or lakehouseWhat you'll doYou'll own:Building and optimizing our academic research paper pipelineYou'll architect and implement robust, scalable systems to handle data ingestion while maintaining high performance and quality.You'll work on efficiently deduplicating hundreds of millions of research papers, and calculating embeddings.Your goal will be to make Elicit the most complete and up-to-date database of scholarly sources.Expanding the datasets Elicit works overOur users want Elicit to work over court documents, SEC filings, … your job will be to figure out how to ingest and index a rapidly increasing ontology of documents.We also want to support less structured documents, spreadsheets, presentations, all the way up to rich media like audio and video.Larger customers often want for us to integrate private data into Elicit for their organisation to use. We'll look to you to define and build a secure, reliable, fast, and auditable approach to these data connectors.Data for our ML systemsYou'll figure out the best way to preprocess all these data mentioned above to make them useful to models.We often need datasets for our model fine-tuning. You'll work with our ML engineers and evaluation experts to find, gather, version, and apply these datasets in training runs.Your first week:Start building foundational contextGet to know your team, our stack (including Python, Flyte, and Spark), and the product roadmap.Familiarize yourself with our current data pipeline architecture and identify areas for potential improvement.Make your first contribution to ElicitComplete your first Linear issue related to our data pipeline or academic paper processing.Have a PR merged into our monorepo, demonstrating your understanding of our development workflow.Gain understanding of our CI/CD pipeline, monitoring, and logging tools specific to our data infrastructure.Your first month:You'll complete your first multi-issue projectTackle a significant data pipeline optimization or enhancement project.Collaborate with the team to implement improvements in our academic paper processing workflow.You're actively improving the teamContribute to regular team meetings and hack days, sharing insights from your data engineering expertise.Add documentation or diagrams explaining our data pipeline architecture and best practices.Suggest improvements to our data processing and storage methodologies.Your first quarter:You're flying soloIndependently implement significant enhancements to our data pipeline, improving efficiency and scalability.Make impactful decisions regarding our data architecture and processing strategies.You've developed an area of expertiseBecome the go-to resource for questions related to our academic paper processing pipeline and data infrastructure.Lead discussions on optimizing our data storage and retrieval processes for academic literature.You actively research and improve the productPropose and scope improvements to make Elicit more comprehensive and up-to-date in terms of scholarly sources.Identify and implement technical improvements to surpass competitors like Google Scholar in terms of coverage and data quality.Compensation, benefits, and perksIn addition to working on important problems as part of a productive and positive team, we also offer great benefits (with some variation based on location):Flexible work environment: work from our office in Oakland or remotely with time zone overlap (between GMT and GMT-8), as long as you can travel for in-person retreats and coworking eventsFully covered health, dental, vision, and life insurance for you, generous coverage for the rest of your familyFlexible vacation policy, with a minimum recommendation of 20 days/year + company holidays401K with a 6% employer matchA new Mac + $1,000 budget to set up your workstation or home office in your first year, then $500 every year thereafter$1,000 quarterly AI Experimentation & Learning budget, so you can freely experiment with new AI tools to incorporate into your workflow, take courses, purchase educational resources, or attend AI-focused conferences and eventsA team administrative assistant who can help you with personal and work tasksYou can find more reasons to work with us in this thread!For all roles at Elicit, we use a data-backed compensation framework to keep salaries market-competitive, equitable, and simple to understand. For this role, we target starting ranges of:Senior (L4): $185-270k + equityExpert (L5): $215-305k + equityPrincipal (L6): >$260 + significant equityWe're optimizing for a hire who can contribute at a L4/senior-level or above.We also offer above-market equity for all roles at Elicit, as well as employee-friendly equity terms (10-year exercise periods).
Data Engineer
Data Science & Analytics
Apply
September 25, 2025
Analytics Data Engineer
HappyRobot
51-100
USD
0
120000
-
220000
United States
Full-time
Remote
false
About HappyrobotHappyRobot is a platform to build and deploy AI workers that automate communication. See a demoOur AI workers connect to any system or data source to handle phone calls, email, messages…We target the logistics industry which relies heavily on communication to book, check on, & pay for freight. Primarily working with freight brokers, 3PLs, freight forwarders, shippers, warehouses, & other supply chain enterprises and tech startups.We’re thrilled to share that with our $44M Series B, HappyRobot has now raised a total of $62M — backed by leading investors who believe in our mission and vision for the future.We're looking for rockstars with a relentless drive, unstoppable energy, and a true passion for building something great—ready to embrace the challenge, push limits, and thrive in a fast-paced, high-intensity environment.About the RoleBuild foundational data products, dashboards and tools to enable self-serve analytics to scale across the company.Develop insightful and reliable dashboards to track performance of core metrics that will deliver insights to the whole company.Build and maintain robust data pipelines and models to ensure data quality.Partner with Product, Engineering, and Design teams to inform decisions.Translate complex data into clear, actionable insights for product teams.Love working with data. Love making product better. Love finding the story behind the numbers.ResponsibilitiesDefine, build, and maintain product metrics, dashboards, and pipelines.Write SQL and Python code to extract, transform, and analyze data.Design and run experiments (A/B tests) to support product development.Proactively explore data to identify product opportunities and insights.Collaborate with cross-functional teams to ensure data-driven decisions.Ensure data quality, reliability, and documentation across analytics efforts.Must Have3+ years of experience as an Analytics Data Engineer or similar Data Science & Analytics roles, preferably partnering with GTM and Product leads to build and report on key company-wide metrics.Strong SQL and data engineering skills to transform data into accurate, clean data models (e.g., dbt, Airflow, data warehouses).Advanced analytics experience: segmentation and cohort analysis.Proficiency in Python for data analysis and modeling.Excellent communication skills: able to explain complex data insights clearly.Curious, collaborative, and driven to make an impact in a fast-paced environment.Nice to HaveExperience in B2B SaaS or AI/ML products.Familiarity with product analytics tools (e.g., Mixpanel, Amplitude).Exposure to machine learning concepts or AI-powered systems.Why join us?Opportunity to work at a high-growth AI startup, backed by top investors.Fast Growth - Backed by a16z and YC, on track for double-digit ARR.Ownership & Autonomy - Take full ownership of projects and ship fast.Top-Tier Compensation - Competitive salary + equity in a high-growth startup.Comprehensive Benefits - Healthcare, dental, vision coverage.Work With the Best - Join a world-class team of engineers and builders.Our Operating Principles
Extreme Ownership We take full responsibility for our work, outcomes, and team success. No excuses, no blame-shifting — if something needs fixing, we own it and make it better. This means stepping up, even when it’s not “your job.” If a ball is dropped, we pick it up. If a customer is unhappy, we fix it. If a process is broken, we redesign it. We don’t wait for someone else to solve it — we lead with accountability and expect the same from those around us. Craftsmanship Putting care and intention into every task, striving for excellence, and taking deep ownership of the quality and outcome of your work. Craftsmanship means never settling for “just fine.” We sweat the details because details compound. Whether it’s a product feature, an internal doc, or a sales call — we treat it as a reflection of our standards. We aim to deliver jaw-dropping customer experiences by being curious, meticulous, and proud of what we build — even when nobody’s watching. We are “majos”
Be friendly & have fun with your coworkers. Always be genuine & honest, but kind. “Majo” is our way of saying: be a good human. Be approachable, helpful, and warm. We’re building something ambitious, and it’s easier (and more fun) when we enjoy the ride together. We give feedback with kindness, challenge each other with respect, and celebrate wins together without ego. Urgency with Focus
Create the highest impact in the shortest amount of time. Move fast, but in the right direction. We operate with speed because time is our most limited resource. But speed without focus is chaos. We prioritize ruthlessly, act decisively, and stay aligned. We aim for high leverage: the biggest results from the simplest, smartest actions. We’re running a high-speed marathon — not a sprint with no strategy. Talent Density and Meritocracy
Hire only people who can raise the average; ‘exceptional performance is the passing grade.’ Ability trumps seniority. We believe the best teams are built on talent density — every hire should raise the bar. We reward contribution, not titles or tenure. We give ownership to those who earn it, and we all hold each other to a high standard. A-players want to work with other A-players — that’s how we win. First-Principles Thinking
Strip a problem to physics-level facts, ignore industry dogma, rebuild the solution from scratch. We don’t copy-paste solutions. We go back to basics, ask why things are the way they are, and rebuild from the ground up if needed. This mindset pushes us to innovate, challenge stale assumptions, and move faster than incumbents. It’s how we build what others think is impossible.The personal data provided in your application and during the selection process will be processed by Happyrobot, Inc., acting as Data Controller.By sending us your CV, you consent to the processing of your personal data for the purpose of evaluating and selecting you as a candidate for the position. Your personal data will be treated confidentially and will only be used for the recruitment process of the selected job offer.In relation to the period of conservation of your personal data, these will be eliminated after three months of inactivity in compliance with the GDPR and legislation on the protection of personal data.If you wish to exercise your rights of access, rectification, deletion, portability or opposition in relation to your personal data, you can do so through security@happyrobot.ai subject to the GDPR.For more information, visit https://www.happyrobot.ai/privacy-policyBy submitting your request, you confirm that you have read and understood this clause and that you agree to the processing of your personal data as described.
Data Engineer
Data Science & Analytics
Apply
September 7, 2025
Big Data Architect
Databricks
5000+
0
0
-
0
Germany
Remote
false
CSQ426R218 We have 5 open positions based in our Germany offices. As a Big Data Solutions Architect (Resident Solutions Architect) in our Professional Services team you will work with clients on short to medium term customer engagements on their big data challenges using the Databricks Data Intelligence Platform. You will provide data engineering, data science, and cloud technology projects which require integrating with client systems, training, and other technical tasks to help customers to get most value out of their data. RSAs are billable and know how to complete projects according to specification with excellent customer service. You will report to the regional Manager/Lead. The impact you will have: You will work on a variety of impactful customer technical projects which may include designing and building reference architectures, creating how-to's and productionalizing customer use cases Work with engagement managers to scope variety of professional services work with input from the customer Guide strategic customers as they implement transformational big data projects, 3rd party migrations, including end-to-end design, build and deployment of industry-leading big data and AI applications Consult on architecture and design; bootstrap or implement customer projects which leads to a customers' successful understanding, evaluation and adoption of Databricks. Provide an escalated level of support for customer operational issues. You will work with the Databricks technical team, Project Manager, Architect and Customer team to ensure the technical components of the engagement are delivered to meet customer's needs. Work with Engineering and Databricks Customer Support to provide product and implementation feedback and to guide rapid resolution for engagement specific product and support issues. What we look for: Proficient in data engineering, data platforms, and analytics with a strong track record of successful projects and in-depth knowledge of industry best practices Comfortable writing code in either Python or Scala Enterprise Data Warehousing experience (Teradata / Synapse/ Snowflake or SAP) Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one Deep experience with distributed computing with Apache Spark™ and knowledge of Spark runtime internals Familiarity with CI/CD for production deployments Working knowledge of MLOps Design and deployment of performant end-to-end data architectures Experience with technical project delivery - managing scope and timelines. Documentation and white-boarding skills. Experience working with clients and managing conflicts. Build skills in technical areas which support the deployment and integration of Databricks-based solutions to complete customer projects. Travel is required up to 10%, more at peak times. Databricks Certification About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
Benefits
At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visit https://www.mybenefitsnow.com/databricks.
Our Commitment to Diversity and Inclusion At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics. Compliance If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.
Data Engineer
Data Science & Analytics
Solutions Architect
Software Engineering
Apply
September 1, 2025
Finance Data Engineer
Synthesia
501-1000
-
United Kingdom
Full-time
Remote
true
Welcome to the video first world From your everyday PowerPoint presentations to Hollywood movies, AI will transform the way we create and consume content. Today, people want to watch and listen, not read — both at home and at work. If you’re reading this and nodding, check out our brand video. Despite the clear preference for video, communication and knowledge sharing in the business environment are still dominated by text, largely because high-quality video production remains complex and challenging to scale—until now…. Meet Synthesia
We're on a mission to make video easy for everyone. Born in an AI lab, our AI video communications platform simplifies the entire video production process, making it easy for everyone, regardless of skill level, to create, collaborate, and share high-quality videos. Whether it's for delivering essential training to employees and customers or marketing products and services, Synthesia enables large organizations to communicate and share knowledge through video quickly and efficiently. We’re trusted by leading brands such as Heineken, Zoom, Xerox, McDonald’s and more. Read stories from happy customers and what 1,200+ people say on G2. In 2023, we were one of 7 European companies to reach unicorn status. In February 2024, G2 named us as the fastest growing company in the world. We’ve raised over $150M in funding from top-tier investors, including Accel, Nvidia, Kleiner Perkins, Google and top founders and operators including Stripe, Datadog, Miro, Webflow, and Facebook. About the role… Synthesia is building a modern, AI-powered back office. We are hiring a full-stack engineer to design and run the data layer that powers Finance ensuring data from finance systems land securely in a data warehouse and is easily consumed via Copilot (LLM) and Omni dashboards. You’ll be a cornerstone in our shift to modern finance workflows. Key Responsibilities: Design, develop, and maintain robust ETL/ELT pipelines to ingest, transform, and securely store data from NetSuite and other finance systems into Snowflake, ensuring data integrity, compliance, and security best practices (e.g., encryption, access controls, and auditing). Collaborate with finance and data teams to define data models, schemas, and governance policies that support modern finance workflows, including automated reporting, forecasting, and anomaly detection. Implement data retrieval mechanisms optimized for LLM-based querying via Co-pilot and/or similar tools, enabling natural language access to financial data while maintaining accuracy and contextual relevance. Build and optimize interactive dashboards in Omni for real-time visualization and analysis of key metrics, such as financial performance and operational KPIs. Monitor and troubleshoot data pipelines, performing root-cause analysis on issues related to data quality, latency, or availability, and implementing proactive solutions to ensure high reliability. Document processes, architectures, and best practices to facilitate knowledge sharing and scalability within the team. Qualifications: Bachelor's or Master's degree in Computer Science, Finance, Information Systems, or a related field. 5+ years of experience as a data engineer or analytics engineer, with a proven track record in full stack data development (from ingestion to visualization). Strong expertise in Snowflake, including data modeling, warehousing, and performance optimization. Hands-on experience with ETL tools (e.g., Apache Airflow, dbt, Fivetran) and integrating data from ERP systems like NetSuite. Proficiency in SQL, Python, and/or other scripting languages for data processing and automation. Familiarity with LLM integrations (e.g., for natural language querying) and dashboarding tools like Omni or similar (e.g., Tableau, Looker). Solid understanding of data security principles, including GDPR/CCPA compliance, role-based access, and encryption in cloud environments. Excellent problem-solving skills, with the ability to work cross-functionally in agile teams. At Synthesia we expect everyone to... Put the Customer First Own it & Go Direct Be Fast & Experimental Make the Journey Fun You can read more about this in our public Notion page. Location: LON or UK Remote UK Benefits 📍A hybrid, flexible approach to work where you have access to a lovely office space in Oxford Circus with free lunches on a Wednesday and Friday 💸 A competitive salary + stock options 🏝 25 days of annual leave + public holidays 🏥 Private healthcare through AXA ❣️ Pension contribution - Synthesia contributes 3% and employees contribute 5% on qualifying earnings 🍼 Paid parental leave entitling primary caregivers to 16 weeks of full pay, and secondary 5 weeks of full pay 👉 You can participate in a generous recruitment referral scheme if you help us to hire 💻 The equipment you need to be successful in your role
Data Engineer
Data Science & Analytics
Apply
August 27, 2025
Data Engineer
Sierra
201-500
USD
0
220000
-
340000
United States
Full-time
Remote
false
About usAt Sierra, we’re creating a platform to help businesses build better, more human customer experiences with AI. We are primarily an in-person company based in San Francisco, with growing offices in Atlanta, New York, and London.We are guided by a set of values that are at the core of our actions and define our culture: Trust, Customer Obsession, Craftsmanship, Intensity, and Family. These values are the foundation of our work, and we are committed to upholding them in everything we do.Our co-founders are Bret Taylor and Clay Bavor. Bret currently serves as Board Chair of OpenAI. Previously, he was co-CEO of Salesforce (which had acquired the company he founded, Quip) and CTO of Facebook. Bret was also one of Google's earliest product managers and co-creator of Google Maps. Before founding Sierra, Clay spent 18 years at Google, where he most recently led Google Labs. Earlier, he started and led Google’s AR/VR effort, Project Starline, and Google Lens. Before that, Clay led the product and design teams for Google Workspace. What you'll doSierra is in the process of building out its core data foundations, and you’ll play a pivotal role in shaping the company’s data strategy and infrastructure. Partnering with engineering, product, and GTM teams, you’ll design and operate scalable batch and real-time data systems, create trusted data models, and build the pipelines that power experimentation, analytics, and AI development.You’ll ensure that every area of the business—from customer experience to go-to-market execution—has access to high-quality, reliable data to drive insight and innovation. Beyond infrastructure, you’ll influence how data is captured, governed, and leveraged across Sierra, empowering decision-making at scale.This is a unique opportunity to establish the foundations of Sierra’s data ecosystem, drive standards for reliability and trust, and enable the next generation of AI-powered customer interactions.What you'll bringProven Experience: Extensive experience in data engineering, with a track record of designing and operating data pipelines, systems, and models at scale.Curiosity & Customer Obsession: Passion for building trustworthy data systems that empower teams to better understand users and deliver impactful product experiences.Adaptability and Resilience: Comfort working in a fast-paced startup environment, able to adapt to evolving priorities and deliver reliable solutions amidst ambiguity.Technical Proficiency: Strong proficiency in SQL and Python, with expertise in distributed data processing frameworks (e.g., Spark, Flink, Kafka) and cloud-based platforms (AWS, GCP).Data Architecture Skills: Deep experience with data modeling, warehousing, and designing schemas optimized for analytics, experimentation, and AI/ML workloads.Data Quality & Governance: Strong understanding of data validation, monitoring, compliance, and best practices for ensuring data integrity across pipelines.Excellent Communication: Ability to translate technical infrastructure and data design trade-offs into clear recommendations for product, engineering, and business stakeholders.Great Collaboration: Proven ability to partner closely with product, ML, analytics, and GTM teams to deliver data foundations that unlock business and product innovation.Even better...Experience with (open to equivalents) - AWS Glue, Athena, Kafka, Flink/Spark, dbt, Airflow/Dagster, Terraform.Experience working with large language models (LLMs), conversational AI, or agent-based systems.Familiarity with building or improving data platforms for advanced analytics.Our valuesTrust: We build trust with our customers with our accountability, empathy, quality, and responsiveness. We build trust in AI by making it more accessible, safe, and useful. We build trust with each other by showing up for each other professionally and personally, creating an environment that enables all of us to do our best work.Customer Obsession: We deeply understand our customers’ business goals and relentlessly focus on driving outcomes, not just technical milestones. Everyone at the company knows and spends time with our customers. When our customer is having an issue, we drop everything and fix it.Craftsmanship: We get the details right, from the words on the page to the system architecture. We have good taste. When we notice something isn’t right, we take the time to fix it. We are proud of the products we produce. We continuously self-reflect to continuously self-improve.Intensity: We know we don’t have the luxury of patience. We play to win. We care about our product being the best, and when it isn’t, we fix it. When we fail, we talk about it openly and without blame so we succeed the next time.Family: We know that balance and intensity are compatible, and we model it in our actions and processes. We are the best technology company for parents. We support and respect each other and celebrate each other’s personal and professional achievements.What we offerWe want our benefits to reflect our values and offer the following to full-time employees:Flexible (Unlimited) Paid Time OffMedical, Dental, and Vision benefits for you and your familyLife Insurance and Disability BenefitsRetirement Plan (e.g., 401K, pension) with Sierra matchParental LeaveFertility and family building benefits through CarrotLunch, as well as delicious snacks and coffee to keep you energized Discretionary Benefit Stipend giving people the ability to spend where it matters mostFree alphorn lessonsThese benefits are further detailed in Sierra's policies and are subject to change at any time, consistent with the terms of any applicable compensation or benefits plans. Eligible full-time employees can participate in Sierra's equity plans subject to the terms of the applicable plans and policies.Be you, with usWe're working to bring the transformative power of AI to every organization in the world. To do so, it is important to us that the diversity of our employees represents the diversity of our customers. We believe that our work and culture are better when we encourage, support, and respect different skills and experiences represented within our team. We encourage you to apply even if your experience doesn't precisely match the job description. We strive to evaluate all applicants consistently without regard to race, color, religion, gender, national origin, age, disability, veteran status, pregnancy, gender expression or identity, sexual orientation, citizenship, or any other legally protected class.
Data Engineer
Data Science & Analytics
Apply
August 22, 2025
Data Engineer
Krea
51-100
-
United States
Full-time
Remote
false
About KreaAt Krea, we are building next-generation AI creative tools.We are dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not replace it.We believe AI is a new medium that allows us to express ourselves through various formats—text, images, video, sound, and even 3D. We're building better, smarter, and more controllable tools to harness this medium. This jobData is one of the fundamental pieces of Krea. Huge amounts of data power our AI training pipelines, our analytics and observability, and many of the core systems that make Krea tick.As a data engineer, you will…… build distributed systems to process gigantic (petabytes) amounts of files of all kinds (images, video, and even 3D data). You should feel comfortable solving scaling problems as you go.… work closely with our research team to build ML pipelines and deploy models to make sense of raw data.… play with massive amounts of compute on huge kubernetes GPU clusters - our main GPU cluster takes up an entire datacenter from our provider.… learn machine learning engineering (ML experience is a bonus, but you can also learn it on the job) from world-class researchers on a small yet highly effective tight-knit team.Example projectsFind clean scenes in millions of videos, running distributed data pipelines that detect shot boundaries and saving timestamps of clips.Solve orchestration and scaling issues with a large-scale distributed GPU job processing system on kubernetess.Build systems to deploy and combine different LLMs to caption massive amounts of multimedia data in a variety of different ways.Design multi-stage pipelines to turn petabytes of raw data into clean downstream datasets, with metadata, annotations, and filters.Strong candidates may have experience with…Python, PyArrow, DuckDB, SQL, massive relational databases, PyTorch, Pandas, NumPy…KubernetesDesigning and implementing large-scale ETL systemsFundamental knowledge of containerization, operating systems, file-systems, and networking.Distributed systems designAbout usWe’re building AI creative tooling.We’ve raised over $83M from the best investors in Silicon Valley.We’re a team of 12 with millions of active users scaling aggressively.
Data Engineer
Data Science & Analytics
Machine Learning Engineer
Data Science & Analytics
Apply
July 31, 2025
Staff Data Engineer
Thoughtful
101-200
USD
0
190000
-
250000
United States
Full-time
Remote
true
Join Our Mission to Revolutionize Healthcare Thoughtful is pioneering a new approach to automation for all healthcare providers! Our AI-powered Revenue Cycle Automation platform enables the healthcare industry to automate and improve its core business operations. We're looking for Staff Data Engineers to help scale and strengthen our data platform. Our data stack today consists of Aurora RDS, AWS Glue, Apache Iceberg, S3 (Parquet), Spark and Athena - supporting a range of use cases from operational reporting to downstream services. We’re looking to grow the team with engineers who can help improve performance, increase reliability, and expand the platform's capabilities as our data volume and complexity continue to grow. You’ll work closely with other engineers to evolve our existing pipelines, improve observability and data quality, and enable faster, more flexible access to data across the company. The platform is deployed on AWS using OpenTofu, and we’re looking for engineers who bring strong cloud infrastructure fundamentals alongside deep experience in data engineering. Your Role: Build: Develop and maintain data pipelines and transformations across the stack. Starting from ingesting transactional data into the data lakehouse to refining data up the medallion data architecture. Optimize: Tune performance, storage layout, and cost-efficiency across our data storage and query engines. Extend: Help design and implement new data ingestion patterns and improve platform observability and reliability. Collaborate: Partner with engineering, product, and operations teams to deliver well-structured, trustworthy data for diverse use cases. Contribute: Help establish and evolve best practices for our data infrastructure, from pipeline design to OpenTofu-managed resource provisioning. Secure: Help design and implement a data governance strategy to secure our data lakehouse. Your Qualifications: 8-10+ years of experience building and maintaining data pipelines in production environments Strong knowledge of the data lakehouse ecosystem, with an emphasis on AWS data services - particularly Glue, S3, Athena/Trino/PrestoDB, and Aurora Proficiency in Python, Spark and Athena/Trino/PrestoDB for data transformation and orchestration Experience managing infrastructure with OpenTofu/Terraform or other Infrastructure-as-Code tools Solid understanding of data modeling, partitioning strategies, schema evolution, and performance tuning Comfortable working with cloud-native data pipelines and batch processing (streaming experience is a plus but not required) What Sets You Apart: Systems thinker - you understand the tradeoffs in data architecture and design for long-term stability and clarity Outcome-driven - you focus on building useful, maintainable systems that serve real business needs Strong collaborator - you're comfortable working across teams and surfacing data requirements early Practical and hands-on - able to dive into logs, schemas, and IAM policies when needed Thoughtful contributor - committed to improving code quality, developer experience, and documentation across the board Why Thoughtful? Competitive compensation Equity participation: Employee Stock Options. Health benefits: Comprehensive medical, dental, and vision insurance. Time off: Generous leave policies and paid company holidays. California Salary Range $190,000—$250,000 USD
Data Engineer
Data Science & Analytics
Apply
July 29, 2025
Data Center Research & Development Engineer - Stargate
OpenAI
5000+
USD
240000
-
400000
United States
Full-time
Remote
false
About the Team:OpenAI, in close collaboration with our capital partners, is embarking on a journey to build the world’s most advanced AI infrastructure ecosystem. The Data Center Engineering team is at the core of this mission. This team sets the infrastructure strategy, develops cutting-edge engineering solutions, partners with research teams to define the infrastructure performance requirements, and creates reference designs to enable rapid global expansion in collaboration with our partners. As a key member of this team, you will help design and deliver next-generation power, cooling, and hardware solutions for high-density rack deployments in some of the largest data centers in the world. You will work closely with stakeholders across research, site selection, design, construction, commissioning, hardware engineering, deployment, operations, and global partners to bring OpenAI’s infrastructure vision to life.About the Role:We’re seeking a seasoned data center R&D engineer with extensive experience in designing, performing validation testing, commissioning, and operating large-scale power, cooling, and high-performance computing systems. This role focuses on developing and validating new infrastructure and hardware, including high-voltage rectifiers, UPS systems, battery storage, transformers, DC to DC converters, and power supplies. The role will lead the design and buildout of a hardware validation lab, create detailed models and test procedures, and ensure hardware compatibility across edge-case data center operating conditions. Additionally, the role involves working closely with hardware vendors to assess manufacturing test protocols, throughput, and liquid-cooled GPU rack performance. A strong foundation in technical design, operational leadership, and vendor collaboration is critical, with an opportunity to lead high-impact infrastructure programs.You Might Thrive in this Role:Oversee electrical, mechanical, controls, and telemetry design and operations for large-scale data centers, including review of building and MEP drawings across all project phases from concept design to permitting, construction, commissioning and production.Develop, test and implement operational procedures and workflows from design through commissioning and deployment.Perform validation testing of all critical equipment and hardware in partnership with equipment vendors and ODMs.Lead buildout of R&D lab, including equipment selection, test infrastructure for high-density liquid-cooled racks, and staffing plans.Select and manage engineering tools (CAD, CFD, PLM, PDM, electrical, mechanical, power/network management).Collaborate with external vendors to select, procure, and manage critical infrastructure equipment (e.g., UPS, generators, transformers, DC to DC converters, chillers, VFDs).Ensure seamless integration of power, cooling, controls, networking, and construction systems into facility design.Provide technical direction to teams and vendors, ensuring safety, quality, and compliance with local codes, standards and regulations.Manage vendor relationships and ensure adherence to safety, performance, and operational standards.Qualifications:20+ years of experience in data center design, operations, and critical systems maintenance.Proven leadership across design, commissioning, and operation of large-scale data center campuses.Deep expertise in infrastructure systems (power, cooling, controls, networking) and operational workflows.Hands-on experience with critical infrastructure equipment and testing protocols.Strong track record in lab development, equipment selection, and facility operations.Familiarity with engineering tools (CAD, CFD, PLM, etc.) and their integration across teams.Experience navigating regulatory environments and working with government agencies.Excellent cross-functional communication and stakeholder collaboration.Bachelor’s degree in engineering required; advanced degree and PE certification preferred.Preferred Skills:Expertise in equipment design, agency certification, and validation testing.Experience in global, matrixed organizations and multi-site operations.Skilled in vendor negotiations and supply chain management.Familiarity with sustainable and energy-efficient data center design principles.About OpenAIOpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.OpenAI Global Applicant Privacy PolicyAt OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
Data Engineer
Data Science & Analytics
Apply
July 18, 2025
Staff Data Engineer
Glean Work
1001-5000
-
India
Full-time
Remote
false
About Glean: Founded in 2019, Glean is an innovative AI-powered knowledge management platform designed to help organizations quickly find, organize, and share information across their teams. By integrating seamlessly with tools like Google Drive, Slack, and Microsoft Teams, Glean ensures employees can access the right knowledge at the right time, boosting productivity and collaboration. The company’s cutting-edge AI technology simplifies knowledge discovery, making it faster and more efficient for teams to leverage their collective intelligence. Glean was born from Founder & CEO Arvind Jain’s deep understanding of the challenges employees face in finding and understanding information at work. Seeing firsthand how fragmented knowledge and sprawling SaaS tools made it difficult to stay productive, he set out to build a better way - an AI-powered enterprise search platform that helps people quickly and intuitively access the information they need. Since then, Glean has evolved into the leading Work AI platform, combining enterprise-grade search, an AI assistant, and powerful application- and agent-building capabilities to fundamentally redefine how employees work.About the Role: Glean is building a world-class Data Organization composed of data science, applied science, data engineering and business intelligence groups. Our data engineering group is based in our Bangalore, India office. In this role, you will work on customer-facing and Glean employee-facing analytics initiatives: Customer-facing analytics initiatives: Customers rely on in-product dashboards and if they have the willingness and resources, self-serve data analytics to understand how Glean’s being used at their company in order to get a better sense of Glean’s ROI and partner with Glean to increase user adoption. You’re expected to partner with backend and data science to maintain and improve the data platform behind these operations reflect usage on new features reflect changes on the underlying product usage logs on existing features identify and close data quality issues, e.g. gaps with internal tracking, and backfill the changes triage issues customers report to us within appropriate SLAs help close customer-facing technical documentation gaps You will: Help improve the availability of high-value upstream raw data by channeling inputs from data science and business intelligence to identify biggest gaps in data foundations partnering with Go-to-Market & Finance operations groups to create streamlined data management processes in enterprise apps like Salesforce, Marketo and various accounting software partnering with Product Engineering teams as they craft product logging initiatives & processes Architect and implement key tables that transform structured and unstructured data into usable models by the data, operations, and engineering orgs. Ensure and maintain the quality and availability of internally used tables within reasonable SLAs Own and improve the reliability, efficiency and scalability of ETL tooling, including but not limited to dbt, BigQuery, Sigma. This includes identifying implementing and disseminating best practices as well. About you: You have 9+ yrs of work experience in software/data engineering (former is strongly preferred) as a bachelor degree holder. This requirement is 7+ for masters degree holders and 5+ for PhD Degree holders. You’ve served as a tech lead and mentored several data engineers before. Customer-facing analytics initiatives: You have experience in architecting, implementing and maintaining robust data platform solutions for external-facing data products. You have experience with implementing and maintaining large-scale data processing tools like Beam and Spark. You have experience working with stakeholders and peers from different time zones and roles, e.g. ENG, PM, data science, GTM, often as the main data engineering point of contact. Internal-facing analytics initiatives: You have experience in full-cycle data warehousing projects, including requirements analysis, proof-of-concepts, design, development, testing, and implementation You have experience in database designing, architecture, and cost-efficient scaling You have experience with cloud-based data tools like BigQuery and dbt You have experience with data pipelining tools like Airbyte, Apache, Stitch, Hevo Data, and Fivetran General qualifications: You have a high degree of proficiency with SQL and are able to set best practices and up-level our growing SQL user base within the organization You are proficient in at least one of Python, Java and Golang. You are familiar with cloud computing services like GCP and/or AWS. You are concise and precise in written and verbal communication. Technical documentation is your strong suit. You are a particularly good fit if: You have 1+ years of tech lead management experience. Note this is distinct from having a tech lead experience, and involves formally managing others. You have experience working with customers directly in a B2B setting. You have experience with Salesforce, Marketo, and Google Analytics. You have experience in distributed data processing & storage, e.g. HDFS Location: This role is hybrid (3 days a week in our Bangalore office) We are a diverse bunch of people and we want to continue to attract and retain a diverse range of people into our organization. We're committed to an inclusive and diverse company. We do not discriminate based on gender, ethnicity, sexual orientation, religion, civil or family status, age, disability, or race.
Data Engineer
Data Science & Analytics
Apply
July 14, 2025
Data Infrastructure Engineer
HeyGen
201-500
-
United States
Canada
Full-time
Remote
false
About HeyGen At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.
Learn more at www.heygen.com. Visit our Mission and Culture doc here. Position Summary: At HeyGen, we are at the forefront of developing applications powered by our cutting-edge AI research. As a Data Infrastructure Engineer, you will lead the development of fundamental data systems and infrastructure. These systems are essential for powering our innovative applications, including Avatar IV, Photo Avatar, Instant Avatar, Interactive Avatar, and Video Translation. Your role will be crucial in enhancing the efficiency and scalability of these systems, which are vital to HeyGen's success. Key Responsibilities: Design, build, and maintain the data infrastructure and systems needed to support our AI applications. Examples include Large scale data acquisition Multi-modal data processing framework and applications Storage and computation efficiency AI model evaluation and productionization infrastructure Collaborate with data scientists and machine learning engineers to understand their computational and data needs and provide efficient solutions. Stay up-to-date with the latest industry trends in data infrastructure technologies and advocate for best practices and continuous improvement. Assist in budget planning and management of cloud resources and other infrastructure expenses. Qualifications: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field Proven experience in managing infrastructure for large-scale AI or machine learning projects Excellent problem-solving skills and the ability to work independently or as part of a team. Proficiency in Python Experience optimizing computational workflows Familiarity with AI and machine learning frameworks like TensorFlow or PyTorch. Preferred Qualifications: Experience with GPU computing Experience with distributed data processing system Experience building large scale batch inference system Prior experience in a startup or fast-paced tech environment. What HeyGen Offers Competitive salary and benefits package. Dynamic and inclusive work environment. Opportunities for professional growth and advancement. Collaborative culture that values innovation and creativity. Access to the latest technologies and tools. HeyGen is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Data Engineer
Data Science & Analytics
DevOps Engineer
Data Science & Analytics
Apply
July 3, 2025
Audio Data Engineer – Speech Cleaning & Pipeline Automation (TTS)
Hippocratic AI
201-500
-
United States
Full-time
Remote
false
Data Engineer
Data Science & Analytics
Apply
June 10, 2025
QA automation engineer
Writer
1001-5000
-
United States
Full-time
Remote
true
📐 About this roleWith WRITER, you'll be working closely with the product and the engineering team to ship a product that tens of thousands of people rely on every day.As a QA engineer, you’ll be working on key parts of the QA process and defining key quality KPIs and metrics for the product. You'll drive testing efforts and lead our automation strategies, and we'll look to you for leadership on maintaining the highest bar of quality possible for all releases.🦸🏻♀️ Your responsibilitiesDesigning and developing automation frameworks: Creating robust, scalable, and maintainable automated test frameworks from scratch or enhancing existing ones. This often involves proficiency in languages like Typsecript, Python.Performance and load testing: Assessing application performance under various conditions, ensuring reliability and scalability.Creating and maintaining test scripts: Writing automated test scripts for various levels of testing (Integration, API, end-to-end) to validate functionality, performance, and security.Implementing CI/CD integration: Embedding automated tests within Continuous Integration/Continuous Delivery (CI/CD) pipelines to enable continuous testing and rapid feedback loops.Test planning and strategy: Developing comprehensive test plans and strategies, defining test approaches, and identifying appropriate test data.Defect management: Identifying, documenting, prioritizing, and tracking bugs and issues, and collaborating with development teams for timely resolution.Collaborating with development teams: Working closely with developers, product managers, and other stakeholders throughout the entire software development lifecycle to ensure quality is built in from the start ("shift-left" testing).Mentoring and guiding: Advising developers on unit testing best practices and promoting a culture of quality.Research and innovation: Staying up-to-date with the latest advancements in test automation technology, tools, and methodologies.Data quality testing AI/ML: Ensuring the quality, integrity, and representativeness of training and testing data for AI models Creating specialized test frameworks and tools tailored for evaluating AI and Machine Learning models, especially for emerging AI technologies like Large Language Models (LLMs).Assessing the quality of AI-written content, identifying errors, inconsistencies, and areas where the AI's output can be improved.Ensuring the content meets high standards and aligns with user expectations and industry guidelines.⭐️ Is this you?8+ years of QA engineering experience5+ years automated testing experienceExperience with test automation/hands-on coding using playwright, selenium and Typescript, JavaScript.Experience with UI automation frameworks
🍩 Benefits & perks (US Full-time employees)Generous PTO, plus company holidaysMedical, dental, and vision coverage for you and your familyPaid parental leave for all parents (12 weeks)Fertility and family planning supportEarly-detection cancer testing through GalleriFlexible spending account and dependent FSA optionsHealth savings account for eligible plans with company contributionAnnual work-life stipends for:Home office setup, cell phone, internetWellness stipend for gym, massage/chiropractor, personal training, etc.Learning and development stipendCompany-wide off-sites and team off-sitesCompetitive compensation, company stock options and 401kWriter is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.By submitting your application on the application page, you acknowledge and agree to Writer's Global Candidate Privacy Notice.
Data Engineer
Data Science & Analytics
Apply
June 6, 2025
Datacenter Operations Technician
X AI
5000+
-
United States
Full-time
Remote
false
Data Engineer
Data Science & Analytics
Apply
May 22, 2025
Web Scraping Specialist
Wynd Labs
1-10
-
Anywhere
Full-time
Remote
true
Data Engineer
Data Science & Analytics
Apply
May 19, 2025
Apply
May 19, 2025
Senior Data Engineer
Together AI
201-500
USD
0
160000
-
240000
United States
Full-time
Remote
false
About the Role Together AI is looking for a Senior Data Engineer to help define, build, and operate the data infrastructure that handles millions of events every day to power Together’s mission-critical systems. As a Senior Data Engineer, you will work with our Data and Commerce engineering team to scale the data processing components of Together’s usage-based billing system, real-time customer-facing analytics product, and internal business intelligence tools. You will work across both cloud-native services and globally distributed data centers. If you thrive in fast-paced environments and have a passion for defining and building early-stage data platforms for a rapidly scaling and data-intensive company, this is for you. Requirements 5+ years of demonstrated experience in building large scale, fault tolerant, distributed data platforms, stream processing pipelines, ETLs, etc Expert-level skills in designing, building, and operating stream processing pipelines with services like AWS Kinesis, Apache Kafka, or Redpanda Expert-level knowledge of building real-time customer facing analytics systems using services like AWS TimeStream or Clickhouse Proficiency in writing and maintaining Infrastructure as Code (IaC) using tools like Terraform, AWS CDK, or Pulumi Proficiency in version control practices and integrating IaC with CI/CD pipelines. Proficiency in implementing and managing GitOps workflows with tools such as ArgoCD, Github Actions, TeamCity, or similar Proficiency in one or more of Golang, Rust, Python, Java, or TypeScript Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience Experience with Kubernetes, or containers a plus Responsibilities Identify, design, and develop foundational data infrastructure components capable of handling millions or billions of events daily Analyze and improve the robustness and scalability of existing data processing infrastructure Partner with product teams to understand functional requirements and deliver solutions that meet business needs Write clear, well-tested, and maintainable infra-as-code and software for both new and existing systems Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance Participate in an on-call rotation to address critical incidents when necessary About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $240,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
Data Engineer
Data Science & Analytics
Apply
May 16, 2025
No job found
There is no job in this category at the moment. Please try again later