Career AI Dataset Engineer

AI Dataset Engineer

ZAITRA, established in 2020, is a startup with a primary emphasis on delivering tailored flight software and cutting-edge AI solutions for space missions. Our endeavours encompass projects for both the European Space Agency and commercial customers.

About the role

As a Dataset Engineer, you will play a key role in maintaining and organizing our growing dataset repository, ensuring that our AI models have access to high-quality, well-curated data. Your responsibilities will include:

  • Managing and cataloging datasets, both locally and in cloud storage solutions such as S3.
  • Implementing robust solutions for local data mirrors to ensure efficient access and backups.
  • Designing and maintaining a dataset annotation system, ensuring high-quality labeled data.
  • Creating dataset formats for different tasks (e.g. classification, regression, segmentation, object detection) that are optimized for both storage size and loading speed.
  • Transforming various datasets into a unified format for ML training.
  • Developing efficient methods to search and identify data containing specific objects, events, or patterns for annotation and model training.
  • Finding and evaluating relevant public datasets to meet project requirements.
  • Collaborating with third parties to acquire commercial data.
  • Supporting the team in managing storage resources and optimizing data workflows.

Required qualifications

  • Strong organizational skills and experience managing large datasets.
  • Familiarity with dataset annotation tools and processes, including labeling and quality control.
  • Hands-on experience with cloud storage solutions like AWS S3 and version control for data.
  • Proficiency in Python programming.
  • Understanding of database systems and search tools for querying large datasets (you will be working with systems such as EarthExplorer).
  • Enthusiasm for working on space-related projects and contributing to cutting-edge AI applications.

What we offer

  • Hybrid work (preferably with at least 3 days a week in our office).
  • Flexible working hours.
  • Modern offices near the center of Brno. 
  • Direct impact on the development and direction of products for space missions.
  • Involvement in the development of solutions to be deployed in Earth’s orbit and beyond.
  • 5 weeks of vacation.
  • Meal vouchers. 
  • Team-building activities and company events.

What is nice to have

  • Experience with machine learning datasets and knowledge of common formats (e.g., COCO, Pascal VOC).
  • Familiarity with tools like CVAT, Label Studio or similar annotation platforms.
  • Knowledge of satellite imagery or Earth observation datasets.
  • Understanding of metadata management and schema design for efficient data querying.
  • Prior experience with versioning systems for datasets (e.g., DVC, Git LFS).

Interview process

  • Submit your CV and short cover letter via email to jobs@zaitra.io.
  • Attend a 45-minute call with the CTO and HR.
  • Complete a short take-home task.
  • Participate in a 60-minute technical screening, including a review of your task.
  • Receive an offer.

Where: Brno

Type: Full time

Contact: jobs@zaitra.io