ZAITRA, established in 2020, is a startup with a primary emphasis on delivering tailored flight software and cutting-edge AI solutions for space missions. Our endeavours encompass projects for both the European Space Agency and commercial customers.
About the role
As a Dataset Engineer, you will play a key role in maintaining and organizing our growing dataset repository, ensuring that our AI models have access to high-quality, well-curated data. Your responsibilities will include:
- Managing and cataloging datasets, both locally and in cloud storage solutions such as S3.
- Implementing robust solutions for local data mirrors to ensure efficient access and backups.
- Designing and maintaining a dataset annotation system, ensuring high-quality labeled data.
- Creating dataset formats for different tasks (e.g. classification, regression, segmentation, object detection) that are optimized for both storage size and loading speed.
- Transforming various datasets into a unified format for ML training.
- Developing efficient methods to search and identify data containing specific objects, events, or patterns for annotation and model training.
- Finding and evaluating relevant public datasets to meet project requirements.
- Collaborating with third parties to acquire commercial data.
- Supporting the team in managing storage resources and optimizing data workflows.
Required qualifications
- Strong organizational skills and experience managing large datasets.
- Familiarity with dataset annotation tools and processes, including labeling and quality control.
- Hands-on experience with cloud storage solutions like AWS S3 and version control for data.
- Proficiency in Python programming.
- Understanding of database systems and search tools for querying large datasets (you will be working with systems such as EarthExplorer).
- Enthusiasm for working on space-related projects and contributing to cutting-edge AI applications.
What we offer
- Hybrid work (preferably with at least 3 days a week in our office).
- Flexible working hours.
- Modern offices near the center of Brno.
- Direct impact on the development and direction of products for space missions.
- Involvement in the development of solutions to be deployed in Earth’s orbit and beyond.
- 5 weeks of vacation.
- Meal vouchers.
- Team-building activities and company events.
What is nice to have
- Experience with machine learning datasets and knowledge of common formats (e.g., COCO, Pascal VOC).
- Familiarity with tools like CVAT, Label Studio or similar annotation platforms.
- Knowledge of satellite imagery or Earth observation datasets.
- Understanding of metadata management and schema design for efficient data querying.
- Prior experience with versioning systems for datasets (e.g., DVC, Git LFS).
Interview process
- Submit your CV and short cover letter via email to jobs@zaitra.io.
- Attend a 45-minute call with the CTO and HR.
- Complete a short take-home task.
- Participate in a 60-minute technical screening, including a review of your task.
- Receive an offer.