Code Data Quality Specialist
Mistral AI
Data Science, Quality Assurance
Paris, France
We’re seeking highly motivated Data Quality Specialists with strong analytical skills and a keen eye for detail to join our Human Data Annotation team within the Science organisation.
Key Responsibilities
Generate and validate high-quality data annotations, based on guidelines and continuous feedback, for the development and evaluation of AI models
Surface systemic issues, edge cases, and gaps in guidelines back to annotation operations and technical stakeholders
Produce annotations yourself when needed, modeling the quality bar expected of the team
Build and maintain internal tools and automation that streamline annotator workflows such as visualization dashboards, batch configuration scripts, output management utilities, and similar
Troubleshoot environment, tooling, and CLI/git issues for annotators on their local machines, liaising with IT and engineering as needed
About you
A degree in computer science, engineering, or a related field. Alternatively, 2 to 5 years of professional experience in software engineering, technical support, or developing tools
Hands-on experience using code agents (e.g. Mistral’s vibe) in your own development workflow, and genuine interest in how they're evolving
Proficient in at least one programming language (e.g. Python, JavaScript, or similar), with enough breadth to read and reason about code across a few core languages
Able to apply consistent judgment against a rubric and surface edge cases, ambiguities, or gaps in guidelines
Sustained focus and accuracy on detail-oriented, high-volume review work
Comfortable working in a Unix-like terminal: shell basics, package managers, environment setup, and git workflows (branches, merges, resolving conflicts)
Able to troubleshoot local development environment issues (dependencies, virtual environments, paths, permissions) across common operating systems
Professional proficiency in English, with strong writing and comprehension skills
Nice to have
Prior experience in data annotation for AI/ML, especially LLM training (SFT, RLHF, preference data), evals/benchmarks, or agentic data
Experience building an annotation team through interviews and training
Experience supporting technical users or troubleshooting developer environments (internal tools support, DevRel, teaching assistant for coding courses, etc.)
Fluency across multiple programming languages, or domain depth in one of: frontend, backend, DevOps, MLOps, data engineering
Familiarity with rubric-based evaluation concepts, inter-annotator agreement, or quality measurement for human-labeled data
Experience developing, deploying, and managing internal tooling or automation scripts