AI DataSkolkovo / remoteFull-time

AI Data & Evaluation Engineer

About the role

YappiX is hiring an AI Data & Evaluation Engineer to work on datasets, labeling, benchmark design, LLM evaluation, and quality control for AI-first products and new AI architectures.

We need someone who understands that models do not improve because of slogans — they improve because of better data, honest tests, and rigorous evaluation.

Responsibilities

  • collect, clean, normalize, and structure datasets
  • build evaluation pipelines and benchmark suites
  • design private test sets, adversarial tests, and quality metrics
  • analyze model failures and identify architectural weak points
  • work with synthetic data, filtering, deduplication, and quality control
  • support data workflows for research and product experiments
  • collaborate with research and engineering teams on model quality and measurable outcomes

Requirements

  • Python
  • experience with data workflows, ML datasets, and data pipelines
  • understanding of LLM evaluation, quality metrics, and benchmarking
  • attention to detail and strong data discipline
  • ability to detect system-level errors rather than only local issues
  • ability to propose metrics and validation schemes independently
  • understanding of reproducibility and data quality

Nice to have

  • experience with NLP, LLMs, prompt evaluation, or red teaming
  • experience with synthetic data generation and dataset curation
  • experience in labeling, QA, and research analytics
  • experience with SQL, DuckDB, Pandas, Arrow, or Hugging Face Datasets

You may not be a fit if

  • you treat data work as a secondary support task
  • you cannot design honest and reproducible tests
  • you do not distinguish between “the model sounds good” and “the model is correct.”

What we offer

  • work on AI-first systems and new AI architectures
  • a strong role in model quality and measurable results
  • the opportunity to build benchmark and evaluation systems from scratch
  • a compact team and fast experimental cycles
  • remote / Skolkovo / remote

How to apply

Send your CV, examples of data or evaluation work, and a short note on how you designed test sets or quality metrics to hr@yappix.ru or via https://yappix.ru/en/contact