Home » Jobs » IT Jobs In Kenya » AI Evaluation Engineer (Data Analysis & Multi-Agent Systems)
Candidates Experience With Us + Latest Updates

Personalized Support for Your Success

Upcoming Trainings & Events

AI Evaluation Engineer (Data Analysis & Multi-Agent Systems)

IT Jobs. Gramian Consultancy Jobs

  • Design and develop multi-agent benchmark tasks focused on complex data analysis workflows
  • Create or curate realistic datasets (CSV, JSON, logs, reports, financial or operational data)
  • Build tasks requiring:
    • Cross-referencing across multiple data sources
    • Anomaly detection and contradiction identification
    • Statistical analysis and interpretation
  • Define task decomposition strategies across specialized sub-agents (e.g., financial, technical, operational analysis)
  • Develop verification logic to validate precise analytical outputs (not generic summaries)
  • Implement evaluation pipelines using Python and SQL
  • Create reproducible environments using Docker
  • Analyze task performance and refine for clarity, difficulty, and scoring accuracy
  • 5+ years of experience in data analysis or analytics-heavy roles
  • Strong proficiency in Python (pandas, NumPy) and SQL
  • Experience working with real-world, messy datasets (CSV, JSON, logs, reports)
  • Ability to design analytical problems with clear, verifiable answers
  • Solid understanding of statistics (distributions, correlations, outliers)
  • Familiarity with AI benchmarks or evaluation environments (e.g., SWE-bench or similar)
  • Hands-on experience with Docker (Dockerfiles, image builds, debugging)

Click Here to Apply

🚨 Before You Apply for This Job. Need Help With Your CV?

Career Lessons + Experiences

Labour Laws – Know Your Rights