Candidates Experience With Us + Latest Updates
Personalized Support for Your Success
Upcoming Trainings & Events
AI Evaluation Engineer (Agentic Coding / Software Engineering) Job
IT Jobs. Gramian Consultancy Jobs
Role overview
We are looking for an AI Evaluation Engineer specialized in software engineering workflows to evaluate and improve datasets used for agentic coding models.
In this role, you will work on realistic coding tasks — reviewing model trajectories, validating outputs, and producing high-quality evaluations. This is a hands-on engineering role, requiring strong debugging skills, attention to detail, and the ability to assess correctness in real code scenarios.
Commitments Required: 8 hours per day with an overlap of 4 hours with PST.
Employment type: Contractor assignment (no medical/paid leave)
Duration of contract: 5 weeks+
Location: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria,Turkey, Vietnam
Interview: take home assessment (60min)
Key Responsibilities
- Execute coding tasks within agentic coding environments, maintaining strict evaluation protocols
- Review and evaluate model-generated code trajectories for correctness and completeness
- Validate outputs by reading code, running tests, analyzing logs, and inspecting artifacts
- Perform targeted validation using scripts, tests, and manual checks
- Write clear, evidence-based rationales for evaluations and rankings
- Design realistic, multi-step coding tasks and workflows (offline work)
- Create and refine evaluation rubrics and scoring criteria
- Ensure consistency, quality, and compliance across evaluations
- Identify issues in environments, instructions, or workflows and report with clear evidence
Qualifications & Experience
- 5+ years of experience in software engineering, QA, developer tooling, or similar code-heavy roles
- Strong proficiency in at least one programming ecosystem (e.g., Python, JavaScript/TypeScript, Java, C/C++, Rust, SQL)
- Ability to read and understand unfamiliar codebases and implement/debug changes
- Experience running and interpreting tests, scripts, and CLI tools
- Strong debugging and problem-solving skills, including handling edge cases
- Comfortable working in Linux/terminal environments
- Familiarity with Git workflows and standard development tooling
- Experience with AI coding tools or agentic coding environments (e.g., Cursor, Claude Code, or similar)
- Strong attention to detail and ability to produce consistent, high-quality evaluations
How to Apply
🚨 Before You Apply for This Job. Need Help With Your CV?
This job will attract 1000+ applicants.
Many qualified professionals miss out on getting shortlisted and interviews — not because they lack experience, but because their CV doesn’t clearly show how they fit this specific job.
🎯 Want to get an interview fast? Customize your CV specifically for this job.
Using the same CV for every application will not get you interviews.
Email your CV today to our Client Service Manager, Rose, using cvwriting@corporatestaffing.co.ke
Subject: CV Review & Upgrade.
Rose and our recruiters will review your CV and show you exactly how to improve it for the job you are targeting.
Using an A.I-generated CV but not getting interviews? Get it reviewed here by our recruiters today.

