Job Specifications
Duration: 5/6 Months
Job Description:
We are looking for an Evaluation Scientist who can work across both hands-on experimentation and automation infrastructure. This role begins with running manual evaluations (e.g., executing and monitoring individual experiments) and progresses toward building scripts, tools, and infrastructure that streamline and automate these processes, with the long-term goal of reducing manual work as much as possible.
The ideal candidate will also bring expertise in coding agents and quality evaluation, enabling them to design robust experiments and improve workflows. While the role will receive high-level guidance, candidates should be able to independently define and implement the lower-level details of experiment setup after ramping up. For example, given a high-level requirement for a new type of evaluation, the candidate should be able to propose and execute an implementation plan with detailed steps, metrics, and automation in place.
Responsibilities:
Run and manage manual evaluation experiments across AI/ML systems.
Develop and maintain automation infrastructure (scripts, pipelines, tools) to reduce manual evaluation work.
Design and execute new types of evaluations, translating broad research questions into structured experiment setups.
Work with coding agents and applied ML workflows to define and measure quality.
Define metrics, benchmarks, and evaluation criteria to assess performance and identify gaps.
Collaborate with research leads to align evaluation design with project goals while owning implementation details.
Ensure reproducibility, consistency, and scalability of evaluation processes.
Experience (Required):
Strong coding skills in Python (or equivalent) for scripting, automation, and experiment design.
Experience with running and analyzing experiments, including quality evaluation methodologies.
Knowledge of coding agents, ML models, or applied automation frameworks.
Ability to work independently: take high-level requirements and define detailed steps for execution.
2-4 years of hands-on experience in evaluation, scripting, or applied data science/ML (academic or industry).
Strong analytical skills with experience in data handling, reporting, and experiment analysis.
Experience (Desired):
Familiarity with evaluation frameworks and automation tools in AI/ML research.
Experience in building scalable infrastructure for experiments or evaluations.
Knowledge of experimental design, statistical testing, or quality benchmarking.
Education:
Bachelor's degree or equivalent practical experience.
About US Tech Solutions:
US Tech Solutions is a global staff augmentation firm providing a wide range of talent on-demand and total workforce solutions. To know more about US Tech Solutions, please visit www.ustechsolutions.com
US Tech Solutions is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.
Recruiter's email id: dhaval@ustechsolutionsinc.com
JobDiva ID: 25-49510
About the Company
USTECH SOLUTIONS is the largest privately owned diversity workforce partner with a global footprint. For 20+ years, we have partnered with leading MSPs and some of the world's largest enterprises to deliver a flexible workforce. We serve Fortune 500 giants and growing businesses alike, reinventing the role of humans in a digital workforce.
'Reinventing Human' is about connecting you with top talent and seamlessly integrating new hires into your programs through our next gen AI powered Talent platforms. As the largest privat...
Know more