Academic Researcher – Remote (AI Model Evaluation)
Crossing Hurdles · États-Unis
Job description
About the role
Join a frontier model evaluation initiative focused on advancing next‑generation large language model systems across technical and professional domains. This is a remote, W‑2 contingent position requiring 30+ hours per week of independent research work.
Key responsibilities
- Design and validate challenging, real‑world benchmark tasks rooted in your academic or professional expertise.
- Build executable Python‑based problem sets with clear specifications and verifiable test cases for agentic workflows.
- Evaluate model outputs to identify reasoning, logic, and problem‑solving failures in complex scenarios.
- Develop structured gold‑standard solutions and rubrics to support consistent evaluation frameworks.
- Analyze system behavior to surface capability gaps and failure modes in advanced reasoning tasks.
- Collaborate with domain experts across STEM and quantitative disciplines to improve evaluation quality and rigor.
Required profile
- Current or retired professor, or PhD candidate, in a STEM or professional discipline (e.g., computer science, mathematics, physics, engineering, statistics, economics, finance, law).
- Strong academic background from a top‑tier university or equivalent research environment.
- Working proficiency in Python applied in research, coursework, or professional settings.
- Ability to design and implement executable problem‑solving tasks and computational workflows.
- Strong analytical thinking with the ability to assess logical correctness and system behavior.
- Capacity to work independently and consistently for at least 30 hours per week on weekdays.
Required skills
- Python
What we offer
- Remote work with flexible scheduling.
- Competitive compensation (approximately $80‑$110k per year).
- Opportunity to contribute to cutting‑edge AI research alongside leading academic and industry experts.
Questions fréquentes
Why are you reporting this job?
Apply in 30 seconds
Enter your email to apply. An account will be created automatically.
By continuing, you accept our terms of use.
Already have an account? Login
Published 1 day ago
Expires 1 month from now
14 views · 0 interested
Boost your chances
Upload your CV — we will match you with relevant openings.
Analyzing your CV...
Crossing Hurdles
États-Unis
Related job offers
-
Training Specialist
Marathon TS États-Unis -
Bilingual Training Specialist – Remote
Geotab États-Unis -
Part-Time Didactic Nursing Faculty – Fundamentals & Patient Care
Chamberlain University États-Unis -
Teacher – K-12 Classroom Instructor
ADP WFN RECRUITMENT TEST COMPANY Anchorage -
Early Head Start Teacher (Full-Time)
Save the Children US Coushatta