The role spans technical exploration, rapid prototyping, system design, evaluation, and refinement. * Design and build agentic systems that combine models, tools, memory, orchestration, and evaluation mechanisms into robust end-to-end solutions * Define and implement rigorous evaluation and benchmarking approaches to measure quality, reliability, and business usefulness * Strong experience defining and implementing evaluation, benchmarking, and testing approaches for AI systems
mehr