Fahrschule Black Forest GmbH In this role, you'll own the post-training pipeline for our multimodal models end to end — from data strategy and reward modeling to preference optimization, distillation, and safety tuning — across image, editing, and video. * Own the full post-training pipeline end to end — from data curation and reward modeling through fine-tuning, preference optimization, distillation, safety tuning, evaluation, and deployment * Identify quality and alignment gaps through rigorous evaluation, then close them through targeted research and engineering * You've owned post-training for a frontier generative model through release (SFT, preference optimization (DPO or RLHF), distillation, safety tuning) with measurable quality wins on human prefs or standard benchmarks
more