We're the team behind Latent Diffusion, Stable Diffusion, and FLUX — foundational technologies that changed how the world creates images and video. You'll shape training objectives, architectures, data strategies, and systems behind our joint image, video, and audio foundation models, with a direct line from your research to products used by millions. * Lead large-scale pretraining experiments for our multimodal (image, video, audio) foundation models (architecture, objective functions, scaling strategies) * You've led or co-owned pretraining for a foundation model (image, video, LLM, or multimodal) that shipped to production or a major release
mehr