You will design, build, and operate GPU‑centric AI infrastructure (especially NVIDIA) across on‑prem and cloud environments, with a strong focus on performance, scalability, and efficiency. * You are responsible for developing and operating core infrastructure components such as scheduling and resource management systems (e.g., SLURM, Ray, Run:ai), ensuring efficient utilization of shared GPU resources. * Using modern tooling, you build and maintain automated, reproducible infrastructure (e.g., Docker, Kubernetes, Terraform, Ansible, CI/CD). * You contribute to BMW-specific AI use cases by providing reliable and scalable infrastructure. * Your role is rounded by taking technical ownership of the AI infrastructure stack, defining best practices, and guide less experienced . * Several years of professional experience (8–10 years) in industry, building and operating AI and HPC infrastructure.
mehr