If you're obsessed with distributed systems at scale, infrastructure reliability, scalability, security, and continuous improvement, this team would be perfect for * Maintain research infrastructure, ensuring health, and optimizing components to extract peak performance from the system (both on application, and infrastructure side) * Scale infrastructure to meet growing research demands while maintaining reliability and performance * Collaborate with research teams to deeply understand their infrastructure needs, and design solutions that balance performance with cost efficiency. * Build and evolve telemetry and monitoring systems to provide deep visibility into infrastructure performance, utilization, and costs across our cloud and datacenter fleets. * Deep knowledge of modern cloud infrastructure including Kubernetes, Infrastructure as Code, AWS, and GCP
mehr