Implementing testing, evaluation, monitoring, and observability best practices for production agent systems (LangSmith / LangFuse, structured tracing, offline and online evaluation). * Identifying, testing, and adopting state-of-the-art advancements in LLMs and autonomous agent architectures (e.g. reflection, planning, multi-agent coordination, AWS Bedrock and Bedrock AgentCore). * Experience: 5+ years of hands-on software engineering experience, with 2-3 years building AI agents in production environments, ideally for high-stakes or regulated industries (finance, legal, healthcare, etc.). * Python & backend: Expert-level Python skills and strong engineering fundamentals across backend systems, APIs, and data pipelines (FastAPI, Postgres / SQLAlchemy, gRPC, async IO, Databricks, testing). * Engineering fundamentals: Solid understanding of software design principles (SOLID, DRY, KISS), Clean Code, and ...
mehr