Currently when running evals with the builtin direct-model agent (rather than using off the shelf agents like Claude Code), there is a requirement that the model is hosted at an OpenAI compliant endpoint.
Many models have their own APIs, and we should move from using the OpenAI client package to a more general one that handles that complexity for us.