OpenSRE
OpenSRE is an open source framework for AI SRE agents that investigate production incidents from alert to remediation plan. It is Apache 2.0 licensed and built for teams that want to run incident automation in their own infrastructure instead of sending telemetry into a managed black box. When an alert fires, OpenSRE pulls context from logs, metrics, and traces, then produces an evidence-linked report with probable root causes and next actions.
The project fits engineering teams that already use observability and incident tools such as Datadog, Grafana, PagerDuty, and Slack, but need one investigation workflow across them. OpenSRE can post summaries to communication and incident channels, reducing handoffs between on-call, SRE, and application teams. It also supports multiple model providers, so teams can pick OpenAI, Anthropic, Gemini, Ollama, or other providers based on policy and cost constraints.
OpenSRE is also unusual because it includes an evaluation environment for incident-response agents. The repository ships synthetic root-cause tests and cloud-backed end-to-end scenarios that measure whether an agent finds the right cause with supporting evidence. That makes it useful for teams that want to compare workflows before production rollout, not just run demos.
Best for teams with moderate ops maturity that want self-hosted AI-assisted incident response, clear audit trails, and control over integrations, data retention, and deployment pace.

