Summary:
On September 10th, 2024, users in the OnDemand environment were receiving errors when connecting to the OBeer application between 1:15 PM and 2:25 PM MST. The outage was caused by a malfunction in the Domain Controller. We successfully restored access by restarting the service. To prevent similar incidents in the future, we are implementing additional detection measures.
Timeline:
Root Cause:
The outage was caused by an issue with the DNS resolution function of the Domain Controller, which prevented the domain from being contacted from external connections. A restart fixed the issue.
Additional Remediation:
To help catch and prevent future issues, we added some monitoring to the Domain Controller: