Monitoring
Check /healthy
You can check /healthy
or /healthy?verbose
endpoint to see the health of the system. This endpoint will check various health signals,
like ascending block height, engine nodes signing blocks, local MPC connectivity, enrollment of credentials for signer or connector, etc.
This endpoint will return 200 OK
if the node is healthy, and 503 Service Unavailable
otherwise. Using /healthy?verbose
will show JSON
encoded details about the health checks, and it is not API stable.
The endpoint reports the local health of the node, and not necessarily other nodes. For example, if an API node is run, no MPC related health checks are done. If however, another node is down, then any corresponding local connectivity issues will be reported.
Check height is ascending
The "height" refers to the count of state commitments between all nodes of a Treasury cluster.
It should go up by 1 roughly every 5 seconds. You can see the height by checking /v1/treasury
.
{
"name": "treasuries/v2pmy6b8gAriYnaZkeKwMN",
"software": "24.4.6-pre.4 (rev cb92779e2)",
"network": "!mainnet",
"block": {
"height": "5851",
"time": "2024-11-14T23:29:02Z"
}
}
Note that this check is always done by the /healthy
endpoint.
Check events
Treasury has a JSON event stream that you can hook into. An event is emitted for every state change. You
could create alerts based off of when an AccessRule
or TransferRule
is updated, for example.
websocat ws://${API_URL}/v1/audit
These events could be stored somewhere and used as an audit log.
Supervisor Slack alerts
You can connect cord supervise run ...
to slack to post messages. It will alert when:
- The height of the node hasn't ascended in a while.
- When an update is occuring, and if it fails or succeeds.
- Create a slack app from scratch.
- Optionally add a photo & description.
- Go to "OAuth & Permissions"
- Add
chat:write
Bot Token scope. - Install to workspace.
- In slack, add your bot to a channel, e.g. "#test-alerts". Use the channel's integrations tab.
- Copy the Bot User OAuth Token (starts with "xoxb-")
Test sending an alert.
export SLACK_API_KEY=xoxb-...
cord supervise alert --slack-channel '#test-alerts' hi
You can then enable slack alerts by setting SLACK_API_KEY
and adding --slack-channel
to cord supervise run ...
.
Events / Audit Log
By default, Treasury will store a rotating log of JSON-formatted events in $TREASURY_BACKUP_DIR/events
.
These are API-stable and they will reflect all state changes in your Treasury.