Query and monitor logs from your entire cluster from a single point. No external infrastructure required.
A production issue hits 5 servers. You need to correlate logs across all of them instantly to find root cause.
# Query logs from all nodes at once
$ curl https://any-node:20194/api/v1.0/logs \
-G --data-urlencode "filter=ERROR"
# Results from all 5 nodes, chronologically ordered
[
{
"node": "prod-1",
"timestamp": "2025-02-20T06:23:45.123Z",
"message": "Database connection timeout"
},
{
"node": "prod-2",
"timestamp": "2025-02-20T06:23:46.456Z",
"message": "Orchestration failed: DB unreachable"
}
]
# Filter by orchestration execution
$ curl https://any-node:20194/api/v1.0/logs \
-G --data-urlencode "execution_id=exec-123"DM-WebManager provides live log streaming and real-time orchestration monitoring:
2025-02-20 06:23:00 [prod-1] INFO Starting deploy-v2
2025-02-20 06:23:01 [prod-2] INFO Pulling image
2025-02-20 06:23:03 [prod-3] INFO Running health check
2025-02-20 06:23:04 [prod-1] WARN Health slow (2.3s)
2025-02-20 06:23:05 [prod-4] OK All checks passedFilter by node, level, timestamp range, orchestration ID, or custom regex patterns.
"ERROR in prod-1 AND after:2025-03-12T14:00"
"execution_id=exec-789 AND level:WARN"
"regex:Database.*timeout AND nodes:[prod-1..prod-5]"Automatically aggregate logs by error type, node, and time window for trend analysis.
"errors_by_node": {
"prod-1": 23,
"prod-2": 7
},
"top_errors": [
{"type": "timeout", "count": 15}
]Automatic log rotation and retention policies. Keep recent logs hot, archive old logs.
"retention": {
"hot": "7d",
"warm": "30d",
"cold_archive": "1y"
}Create alerts triggered by error patterns, missing heartbeats, or anomaly detection.
"alert": {
"name": "high-error-rate",
"condition": "errors > 100 in 5m",
"webhook": "https://slack/hook"
}Query millions of log lines in milliseconds across all nodes.
Logs stored locally on each node. No external dependencies.
Stream logs live as orchestrations execute.
Complete history of all operations for compliance.