Commandpython

/hv-status Command

Monitor hAIveMind collective health, agent status, and system performance

View Source

hv-status - Collective Health Monitor

Purpose

Comprehensive monitoring of hAIveMind collective health including agent availability, memory utilization, network connectivity, and system performance metrics.

When to Use

  • Daily Health Checks: Monitor collective operational status
  • Troubleshooting Issues: Diagnose connectivity or performance problems
  • Capacity Planning: Monitor memory usage and agent workload
  • Network Monitoring: Check Tailscale connectivity between agents
  • Before Delegating: Verify target agents are available and responsive
  • Performance Analysis: Understand collective resource utilization

Syntax

hv-status [options]

Parameters

  • options (optional): Display filtering and formatting
    • --detailed: Show comprehensive information for all sections
    • --agents: Show only agent roster and availability
    • --memory: Show only memory statistics and usage
    • --network: Show only network connectivity status
    • --json: Output in JSON format for programmatic use
    • --quiet: Show only critical issues, minimal output

Status Information Sections

Agent Roster and Availability

  • Active Agents: Currently online and responding
  • Agent Capabilities: Skills and expertise each agent provides
  • Response Times: Average response latency for each agent
  • Workload Status: Current task queue and availability
  • Last Seen: When each agent was last active

Memory Statistics

  • Storage Utilization: ChromaDB and Redis usage metrics
  • Memory Categories: Distribution across infrastructure, incidents, etc.
  • Growth Trends: Memory usage over time
  • Cache Performance: Redis hit rates and efficiency
  • Cleanup Status: Old memory removal and optimization

Network Health

  • Tailscale Connectivity: Connection status to each machine
  • API Endpoints: Health check status for MCP servers
  • Sync Performance: Inter-machine synchronization latency
  • Certificate Status: SSL/TLS certificate validity
  • Firewall Status: Port accessibility between nodes

Real-World Examples

Quick Health Check

hv-status

Result: Overview of collective health with key metrics and any issues highlighted

Detailed System Analysis

hv-status --detailed

Result: Comprehensive report suitable for troubleshooting or performance analysis

Agent Availability Check

hv-status --agents

Result: Focus on which agents are available for task delegation

Memory Usage Analysis

hv-status --memory

Result: Storage metrics for capacity planning and cleanup decisions

Programmatic Monitoring

hv-status --json --quiet

Result: JSON output for automated monitoring scripts, errors only

Expected Output

Standard Status Overview

๐ŸŒ hAIveMind Collective Status - 2025-01-24 14:30:00

๐ŸŽฏ Collective Health: โœ“ OPERATIONAL
   โ†ณ 12 of 14 agents responding (85.7%)
   โ†ณ 2 agents offline: tony-dev, mike-dev (non-critical)
   โ†ณ Average response time: 245ms
   โ†ณ No critical issues detected

๐Ÿค– Agent Roster (Top 5 by Activity):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Agent Name          โ”‚ Capabilities     โ”‚ Status     โ”‚ Response    โ”‚ Workload    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ elastic1-specialist โ”‚ elasticsearch    โ”‚ โœ“ Online   โ”‚ 180ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚
โ”‚ lance-dev-agent     โ”‚ coordination     โ”‚ โœ“ Online   โ”‚ 120ms       โ”‚ โ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚
โ”‚ security-analyst    โ”‚ security         โ”‚ โœ“ Online   โ”‚ 290ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚
โ”‚ mysql-specialist    โ”‚ database_ops     โ”‚ โœ“ Online   โ”‚ 340ms       โ”‚ โ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚
โ”‚ monitoring-agent    โ”‚ monitoring       โ”‚ โœ“ Online   โ”‚ 205ms       โ”‚ โ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ’พ Memory Statistics:
   โ†ณ Total Memories: 8,742 (โ†‘ 127 today)
   โ†ณ Storage Usage: 2.3 GB / 50 GB (4.6%)
   โ†ณ Categories: Infrastructure 35%, Incidents 28%, Security 18%, Other 19%
   โ†ณ Redis Cache: 89% hit rate, 512 MB used

๐ŸŒ Network Health:
   โ†ณ Tailscale: โœ“ Connected to 11 nodes  
   โ†ณ MCP Servers: โœ“ All endpoints responding
   โ†ณ Sync Status: โœ“ Last sync 14 minutes ago
   โ†ณ Certificate: Valid until 2025-06-15

๐Ÿ“Š Recent Activity (Last 24h):
   โ†ณ Broadcasts: 23 (โ†‘ 8 from yesterday)
   โ†ณ Delegations: 45 (โ†‘ 12 from yesterday)
   โ†ณ Queries: 156 (โ†“ 3 from yesterday)
   โ†ณ Memory Stores: 89 (โ†‘ 15 from yesterday)

โš ๏ธ  Warnings:
   โ†ณ elastic2 response time increased 40% (480ms avg)
   โ†ณ Memory growth rate above normal (โ†‘ 18% this week)

๐Ÿ’ก Recommendations:
   โ†ณ Consider restarting elastic2 agent to improve response time
   โ†ณ Schedule memory cleanup for memories older than 6 months
   โ†ณ Monitor tony-dev and mike-dev connectivity issues

Agents-Only View

๐Ÿค– hAIveMind Agent Roster - 2025-01-24 14:30:00

๐Ÿ“‹ 12 Active Agents | 2 Offline | 14 Total Registered

๐ŸŸข ONLINE AGENTS:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Agent               โ”‚ Capabilities                 โ”‚ Response    โ”‚ Workload    โ”‚ Last Task   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ lance-dev-agent     โ”‚ coordination, infrastructure โ”‚ 120ms       โ”‚ โ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 12 min ago  โ”‚
โ”‚ elastic1-specialist โ”‚ elasticsearch_ops, cluster  โ”‚ 180ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 45 min ago  โ”‚
โ”‚ security-analyst    โ”‚ security, incident_response  โ”‚ 290ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 2 hours ago โ”‚
โ”‚ mysql-specialist    โ”‚ database_ops, optimization  โ”‚ 340ms       โ”‚ โ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 30 min ago  โ”‚
โ”‚ monitoring-agent    โ”‚ monitoring, alerting        โ”‚ 205ms       โ”‚ โ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 8 min ago   โ”‚
โ”‚ proxy1-agent        โ”‚ scraping, data_collection   โ”‚ 410ms       โ”‚ โ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 3 min ago   โ”‚
โ”‚ auth-specialist     โ”‚ security, authentication    โ”‚ 198ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 1 hour ago  โ”‚
โ”‚ grafana-agent       โ”‚ monitoring, visualization   โ”‚ 234ms       โ”‚ โ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 25 min ago  โ”‚
โ”‚ elastic3-specialist โ”‚ elasticsearch_ops           โ”‚ 267ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 18 min ago  โ”‚
โ”‚ dev-coordinator     โ”‚ development, code_review    โ”‚ 156ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 1.5 hr ago  โ”‚
โ”‚ kafka-specialist    โ”‚ data_processing, streaming  โ”‚ 445ms       โ”‚ โ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 22 min ago  โ”‚
โ”‚ redis-specialist    โ”‚ caching, performance        โ”‚ 189ms       โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  โ”‚ 38 min ago  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”ด OFFLINE AGENTS:
   โ†ณ tony-dev (development) - Last seen: 6 hours ago
   โ†ณ mike-dev (development) - Last seen: 2 days ago

๐ŸŽฏ CAPABILITY DISTRIBUTION:
   โ†ณ Development: 3 agents (2 offline)
   โ†ณ Infrastructure: 4 agents  
   โ†ณ Database: 3 agents
   โ†ณ Security: 2 agents
   โ†ณ Monitoring: 2 agents
   โ†ณ Data Processing: 2 agents

โœจ TOP PERFORMERS (Last 24h):
   1. lance-dev-agent: 23 tasks completed
   2. monitoring-agent: 18 tasks completed  
   3. proxy1-agent: 15 tasks completed

Memory Statistics Detail

๐Ÿ’พ hAIveMind Memory Statistics - 2025-01-24 14:30:00

๐Ÿ“Š STORAGE OVERVIEW:
   โ†ณ Total Memories: 8,742 items
   โ†ณ Storage Size: 2.3 GB (compressed)
   โ†ณ Growth Rate: +127 memories today (+18% this week)
   โ†ณ Oldest Memory: 2024-06-15 (223 days ago)

๐Ÿ“š CATEGORY BREAKDOWN:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Category        โ”‚ Count   โ”‚ Size (MB)   โ”‚ Avg Size    โ”‚ Growth (7 days) โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ infrastructure  โ”‚ 3,059   โ”‚ 847         โ”‚ 284 KB      โ”‚ +156 (+5.4%)    โ”‚
โ”‚ incidents       โ”‚ 2,448   โ”‚ 623         โ”‚ 261 KB      โ”‚ +89  (+3.8%)    โ”‚
โ”‚ security        โ”‚ 1,573   โ”‚ 412         โ”‚ 269 KB      โ”‚ +45  (+2.9%)    โ”‚
โ”‚ deployments     โ”‚ 874     โ”‚ 198         โ”‚ 232 KB      โ”‚ +23  (+2.7%)    โ”‚
โ”‚ monitoring      โ”‚ 523     โ”‚ 134         โ”‚ 263 KB      โ”‚ +34  (+7.0%)    โ”‚
โ”‚ runbooks        โ”‚ 265     โ”‚ 89          โ”‚ 344 KB      โ”‚ +12  (+4.7%)    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿƒ PERFORMANCE METRICS:
   โ†ณ Search Latency: 287ms average (โ†“ 15ms from last week)
   โ†ณ Insert Rate: 43.2 memories/hour
   โ†ณ Redis Hit Rate: 89.3% (excellent)
   โ†ณ Vector Index: 94.1% efficiency

๐Ÿงน CLEANUP STATUS:
   โ†ณ Last Cleanup: 2025-01-20 02:00:00
   โ†ณ Eligible for Cleanup: 234 memories (older than 180 days)
   โ†ณ Estimated Space Recovery: 67 MB
   โ†ณ Next Scheduled Cleanup: 2025-01-27 02:00:00

๐Ÿ“ˆ TRENDING TOPICS (Last 7 days):
   1. elasticsearch performance (47 memories)
   2. security vulnerability patches (31 memories)
   3. database optimization (28 memories)
   4. network connectivity issues (22 memories)
   5. deployment automation (19 memories)

Performance Metrics and Thresholds

Agent Response Time Classifications

  • Excellent: < 200ms (immediate response)
  • Good: 200-400ms (normal operation)
  • Slow: 400-800ms (potential issues)
  • Critical: > 800ms (needs investigation)

Memory Usage Thresholds

  • Normal: < 60% of allocated storage
  • Warning: 60-80% of allocated storage
  • Critical: > 80% of allocated storage
  • Emergency: > 95% of allocated storage

Network Health Indicators

  • All Green: > 90% agents responsive
  • Warning: 70-90% agents responsive
  • Degraded: 50-70% agents responsive
  • Critical: < 50% agents responsive

Common Status Issues and Solutions

Offline Agents

๐Ÿ”ด OFFLINE: elastic2-specialist (Last seen: 2 hours ago)
๐Ÿ’ก Troubleshooting Steps:
   1. Check machine connectivity: ping elastic2
   2. Verify MCP server: curl http://elastic2:8900/health
   3. Check system resources: ssh elastic2 'top -bn1'
   4. Restart services: ssh elastic2 'sudo systemctl restart memory-mcp-server'

High Memory Usage

โš ๏ธ  Memory usage at 78% (Warning threshold)
๐Ÿ’ก Recommended Actions:
   1. Run memory cleanup: hv-sync clean --memory
   2. Archive old memories: memories older than 6 months
   3. Review memory retention policies
   4. Consider storage expansion if growth continues

Network Connectivity Issues

โŒ Tailscale connectivity degraded (67% nodes reachable)
๐Ÿ’ก Diagnostic Steps:
   1. Check Tailscale status: tailscale status
   2. Restart Tailscale: sudo systemctl restart tailscaled
   3. Verify routing: tailscale ping elastic1
   4. Check firewall rules on affected machines

Poor Performance

๐ŸŒ Average response time: 847ms (Above normal threshold)
๐Ÿ’ก Performance Optimization:
   1. Check system resources on slow agents
   2. Review network latency between machines
   3. Consider Redis cache optimization
   4. Restart high-latency agents

Best Practices for Status Monitoring

  • Daily Checks: Run hv-status as part of daily routine
  • Performance Baselines: Track response times and memory growth trends
  • Proactive Maintenance: Address warnings before they become critical
  • Automation: Use --json output for automated monitoring scripts
  • Documentation: Record recurring issues and solutions in collective memory

Related Commands

  • After finding issues: Use hv-delegate to assign resolution tasks
  • For connectivity issues: Use hv-sync to refresh configurations
  • Performance problems: Use hv-query to find similar past incidents
  • Share findings: Use hv-broadcast to inform collective about status changes

Troubleshooting Status Command Issues

Command Not Responding

  1. Check local MCP server: curl http://localhost:8900/health
  2. Verify Redis connectivity: redis-cli ping
  3. Check system resources: top, df -h, free -m
  4. Restart local services if needed

Incomplete Data

  1. Some agents may be temporarily unreachable (normal)
  2. Network partitions can affect data collection
  3. Check Tailscale connectivity to affected machines
  4. Wait 1-2 minutes and retry for transient issues

Outdated Information

  1. Status data cached for 60 seconds for performance
  2. Use --detailed to force fresh data collection
  3. Check last sync timestamp in output
  4. Network delays may affect data freshness

This command provides comprehensive health monitoring for the hAIveMind collective, helping you maintain optimal performance and quickly identify issues requiring attention.