agent_tracker/agent_collector_api_documentation.md
2025-08-17 07:23:53 -05:00

310 lines
No EOL
8.4 KiB
Markdown

# Agent Collector API Documentation
## Overview
Agent Collector is a Flask-based REST API for collecting and storing agent metadata across an organization. The application validates agent data against a predefined JSON schema and stores it in MongoDB for persistence.
## Base Configuration
- **Default Host**: `0.0.0.0`
- **Default Port**: `8475`
- **Server**: Hypercorn ASGI server
- **Database**: MongoDB
- **Content Type**: `application/json`
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MONGO_HOST` | `localhost` | MongoDB host address |
| `MONGO_PORT` | `27017` | MongoDB port |
| `MONGO_USERNAME` | `admin` | MongoDB username |
| `MONGO_PASSWORD` | `admin` | MongoDB password |
| `MONGO_DB_NAME` | `agent_collector` | MongoDB database name |
| `MONGO_COLLECTION` | `agents` | MongoDB collection name |
| `DEBUG` | `False` | Flask debug mode |
| `HOST` | `0.0.0.0` | Host to bind to |
| `PORT` | `8475` | Port to bind to |
| `HYPERCORN_BIND` | `0.0.0.0:8475` | Hypercorn bind address |
| `HYPERCORN_WORKERS` | `1` | Number of worker processes |
| `HYPERCORN_LOG_LEVEL` | `info` | Logging level |
## API Endpoints
### Health Check
**Endpoint**: `GET /`
**Description**: Health check endpoint that verifies the application and MongoDB connection status.
**Request**: No parameters required
**Response**:
```json
{
"status": "healthy",
"message": "Agent collector API is running",
"timestamp": "2024-01-01T12:00:00.000000",
"database": {
"status": "connected",
"healthy": true
}
}
```
**Status Codes**:
- `200 OK`: Service is healthy
### Submit Agent Data
**Endpoint**: `POST /agents`
**Description**: Collects and validates agent metadata, then stores it in MongoDB.
**Request Headers**:
- `Content-Type: application/json`
**Request Body Schema**:
**Required Fields**:
- `name` (string): Name of the agent (minimum length: 1)
- `description` (string): Detailed description of the agent (minimum length: 1)
- `purpose` (string): Primary purpose or function of the agent (minimum length: 1)
**Optional Fields**:
- `location` (string): Physical or virtual location where the agent is deployed
- `userbase` (array of strings): List of users or user groups who use this agent
- `version` (string): Current version of the agent
- `creation_date` (string, ISO 8601 datetime): Date and time when the agent was created
- `last_updated` (string, ISO 8601 datetime): Date and time when the agent was last updated
- `capabilities` (array of strings): List of agent capabilities
- `status` (string): Current operational status - must be one of: `active`, `inactive`, `deprecated`, `development`
- `department` (string): Department or team responsible for the agent
- `contact_person` (string): Person to contact for issues or inquiries about the agent
- `tags` (array of strings): Tags for categorizing the agent
- `metadata` (object): Additional arbitrary metadata about the agent
**Example Request**:
```json
{
"name": "TestAgent",
"description": "Test description",
"purpose": "Testing purposes",
"location": "Development Environment",
"userbase": ["dev-team", "qa-team"],
"version": "1.0.0",
"capabilities": ["data-processing", "automated-testing"],
"status": "development",
"department": "Engineering",
"contact_person": "john.doe@company.com",
"tags": ["automation", "testing"],
"metadata": {
"framework": "custom",
"language": "python"
}
}
```
**Success Response** (`201 Created`):
```json
{
"status": "success",
"message": "Agent data collected successfully",
"agent_id": "507f1f77bcf86cd799439011"
}
```
**Error Responses**:
**Unsupported Media Type** (`415`):
```json
{
"error": "Unsupported Media Type",
"message": "Request must be JSON"
}
```
**Validation Error** (`400 Bad Request`):
```json
{
"error": "Invalid Data",
"message": "'name' is a required property"
}
```
**Database Unavailable** (`503 Service Unavailable`):
```json
{
"error": "Database Unavailable",
"message": "MongoDB connection is not available. Please check the database setup.",
"agent_data": {
// Original submitted data returned for client retry
}
}
```
**Database Error** (`500 Internal Server Error`):
```json
{
"error": "Database Error",
"message": "Failed to store agent data. MongoDB may be unavailable or there was an error processing the request.",
"agent_data": {
// Original submitted data returned for client retry
}
}
```
## Data Processing
### Automatic Timestamp Addition
The API automatically adds timestamps to submitted data:
- If `creation_date` is not provided, it's set to the current UTC time
- If `last_updated` is not provided, it's set to the current UTC time
- Timestamps are in ISO 8601 format
### Schema Validation
All submitted data is validated against the JSON schema before storage. The validation:
- Ensures required fields are present
- Validates data types for all fields
- Enforces string length constraints
- Validates enum values for status field
- Rejects additional properties not defined in the schema
## Database Operations
### Connection Management
- Automatic connection retry with exponential backoff
- Connection health checks before operations
- Graceful error handling for connection failures
- Connection pooling via PyMongo MongoClient
### Data Storage
- Each agent record is stored as a MongoDB document
- Unique ObjectId is generated for each record
- Full document is stored with all provided metadata
## Error Handling
The API implements comprehensive error handling:
1. **Input Validation**: JSON schema validation with detailed error messages
2. **Database Connectivity**: Connection retries and graceful degradation
3. **Data Preservation**: Failed requests return original data for client retry
4. **Logging**: Detailed error logging for troubleshooting
## Security Considerations
- No authentication mechanism implemented (authentication should be handled at proxy/gateway level)
- Input validation prevents injection attacks
- Database credentials configurable via environment variables
- Connection timeouts prevent resource exhaustion
## Deployment Notes
### Production Deployment
Run with Hypercorn for production:
```bash
python main.py
```
### Development Mode
For development with Flask's built-in server:
```bash
python wsgi.py
```
### Docker Deployment
The application is designed to work in containerized environments with:
- Environment variable configuration
- MongoDB connection timeout handling
- Graceful shutdown handling
## Client Integration
### Example Client Code (Python)
```python
import requests
import json
# Health check
response = requests.get('http://localhost:8475/')
print(f"Health: {response.json()}")
# Submit agent data
agent_data = {
"name": "MyAgent",
"description": "Agent for processing customer data",
"purpose": "Customer service automation",
"status": "active",
"department": "Customer Success"
}
response = requests.post(
'http://localhost:8475/agents',
headers={'Content-Type': 'application/json'},
data=json.dumps(agent_data)
)
if response.status_code == 201:
result = response.json()
print(f"Agent stored with ID: {result['agent_id']}")
else:
print(f"Error: {response.json()}")
```
### Example Client Code (curl)
```bash
# Health check
curl http://localhost:8475/
# Submit agent data
curl -X POST http://localhost:8475/agents \
-H "Content-Type: application/json" \
-d '{
"name": "TestAgent",
"description": "Test description",
"purpose": "Testing purposes"
}'
```
## MongoDB Schema
The MongoDB collection stores documents with the following structure:
```javascript
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"name": "AgentName",
"description": "Agent description",
"purpose": "Agent purpose",
"location": "Optional location",
"userbase": ["user1", "user2"],
"version": "1.0.0",
"creation_date": "2024-01-01T12:00:00.000000",
"last_updated": "2024-01-01T12:00:00.000000",
"capabilities": ["capability1", "capability2"],
"status": "active",
"department": "Engineering",
"contact_person": "contact@company.com",
"tags": ["tag1", "tag2"],
"metadata": {
"custom_field": "custom_value"
}
}
```
## Rate Limiting and Scaling
- No built-in rate limiting (should be implemented at proxy/gateway level)
- Hypercorn workers can be scaled via `HYPERCORN_WORKERS` environment variable
- MongoDB connection pooling handles concurrent requests
- Application is stateless and horizontally scalable