assistant-extractor/README.md
DJP ee960c544f Initial commit: OpenAI Assistant Data Extractor
- Add Python script to extract assistant data via OpenAI API
- Extract names, IDs, system instructions, and vector stores
- Support for function tool schemas and response format schemas
- Export to CSV with separate schema files
- Handle pagination and error cases

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-02 16:43:01 -04:00

90 lines
No EOL
2.6 KiB
Markdown

# OpenAI Assistant Data Extractor
A Python tool to extract comprehensive data from OpenAI assistants and export it to CSV format with separate files for JSON schemas.
## Features
- **List all assistants** in your OpenAI organization
- **Extract key data** including:
- Assistant name and ID
- System instructions
- Model information
- Creation timestamp
- Attached vector stores and their names
- Function tools and their JSON schemas
- Response format schemas (structured outputs)
- **Export to CSV** with references to separate schema files
- **Automatic pagination** handling for large numbers of assistants
- **Schema file generation** for complex JSON structures
## Installation
1. Clone this repository
2. Install dependencies:
```bash
pip install -r requirements.txt
```
## Usage
1. Set your OpenAI API key:
```bash
export OPENAI_API_KEY=your_api_key_here
```
2. Run the extractor:
```bash
python assistant_extractor.py
```
## Output
The tool generates several files:
### CSV File
- `assistants_data.csv` - Main data export with columns:
- `assistant_id` - Unique OpenAI assistant identifier
- `assistant_name` - Display name of the assistant
- `system_instructions` - The assistant's system prompt
- `vector_store_ids` - Comma-separated list of attached vector store IDs
- `vector_store_names` - Human-readable names and IDs of vector stores
- `function_tools` - Comma-separated list of function tool names
- `function_schemas` - Reference to function schema file (if any)
- `response_format_schema_file` - Reference to response format schema file (if any)
- `model` - AI model used by the assistant
- `created_at` - Timestamp when assistant was created
### Schema Files
- `function_schemas_{assistant_id}.txt` - Function tool parameter schemas
- `response_format_schema_{assistant_id}.json` - Structured output schemas
## Requirements
- Python 3.7+
- OpenAI API key with access to Assistants API
- `openai` Python package (>=1.3.0)
## Error Handling
- Handles API rate limits and pagination automatically
- Creates error references in CSV if schema extraction fails
- Continues processing other assistants if individual assistant extraction fails
## Example Output
```
Extracting assistant data...
Found 3 assistants
Data exported to assistants_data.csv
Summary:
- Customer Support Bot (asst_abc123)
Vector Stores: Knowledge Base (vs_def456)
Function Tools: get_order_status, process_refund
Function Schemas: function_schemas_asst_abc123.txt
Response Format Schema: response_format_schema_asst_abc123.json
```
## License
MIT License