ideas-generator/docs-archive/RESPONSES_API_MIGRATION_PLAN.md
DJP b909d7e19a Clean up repository structure and archive legacy docs
- Move 12+ outdated documentation files to docs-archive/
- Keep main directory clean with only essential files
- Add archive README explaining the move
- Main README.md is now the single source of truth for installation
- Focus on Docker deployment as primary method

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-10 16:24:39 -04:00

552 lines
No EOL
17 KiB
Markdown

# OpenAI Responses API Migration Plan - 2025 Transition Strategy
## Executive Summary
Following OpenAI's deprecation timeline (Assistants API sunset: mid-2026), we're migrating from the current Make.com workflow using **Assistants API** to a local backend using the new **Responses API**. This plan ensures feature parity while future-proofing the system.
## Migration Timeline & Context
**Current Status (2025):**
- ✅ Responses API released (March 2025) with full tool support
- ⚠️ Assistants API v1 deprecated (December 2024)
- ⏰ Assistants API complete sunset: Mid-2026
- 🎯 **Migration Priority: HIGH** - 18 months to complete transition
## Current Assistants API Usage Analysis
### From Make.com Workflow Blueprint:
#### 1. **Thread Management** (Modules 203, 493)
```javascript
// Current: OpenAI Threads API
POST https://api.openai.com/v1/threads
{
"messages": [{
"role": "user",
"content": "Please use this tone of voice: [TOV_CONTENT]"
}]
}
// Thread persistence via thread_id in conversations table
thread_id: "thread_xxx"
```
#### 2. **Assistant Message Processing** (Modules 519, 520)
```javascript
// Current: Assistants API messageAdvanced
{
"assistantId": "asst_xxx",
"threadId": "thread_xxx",
"role": "user",
"message": "User input"
}
// Run management with polling for completion
```
#### 3. **Assistant Configuration** (Datastore 1607)
```javascript
{
"Assistant ID": "asst_xxx", // OpenAI Assistant ID
"Name": "Creative Assistant", // Display name
"Instructions": "System prompt...", // Assistant personality
"Model": "gpt-4-turbo", // Model configuration
"Initial Message": "Hello! I'm..." // Welcome message
}
```
## Responses API Migration Strategy
### 1. **Conversation State Management**
**From:** Thread-based persistence
**To:** Response-based continuation with server-side memory
```javascript
// NEW: Responses API with conversation memory
const response = await client.responses.create({
model: "gpt-4o",
input: userMessage,
store: true, // Enable server-side memory
previous_response_id: lastResponseId, // Continue conversation
system: assistantInstructions, // Assistant personality
temperature: 0.7
});
// Store response_id for conversation continuation
conversation.last_response_id = response.id;
```
**Key Benefits:**
- ✅ Automatic conversation memory management
- ✅ No manual thread/run management
- ✅ Simplified API calls (single endpoint)
- ✅ Built-in conversation forking capability
### 2. **Assistant Personality System**
**From:** Pre-configured Assistant IDs
**To:** Dynamic system prompts with response configuration
```javascript
// NEW: Dynamic assistant configuration
const assistants = {
"creative_ideation": {
name: "Creative Ideation Assistant",
system: `You are a highly creative business ideation assistant with decades of experience helping teams generate innovative solutions. Your responses should be:
- Imaginative and forward-thinking
- Practical and implementable
- Encouraging and enthusiastic
- Rich with diverse perspectives and examples`,
model: "gpt-4o",
temperature: 0.8,
initial_message: "Hello! I'm here to spark your creativity and help generate amazing business ideas!"
},
"analytical_advisor": {
name: "Analytical Business Advisor",
system: `You are a data-driven business analyst and strategic advisor. Your responses should be:
- Methodical and evidence-based
- Structured with clear frameworks
- Risk-aware and practical
- Focused on measurable outcomes`,
model: "gpt-4o",
temperature: 0.3,
initial_message: "Greetings! I'm ready to provide analytical insights and strategic guidance for your business challenges."
}
};
```
### 3. **Tone-of-Voice Integration**
**From:** Thread-level TOV injection
**To:** Dynamic system prompt modification
```javascript
// NEW: Enhanced system prompt with TOV
function buildSystemPrompt(assistantKey, tovKey) {
const basePrompt = assistants[assistantKey].system;
const tovPrompts = {
"standard": "",
"pep": "\n\nAdditionally, use an energetic, enthusiastic, and motivational tone in all your responses. Be upbeat, use exclamation points appropriately, and inspire action.",
"professional": "\n\nMaintain a formal, professional tone throughout. Use clear, concise language appropriate for executive-level communication.",
"casual": "\n\nUse a friendly, conversational tone. Be approachable and relatable while maintaining helpfulness."
};
return basePrompt + (tovPrompts[tovKey] || "");
}
// Usage in API call
const systemPrompt = buildSystemPrompt(assistantKey, tovKey);
const response = await client.responses.create({
model: assistants[assistantKey].model,
input: userMessage,
system: systemPrompt,
store: true,
previous_response_id: lastResponseId
});
```
### 4. **Content Processing Pipeline**
**From:** External markdown compilation
**To:** Built-in response processing with enhanced tools
```javascript
// NEW: Simplified response handling
const response = await client.responses.create({
model: "gpt-4o",
input: userMessage,
system: systemPrompt,
store: true,
previous_response_id: lastResponseId,
// Enhanced with built-in tools
tools: [
{ type: "web_search" }, // Built-in web search
{ type: "file_search" }, // Built-in file search
]
});
// Response includes formatted content
const assistantMessage = response.choices[0].message.content;
// Built-in markdown support, no external processing needed
```
## Updated Database Schema
### Modified Tables for Responses API:
**conversations table (updated):**
```sql
CREATE TABLE conversations (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL,
title TEXT,
last_response_id TEXT, -- NEW: Instead of thread_id
assistant_key TEXT,
tov_key TEXT DEFAULT 'standard',
model TEXT DEFAULT 'gpt-4o', -- NEW: Per-conversation model tracking
cost DECIMAL(10,4) DEFAULT 0.0000,
start_time DATETIME DEFAULT CURRENT_TIMESTAMP,
end_time DATETIME DEFAULT CURRENT_TIMESTAMP,
-- Remove thread_id, assistant_id columns
-- Remove assistant_id foreign key constraint
);
```
**assistants table (simplified):**
```sql
CREATE TABLE assistants (
id INTEGER PRIMARY KEY AUTOINCREMENT,
key TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
system_prompt TEXT NOT NULL, -- NEW: Full system prompt
model TEXT DEFAULT 'gpt-4o',
temperature DECIMAL(3,2) DEFAULT 0.7, -- NEW: Model parameters
initial_message TEXT,
deleted BOOLEAN DEFAULT FALSE,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
-- Remove assistant_id column (no more OpenAI Assistant IDs)
-- Remove instructions column (merged into system_prompt)
);
```
**responses table (new):**
```sql
CREATE TABLE responses (
id TEXT PRIMARY KEY, -- OpenAI response_id
conversation_id TEXT NOT NULL,
parent_response_id TEXT, -- For conversation threading
model TEXT NOT NULL,
system_prompt TEXT, -- Snapshot of system prompt used
input_tokens INTEGER DEFAULT 0,
output_tokens INTEGER DEFAULT 0,
cost DECIMAL(10,6) DEFAULT 0.000000,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (conversation_id) REFERENCES conversations (id)
);
```
## API Implementation Changes
### 1. **Updated Chat Endpoint** (`routes/chat.js`):
```javascript
const express = require('express');
const router = express.Router();
const { OpenAI } = require('openai');
const { v4: uuidv4 } = require('uuid');
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
router.post('/', async (req, res) => {
try {
const { user_id } = req.auth;
const { ConversationID, AssistantKey, TOV_Key, Message } = req.body;
// Validate required fields
if (!AssistantKey || !TOV_Key || !Message) {
return res.status(400).json({ error: 'Missing required fields' });
}
// Content moderation (still separate API)
const moderation = await openai.moderations.create({ input: Message });
if (moderation.results[0].flagged) {
return res.status(400).json({ error: 'Content flagged by moderation' });
}
// Get assistant configuration
const assistant = await Assistant.findOne({
where: { key: AssistantKey, deleted: false }
});
if (!assistant) {
return res.status(400).json({ error: 'Error: Assistant Not Set' });
}
let conversation;
let isNewConversation = !ConversationID;
let previousResponseId = null;
if (isNewConversation) {
// Create new conversation
conversation = await Conversation.create({
id: uuidv4(),
user_id,
assistant_key: AssistantKey,
tov_key: TOV_Key,
model: assistant.model
});
} else {
// Get existing conversation
conversation = await Conversation.findOne({
where: { id: ConversationID, user_id }
});
if (!conversation) {
return res.status(404).json({ error: 'Conversation not found' });
}
previousResponseId = conversation.last_response_id;
}
// Build system prompt with TOV
const systemPrompt = buildSystemPrompt(AssistantKey, TOV_Key);
// Call Responses API
const response = await openai.responses.create({
model: assistant.model,
input: Message,
system: systemPrompt,
temperature: assistant.temperature,
store: true, // Enable conversation memory
previous_response_id: previousResponseId,
// Built-in tools (if needed)
tools: [
{ type: "web_search" },
{ type: "file_search" }
]
});
// Store user message
await Message.create({
conversation_id: conversation.id,
role: 'user',
content: Message,
content_plain: Message
});
// Extract assistant response
const assistantMessage = response.choices[0].message.content;
// Store assistant message
await Message.create({
conversation_id: conversation.id,
role: 'assistant',
content: assistantMessage,
content_plain: assistantMessage
});
// Store response metadata
await Response.create({
id: response.id,
conversation_id: conversation.id,
parent_response_id: previousResponseId,
model: assistant.model,
system_prompt: systemPrompt,
input_tokens: response.usage.prompt_tokens,
output_tokens: response.usage.completion_tokens,
cost: calculateCost(response.usage, assistant.model)
});
// Update conversation
await conversation.update({
last_response_id: response.id,
end_time: new Date()
});
// Generate title for new conversations
if (isNewConversation) {
const title = await generateTitle(Message);
await conversation.update({ title });
return res.json({
conversation_id: conversation.id,
conversation_title: title,
message: assistantMessage
});
}
res.json({
conversation_id: conversation.id,
message: assistantMessage
});
} catch (error) {
console.error('Chat error:', error);
res.status(500).json({ error: 'Internal server error' });
}
});
module.exports = router;
```
### 2. **Conversation Retrieval** (`routes/conversations.js`):
```javascript
// GET /api/conversations/:id/messages
router.get('/:id/messages', async (req, res) => {
try {
const { user_id } = req.auth;
const { id } = req.params;
// Option 1: Retrieve from local database (maintains current UX)
const messages = await Message.findAll({
where: { conversation_id: id },
order: [['timestamp', 'ASC']]
});
// Option 2: Retrieve full conversation from OpenAI (leveraging server-side memory)
const conversation = await Conversation.findOne({
where: { id, user_id }
});
if (!conversation || !conversation.last_response_id) {
return res.json({ conversation_id: id, messages: [] });
}
// Fetch complete conversation from OpenAI
const openaiResponse = await openai.responses.retrieve(
conversation.last_response_id
);
// openaiResponse includes full conversation history
const fullConversation = openaiResponse.messages || [];
res.json({
conversation_id: id,
messages: fullConversation.map(msg => ({
role: msg.role,
content: msg.content
}))
});
} catch (error) {
console.error('Messages retrieval error:', error);
res.status(500).json({ error: 'Failed to retrieve messages' });
}
});
```
### 3. **Enhanced Features with Responses API**:
#### Conversation Forking:
```javascript
// Fork conversation at any point
router.post('/:id/fork', async (req, res) => {
const { response_id, new_message } = req.body;
const forkedResponse = await openai.responses.create({
model: "gpt-4o",
input: new_message,
previous_response_id: response_id, // Fork from this point
store: true
});
// Create new conversation branch
const newConversation = await Conversation.create({
id: uuidv4(),
user_id,
last_response_id: forkedResponse.id,
// ... other fields
});
res.json({ conversation_id: newConversation.id });
});
```
#### Built-in Web Search:
```javascript
// Automatic web search when relevant
const response = await openai.responses.create({
model: "gpt-4o",
input: "What are the latest trends in AI for 2025?",
tools: [{ type: "web_search" }], // Automatically searches web when needed
store: true
});
```
## Migration Benefits
### 1. **Simplified Architecture**
-**Remove:** Thread management, run polling, message creation
-**Add:** Single API call with automatic memory
- 📉 **Reduce:** ~60% fewer API calls per conversation
### 2. **Enhanced Capabilities**
- 🌐 **Built-in Web Search:** No external integration needed
- 📁 **Built-in File Search:** Advanced RAG capabilities
- 🔧 **Enhanced Tools:** Future-proof tool ecosystem
- 🧠 **Server-side Memory:** Automatic conversation management
### 3. **Cost Optimization**
- 💰 **Reduced API calls:** Single endpoint vs multiple (threads, messages, runs)
-**Faster responses:** No run polling delays
- 📊 **Better analytics:** Built-in usage tracking
### 4. **Developer Experience**
- 🚀 **Simpler debugging:** Single API call to trace
- 🔄 **Easier testing:** Stateless requests for unit testing
- 📚 **Better documentation:** Active OpenAI support and examples
## Implementation Timeline
### Phase 1: Foundation (Week 1)
- [ ] Set up Responses API client and authentication
- [ ] Update database schema for response_id tracking
- [ ] Create assistant configuration system
- [ ] Test basic Responses API integration
### Phase 2: Core Migration (Week 2)
- [ ] Implement new chat endpoint with Responses API
- [ ] Update conversation retrieval logic
- [ ] Migrate tone-of-voice system to dynamic prompts
- [ ] Test conversation continuity and memory
### Phase 3: Enhanced Features (Week 3)
- [ ] Integrate built-in web search capabilities
- [ ] Add conversation forking functionality
- [ ] Implement advanced analytics and cost tracking
- [ ] Update frontend for new response format
### Phase 4: Production Optimization (Week 4)
- [ ] Performance testing and optimization
- [ ] Error handling and retry logic
- [ ] Monitoring and alerting setup
- [ ] Documentation and deployment guides
### Phase 5: Parallel Operation (Week 5)
- [ ] Run both systems in parallel for validation
- [ ] Data migration from Assistants to Responses format
- [ ] User acceptance testing
- [ ] Gradual cutover strategy
## Risk Mitigation
### 1. **API Compatibility**
- **Risk:** Breaking changes in Responses API
- **Mitigation:** Version pinning, fallback to Chat Completions API
### 2. **Feature Gaps**
- **Risk:** Missing features from Assistants API
- **Mitigation:** Hybrid approach using Chat Completions for gaps
### 3. **Migration Timeline**
- **Risk:** Assistants API sunset before migration complete
- **Mitigation:** Aggressive timeline with parallel development
### 4. **Data Loss**
- **Risk:** Conversation history lost during migration
- **Mitigation:** Full data export and mapping strategy
## Success Metrics
### Technical Metrics:
-**Response Time:** <2s average (vs current ~5s with polling)
- **API Call Reduction:** 60% fewer calls per conversation
- **Error Rate:** <1% API errors
- **Feature Parity:** 100% current functionality maintained
### Business Metrics:
- 💰 **Cost Reduction:** 30-40% OpenAI usage costs
- 📈 **User Satisfaction:** Improved response times
- 🛠 **Developer Velocity:** Faster feature development
- 🔮 **Future-Proofing:** Ready for OpenAI's 2026+ roadmap
This migration plan ensures we transition to the Responses API while maintaining all current functionality and positioning for enhanced capabilities and cost optimization.