- Move 12+ outdated documentation files to docs-archive/ - Keep main directory clean with only essential files - Add archive README explaining the move - Main README.md is now the single source of truth for installation - Focus on Docker deployment as primary method 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
552 lines
No EOL
17 KiB
Markdown
552 lines
No EOL
17 KiB
Markdown
# OpenAI Responses API Migration Plan - 2025 Transition Strategy
|
|
|
|
## Executive Summary
|
|
|
|
Following OpenAI's deprecation timeline (Assistants API sunset: mid-2026), we're migrating from the current Make.com workflow using **Assistants API** to a local backend using the new **Responses API**. This plan ensures feature parity while future-proofing the system.
|
|
|
|
## Migration Timeline & Context
|
|
|
|
**Current Status (2025):**
|
|
- ✅ Responses API released (March 2025) with full tool support
|
|
- ⚠️ Assistants API v1 deprecated (December 2024)
|
|
- ⏰ Assistants API complete sunset: Mid-2026
|
|
- 🎯 **Migration Priority: HIGH** - 18 months to complete transition
|
|
|
|
## Current Assistants API Usage Analysis
|
|
|
|
### From Make.com Workflow Blueprint:
|
|
|
|
#### 1. **Thread Management** (Modules 203, 493)
|
|
```javascript
|
|
// Current: OpenAI Threads API
|
|
POST https://api.openai.com/v1/threads
|
|
{
|
|
"messages": [{
|
|
"role": "user",
|
|
"content": "Please use this tone of voice: [TOV_CONTENT]"
|
|
}]
|
|
}
|
|
|
|
// Thread persistence via thread_id in conversations table
|
|
thread_id: "thread_xxx"
|
|
```
|
|
|
|
#### 2. **Assistant Message Processing** (Modules 519, 520)
|
|
```javascript
|
|
// Current: Assistants API messageAdvanced
|
|
{
|
|
"assistantId": "asst_xxx",
|
|
"threadId": "thread_xxx",
|
|
"role": "user",
|
|
"message": "User input"
|
|
}
|
|
|
|
// Run management with polling for completion
|
|
```
|
|
|
|
#### 3. **Assistant Configuration** (Datastore 1607)
|
|
```javascript
|
|
{
|
|
"Assistant ID": "asst_xxx", // OpenAI Assistant ID
|
|
"Name": "Creative Assistant", // Display name
|
|
"Instructions": "System prompt...", // Assistant personality
|
|
"Model": "gpt-4-turbo", // Model configuration
|
|
"Initial Message": "Hello! I'm..." // Welcome message
|
|
}
|
|
```
|
|
|
|
## Responses API Migration Strategy
|
|
|
|
### 1. **Conversation State Management**
|
|
|
|
**From:** Thread-based persistence
|
|
**To:** Response-based continuation with server-side memory
|
|
|
|
```javascript
|
|
// NEW: Responses API with conversation memory
|
|
const response = await client.responses.create({
|
|
model: "gpt-4o",
|
|
input: userMessage,
|
|
store: true, // Enable server-side memory
|
|
previous_response_id: lastResponseId, // Continue conversation
|
|
system: assistantInstructions, // Assistant personality
|
|
temperature: 0.7
|
|
});
|
|
|
|
// Store response_id for conversation continuation
|
|
conversation.last_response_id = response.id;
|
|
```
|
|
|
|
**Key Benefits:**
|
|
- ✅ Automatic conversation memory management
|
|
- ✅ No manual thread/run management
|
|
- ✅ Simplified API calls (single endpoint)
|
|
- ✅ Built-in conversation forking capability
|
|
|
|
### 2. **Assistant Personality System**
|
|
|
|
**From:** Pre-configured Assistant IDs
|
|
**To:** Dynamic system prompts with response configuration
|
|
|
|
```javascript
|
|
// NEW: Dynamic assistant configuration
|
|
const assistants = {
|
|
"creative_ideation": {
|
|
name: "Creative Ideation Assistant",
|
|
system: `You are a highly creative business ideation assistant with decades of experience helping teams generate innovative solutions. Your responses should be:
|
|
- Imaginative and forward-thinking
|
|
- Practical and implementable
|
|
- Encouraging and enthusiastic
|
|
- Rich with diverse perspectives and examples`,
|
|
model: "gpt-4o",
|
|
temperature: 0.8,
|
|
initial_message: "Hello! I'm here to spark your creativity and help generate amazing business ideas!"
|
|
},
|
|
|
|
"analytical_advisor": {
|
|
name: "Analytical Business Advisor",
|
|
system: `You are a data-driven business analyst and strategic advisor. Your responses should be:
|
|
- Methodical and evidence-based
|
|
- Structured with clear frameworks
|
|
- Risk-aware and practical
|
|
- Focused on measurable outcomes`,
|
|
model: "gpt-4o",
|
|
temperature: 0.3,
|
|
initial_message: "Greetings! I'm ready to provide analytical insights and strategic guidance for your business challenges."
|
|
}
|
|
};
|
|
```
|
|
|
|
### 3. **Tone-of-Voice Integration**
|
|
|
|
**From:** Thread-level TOV injection
|
|
**To:** Dynamic system prompt modification
|
|
|
|
```javascript
|
|
// NEW: Enhanced system prompt with TOV
|
|
function buildSystemPrompt(assistantKey, tovKey) {
|
|
const basePrompt = assistants[assistantKey].system;
|
|
const tovPrompts = {
|
|
"standard": "",
|
|
"pep": "\n\nAdditionally, use an energetic, enthusiastic, and motivational tone in all your responses. Be upbeat, use exclamation points appropriately, and inspire action.",
|
|
"professional": "\n\nMaintain a formal, professional tone throughout. Use clear, concise language appropriate for executive-level communication.",
|
|
"casual": "\n\nUse a friendly, conversational tone. Be approachable and relatable while maintaining helpfulness."
|
|
};
|
|
|
|
return basePrompt + (tovPrompts[tovKey] || "");
|
|
}
|
|
|
|
// Usage in API call
|
|
const systemPrompt = buildSystemPrompt(assistantKey, tovKey);
|
|
const response = await client.responses.create({
|
|
model: assistants[assistantKey].model,
|
|
input: userMessage,
|
|
system: systemPrompt,
|
|
store: true,
|
|
previous_response_id: lastResponseId
|
|
});
|
|
```
|
|
|
|
### 4. **Content Processing Pipeline**
|
|
|
|
**From:** External markdown compilation
|
|
**To:** Built-in response processing with enhanced tools
|
|
|
|
```javascript
|
|
// NEW: Simplified response handling
|
|
const response = await client.responses.create({
|
|
model: "gpt-4o",
|
|
input: userMessage,
|
|
system: systemPrompt,
|
|
store: true,
|
|
previous_response_id: lastResponseId,
|
|
|
|
// Enhanced with built-in tools
|
|
tools: [
|
|
{ type: "web_search" }, // Built-in web search
|
|
{ type: "file_search" }, // Built-in file search
|
|
]
|
|
});
|
|
|
|
// Response includes formatted content
|
|
const assistantMessage = response.choices[0].message.content;
|
|
// Built-in markdown support, no external processing needed
|
|
```
|
|
|
|
## Updated Database Schema
|
|
|
|
### Modified Tables for Responses API:
|
|
|
|
**conversations table (updated):**
|
|
```sql
|
|
CREATE TABLE conversations (
|
|
id TEXT PRIMARY KEY,
|
|
user_id TEXT NOT NULL,
|
|
title TEXT,
|
|
last_response_id TEXT, -- NEW: Instead of thread_id
|
|
assistant_key TEXT,
|
|
tov_key TEXT DEFAULT 'standard',
|
|
model TEXT DEFAULT 'gpt-4o', -- NEW: Per-conversation model tracking
|
|
cost DECIMAL(10,4) DEFAULT 0.0000,
|
|
start_time DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
end_time DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
|
|
-- Remove thread_id, assistant_id columns
|
|
-- Remove assistant_id foreign key constraint
|
|
);
|
|
```
|
|
|
|
**assistants table (simplified):**
|
|
```sql
|
|
CREATE TABLE assistants (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
key TEXT UNIQUE NOT NULL,
|
|
name TEXT NOT NULL,
|
|
system_prompt TEXT NOT NULL, -- NEW: Full system prompt
|
|
model TEXT DEFAULT 'gpt-4o',
|
|
temperature DECIMAL(3,2) DEFAULT 0.7, -- NEW: Model parameters
|
|
initial_message TEXT,
|
|
deleted BOOLEAN DEFAULT FALSE,
|
|
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
|
|
-- Remove assistant_id column (no more OpenAI Assistant IDs)
|
|
-- Remove instructions column (merged into system_prompt)
|
|
);
|
|
```
|
|
|
|
**responses table (new):**
|
|
```sql
|
|
CREATE TABLE responses (
|
|
id TEXT PRIMARY KEY, -- OpenAI response_id
|
|
conversation_id TEXT NOT NULL,
|
|
parent_response_id TEXT, -- For conversation threading
|
|
model TEXT NOT NULL,
|
|
system_prompt TEXT, -- Snapshot of system prompt used
|
|
input_tokens INTEGER DEFAULT 0,
|
|
output_tokens INTEGER DEFAULT 0,
|
|
cost DECIMAL(10,6) DEFAULT 0.000000,
|
|
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
FOREIGN KEY (conversation_id) REFERENCES conversations (id)
|
|
);
|
|
```
|
|
|
|
## API Implementation Changes
|
|
|
|
### 1. **Updated Chat Endpoint** (`routes/chat.js`):
|
|
|
|
```javascript
|
|
const express = require('express');
|
|
const router = express.Router();
|
|
const { OpenAI } = require('openai');
|
|
const { v4: uuidv4 } = require('uuid');
|
|
|
|
const openai = new OpenAI({
|
|
apiKey: process.env.OPENAI_API_KEY,
|
|
});
|
|
|
|
router.post('/', async (req, res) => {
|
|
try {
|
|
const { user_id } = req.auth;
|
|
const { ConversationID, AssistantKey, TOV_Key, Message } = req.body;
|
|
|
|
// Validate required fields
|
|
if (!AssistantKey || !TOV_Key || !Message) {
|
|
return res.status(400).json({ error: 'Missing required fields' });
|
|
}
|
|
|
|
// Content moderation (still separate API)
|
|
const moderation = await openai.moderations.create({ input: Message });
|
|
if (moderation.results[0].flagged) {
|
|
return res.status(400).json({ error: 'Content flagged by moderation' });
|
|
}
|
|
|
|
// Get assistant configuration
|
|
const assistant = await Assistant.findOne({
|
|
where: { key: AssistantKey, deleted: false }
|
|
});
|
|
|
|
if (!assistant) {
|
|
return res.status(400).json({ error: 'Error: Assistant Not Set' });
|
|
}
|
|
|
|
let conversation;
|
|
let isNewConversation = !ConversationID;
|
|
let previousResponseId = null;
|
|
|
|
if (isNewConversation) {
|
|
// Create new conversation
|
|
conversation = await Conversation.create({
|
|
id: uuidv4(),
|
|
user_id,
|
|
assistant_key: AssistantKey,
|
|
tov_key: TOV_Key,
|
|
model: assistant.model
|
|
});
|
|
} else {
|
|
// Get existing conversation
|
|
conversation = await Conversation.findOne({
|
|
where: { id: ConversationID, user_id }
|
|
});
|
|
|
|
if (!conversation) {
|
|
return res.status(404).json({ error: 'Conversation not found' });
|
|
}
|
|
|
|
previousResponseId = conversation.last_response_id;
|
|
}
|
|
|
|
// Build system prompt with TOV
|
|
const systemPrompt = buildSystemPrompt(AssistantKey, TOV_Key);
|
|
|
|
// Call Responses API
|
|
const response = await openai.responses.create({
|
|
model: assistant.model,
|
|
input: Message,
|
|
system: systemPrompt,
|
|
temperature: assistant.temperature,
|
|
store: true, // Enable conversation memory
|
|
previous_response_id: previousResponseId,
|
|
|
|
// Built-in tools (if needed)
|
|
tools: [
|
|
{ type: "web_search" },
|
|
{ type: "file_search" }
|
|
]
|
|
});
|
|
|
|
// Store user message
|
|
await Message.create({
|
|
conversation_id: conversation.id,
|
|
role: 'user',
|
|
content: Message,
|
|
content_plain: Message
|
|
});
|
|
|
|
// Extract assistant response
|
|
const assistantMessage = response.choices[0].message.content;
|
|
|
|
// Store assistant message
|
|
await Message.create({
|
|
conversation_id: conversation.id,
|
|
role: 'assistant',
|
|
content: assistantMessage,
|
|
content_plain: assistantMessage
|
|
});
|
|
|
|
// Store response metadata
|
|
await Response.create({
|
|
id: response.id,
|
|
conversation_id: conversation.id,
|
|
parent_response_id: previousResponseId,
|
|
model: assistant.model,
|
|
system_prompt: systemPrompt,
|
|
input_tokens: response.usage.prompt_tokens,
|
|
output_tokens: response.usage.completion_tokens,
|
|
cost: calculateCost(response.usage, assistant.model)
|
|
});
|
|
|
|
// Update conversation
|
|
await conversation.update({
|
|
last_response_id: response.id,
|
|
end_time: new Date()
|
|
});
|
|
|
|
// Generate title for new conversations
|
|
if (isNewConversation) {
|
|
const title = await generateTitle(Message);
|
|
await conversation.update({ title });
|
|
|
|
return res.json({
|
|
conversation_id: conversation.id,
|
|
conversation_title: title,
|
|
message: assistantMessage
|
|
});
|
|
}
|
|
|
|
res.json({
|
|
conversation_id: conversation.id,
|
|
message: assistantMessage
|
|
});
|
|
|
|
} catch (error) {
|
|
console.error('Chat error:', error);
|
|
res.status(500).json({ error: 'Internal server error' });
|
|
}
|
|
});
|
|
|
|
module.exports = router;
|
|
```
|
|
|
|
### 2. **Conversation Retrieval** (`routes/conversations.js`):
|
|
|
|
```javascript
|
|
// GET /api/conversations/:id/messages
|
|
router.get('/:id/messages', async (req, res) => {
|
|
try {
|
|
const { user_id } = req.auth;
|
|
const { id } = req.params;
|
|
|
|
// Option 1: Retrieve from local database (maintains current UX)
|
|
const messages = await Message.findAll({
|
|
where: { conversation_id: id },
|
|
order: [['timestamp', 'ASC']]
|
|
});
|
|
|
|
// Option 2: Retrieve full conversation from OpenAI (leveraging server-side memory)
|
|
const conversation = await Conversation.findOne({
|
|
where: { id, user_id }
|
|
});
|
|
|
|
if (!conversation || !conversation.last_response_id) {
|
|
return res.json({ conversation_id: id, messages: [] });
|
|
}
|
|
|
|
// Fetch complete conversation from OpenAI
|
|
const openaiResponse = await openai.responses.retrieve(
|
|
conversation.last_response_id
|
|
);
|
|
|
|
// openaiResponse includes full conversation history
|
|
const fullConversation = openaiResponse.messages || [];
|
|
|
|
res.json({
|
|
conversation_id: id,
|
|
messages: fullConversation.map(msg => ({
|
|
role: msg.role,
|
|
content: msg.content
|
|
}))
|
|
});
|
|
|
|
} catch (error) {
|
|
console.error('Messages retrieval error:', error);
|
|
res.status(500).json({ error: 'Failed to retrieve messages' });
|
|
}
|
|
});
|
|
```
|
|
|
|
### 3. **Enhanced Features with Responses API**:
|
|
|
|
#### Conversation Forking:
|
|
```javascript
|
|
// Fork conversation at any point
|
|
router.post('/:id/fork', async (req, res) => {
|
|
const { response_id, new_message } = req.body;
|
|
|
|
const forkedResponse = await openai.responses.create({
|
|
model: "gpt-4o",
|
|
input: new_message,
|
|
previous_response_id: response_id, // Fork from this point
|
|
store: true
|
|
});
|
|
|
|
// Create new conversation branch
|
|
const newConversation = await Conversation.create({
|
|
id: uuidv4(),
|
|
user_id,
|
|
last_response_id: forkedResponse.id,
|
|
// ... other fields
|
|
});
|
|
|
|
res.json({ conversation_id: newConversation.id });
|
|
});
|
|
```
|
|
|
|
#### Built-in Web Search:
|
|
```javascript
|
|
// Automatic web search when relevant
|
|
const response = await openai.responses.create({
|
|
model: "gpt-4o",
|
|
input: "What are the latest trends in AI for 2025?",
|
|
tools: [{ type: "web_search" }], // Automatically searches web when needed
|
|
store: true
|
|
});
|
|
```
|
|
|
|
## Migration Benefits
|
|
|
|
### 1. **Simplified Architecture**
|
|
- ❌ **Remove:** Thread management, run polling, message creation
|
|
- ✅ **Add:** Single API call with automatic memory
|
|
- 📉 **Reduce:** ~60% fewer API calls per conversation
|
|
|
|
### 2. **Enhanced Capabilities**
|
|
- 🌐 **Built-in Web Search:** No external integration needed
|
|
- 📁 **Built-in File Search:** Advanced RAG capabilities
|
|
- 🔧 **Enhanced Tools:** Future-proof tool ecosystem
|
|
- 🧠 **Server-side Memory:** Automatic conversation management
|
|
|
|
### 3. **Cost Optimization**
|
|
- 💰 **Reduced API calls:** Single endpoint vs multiple (threads, messages, runs)
|
|
- ⚡ **Faster responses:** No run polling delays
|
|
- 📊 **Better analytics:** Built-in usage tracking
|
|
|
|
### 4. **Developer Experience**
|
|
- 🚀 **Simpler debugging:** Single API call to trace
|
|
- 🔄 **Easier testing:** Stateless requests for unit testing
|
|
- 📚 **Better documentation:** Active OpenAI support and examples
|
|
|
|
## Implementation Timeline
|
|
|
|
### Phase 1: Foundation (Week 1)
|
|
- [ ] Set up Responses API client and authentication
|
|
- [ ] Update database schema for response_id tracking
|
|
- [ ] Create assistant configuration system
|
|
- [ ] Test basic Responses API integration
|
|
|
|
### Phase 2: Core Migration (Week 2)
|
|
- [ ] Implement new chat endpoint with Responses API
|
|
- [ ] Update conversation retrieval logic
|
|
- [ ] Migrate tone-of-voice system to dynamic prompts
|
|
- [ ] Test conversation continuity and memory
|
|
|
|
### Phase 3: Enhanced Features (Week 3)
|
|
- [ ] Integrate built-in web search capabilities
|
|
- [ ] Add conversation forking functionality
|
|
- [ ] Implement advanced analytics and cost tracking
|
|
- [ ] Update frontend for new response format
|
|
|
|
### Phase 4: Production Optimization (Week 4)
|
|
- [ ] Performance testing and optimization
|
|
- [ ] Error handling and retry logic
|
|
- [ ] Monitoring and alerting setup
|
|
- [ ] Documentation and deployment guides
|
|
|
|
### Phase 5: Parallel Operation (Week 5)
|
|
- [ ] Run both systems in parallel for validation
|
|
- [ ] Data migration from Assistants to Responses format
|
|
- [ ] User acceptance testing
|
|
- [ ] Gradual cutover strategy
|
|
|
|
## Risk Mitigation
|
|
|
|
### 1. **API Compatibility**
|
|
- **Risk:** Breaking changes in Responses API
|
|
- **Mitigation:** Version pinning, fallback to Chat Completions API
|
|
|
|
### 2. **Feature Gaps**
|
|
- **Risk:** Missing features from Assistants API
|
|
- **Mitigation:** Hybrid approach using Chat Completions for gaps
|
|
|
|
### 3. **Migration Timeline**
|
|
- **Risk:** Assistants API sunset before migration complete
|
|
- **Mitigation:** Aggressive timeline with parallel development
|
|
|
|
### 4. **Data Loss**
|
|
- **Risk:** Conversation history lost during migration
|
|
- **Mitigation:** Full data export and mapping strategy
|
|
|
|
## Success Metrics
|
|
|
|
### Technical Metrics:
|
|
- ✅ **Response Time:** <2s average (vs current ~5s with polling)
|
|
- ✅ **API Call Reduction:** 60% fewer calls per conversation
|
|
- ✅ **Error Rate:** <1% API errors
|
|
- ✅ **Feature Parity:** 100% current functionality maintained
|
|
|
|
### Business Metrics:
|
|
- 💰 **Cost Reduction:** 30-40% OpenAI usage costs
|
|
- 📈 **User Satisfaction:** Improved response times
|
|
- 🛠 **Developer Velocity:** Faster feature development
|
|
- 🔮 **Future-Proofing:** Ready for OpenAI's 2026+ roadmap
|
|
|
|
This migration plan ensures we transition to the Responses API while maintaining all current functionality and positioning for enhanced capabilities and cost optimization. |