Worksona.js Documentation
AI Agent Management - Library & API Server
Core Library Overview
Worksona.js is a powerful, single-file JavaScript library (2,241 lines, zero runtime dependencies) that provides sophisticated AI agent management across multiple LLM providers. The library operates in two modes: as a direct JavaScript library or as a REST API server.
Zero Dependencies
Single-file architecture with no external runtime dependencies
Multi-Provider
OpenAI, Anthropic, Google with automatic failover
Image Pipeline
Generate, analyze, edit, and create image variations
Observable
17 event emitters for complete visibility
Architecture Overview
GPT-5, GPT-4o, o3] ANT[Anthropic
Claude Opus 4.5] GOO[Google
Gemini Pro] end subgraph "Capabilities" CH[Chat & Query] IM[Image Processing] FP[File Processing] ME[Metrics & History] end W --> A W --> E W --> CP A --> OAI A --> ANT A --> GOO W --> CH W --> IM W --> FP W --> ME style W fill:#2563eb,color:#fff style A fill:#7c3aed,color:#fff
Latest Frontier Models
Worksona.js provides first-class support for the latest and most powerful AI models:
- GPT-5 series (gpt-5, gpt-5-mini, gpt-5-nano) - OpenAI's latest flagship models
- Claude Opus 4.5 - Anthropic's most capable model with 200K context
- o3 & o3-mini - Advanced reasoning models for complex problem-solving
- GPT-4o - Multimodal model with vision and voice capabilities
- Full backward compatibility with GPT-4, Claude 3.5, and earlier models
Event-Driven Architecture
The library emits 17 distinct events for complete observability:
// Core events
worksona.on('agent-loaded', (data) => console.log('Agent ready'));
worksona.on('chat-start', (data) => console.log('Processing...'));
worksona.on('chat-complete', (data) => console.log('Response:', data));
// Image events
worksona.on('image-generation-start', (data) => ...);
worksona.on('image-generation-complete', (data) => ...);
worksona.on('image-analysis-complete', (data) => ...);
// Error handling
worksona.on('error', (error) => console.error('Error:', error));
Image Processing Pipeline
Complete image workflow powered by OpenAI's DALL-E 3 and GPT-4o vision:
- Generate - Create images from text descriptions
- Analyze - Extract insights from images using vision AI
- Edit - Modify existing images with text prompts
- Variations - Create variations of existing images
Real-Time Control Panel
Built-in visual control panel for development and debugging:
- View all loaded agents and their status
- Monitor API provider connectivity
- Inspect agent metrics and history
- Test chat and image operations
- Floating or embedded mode
Delegators Explained
Delegators are the orchestration layer in Worksona.js that enable multi-agent workflows. A delegator agent can break down complex tasks and delegate subtasks to specialized agents, creating powerful AI pipelines.
Multi-Agent Workflow Example
// Content creation pipeline with delegation
const pipeline = async (topic) => {
// Step 1: Research
const research = await worksona.chat('research-analyst',
`Research key points about: ${topic}`);
// Step 2: Write content
const draft = await worksona.chat('content-writer',
`Write article based on: ${research}`);
// Step 3: Edit
const edited = await worksona.chat('editor-agent',
`Edit and improve: ${draft}`);
// Step 4: Fact-check
const verified = await worksona.chat('research-analyst',
`Verify facts in: ${edited}`);
return verified;
};
// Execute the pipeline
const article = await pipeline('Quantum Computing');
Common Delegation Patterns
- Research Write Edit - Content creation workflows
- Analyze Extract Summarize - Document processing
- Classify Route Respond - Customer support automation
- Generate Review Refine - Creative workflows
- Parse Validate Transform - Data processing pipelines
Use Cases for Delegation
Content Production
Research, writing, editing, and fact-checking pipelines
Document Processing
OCR, extraction, analysis, and summarization workflows
Customer Support
Intent detection, routing, response generation, and QA
Data Analysis
Collection, validation, analysis, and reporting chains
Endpoint Agents
Endpoint agents are pre-configured, specialized agents with distinct personalities, knowledge domains, and expertise. Each agent is defined by a JSON configuration file that specifies its traits, system prompt, and conversation examples.
Agent Personality System
Agents have rich personality configurations that shape their behavior:
{
"id": "marketing-agent",
"name": "Marketing Strategist",
"description": "Expert in marketing strategy and brand positioning",
"config": {
"provider": "openai",
"model": "gpt-4o",
"temperature": 0.7,
"systemPrompt": "You are a marketing strategist...",
"traits": {
"personality": [
"Creative and strategic",
"Data-driven decision maker",
"Brand-focused"
],
"knowledge": [
"Marketing strategy",
"Brand positioning",
"Customer psychology",
"Digital marketing"
],
"tone": "Professional yet enthusiastic"
},
"examples": [
{
"user": "How do we improve brand awareness?",
"assistant": "Let's develop a multi-channel strategy..."
}
]
}
}
Available Pre-Configured Agents
Research Analyst
Expert in research, analysis, and fact-checking
Content Writer
Specialized in creating engaging written content
Legal Agent
Knowledgeable in legal analysis and compliance
Marketing Agent
Expert in marketing strategy and campaigns
PRD Editor
Specialized in product requirement documents
Interviewer
Conducts structured interviews and assessments
Creating Custom Agents
You can create custom agents programmatically or load from JSON files:
// Programmatic creation
await worksona.loadAgent({
id: 'support-bot',
name: 'Customer Support Bot',
config: {
provider: 'anthropic',
model: 'claude-opus-4-5-20251101',
temperature: 0.7,
systemPrompt: 'You are a helpful customer support agent...',
traits: {
personality: ['Helpful', 'Patient', 'Empathetic'],
knowledge: ['Product documentation', 'Troubleshooting', 'FAQs'],
tone: 'Friendly and professional'
}
}
});
// Load from JSON file
const agentConfig = await fetch('./agents/custom-agent.json');
await worksona.loadAgent(await agentConfig.json());
REST API Server
Worksona.js includes a full-featured Express-based REST API server that exposes all library functionality via HTTP endpoints. The server provides 32+ endpoints organized into logical categories.
API Architecture
Key API Features
File Processing
Upload and process images, PDFs, DOCX, XLSX, CSV files
OCR Capabilities
Extract text from images using Tesseract
Document Parsing
Parse PDF, DOCX, XLSX with specialized libraries
Agent-Scoped Routing
Pattern: /api/agents/:agentId/:action/:object
Batch Processing
Process up to 10 queries in parallel
Webhook Support
Integrate with external services via webhooks
Endpoint Categories (32+ endpoints)
- Agent Management - Load, list, get, delete, chat
- Query & Chat - Generic query, agent query, batch processing
- File Processing - Upload, analyze, process with schema
- Image Operations - Analyze with vision, generate with DALL-E
- Document Processing - OCR, analyze documents
- Tool System - DALL-E, web scraper, text-to-speech
- Convenience Endpoints - Quick translate, ask shortcuts
- Slash Commands - /ocr, /summarize, /translate, /extract-data
- Webhooks - Incoming webhook handlers
Rate Limiting & Security
- 100 requests per 15 minutes on /api routes
- Optional API key authentication (X-API-Key header)
- File size limit: 10MB per upload
- Max 5 files per request
- CORS enabled for development
Quick Example
// Simple query via REST API
const response = await fetch('http://localhost:3000/api/query', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agent: 'research-analyst',
query: 'What is quantum computing?',
options: { temperature: 0.7 }
})
});
const data = await response.json();
console.log(data.data.response);
Tooling Ecosystem
Worksona.js includes an extensible tool system that adds specialized capabilities beyond basic chat. Tools can be used standalone or enhanced by agents for improved results.
Tool System Architecture
/api/tools/:tool/:action] AT[Agent-Enhanced Tool
/api/agents/:id/tools/:tool/:action] end subgraph "Available Tools" DALLE[DALL-E 3
Image Generation] SCRAPE[Web Scraper
Content Extraction] TTS[Text-to-Speech
6 Voices] end DT --> DALLE DT --> SCRAPE DT --> TTS AT --> |Prompt Enhancement| DALLE AT --> |Content Analysis| SCRAPE AT --> TTS style DALLE fill:#10b981,color:#fff style SCRAPE fill:#f59e0b,color:#fff style TTS fill:#8b5cf6,color:#fff
Built-In Tools
DALL-E 3 Image Generator
Actions: generate, edit, variations
Agent Enhancement: Marketing agents can transform "logo" into detailed professional prompts
Web Scraper
Actions: fetch (extract text), extract (structured data)
Agent Enhancement: Research agents can analyze and summarize scraped content
Text-to-Speech
Actions: speak, generate
Voices: alloy, echo, fable, onyx, nova, shimmer
Direct vs Agent-Enhanced Tools
// Direct tool access - basic functionality GET /api/tools/dalle/generate?prompt=sunset // Returns: Basic sunset image // Agent-enhanced tool access - improved results GET /api/agents/marketing-agent/tools/dalle/generate?prompt=logo // Agent transforms "logo" into: // "Create a modern, professional logo design that embodies innovation... // using a sophisticated color palette of navy blue and silver..." // Returns: Professional, detailed logo
Document Processing Tools
Specialized tools for document handling:
- PDF Parser - Extract text and metadata from PDF files
- DOCX Parser - Process Microsoft Word documents
- XLSX Parser - Read Excel spreadsheets
- OCR Engine - Tesseract-powered text extraction from images
- Markdown Renderer - Convert and render markdown content