Actions
Execute and monitor units of work on your connections
Actions
Actions are discrete units of work that execute on connections. They represent the "verbs" of VirtuousAI — the things you want to do with your connected services.
Concept
An action combines:
- Template — What to do (e.g.,
dlt_extract,web_search) - Connection — Where to connect (credentials)
- Config — How to do it (parameters)
Examples:
| Template | Connection | Config |
|---|---|---|
dlt_extract | conn_shopify | { resources: ["orders"], incremental: true } |
web_search | conn_rest_api | { query: "customer data" } |
call_agent | — | { kind: "call_agent", agent: "data-analyst", question: "Summarize..." } |
Action Types
Actions fall into categories based on what they do:
| Type | Description | Examples |
|---|---|---|
| Data Syncs | Pull data from external services into VirtuousAI | dlt_extract, file_sync |
| Transformations | Process and transform data | duckdb_transform |
| Integrations | Push data or call external APIs | http_request |
| Research | Search and fetch web content | web_search, fetch_page |
| AI Operations | Agent delegation and AI-powered tasks | call_agent, generate_query, explain_schema |
| Output | Generate deliverables | create_flashboard |
| Control Flow | Workflow routing and orchestration | approval_gate, conditional_branch, call_automation |
ActionKind Reference
Every action has a kind that determines its behavior. Here is the complete reference:
| Kind | Execution | Description |
|---|---|---|
dlt_extract | Async | Extract data from external sources (Shopify, Klaviyo, etc.) to bronze layer |
file_sync | Async | Sync files from remote storage (S3, Google Drive) |
duckdb_transform | Async | SQL-based data transformation via DuckDB |
http_request | Sync | Make arbitrary HTTP requests to external APIs |
web_search | Sync | Search the web via Tavily API, returns titles/URLs/snippets |
fetch_page | Sync | Extract readable content from web page URLs |
generate_query | Sync | Generate and execute SQL queries against connected data schemas |
explain_schema | Sync | Explain table structures and column meanings |
call_agent | Async | Invoke an agent by slug for delegated work |
agent_search | Sync | Agent-scoped web search (read-only variant) |
agent_fetch_page | Sync | Agent-scoped page fetch (read-only variant) |
create_flashboard | Async | Generate a persistent flashboard dashboard |
approval_gate | Sync | Pause workflow and wait for human approval |
conditional_branch | Sync | Evaluate conditions and route execution flow |
call_automation | Async | Invoke another automation as a sub-workflow |
Action Lifecycle
| Status | Description | Can Transition To |
|---|---|---|
PENDING | Created, awaiting execution | RUNNING, CANCELLED, AWAITING_APPROVAL |
AWAITING_APPROVAL | Requires human approval | RUNNING (approved), REJECTED |
RUNNING | Actively executing | COMPLETED, FAILED, CANCELLED |
COMPLETED | Finished successfully | — (terminal) |
FAILED | Encountered an error | — (terminal, can retry) |
CANCELLED | Manually cancelled | — (terminal) |
REJECTED | Approval denied | — (terminal) |
Execution Model
VirtuousAI uses two execution modes:
| Mode | When Used | Characteristics |
|---|---|---|
| SYNC | Fast operations (under 30s) | Inline execution, immediate response |
| ASYNC_QUEUE | Long-running operations | SQS + Dramatiq workers, lease-based |
Lease-Based Ownership
For ASYNC_QUEUE operations:
- Lease Duration: 90 seconds
- Heartbeat Interval: Every 30 seconds
- Watchdog Grace: 180 seconds before marking abandoned
If a worker crashes, the watchdog detects the stale lease and marks the run as failed, preventing zombie jobs.
ActionRun Structure
Each execution creates an ActionRun — an immutable record:
{
"id": "run_xyz789",
"actionId": "action_abc123",
"status": "COMPLETED",
"startedAt": "2026-01-22T14:30:00Z",
"completedAt": "2026-01-22T14:32:15Z",
"result": {
"recordsProcessed": 1250,
"bytesTransferred": 2100000,
"tables": ["orders", "order_line_items"]
},
"artifacts": [
{ "name": "orders.parquet", "size": 1500000 }
]
}ActionRuns are immutable. Even if an action is deleted, historical runs are preserved for auditing.
Creating Actions
{
"kind": "dlt_extract",
"connectionRef": { "slug": "shopify" },
"definition": {
"source": "shopify",
"resources": ["orders", "products"],
"incremental": true,
"start_date": "2026-01-01"
}
}{
"kind": "duckdb_transform",
"definition": {
"kind": "duckdb_transform",
"transform_name": "stg_shopify_customers",
"sql": "SELECT id as customer_id, email, ... FROM {{ bronze_customers }} QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) = 1",
"inputs": [
{ "bronze_dataset": "bronze.shopify.customers" }
],
"output_layer": "silver",
"mode": "full_refresh"
}
}{
"kind": "call_agent",
"definition": {
"kind": "call_agent",
"agent": "data-analyst",
"question": "Summarize the key trends in this quarter's sales data"
}
}DuckDB Transform Actions
The duckdb_transform action type enables SQL-based data transformations using DuckDB. It's the primary way to move data through the medallion architecture.
Definition Schema
| Field | Type | Required | Description |
|---|---|---|---|
kind | "duckdb_transform" | Yes | Action type |
transform_name | string | Yes | Output table name (e.g., stg_shopify_customers) |
sql | string | Yes | DuckDB SQL with {{ variable }} placeholders |
inputs | array | Yes | List of input datasets |
output_layer | "silver" | "gold" | No | Target layer (default: silver) |
mode | "full_refresh" | "incremental_merge" | No | Write mode (default: full_refresh) |
merge_keys | array | No | Primary keys for incremental merge |
Input References
Inputs can reference bronze or silver datasets:
| Input Type | Format | Resolves To |
|---|---|---|
| Bronze | { "bronze_dataset": "bronze.shopify.customers" } | read_parquet('s3://.../*.parquet') |
| Silver | { "silver_dataset": "stg_shopify_orders" } | delta_scan('s3://...') |
The {{ variable }} in your SQL gets replaced with the appropriate DuckDB function.
Write Modes
| Mode | Behavior | Use When |
|---|---|---|
full_refresh | Replaces entire table | Small tables, schema changes |
incremental_merge | MERGE on merge_keys | Large tables, append-heavy workloads |
Example: Silver Staging Table
{
"kind": "duckdb_transform",
"definition": {
"kind": "duckdb_transform",
"transform_name": "stg_shopify_customers",
"sql": "SELECT id as customer_id, email, first_name, last_name, CAST(total_spent AS DECIMAL(18,2)) as total_spent, created_at as created_at_utc FROM {{ bronze_customers }} QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) = 1",
"inputs": [{ "bronze_dataset": "bronze.shopify.customers" }],
"output_layer": "silver",
"mode": "full_refresh"
}
}Example: Gold Fact Table
{
"kind": "duckdb_transform",
"definition": {
"kind": "duckdb_transform",
"transform_name": "fct_order_lines",
"sql": "SELECT li.line_item_id, o.order_id, o.customer_id, li.quantity, li.unit_price FROM {{ stg_shopify_order_line_items }} li JOIN {{ stg_shopify_orders }} o ON li.order_id = o.order_id",
"inputs": [
{ "silver_dataset": "stg_shopify_order_line_items" },
{ "silver_dataset": "stg_shopify_orders" }
],
"output_layer": "gold",
"mode": "full_refresh"
}
}Learn more about the bronze/silver/gold architecture in Data Pipeline Concepts.
Connection Resolution
Actions resolve connections flexibly:
| Reference Type | Example | Resolution |
|---|---|---|
| By ID | { "id": "conn_abc123" } | Exact match |
| By Slug | { "slug": "shopify" } | LLM-friendly name lookup |
| By Provider | { "provider": "shopify" } | Uses org's default for provider |
Progress Tracking
For data extraction actions, the system provides detailed progress tracking with per-resource and per-slice visibility.
Progress Schema (v2)
Long-running extractions report progress in a structured format:
{
"schema_version": 2,
"resources_order": ["profiles", "events", "lists"],
"completed_resources": ["profiles"],
"failed_resources": [],
"in_progress_resource": "events",
"resource_cursors": {
"events": {
"slices_completed": 127,
"slices_total": 384
}
}
}| Field | Description |
|---|---|
schema_version | Always 2 for new runs |
resources_order | Ordered list of resources to extract |
completed_resources | Resources that finished successfully |
failed_resources | Resources that encountered errors |
in_progress_resource | Currently extracting resource (or null) |
resource_cursors | Slice-level progress for large resources |
Calculating Global Progress
To compute overall percentage:
const doneCount = completed.length + failed.length
const activePercent = cursor?.slices_total
? (cursor.slices_completed / cursor.slices_total)
: 0
const globalPercent = ((doneCount + activePercent) / totalResources) * 100Partial Success
Extractions can complete with partial success when some resources succeed and others fail:
| Status | Condition | Action |
|---|---|---|
COMPLETED | All resources succeeded | None needed |
COMPLETED (partial) | Some succeeded, some failed | Review failed_resources |
FAILED | All resources failed | Check error details, retry |
Partial success allows you to use successfully extracted data while investigating failures in specific resources.
Streaming Execution
For long-running actions, subscribe to real-time progress via Server-Sent Events:
GET /api/v1/action-runs/{run_id}/logs/stream
Accept: text/event-streamEvents include:
extraction_started— Extraction beginning with resource listresource_started— Individual resource extraction startingresource_completed— Resource finished with row/file countsslice_completed— Progress update for large resourcesrun_completed— Final success with summaryrun_failed— Error details
Error Handling
When actions fail, the run includes detailed error information:
{
"status": "FAILED",
"error": {
"code": "CONNECTION_ERROR",
"message": "Failed to connect to Shopify API",
"details": {
"statusCode": 401,
"shopifyError": "Invalid API key"
},
"retryable": true
}
}Error Codes
| Code | Description | Retryable |
|---|---|---|
CONNECTION_ERROR | Failed to connect to external service | Usually yes |
AUTH_ERROR | Authentication/authorization failed | No (fix credentials) |
RATE_LIMITED | External API rate limit hit | Yes (with backoff) |
DATA_ERROR | Invalid or corrupt data | No (fix source data) |
TIMEOUT | Execution exceeded time limit | Sometimes |
INTERNAL_ERROR | Unexpected system error | Yes |
Retry Behavior
Retryable errors are automatically retried with exponential backoff. Non-retryable errors require manual intervention.
Rate Limiting
Data extraction actions use pre-emptive rate limiting to avoid hitting vendor API limits:
| Source | Strategy | Details |
|---|---|---|
| Amazon SP-API | 65s intervals | Reports API has strict 1/min sustained limits |
| Klaviyo | Per-endpoint buckets | 0.1s for most endpoints, 0.02s for events |
This prevents 429 errors by spacing requests to stay under vendor limits. See Data Sources Guide for details.
Resource Ordering
When extracting multiple resources, VirtuousAI uses size-optimized ordering by default:
| Behavior | When |
|---|---|
| Size-optimized (XS→XL) | No resources specified, or defaults used |
| User-specified order | Resources explicitly listed in definition |
This ensures fast resources complete first, providing partial results quickly. See Data Sources Guide for details.
Best Practices
- Use incremental syncs — When possible, sync only new/changed data to reduce execution time
- Set appropriate timeouts — Configure timeouts based on expected data volume
- Monitor runs — Set up alerts for failed actions, especially in automations
- Test with small datasets — Validate action configuration before running on full data
- Use streaming for long runs — Subscribe to SSE for real-time progress on lengthy operations
OpenAPI Reference
For detailed endpoint schemas, request/response formats, and authentication: