This guide walks through creating a custom S4E playbook from scratch, covering design principles, step-by-step authoring, testing, and deployment.

Design Principles

Before writing a playbook, consider the following:

  • Modularity --- Break complex workflows into reusable actions. A playbook should orchestrate actions, not contain inline logic.
  • Idempotency --- Steps should be safe to re-execute. If a playbook is retried after a partial failure, completed steps should not cause duplicate side effects.
  • Fail-safe defaults --- When in doubt, abort rather than continue. Use explicit on_failure handling for every critical step.
  • Least privilege --- Request only the permissions each action needs. Avoid admin-scoped actions when read-write is sufficient.

Step 1 --- Define the Trigger

Decide what event starts the playbook. Common patterns:

# Triggered by a new critical finding
trigger:
  event: finding.new
  conditions:
    severity: critical

# Triggered on a schedule
trigger:
  event: schedule
  cron: "0 8 * * 1"

# Triggered manually
trigger:
  event: manual

Tip

Start with manual triggers during development, then switch to event-based triggers once the playbook is tested.

Step 2 --- Add Steps

Build the workflow by adding steps sequentially. Each step must have a unique id and a type.

Basic Linear Workflow

steps:
  - id: notify
    name: Notify Security Team
    type: action
    action_ref: act-slack-notify
    parameters:
      channel: "#security-alerts"
      message: "New critical finding: {{ finding.title }}"
    on_success: create-ticket

  - id: create-ticket
    name: Create JIRA Ticket
    type: action
    action_ref: act-jira-create
    parameters:
      project: SEC
      summary: "{{ finding.title }}"
      description: "{{ finding.description }}"
      priority: Critical
    on_success: end

Adding Conditions

Use condition steps to branch the workflow based on runtime data:

steps:
  - id: check-environment
    name: Check If Production
    type: condition
    expression: "'production' in {{ finding.asset.tags }}"
    on_true: immediate-response
    on_false: standard-response

  - id: immediate-response
    name: Immediate P1 Response
    type: action
    action_ref: act-pagerduty-alert
    parameters:
      service: production
      urgency: high

  - id: standard-response
    name: Standard Notification
    type: action
    action_ref: act-email-notify
    parameters:
      to: [email protected]

Adding Parallel Steps

Execute multiple actions simultaneously to reduce total workflow time:

steps:
  - id: parallel-triage
    name: Parallel Triage Actions
    type: parallel
    branches:
      - id: scan-related
        action_ref: act-scan-related-assets
        parameters:
          asset_id: "{{ finding.asset.id }}"
      - id: enrich
        action_ref: act-threat-intel-lookup
        parameters:
          indicator: "{{ finding.source_ip }}"
      - id: notify
        action_ref: act-slack-notify
        parameters:
          channel: "#triage"
          message: "Triaging {{ finding.title }}"
    join: all
    on_success: review-results

Step 3 --- Configure Error Handling

Retry Logic

steps:
  - id: apply-patch
    name: Apply Security Patch
    type: action
    action_ref: act-auto-patch
    parameters:
      finding_id: "{{ finding.id }}"
    on_failure: retry
    max_retries: 3
    retry_delay: 60s

Fallback Steps

steps:
  - id: auto-remediate
    name: Attempt Auto Remediation
    type: action
    action_ref: act-auto-patch
    on_failure: manual-fallback

  - id: manual-fallback
    name: Manual Remediation Required
    type: action
    action_ref: act-slack-notify
    parameters:
      channel: "#incidents"
      message: "Auto-remediation failed for {{ finding.title }}. Manual action required."

Global Error Handling

Set a playbook-level failure strategy:

on_failure: abort
timeout_minutes: 120

Step 4 --- Use Variable Substitution

Reference trigger data and step outputs using {{ }} syntax:

parameters:
  finding_title: "{{ finding.title }}"
  asset_name: "{{ finding.asset.name }}"
  ticket_id: "{{ steps.create-ticket.output.issue_key }}"
  scan_date: "{{ scan.completed_at }}"
  current_time: "{{ now }}"

Warning

Variable references to future steps (steps that have not yet executed) will resolve to null. Always reference outputs from steps that precede the current one in the execution order.

Step 5 --- Test with Dry Run

Before deploying, test the playbook in dry-run mode:

curl -X POST "https://api.s4e.io/api/playbooks/pb-my-playbook/run" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "target_asset_id": "a-1001",
    "parameters": {},
    "dry_run": true
  }'

Dry-run validates the entire workflow:

  • All action references resolve to registered actions.
  • Parameter types match action schemas.
  • Condition expressions are syntactically valid.
  • Step flow has no unreachable or circular paths.

Step 6 --- Create the Playbook via API

Register the playbook on the platform:

curl -X POST "https://api.s4e.io/api/playbooks" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Critical Vulnerability Response",
    "version": "1.0.0",
    "description": "Automated triage and response for critical vulnerabilities.",
    "trigger": {
      "event": "finding.new",
      "conditions": {"severity": "critical"}
    },
    "steps": [
      {
        "id": "notify",
        "name": "Notify Team",
        "type": "action",
        "action_ref": "act-slack-notify",
        "parameters": {
          "channel": "#security-alerts",
          "message": "Critical: {{ finding.title }}"
        },
        "on_success": "create-ticket"
      },
      {
        "id": "create-ticket",
        "name": "Create Ticket",
        "type": "action",
        "action_ref": "act-jira-create",
        "parameters": {
          "project": "SEC",
          "summary": "{{ finding.title }}"
        }
      }
    ],
    "on_failure": "abort",
    "timeout_minutes": 60
  }'

Step 7 --- Version and Iterate

Use semantic versioning for playbook updates:

curl -X PUT "https://api.s4e.io/api/playbooks/pb-critical-vuln-response" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "version": "1.1.0",
    "steps": [ ... ]
  }'

Note

Updating a playbook version does not affect currently running executions. In-flight executions continue with the version that was active when they started.

Using the Playbook Editor UI

For teams that prefer a visual approach, the S4E web interface provides a drag-and-drop playbook editor:

  1. Navigate to Playbooks > Create New.
  2. Set the trigger event and conditions.
  3. Drag action, condition, parallel, and delay blocks onto the canvas.
  4. Connect steps by drawing edges between them.
  5. Configure parameters for each step in the side panel.
  6. Click Validate to check for errors.
  7. Click Save to register the playbook.

The visual editor generates the same YAML/JSON structure as the API, and you can switch between visual and code views at any time.

Best Practices

Practice Recommendation
Start simple Begin with 2-3 steps and add complexity incrementally.
Test before production Always dry-run new playbooks and test with sandbox actions.
Use meaningful IDs Step IDs like notify-team are easier to debug than step-1.
Document your playbooks Use the description field at both playbook and step level.
Monitor executions Review execution logs regularly to catch silent failures.
Version control Store playbook definitions in Git alongside your infrastructure code.

Next Steps