Files
lambda_local_runner/docs/lambdas-md/lambda-12-step-functions.md
2026-05-11 20:13:11 -03:00

3.9 KiB

Step Functions

When Lambda alone isn't enough. Standard vs Express. Map state for fan-out. Comparison with Airflow.

When Lambda alone isn't enough

A single Lambda function works well for one discrete task. Problems start when you need to chain multiple tasks, retry selectively, wait on human approval, or fan out across thousands of items. Doing this with Lambda alone means writing orchestration logic inside your functions — tracking state, implementing retry delays, deciding what "done" means. Step Functions externalises that orchestration into a state machine where every state transition is durable, auditable, and resumable.

Reach for Step Functions when you need: sequential steps with state passing, conditional branching, parallel fan-out with join, wait states longer than 15 minutes, or retry-with-exponential-backoff built in.

Standard vs Express workflows

Standard Express
Max duration 1 year 5 minutes
Execution semantics Exactly-once per state At-least-once
Execution history Full audit trail in AWS console CloudWatch Logs only
Pricing $0.025 per 1 000 state transitions $0.00001 per state transition + duration
Use for Long-running business workflows, human approvals, compliance audit trails High-volume, short-duration event processing (IoT, streaming)

For most application orchestration, Standard is the right choice — the exactly-once semantic matters when steps have side effects (charging a card, sending an email). Express is for high-throughput pipelines where at-least-once is acceptable and cost per transition is a concern.

Map state for fan-out

The Map state runs the same workflow branch for every item in an array, in parallel. This is the core fan-out primitive. For this project's use case, a Step Functions version could fan out across S3 prefixes — run one Lambda per prefix, collect results in a fan-in step:

{
  "Type": "Map",
  "ItemsPath": "$.prefixes",
  "MaxConcurrency": 10,
  "Iterator": {
    "StartAt": "ScanPrefix",
    "States": {
      "ScanPrefix": {
        "Type": "Task",
        "Resource": "arn:aws:lambda:...:function:pdf-scanner",
        "End": true
      }
    }
  }
}

MaxConcurrency: 0 means unlimited — bounded only by the Lambda concurrency pool. Set an explicit cap to avoid saturating the account concurrency quota.

Other useful states

  • Wait — pause for a duration or until a timestamp. The only way to implement delays longer than 15 minutes without polling.
  • Choice — conditional branching on input values. Replaces if/else logic that would otherwise live inside a Lambda.
  • Parallel — run multiple independent branches simultaneously and join their results.
  • Task (SDK integrations) — Step Functions can call DynamoDB, SQS, ECS, Glue, etc. directly without a Lambda wrapper, reducing cost and latency for simple operations.

Step Functions vs Airflow

Step Functions Apache Airflow (MWAA)
DAG definition JSON/YAML state machine (ASL) Python code (DAG files)
Scheduling Event-driven / on-demand; cron via EventBridge Built-in rich scheduler (cron, data-interval-aware)
Backfill Manual / custom First-class, built-in
Operators AWS services + Lambda (AWS ecosystem only) 600+ providers: Spark, BigQuery, dbt, Kubernetes…
Infrastructure Serverless — zero infra Managed Airflow (MWAA) starts at ~$400/month
Debugging Console execution graph; CloudWatch for logs Airflow UI with task logs, Gantt charts, retries

Step Functions is the right choice when your workflow is AWS-native, event-driven, and you want zero infrastructure. Airflow is the right choice when you need complex scheduling, data-interval backfill, cross-cloud operators, or a data-engineering team that already knows Python DAGs.