AWS Step Functions are used as an orchestrator to combine multiple AWS Services to achieve a complex use case. They support functionalities like condition-based branching, waiting, error handling and parallel execution of functions.
At Clappia, we use Step Functions to power the Clappia Workflows. All Clappia Apps can have multiple complex workflows which can involve actions like sending emails, mobile notifications, SMS, WhatsApp messages, integrating with external APIs and databases, sending data to other Clappia Apps, waiting for a certain duration and IF/ELSE logic. We translate these user-defined Workflows and generate a State Machine in Step Functions. Know more about Clappia Workflows here.
One of the major downsides of AWS Step Functions is its pricing. It charges $0.025 per 1,000 state transitions. For small scale projects, it’s a good solution - easy to set up, very less implementation effort, good integration with most of the other AWS Services. But for Enterprise-grade orchestration that requires hundreds of steps, each with millions of executions, the cost can grow exponentially.
For example, we have an Enterprise customer whose one of the workflows involves 70+ IF Condition Blocks, which translates to 140+ Choice Steps and 100+ other action Steps like sending notifications or making external API Calls in Step Functions and it gets executed 2000 times in a day. The cost of this single State Machine can come around $200 per month. And we have hundreds of other similar use cases for other Enterprise customers. So we needed an alternative solution but with a similar ease of setup.
Express Workflows are a type of AWS Step Functions that are ideal for high-volume, short-running, event-processing workloads. While the Standard Step Functions can run upto a year, the Express Step Functions have a maximum running duration of 5 minutes. Express Step Functions are billed by the number of executions, the duration of execution, and the memory consumed. Number of state transitions within the State Machine is not factored in for billing.
This was a major factor for us in moving towards using Express Step Functions.
In Clappia, users can define workflows by adding nodes such as the Wait node, the IF node, and other platform-related actions. Then we translate the user-defined Workflow to a Step Functions State Machine and decide whether it can be executed in less than 5 minutes or not.
If yes, then we set the type of Workflow to Express. These workflows can be modified at any time by the user, so if a user decides to change the flow such that the total time of workflow execution is likely to go beyond 5 minutes, we automatically convert this into a Standard Workflow. As a result, the end-user workflow will be unaffected, and we will get a significant cost reduction.
Let's take a sample workflow. This workflow has 9 nodes
If we run this workflow 100k times in a month, and the average duration of the workflows is 10 seconds, then:
Total state transitions = State transitions per execution * executions of workflow
Total states transitions = 9 * 100,000 = 900,000
Monthly charges = 900,000 * $0.000025 = $22.50
i. Monthly request charges
The price is $1.00 per million requests
Monthly Request Charges = 100K requests x $1.00 = $0.1
ii. Monthly duration charges
100k workflows x 10 seconds of duration = 1,000,000 seconds
1,000,000,000 x 64 MB (billed memory) /1024 MB = 62,500 GB-s
62,500 GB-s / 60 / 60 = 17.36111 GB-hours
$0.06000 per GB-hour x 17 GB-Hours = $1.02
Monthly duration charges = $1.02
iii. Total monthly charges:
Total Monthly Charges = Request Charges + Duration Charges
Total Monthly Charges = $0.1 + $1.02= $1.12
Standard Step Functions can solve very dynamic use cases, especially when we have to wait for long hours or when we need to do lots of retries and error handling. But they are costly. So we can replace them with Express Step Functions, which are cheaper than the Standard ones and save about 15-20 times of cost but have limited use cases due to the 5-minute run duration limit.