Loading...
Replaced a manual review process with an automated decision engine that evaluates eligibility in seconds. Every new application is assessed instantly against configurable business rules — no spreadsheets, no delays, no human bottleneck.
Client: Enterprise Client
The challenge
The client had a process where every new internal mobility application needed to be assessed for eligibility for a relocation payment. The assessment required checking the distance between two work sites, verifying the employee hadn't already received a payment recently, and applying business rules that varied by location.
This was being done manually. Someone would look up addresses, estimate distances, check spreadsheets for previous payments, and make a decision. It took time, it was inconsistent, and as application volumes grew, the backlog became unmanageable. Some applications waited days for a decision that should take seconds.
They also needed the ability to change the rules — for example, temporarily adjusting the distance threshold for a specific site — without requiring a code change or system redeployment.
Architecture diagram: Step Functions pipeline from stream trigger to decision
Screenshot coming soon
What we built
We built a fully automated decision engine that triggers the moment a new application is created. Within seconds, the system determines eligibility without any human involvement.
The engine works in stages. First, it validates that the application has all the information it needs. If upstream systems haven't finished populating the record yet, it waits and retries automatically (up to 5 times over 2.5 minutes) rather than failing.
Next, it resolves the two work site locations, converts them to geographic coordinates, and calculates the driving distance between them. Both the coordinates and distances are cached, so repeated lookups for the same sites are instant.
Then it applies the business rules: is the distance above the threshold? Has the employee received a payment in the last 12 months? The threshold can be overridden on a per-site basis through a simple configuration table — no code changes needed. Overrides automatically expire after 8 months, so temporary rules don't linger forever.
The decision (eligible or ineligible) is written back to the application record along with all the supporting data: the distance calculated, the threshold used, and the previous payment check result. This gives the team full transparency into why each decision was made.
If anything goes wrong at any stage, the system catches the error, logs it with full context, and sends an alert to the team. Every step is monitored with alarms that fire if executions fail or if the system falls behind.
Step Functions execution view: successful eligibility determination
Screenshot coming soon
CloudWatch dashboard: execution metrics and cache hit rates
Screenshot coming soon
Technical detail
This section is for readers with a technical background who want to understand the architecture and implementation choices.
The engine is orchestrated by an AWS Step Functions state machine (Standard type) triggered by a DynamoDB Stream on the applications table. Only INSERT events pass through a filter on the event source mapping.
Pipeline Stages: 1. Stream Processor Lambda — Receives the DynamoDB stream event, deserialises the record, and starts a Step Functions execution with a unique name (application number + sequence number) for idempotency. 2. Initial Check Lambda — Reads the latest application record and validates required fields are populated. Returns a readiness status. 3. Data Readiness Loop — A Choice state routes based on readiness. If data isn't ready and retries < 5, it waits 30 seconds and retries. After 5 retries, it fails with a clear error. 4. Location Details Lambda — Resolves origin and destination site codes by scanning a projects reference table. 5. Geocoding Lambda — Converts site codes to lat/long coordinates using AWS Location Service (Esri-backed place index). Uses a multi-level lookup strategy with fallback. Results are cached in a DynamoDB table. 6. Distance Calculation Lambda — Calculates driving distance via AWS Location Service route calculator. Falls back to haversine (straight-line) distance when routes exceed the calculator's range. Results cached with the calculation method recorded. 7. Eligibility Calculator Lambda — Checks a per-site overrides table for custom thresholds (with TTL-based auto-expiry). Queries a GSI for previous payments within 365 days. Writes the final decision and all supporting data back to the application record.
Caching Strategy: Two DynamoDB cache tables avoid redundant API calls. The geocoded locations cache uses the site code as the key. The distances cache uses an alphabetically-sorted pair of site codes as the key (so A→B and B→A share one entry).
Configurability: The distance threshold overrides table allows per-site rules without code changes. Each override has a TTL attribute set to 8 months — DynamoDB automatically deletes expired entries. The table has deletion protection enabled.
Error Handling: Every task state has retry configuration (3 attempts, exponential backoff from 2 seconds) and a catch block routing to a terminal failure state. The state machine logs all execution data to CloudWatch Logs (90-day retention).
Monitoring: CloudWatch alarms fire on Step Functions execution failures and stream processor errors. An SNS topic delivers email notifications to the team. All six Lambda functions have dedicated log groups with 30-day retention.
The results
DynamoDB overrides table: configurable business rules with auto-expiry
Screenshot coming soon
Interested in something similar?
Book a free 30-minute discovery call. We'll listen to what you need, tell you what's realistic, and give you a straight answer on whether we can help.