Skip to main content
print this page

Incorrect handling of failed data profiling job retries

· 2 min read
Fix In Progress
Fix In Progress
Bug identified and fix is in progress
Workaround Available
Workaround Available
Temporary workaround available

When data profiling job(backend job) is failed with an unhandled exception, the job does an inordinate number of retries causing additional cost to the customer.

This issue occurs if the Amorphic deployed with single tenancy and have datasets with data profiling enabled.

Affected Versions: 1.11, 1.12, 1.13, 1.14, 2.0, 2.1

Fix Version: 2.2

Root cause(s)

  • Because of incorrect failed job retry configuration, data profiling job retries inordinate number of times.
  • Unhandled exceptions in data profiling job(scheduled backend job)
    • When redshift cluster is paused, data profiling job errors out with timeout exception.

Impact

Account accrues additional cost for unnecessary job executions.

Mitigation

Workaround

Make sure the redshift cluster is in active state around the schedule of data profiling job(everyday 00:00 UTC). Or Disable the data profiling flag on datasets.

Timeline

gantt
title Timeline
dateFormat YYYY-MM-DD
tickInterval 1day
axisFormat %b-%d
todayMarker off
section Tracker
%% update the ticket number and date of bug report
CLOUD-3209 : done, 2023-04-06, 0d
section Identification
Reported : crit, des1, 2023-04-06, 1d
section Mitigation
%% Update number of days took for each step below
Root cause analysis : 1d
section Delivery
%% update the date of each step below
section Next release
%% update the date of next version and release date
  • 2023-04-06: Bug reported/identified (CLOUD-3209)
  • 2023-04-06: Bug triaged