ETL Jobs
| Action | Limit |
|---|---|
| ETL Jobs | 900* |
Why the limit ?
By default, Amorphic system and AWS occupies some of the underlying resources (default ETL jobs, IAM roles) which are nearly 100.
The specified maximum limit is a consolidated count of all the below Amorphic resources created in the specific environment:
- All connections (S3, Ext-API, JDBC Normal) except JDBC bulk load connection
- ETL Jobs
- All Workflow nodes
- Forecast Jobs (Consumes IAM roles)
- DeepSearch Indices (Consumes IAM roles)
- Glue Endpoints with Dataset Access (Consumes IAM roles)
- ML Notebooks with Dataset Access (Consumes IAM roles)
- ETL Notebooks with Dataset Access (Consumes IAM roles)
- Kinesis stream consumers (Consumes IAM roles)
For example, If it is a new Amorphic deployment and no other resources are created then user can create 900 ETL Jobs. Even if user creates 900 ETL jobs there are some restrictions for the job executions based on the type of network configuration:
- Public - No restrictions
- App-Public – Based on the Glue Public Subnet CIDR range specified during the Amorphic deployment. If it is /24 then only 254 DPUs can run at a time (For ex: Can execute approximately 25 ETL spark jobs with 10 DPUs at once)
- App-Private - Based on the Glue Private Subnet CIDR range specified during the Amorphic deployment. If it is /24 then only 254 DPUs can run at a time (For ex: Can execute approximately 25 ETL spark jobs with 10 DPUs at once)
| Action | Limit |
|---|---|
| Maximum concurrent executions of different ETL jobs | 50** |
| Maximum concurrent executions of same ETL job | 1000*** |
** Maximum is calculated based on the AWS default limit of 1000 ETL Jobs and 1000 IAM roles in new AWS account. If both the AWS Glue job limit and IAM role limit is increased to a new higher limit then the limit will be calculated accordingly.
*** Maximum is equivalent to AWS default limit. Can be adjusted by requesting service quota increase.
For more information on the AWS Service Quotas, visit the AWS documentation.
IAM role policy and shared domains
When a domain is shared with an ETL job, Amorphic prefers a single domain/* wildcard in the job's IAM role policy over listing every dataset path, to keep the policy compact.
Because Iceberg datasets require a different IAM statement shape than non-Iceberg datasets, domain/* is only used when every dataset in the domain is of the same type:
- Domain with all non-Iceberg datasets → policy uses
domain/* - Domain with all Iceberg datasets → policy uses
domain/*(under the Iceberg-shaped statement) - Domain with a mix of Iceberg and non-Iceberg datasets → policy falls back to individual
domain_name/dataset_name/*entries, since the two statement shapes cannot share a single wildcard
If you want the shorter domain/* form on a domain that already contains an Iceberg dataset, every other dataset in that domain must also be Iceberg type. Mixing types forces the per-dataset expansion.