Jobs
Jobs in Amorphic CICD enable the execution of Python and PySpark code to work with datasets and perform ETL transformations.
Job Artifacts
- ETL Jobs require a script file, which should be specified under the
Artifactskey in the resource definition. - The path to the script file should be relative to the root
resourcesdirectory.
The Python script is subject to SCA (Software Composition Analysis) and SAST (Static Application Security Testing) before deployment, ensuring compliance and security standards are met.
Below is the sample resource definition file for ETL Job:
{
"rPythonJob": {
"Type": "Job",
"Artifacts": {
"Script": "resources/jobs/test/job_script.py"
},
"Properties": {
"JobName": "cicd_test",
"Description": "CICD test for job",
"ETLJobType": "pythonshell",
"NetworkConfiguration": "general-public-network",
"JobBookmarkOption": "disable",
"Keywords": [
"Owner: asysuser"
],
"MaxCapacity": 0.0625,
"ParameterAccess": [
{
"!DependsOn": "rS3TierParam.ParameterKey"
}
],
"SharedLibraries": [],
"DomainAccess": {
"Owner": [
{
"DomainName": {
"!DependsOn": "rCICDDomain.DomainName"
}
}
],
"ReadOnly": []
},
"DatasetAccess": {
"Owner": [],
"ReadOnly": []
},
"IsDataLineageEnabled": "no",
"IsAutoScalingEnabled": false
}
}
}
Job has dependencies on Dataset, Domain, Parameter, SharedLibraries, and Tag.
Dependent resources should not be deleted before the primary resource; attempting to do so may lead to failures or inconsistencies during the deletion process.
Referencing this Resource
Below are the common keys that can be used in the DependsOn function to retrieve details of this resource.
Supported Keys
| Key | Description |
|---|---|
| Id | Returns the JobId value of this resource |
| JobName | Returns the JobName value of this resource |
For additional supported keys, refer to the API definition document for the respective resource type.
Example
The following example shows how to retrieve the job name from a job template and use it in a data pipeline template:
{
"rDataPipeline2": {
"Type": "DataPipeline",
"Properties": {
"Nodes": [
{
"ModuleType": "etl_job",
"NodeName": "job",
"NodeInstance": {
"!DependsOn": "rPythonJob.JobName"
},
"Resource": {
"Name": {
"!DependsOn": "rPythonJob.JobName"
},
"Id": {
"!DependsOn": "rPythonJob.Id"
}
},
}
]
}
}
}