Libraries
Libraries are an extension of external job libraries. They are mainly used to maintain a central repository of organization-approved libraries/packages to be used across multiple Jobs or Data labs.
These Libraries have the following capabilities:
- They allow users to have multiple packages attached to a job, so they can easily switch between them to perform various actions based on the job requirements.
- They provide the ability to customize job dependencies to a granular level.
- They offer flexibility to choose among the different type of packages.
Currently based on the type of ETL Job, Amorphic supports "py", "egg" and "whl" extensions for python shell applications and "py", "zip", "jar" for pyspark applications.
Library
A Library is a collection of packages/modules that provides a standardized solution for problems in everyday programming. Unlike the OS-provided python supporting the collection, the packages are explicitly designed by User/Organization/Open-Source Community. This encourages and enhances the portability of Python programs by abstracting away the platform-specific APIs into platform-neutral APIs.
The ETL Library has the following properties:
- A Library can have multiple packages attached to it.
- A Library can be attached to multiple Jobs.
Types of Amorphic ETL Libraries:
- External Libraries: Their scope is within the ETL job, and they get removed when user deletes the ETL job.
- Shared Libraries: They possess a universal scope, allowing multiple jobs to utilize the same shared library upon user authentication, and persist in the central repository even after the ETL job has been deleted.
Amorphic Libraries contain the following information:
Type | Description |
---|---|
Library Name | Uniquely identifies the functionality of the library |
Library Description | A brief explanation of the library typically the contents/package inside it |
Packages | It is a file or a list of files that can be imported into an ETL Job to perform a specific set of operations. Example: matplotlib is A numerical plotting library used by any data scientist or any data analyzer for visualizations |
Jobs | The list of ETL jobs to which the library is attached |
CreatedBy | User who created the library. |
LastModifiedBy | User who has recently updated the library. |
LastModifiedTime | Timestamp when the library was recently updated. |
Libraries Operations
Amorphic libraries provides the following operations to manage the libraries:
- Create Library: Create a custom library by choosing the package(s) of the user's choice
- View Library: View existing library Shared ETL Libraries Metadata Information
- Attach Library: Attach an existing library to a ETL Job
- Importing and using a library: Import and use a library in a ETL Job
Create Library
To create a new Library in Amorphic, go to the "Create New Library" section under the "Libraries". The application allows libraries to have zero or more packages/jobs attached to it. After creating the Library user can view, update, & delete it. User can only do these operations if permissions to access the libraries is present on them.
Users cannot delete a shared library if it is attached to the existing ETL Job. So, when attempted to delete such a library, user will be notified with the list of dependent ETL Jobs with a pop-up. Then, user should remove all the libraries used in Jobs and retry to delete the library.
The below gif shows how a user can create a new library.
View Library
To view all the existing library information user must have sufficient permissions. Click the Library name under the "Libraries" section inside the Shared Resources scetion to view the library.
Take a look at how user can view the library information in detail
Attach Library
User can attach a library from the job details page and attach a shared library to a job while creating or updating it. Amorphic provides a list of shared libraries along with other job parameters, which user can then attach to the job. Once attached all the packages in the shared library are passed as arguments to the job automatically without any intervention.
Follow the below gif to attach a shared ETL library to an existing ETL Job.

Importing and using a library
If user has a library with a single version of the required module or multiple different files added in this single library, then they can import the module and use it.
from amorphicutils.common import read_param_store
print(read_param_store("SYSTEM.S3BUCKET.DLZ", secure=False)['data'])
If users have a library with a multiple version of the required module , then they should explicitly insert into the system path the versioned file and then import the module and use it. This ensures it allows picking up the specific version of the library and not a random one.
import sys
# explicitly specify the version user want to use
sys.path.insert(0, "amorphicutils-0.3.1.zip")
from amorphicutils.common import read_param_store
print(read_param_store("SYSTEM.S3BUCKET.DLZ", secure=False)['data'])