Skip to main content
Version: v3.2 print this page

ArcGIS Datasource

ArcGIS datasources allow you to connect to ArcGIS Online to discover, catalog, and ingest geospatial data and metadata from various ArcGIS item types.

The datasource enables you to:

  • Discover and catalog ArcGIS items from multiple supported types
  • Extract comprehensive metadata from ArcGIS items including technical, business and operational metadata
  • Ingest actual geospatial data from Feature Services with advanced filtering capabilities
  • Apply sophisticated filtering using groups, organizations and many more custom filters
  • Track relationships between ArcGIS items and their dependencies
  • Support both metadata-only and metadata-and-data ingestion workflows

Supported ArcGIS Item Types

The following ArcGIS item types are supported for discovery and metadata cataloging:

  • Hub Site Application - ArcGIS Hub site applications
  • Hub Page - Individual pages within ArcGIS Hub sites
  • Dashboard - ArcGIS Dashboards for data visualization
  • Web Experience - ArcGIS Experience Builder applications
  • Web Map - Interactive web maps
  • Feature Service - Geospatial data services (supports data ingestion)
Data Ingestion Support

Currently, only Feature Service items support actual data ingestion. Other item types support metadata cataloging only.

How to create an ArcGIS Datasource?

Domain whitelisting

In order to create ArcGIS datasource, Users need to whitelist the below domain in the whitelisted domains list.

  • .arcgis.com

Users should whitelist other domains/sub-domains based on the requirement.

  • Go to Navigation Menu, then click on the Data Workflows tab and then select the Datasources button.
  • Click on the + Create Datasource button on the top right corner of the page.

ArcGIS Datasource Creation

To create a ArcGIS datasource, input the below details shown in the table or you can directly upload the JSON data.

Metadata

NameDescription
Datasource NameGive the datasource a unique name
DescriptionAdd datasource description
KeywordsAdd keyword tags to connect it with other Amorphic components.
Datasource TypeType of datasource. In this case it is ArcGIS

Datasource Configuration

ConfigurationDescription
Endpoint URLThe URL of the ArcGIS online account to connect to. Must be a valid HTTPS URL without trailing slash.
Authentication MethodChoose the authentication method: ApiKey or OAuth2
Authentication ConfigurationConfiguration specific to the selected authentication method (see details below)
Connection AccessibilityChoose the connection accessibility: Public (yes) or Private (no). Public accessibility is recommended for ArcGIS Online.
FiltersOptional filters to discover specific items. See detailed filter options below.
Private Accessibility

For private accessibility to work user need to whitelist these two domains in the whitelisted domains list.

  • .pypi.org
  • .pythonhosted.org

Authentication Methods

API Key Authentication:

  • KeyValue: Your ArcGIS API key or token (required, non-empty string)

OAuth2 Authentication:

  • ClientId: OAuth2 client identifier (required, non-empty string)
  • ClientSecret: OAuth2 client secret (required, non-empty string)
Secure Storage

Authentication credentials are securely stored in AWS Systems Manager Parameter Store and are not visible in the datasource configuration after creation.

Filters

These are optional filters that can be used to filter the items fetched from the ArcGIS online. All filter values must be URL-encoded.

FilterDescriptionExample
groupsJSON array of group objects. Each group must have either id OR both title and owner.[{"id": "8a906fdd276f4de8bc3f84409ab04cb5"}, {"title": "My Group", "owner": "username"}]
orgidOrganization ID to limit search to specific organization items.eg. lQySeXwbCg63XWTi
bboxBounding box in format: xmin,ymin,xmax,ymax to limit geographic search area.-118,32,-116,34
categoriesComma-separated list of up to 8 organization content categories."/Categories/Water", "/Categories/Forest"
category_filterComma-separated list of up to 3 category terms for matching.basemap,reference,topographic
max_itemsMaximum number of items to fetch (1-10000).100
filterJSON object with advanced search criteria. Supported keys: title, tags, typeKeywords, type, owner.{"type": "\"Feature Service\" OR \"Web Map\"", "owner": "myuser", "tags": "wastewater", "typeKeywords": "ArcGIS Server", "title": "Address Data Exporter"}

How to create a dataflow to ingest ArcGIS items?

Go to Dataflows tab on the datasource details page and click on the + Add Dataflow button.

ArcGIS Dataflow Creation

Fill in the details as shown in the table below and click on the "Create" button.

Step 1: Dataflow Details

DetailsDescription
Dataflow NameGive the dataflow a unique name (must be unique within the datasource)
Dataflow DescriptionAdd a description for the dataflow (optional)
Ingestion TypeChoose the ingestion type: metadata-only or metadata-and-data
Target LocationChoose the target location for the dataflow: S3, S3 Athena, Redshift, Lakeformation (required for metadata-and-data)
Advanced FiltersAdvanced regex-based filters to filter items by type and properties (optional, base64 encoded)

Ingestion Types

Metadata-Only:

  • Catalogs item metadata
  • Supports all ArcGIS item types
  • Target location defaults to DynamoDB
  • Faster processing, minimal storage requirements

Metadata-and-Data:

  • Catalogs metadata AND ingests actual geospatial data
  • Only supports Feature Service items with layers
  • Requires target location selection
  • Creates datasets for each Feature Service layer

Advanced Filters:

Advanced filters provide additional refinement capabilities beyond basic datasource filters by allowing you to apply regex patterns to filter ArcGIS items based on their type and properties. The intuitive UI interface enables you to create multiple filter configurations for different item types (Hub Site Application, Feature Service, Web Map, etc.) with custom key-value pairs, where you can specify field names and corresponding regex patterns to precisely control which items are discovered and processed during ingestion.

Advanced filters use a multi-layered filtering approach where different item types have specific criteria applied. For example:

  • For Hub Site Applications, if "wastewater" tags are specified in the filters, then only items tagged with "wastewater" will be included in the discovery process.
  • For Hub Pages, if a description filter is specified with a regex pattern matching "health", then only items with descriptions containing the word "health" will be included in the discovery process.
  • For Feature Services, if tags and title filters are specified (e.g., "census" tags AND titles containing "age" or "technology"), then only items matching both criteria will be included in the discovery process.

The below image demonstrates the advanced filters configuration:

ArcGIS Advanced Filters

This granular filtering ensures that only the most relevant items matching your specific requirements are discovered and available for selection during the ingestion process.

If user wants to filter only the following items:

  • Feature Services that have "population" or "census" in their title
  • Hub Site Applications that are owned by specific users
  • Web Maps that have certain tags

Then user can configure the advanced filters as follows:

Item TypeKeyValueDescription
Feature Servicetitle.*(?:population|census).*Matches titles containing "population" OR "census"
Hub Site Applicationowner^(john.doe|jane.smith)$Matches items owned by "john.doe" OR "jane.smith"
Web Maptags.*(?:demographics|survey).*Matches items tagged with "demographics" OR "survey"

Additional Examples:

  • Filter by access level: Use field access with pattern public to match items with public access level
  • Filter by description content: Use field description with pattern .*(?:environmental|climate).* to match items with environmental or climate-related keywords in the descriptions
Filter Processing

Advanced filters are applied AFTER the basic datasource filters, providing additional refinement of the discovered items. The filters use case-insensitive regex matching.

After filling in the details, click on the "Continue" button to proceed to the next step.

Step 2: Select ArcGIS Items

In this step, you can select the ArcGIS items to ingest. The system will query your ArcGIS online account using the datasource filters and any advanced filters to discover available items.

Selection Process:

  1. Items are organized by type and displayed with their titles and IDs
  2. Select items by clicking the checkbox next to each item
  3. For Feature Service items (metadata-and-data ingestion only):
    • Expand the Feature Service to view available layers
    • Each layer shows its name and schema information
    • Select individual layers for data ingestion
    • Layer schema is automatically detected and displayed

After selecting the items and layers, click on the "Continue" button to proceed to the next step.

Step 3: Configure Selected Items

In this step, you can configure the selected items for data ingestion.

For Metadata-Only Ingestion:

  • No additional configuration required
  • Items will be cataloged with their existing metadata

For Metadata-and-Data Ingestion (Feature Service layers only):

Each selected Feature Service layer requires dataset configuration:

FieldDescriptionRequired
Dataset NameUnique name for the dataset within the domainYes
DomainTarget domain for the dataset (must have editor access)Yes
DescriptionDataset descriptionOptional
KeywordsTags for dataset discovery and organizationOptional

  • Ingestion Filtering Configuration (API Only)

This is an optional configuration that can be used to filter the data ingested into the dataset.

Ingestion Config:

FieldDescriptionRequired
MaxRecordsThe maximum number of records to ingestOptional
StartOffsetThe offset to start ingesting fromOptional
ColumnFiltersObject containing LogicalOperator and Filters for data filteringOptional

ColumnFilters Object (inside IngestionConfig):

FieldDescriptionRequired
LogicalOperatorThe logical operator to use for combining filters (e.g., "AND", "OR")Optional
FiltersArray of Filter Objects (see structure below)Yes

Filter Object Structure (inside ColumnFilters.Filters):

FieldDescriptionRequired
ColumnNameName of the column to filter onYes
OperatorComparison operator: =, !=, >, <, >=, <=, LIKE, IS NULL, IS NOT NULLYes
ColumnValueValue to compare againstYes

Example Payload:

A city planning department wants to ingest traffic incident data from their ArcGIS Feature Service layer "Traffic_Incidents". They want to:

  • Skip the first 5 records (StartOffset: 5)
  • Limit ingestion to maximum 50 records (MaxRecords: 50)
  • Filter to only include incidents that are currently active and have a severity level of greater than 10
POST /datasources/{datasource_id}/dataflows
{
"DataflowType": "arcgis",
"DataflowName": "Traffic_Incidents",
"IngestionType": "metadata-and-data",
"DataflowConfig": {
"IngestionType": "metadata-and-data",
"TargetLocation": "s3athena"
},
"ItemsConfig": {
"ItemType": "Feature Service",
"ItemDetails": {
"ItemId": "Traffic_Incidents",
"LayerId": 0,
"DatasetConfig": {
"DatasetName": "Traffic_Incidents",
"Description": "Traffic Incidents",
"Domain": "traffic",
"Keywords": ["Traffic", "Incidents"]
},
"IngestionConfig": {
"MaxRecords": 50,
"StartOffset": 5,
"ColumnFilters": {
"LogicalOperator": "AND",
"Filters": [
{ "ColumnName": "status", "Operator": "=", "ColumnValue": "active" },
{ "ColumnName": "severity", "Operator": ">", "ColumnValue": "10" }
]
}
}
}
}
}
Bulk Configure

Use the "Bulk Configure" feature to configure multiple datasets simultaneously:

  • Dataset Prefix/Suffix: Apply common naming patterns
  • Domain: Set the same domain for all datasets
  • Description: Apply common description template
  • Keywords: Add common tags to all datasets

This feature significantly speeds up configuration when working with multiple Feature Service layers.

After configuring the items, click on the "Create Dataflow" button to create the dataflow. The dataflow creation will start in the background and you can see the status of the dataflow on the dataflows tab.

How to start a dataflow to ingest ArcGIS items?

Once the dataflow is created, you can see details of the dataflow on the dataflows tab.

ArcGIS Dataflow Start

  1. Navigate to the dataflows tab on the datasource details page
  2. Click the "Start" button in the actions column for the desired dataflow
  3. The dataflow will begin execution in the background

Monitoring Dataflow Execution

Dataflow Details Page:

  • Click on the dataflow name to view detailed information
  • Shows basic dataflow configuration and status
  • Displays registered datasets in Dataset Details section(for metadata-and-data ingestion)
  • Click dataset names to navigate to dataset details

Logs:

Monitor and troubleshoot dataflow executions by downloading detailed logs. To access logs:

  1. Navigate to the dataflow details page
  2. Click the three-dot menu icon (⋮) in the top right corner
  3. Select "View Logs" to see available log options

ArcGIS Dataflow Logs


Available Log Types:

For Metadata-Only Ingestion:

  • Output Metadata Logs: Contains metadata extraction details, item discovery information, and processing status
  • Error Metadata Logs: Captures any errors or warnings encountered during metadata cataloging For Metadata-and-Data Ingestion:
  • Output Metadata Logs: Contains metadata extraction details and item discovery information
  • Error Metadata Logs: Captures metadata-related errors or warnings
  • Output Data Logs: Contains data ingestion details, record counts, and dataset creation information
  • Error Data Logs: Captures data ingestion errors, schema validation issues, and transformation failures
Log Analysis

Logs are useful for debugging failed executions, monitoring ingestion progress, and understanding data quality issues. Check Error Logs first when troubleshooting failed dataflows.


Executions:

This section shows the execution history and status for each dataflow run. User can monitor progress and view the latest message for each execution. User can click on the eye icon to view the detailed information and metrics for each execution and also download the logs for each execution.

ArcGIS Dataflow Executions

How to schedule a dataflow to ingest ArcGIS items?

To schedule a dataflow to ingest ArcGIS items, you can go to the dataflow details page and click on the "Schedule" tab then click on the "Create Schedule" button.

ArcGIS Dataflow Schedule

Fill in the details as shown in the table below and click on the "Schedule" button.

DetailsDescription
Schedule NameGive the schedule a unique name
DescriptionAdd a description for the schedule
Job TypeChoose the job type: arcgis-full-load
Select DataflowChoose the dataflow to schedule
Schedule TypeChoose the schedule type: Time Based or On-Demand
Schedule ExpressionThe schedule expression for the cron schedule required for time based schedules

Once the schedule is created, you can see the schedule details on the schedules tab. Click on the schedule name to view the schedule details and click on the "Run Schedule" button to run the schedule immediately.

Deletion of ArcGIS Dataflows

When an ArcGIS dataflow is deleted, only the metadata (catalog assets) are removed. The actual data and datasets remain in the system and are not automatically deleted.

Note

If you need to remove the actual datasets created from an ArcGIS dataflow, you must delete them separately from the Datasets section.

How to delete an ArcGIS datasource?

To delete an ArcGIS datasource, you can go to the datasource details page and click on the "Trash" button. Click on the "Delete Datasource" button to confirm the deletion of the datasource.

ArcGIS Datasource Deletion