Parquet file processing fails in Redshift data loads when column datatypes are modified

July 29, 2025 · 2 min read

Fix Available

This bug has been fixed

Parquet file uploads fail when you change column data types in Redshift datasets.

Affected Versions: 2.7 3.0 3.0.3 3.0.6

Fix Version: 3.1

Root cause(s)

When you change a column's data type in a Redshift dataset (for example, changing from INTEGER to VARCHAR), Redshift internally restructures the table by creating a new column, copying data, and removing the old column. This changes the order of columns in your dataset.

Parquet files rely on this column order to know where to put data, but after the data type change, the file's column order no longer matches your dataset's new structure. This causes the upload to fail or put data in wrong places.

Other file formats like CSV, Excel, and TSV use column names instead of position, so they work correctly even after data type changes.

Impact

Parquet file uploads fail with data type conversion errors after column datatype modifications
Data may be loaded into incorrect columns without clear error messages
Users are unable to upload parquet files to datasets where column datatypes have been modified
Inconsistent behavior between parquet files and other supported file formats

Mitigation

Fix available

Fix is available in Amorphic version 3.1. Please upgrade to the latest version to resolve this issue.

Timeline

2024-07-24: Bug reported/identified (CLOUD-5965)
2024-07-24: Bug triaged and documented
2024-07-28: Root cause analysis, fix development and testing completed
2024-07-29: Solution merged and Version 3.1 released with the fix

Root cause(s)​

Impact​

Mitigation​

Timeline​

Root cause(s)

Impact

Mitigation

Timeline