'invalid byte sequence for encoding "UTF8": 0x00 while postgress import from S3
I am importing data from S3 csv file to Pstgress RDS using aws_s3 Extention and it gives an error in between import
Command
psql=> SELECT aws_s3.table_import_from_s3( 't1',
'(format csv)', :'s3_uri' );
Error
ERROR: invalid byte sequence for encoding "UTF8": 0x00 CONTEXT: COPY t1, line 7324484
I tried to change the column type to text but not working
Solution 1:[1]
My data was contain invalid values, and that needs to be cleaned
while I was trying to export data from Redshift and import it to S3, I found redshift has some support to clean this type of data
here is link to the solution https://aws.amazon.com/premiumsupport/knowledge-center/remove-invalid-characters-redshift-data/
thanks
Solution 2:[2]
If you really have ASCII 0x00 in your input data you need to specify this as the NULL character with NULL AS '\000' in your COPY command.
See - https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Prajapati Mehul |
Solution 2 | Bill Weiner |