'Upload large csv file to cloud storage using Python

Hi I am trying to upload a large csv file but I am getting the below error:

HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/de-bucket-my-stg/o?uploadType=resumable&upload_id=ADPycdsyu6gSlyfklixvDgL7RLpAQAg6REm9j1ICarKvmdif3tASOl9MaqjQIZ5dHWpTeWqs2HCsL4hoqfrtVQAH1WpfYrp4sFRn (Caused by SSLError(SSLWantWriteError(3, 'The operation did not complete (write) (_ssl.c:2396)')))

Can someone help me on this?

Below is my code for it:

   import os
    import pandas as pd
    import io
    import requests
    from google.cloud import storage
    
    try:
        url = "https://cb-test-dataset.s3.ap-south-1.amazonaws.com/analytics/analytics.csv"
        cont = requests.get(url).content
        file_to_upload = pd.read_csv(io.StringIO(cont.decode('utf-8')))
    except Exception as e:
        print('Error getting file: ' +  str(e))
    
    try:
        os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'C:/Users/haris/Desktop/de-project/xxx.json' --xxx is replaced here.
        storage_client = storage.Client()
        bucket_name = storage_client.get_bucket('de-bucket-my-stg')
        blob = bucket_name.blob('analytics.csv')
        blob.upload_from_string(file_to_upload.to_csv(),'text/csv')
    except Exception as e:
        print('Error uploading file: ' +  str(e))


Solution 1:[1]

As mentioned in the documentation,

My recommendation is to gzip your file before sending it. Text file has an high compression rate (up to 100 times) and you can ingest gzip files directly into BigQuery without unzipped them

The fastest method of uploading to Cloud Storage is to use the compose API and composite objects.

For more information, you can refer to the stackoverflow thread where OP is facing a similar error.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Divyani Yadav