'Cloud Fusion Pipeline works well in Preview mode but throws error in Deployment mode

I have the below pipeline which ingests news data from RSS Feeds. Pipeline is contructed using HTTPPoller, XMLMultiParser Transorm, Javascript and MongoDB Sink. The pipeline works well in Preview mode but throws "bucket not found" error in Deployment mode

RSS Ingest Pipeline

Error



Solution 1:[1]

Cloud Data Fusion (CDF) creates a Google Cloud Storage (GCS) bucket with the name format similar to the one mentioned in the error message in your GCP project when you create a CDF instance. Judging by the error message, its possible that the GCS bucket may have been deleted. Try to deploy the same pipeline in a new CDF instance (with the bucket present this time) and it should not raise the same exception.

This bucket is used as a Hadoop Compatible File System (HCFS) which is required to run pipelines on Dataproc

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Arjan Singh Bal