'How do I obtain URL to download the csv files of Tensorflow datasets?
In TensorFlow examples, I can see URLs to download the csv format of the dataset. For example,
Iris- https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv
Titanic- https://storage.googleapis.com/tf-datasets/titanic/train.csv
However, I can't find the URL for every dataset in TensorFlow that are listed over her. (https://www.tensorflow.org/datasets/catalog/overview).
Solution 1:[1]
you don't need the URLs. Tensorflow datasets are already ready to use. check out the tutorial here tfds guide
For titanic, it is available here titanic structured dataset
Hope this would help :)
Solution 2:[2]
TensorFlow Datasets is having a collection of ready-to-use datasets. loaded from tfds - "Dataset downloaded and prepared to /root/tensorflow_datasets/iris/2.0.0. Subsequent calls will reuse this data. "- really covinient... but if you'd better take dataset from url (see here - pipelines are convinient):
# https://www.tensorflow.org/guide/data#consuming_csv_data
import tensorflow as tf
import pandas as pd
# test_file = tf.keras.utils.get_file("temperature.csv", "https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv")
titanic_file = tf.keras.utils.get_file("train.csv", "https://storage.googleapis.com/tf-datasets/titanic/train.csv")
df = pd.read_csv(titanic_file)
df.head()
# make dataset from pandas:
myDataset = tf.data.Dataset.from_tensor_slices(dict(df))
for feature_batch in myDataset.take(1):
for key, value in feature_batch.items():
print(" {!r:20s}: {}".format(key, value))
titanic_lines = tf.data.TextLineDataset(titanic_file)
for line in titanic_lines.take(10):
print(line.numpy())
here are different Datasets & Flows also
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Amal Nozieres |
Solution 2 |