This is my dataset: from pyspark.sql import SparkSession, functions as F spark = SparkSession.builder.getOrCreate() df = spark.createDataFrame([('2021-02-07',)
sqlxml
structural-search
edsdk
l5-swagger
informatica-data-integration-hub
rescript
remotewebdriver
shutdown
intel-modin
vs-color-theme-editor
device-node
websharper
3dcamera
apache-commons-net
xlsxwriter
psi
gaussianblur
group-membership
hyperparameters
swift3.2
twitter-oauth
boost-process
programming-pearls
ocl
confluence-rest-api
statsplots
gemfile
angular-httpclient
ora-04091
celery-task