I have a problem with unit testing in DataBricks. I have not found any similar error message yet. Could someone please help me? Below you can find my code: impo
I am trying to open a file that i uploaded to the dbfs location. However, I get error while trying to open the file but I can see the file when I do a ls. Also
Short version: Need a faster/better way to update many column comments at once in spark/databricks. I have a pyspark notebook that can do this sequentially acro
I'm new to databricks and just created a delta live tables to ingest 60 millions json file from S3. However the input rate (the number of files that it read fro
I am currently trying to get a flatten a data in databricks table. Since some of the columns are deeply nested and is of 'String' type, i couldn't use explode f
I'm running like 20 notebooks concurrently and they all update the same Delta table (however, different rows). I'm getting the below exception if any two notebo
I'm trying to run Tpcds on Rapids single node on EMR using this guide: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-rapids.html But getting res
I can use: show columns in table_name but this does not allow me to use the output in a query? This throws an error: SELECT * FROM show columns in table_name
I am using spring-cloud-starter-aws-secrets-manager-config 2.3.3 for a spring boot application which works perfectly in my local pointing to stage environment
I am attempting to read the first X number of rows of a delta table into a dataframe, and then write (overwrite) that back to the delta table. Here is code: # r
I am running the delete query with the < (less then) and > (greater then) condition on the timestamp field but we are not getting the desired results. Fir
I would like to read mails from microsoft outlook using python and run the script using a databricks cluster. I'm using win32com on my local machine and able to
We're developing custom runtime for databricks cluster. We need to version and archive our clusters for client. We made it run successfully in our own environme
I am trying to run some example python3 code https://docs.databricks.com/applications/deep-learning/distributed-training/horovod-runner.html on databricks GPU c
I would like to set only one branch at shared folder in databricks workspace. Attaching screenshot to give more clarity on the same. All of data factory pipeli
I'm using Databricks Autoloader to incrementally stream from a Delta Lake table into a SQL database. If an OPTIMIZE or VACUUM statement is ran against the Delt
Spark-submit in Databricks cluster.. is giving this error. I am using Spark 3.1.2 Scala 2.12 Springframeworkboot 2.6.3 However spark-submit is running good in m
In Databricks I understand that a notebook can be executed from another notebook but the notebook will run in the current cluster by default. For eg: I have not
I am have two table 1 is with 50K records and other is with 2.5K records and I want to update this 2.5K records into table one. Currently I was doing this by us
I am trying get the workspace name inside a python notebook. Is there any way we can do this? Ex: My workspace name is databricks-test. I want to capture this i