I have four questions. Suppose in spark I have 3 worker nodes. Each worker node has 3 executors and each executor has 3 cores. Each executor has 5 gb memory. (T
I have a databricks notebook running every 5 mins, part of the functionality is to connect to a file in Azure Data Lake Storage Gen2 (ADLS Gen2). I get the foll
How to create a database with a name from a variable (in SQL, not in Spark) ? I've written this : %sql SET myVar = CONCAT(getArgument('env'), 'BackOffice'); CRE
I want to do a simple shap analysis and plot a shap.force_plot. I noticed that it works without any issues locally in a .ipynb file, but fails on Databricks wit
Instead of the expected output from a display(my_dataframe), I get Failed to fetch the result. Retry when looking at the completed run (also marked as success).
I have been studying for the above exam using Databricks' learning platform, but I have not found any external resources such as study guides or practice exams
I read the Google API documentation pages (Drive API, pyDrive) and created a databricks notebook to connect to the Google drive. I used the sample code in the d
Even though secrets are for masking confidential information, I need to see the value of the secret for using it outside Databricks. When I simply print the sec
I have a process using the following select statement in sql server SELECT HASHBYTES('SHA1', CAST('4100119300' AS NVARCHAR(100))) AS StringConverted This give
I have a table with ~5k columns and ~1 M rows that looks like this: ID Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 Col11 ID1 0 1 0 1 0 2 1 1 2 2 0 ID2 1
I have multiple JSON files (10 TB ~) on a S3 bucket, and I need to organize these files by a date element present in every json document. What I think that my c
I am trying to cleanup and recreate databricks delta table for integration tests. I want to run the tests on devops agent so i am using JDBC (Simba driver) bu
As you can see the library I'm using is asking to make an entry but there's no box/window where I can make the entry. How do I make an entry here amongst y/n/u/
I have streaming data coming in as JSON array and I want flatten it out as a single row in a Spark dataframe using Python. Here is how the JSON data looks like
I am loading data via pipelines in ADLS gen2 container. Now I want to create a table that has details that when the pipeline start running and then completed. l
I need to find a way to delete multiple rows from a delta table/pyspark data frame given a list of ID's to identify the rows. As far as I can tell there isn't a
I am using Spark ML library for classification problem using a logistic regression. I have vectorized input features and created training dataset and test datas
I am running databricks 7.3LTS and having errors while trying to use scala bulk copy. The error is: object sqldb is not a member of package com.microsoft. I hav
I am working with Azure Databricks jupyter notebooks and have time-consuming jobs (complex queries, model training, loops over many items, etc.). Every time I c
I am really struggling from months. We are trying to scan SCALA code with SonarQube in Azure Devops which is in Databricks. We were getting around 30 error. But