I have batches of binary files (~3mb each) that I receive in batches of ~20000 files at a time. These files are used downstream for further processing, but I wa
I have File Azure Blob Storage that I need to load daily into the Data Lake. I am not clear on which approach I should use(Azure Batch Account, Custom Activity
I would like to read mails from microsoft outlook using python and run the script using a databricks cluster. I'm using win32com on my local machine and able to
I am have two parquet file in azure data lake gen2 I want to Insert Overwrite onw with other. I was trying the same in azure data bricks by doing below. Reading
I have a really simple logic that I would like to understand how I can make it work in pyspark. for data in df1: spark_data_row = spark.createDataFrame(data
I have tables in Azure Databricks that I am using SQL to interact with via a notebook. I need to select all columns from a table with 200 columns, I need to sel
I would like to set only one branch at shared folder in databricks workspace. Attaching screenshot to give more clarity on the same. All of data factory pipeli
I'm using Databricks Autoloader to incrementally stream from a Delta Lake table into a SQL database. If an OPTIMIZE or VACUUM statement is ran against the Delt
I am working on IoT solution, where there are multiple sensors which are sending data. I have one job which listen to Event hub, get the IoT sensor data and sto
I have a set of records with 10 columns. There is a column 'x' which contains an array of float values and the length of array can be very large(for eg, the len
I am getting the below error, while importing Pycaret time-series(beta) module in the databricks (we were running successfully earlier). Request your help in so
I am trying get the workspace name inside a python notebook. Is there any way we can do this? Ex: My workspace name is databricks-test. I want to capture this i
How to create a database with a name from a variable (in SQL, not in Spark) ? I've written this : %sql SET myVar = CONCAT(getArgument('env'), 'BackOffice'); CRE
I want to do a simple shap analysis and plot a shap.force_plot. I noticed that it works without any issues locally in a .ipynb file, but fails on Databricks wit
dbutils.fs.mount( source = f"wasbs://{blob.storage_account_container}@{blob.storage_account_name}.blob.core.windows.net/", mount_point = "/mnt/MLRExtract/"
Error : AnalysisException: Recursive view management_db.v_extract detected (cycle: management_db.v_extract -> management_db.v_extract) Query outisde of the v
Following example from Azure team is using Apache Spark connector for SQL Server to write data to a table. Question: How can we execute a Stored Procedure in an
I read the Google API documentation pages (Drive API, pyDrive) and created a databricks notebook to connect to the Google drive. I used the sample code in the d
Even though secrets are for masking confidential information, I need to see the value of the secret for using it outside Databricks. When I simply print the sec
I wonder how this query is executing successfully. As we know 'having' clause execute before the select one then here how alias name used in 'select' statement