I am using the MLFlow Webhooks , mentioned here. I am using that to queue an Azure Devops Pipeline. However, I can't seem to to find a way to retrieve the paylo
Trying to process JSON data in a column on Databricks. Below is the sample data from a table (its a weather device records info) JSON_Info {"sampleData":"dataD
Is there an elegant, easy and fast way to move data out of HBase into MongoDB? I want to migrate HBase to mongoDB. I am new to mongoDB. Could someone please hel
Issue: I'm trying to write to parquet file using spark.sql, however I encounter issues when having unions or subqueries. I know there's some syntax I can't seem
I am getting below error for updating the repo to a different branch using databricks rest api as mentioned at https://docs.databricks.com/dev-tools/api/latest/
Can I iterate through the widgets in a databricks notebook? Something like this pseudocode? # NB - not valid inputs = {widget.name: widget.value for widget in
If a pyspark dataframe is reading some data from a table and writing it to azure delta lake Can we add comments to this newly written file? For e.g Df = sql("se
Please clarify my confusion as I keep hearing we need read every Parquet file created by Databricks Delta tables to get to latest data in case of a SCD2 table.
We have folders and subfolders in it with year,month, day folders in it. How can we get only the last leaf level folder list using dbutils.fs.ls utility? Exampl
I work on DataBricks with PySpark dataframe containing string-type columns. I use .withColumnRenamed() to rename one of them. Later in the process I use a .filt
Trying to flatten a nested json response using Python databricks dataframe. I was able to flatten the "survey" struct successfully but getting errors when i try
We have a table 1 Day table aggregated with group by call_date ,tdlinx_id ,work_request_id ,category_name another table we have 1 week level data aggregated w
I'm getting the following error when I attempt to write to my data lake with Delta on Databricks fulldf = spark.read.format("csv").option("header", True).option
Below table would be the input dataframe col1 col2 col3 1 12;34;56 Aus;SL;NZ 2 31;54;81 Ind;US;UK 3 null Ban 4 Ned null Expected output dataframe [values of c
I am new to Azure Databricks,I am trying to write a dataframe output to a delta table which consists TIMESTAMP column. But strangely it changes the TIMESTAMP pa
I wanted to do CICD of my azure Databricks notebook using YAML file. I have followed the below flow Pushed my code from Databricks notebook to Azure Repos. Crea
I'm running like 20 notebooks concurrently and they all update the same Delta table (however, different rows). I'm getting the below exception if any two notebo
I have a databricks notebook that is writing a dataframe to a file in ADLS Gen2 storage. It creates a temp folder, outputs the file and then copies that file to
I have the following code which is written in Visual Studio Code. Now I want to run this in Azure Databricks. I have uploaded the entire folder to my Azure Blob
I have batches of binary files (~3mb each) that I receive in batches of ~20000 files at a time. These files are used downstream for further processing, but I wa