Category "azure-synapse"

Keep Sink Columns in Copy Activity when Source Columns less than Sink Columns

I have a copy activity in Datafactory that dynamically maps the columns between files in tables A and B. Both tables, A and B are .parquet. Table A has 8 column

A simple left join query taking lot of time for output

In Azure SYNAPSE I have two tables table A with 6 millions of records and Table B with 2 millions when I run a simple left join query it takes around 20 minutes

Spatial with SparkSQL/Python in Synapse Spark Pool using apache-sedona?

I would like to run spatial queries on large data sets; e.g. geopandas would be too slow. Inspiration I found here: https://anant-sharma.medium.com/apache-sedon

How to set the Synapse pipeline parameter during deployment?

How to set the Synapse integrate pipeline parameter during deployment? I am using the Synapse deployment task with GIT to deploy the workspace to multiple envir

Running nltk.download in Azure Synapse notebook ValueError: I/O operation on closed file

I'm experimenting with NLTK in an Azure Synapse notebook. When I try and run nltk.download('stopwords') I get the following error: ValueError: I/O operation on

PySpark read data into Dataframe, transform in sql, then save to dataframe

New to Spark and Synapse....Need to do some transformation including adding a columns, changing datatypes, etc. I am reading a csv into a dataframe. I'd like t

Could not create lake database from synapse notebooks

New to azure synapse, trying to create database (Managed table) from synapse notebook. I also added Storage blob data contributor for synapse workspace and spec

Where to check who run pipeline in Azure Synapse

I need to find info who starts the pipeline (trigered Manual); In the pipeline runs section there is no info about user only about parent pipeline if applicabl

What do these different properties in SQL Server's (Azure Synapse's) Estimated Execution Plan mean?

I'm trying to work on the statistics, and as a part of it, I'm trying to look at the execution plan of certain SELECT * commands with a WHERE condition on a par

Could SSMS shows actual execution plan in Azure Synapse?

I'm studying of Azure Synapse. In dedicated SQL pool database, 'actual execution plan' of SSMS was disabled. In serverless pool database, SSMS says 'set statist

Azure Syanpse Analytics

I have a need to connect to Synapse Analytics Serverless SQL Pool database using SQL Authentication. I created a serverless SQL Pool database and created a SQL

Azure Synapse IDENTITY COLUMN; Wrong Id values based on seed

The goal is to create a table, insert some (3) dummy rows for technical reasons, then for any valid data, start using Ids above 100. Script for creating (drop-c

M2M Client Credential Flow between NetSuite and Synapse

I am looking to create a flow somewhere in the Azure stack to allow me to get M2M authentication between Azure Synapse and NetSuite. The goal is to be able to d

Azure Synapse insert result of exec into a table

Is there a way to insert into temp table result dataset from exec (no matter call of the procedure or execute dynamic SQL) in Azure Synapse Analytics? I didn't

TypeError: AutoMLConfig() takes no arguments in Azure Synapse

I am receiving below error in Azure synapse Pyspark notebook TypeError: AutoMLConfig() takes no arguments while running below code: automl_settings = { "primary

Error assigning Synapse role using Terraform

i am trying to assign an in-bulit role in Synapse through Terraform but i get an Error. This is what I'm trying to do: resource "azurerm_synapse_role_assignment

Error on source dataset with REST Connector in Azure Synapse pipeline

I am using Copy and transform data from and to a REST endpoint by using Azure Data Factory to load a file from my Box.com account to an Azure Data Lake Gen2 (AD

Synapse external table "Unauthorized"

I have created an external table in Azure Synapse from a parquet file stored in an ADLS Gen2 container. I have used the following three queries to create the da

Synapse spark job fails as input folder does not exist

How to do exception handling for file reading. For example, I have a daily job that will run at 8:00 am. It reads files from Azure data lake storage(Gen 2). The

Data Factory/Synapse: How to merge many files?

After generating ~90 different 100 mb gzip'd CSV files, I want to merge them all into a single file. Using the built-in merge option for a data copy process, it