'When Migrating from SQL Server to Snowflake how do we migrate stored procedures

I am doing POC on Snowflake and exploring this product.

I have a question: in SQL Server, we have 400-500 stored procedures and these stored procedures are getting called from SSRS reports. These stored procedures are simple in Nature like below one:

CREATE PROCEDURE [dbo].[SQL_Stored_Procedure]
    (@StartDate DATETIME, 
     @EndDate   DATETIME)
AS
BEGIN
    SELECT *
    FROM MYTable
    WHERE Cloumn_Date BETWEEN @StartDate AND @EndDate;
END

EXEC [dbo].[SQL_Stored_Procedure] @StartDate = Getdate()-1, @EndDate=Getdate()

How can I achieve same in Snowflake? Seems Snowflake procedure is like a SQL Server function: https://docs.snowflake.net/manuals/sql-reference/stored-procedures-usage.html

Curious to know how other people are doing in above scenario..?



Solution 1:[1]

For some use cases UDTFs might be a better option, but you can use the javascript stored procedures to run SQL statements. It's not super pretty, and I'm hoping that They add SQL as a language option in stored procedures soon. Here's an example with your generic query above.

CREATE OR REPLACE PROCEDURE STORED_PROCEDURE_NAME(STARTDATE VARCHAR, ENDDATE VARCHAR)
RETURNS VARIANT
LANGUAGE JAVASCRIPT
EXECUTE AS OWNER
AS 
$$
    query = `SELECT column1, column2
            FROM MYTable
            WHERE Cloumn_Date BETWEEN '` + STARTDATE + `' AND '` + ENDDATE + `';`

    resultset = snowflake.createStatement({sqlText: query}).execute() 

    //extract the results of the query into a javascript object
    output = []
    while (resultset .next())  {
        output.push(output.getColumnValue(1), output.getColumnValue(2));
    }

    //put the output of the query into a JSON object which Snowflake can translate
    // into a "variant" and can be used downstream
    json = { "result" : output };

    return json;
$$;

//I've found it easiest to work with strings because the data type conversions
//between snowflake and javascript and back again can be tricky
CALL STORED_PROCEDURE_NAME(CURRENT_DATE::STRING, DATEADD( DAY, 1, CURRENT_DATE)::STRING);
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()));

I've found this to be mostly useful for procedures that execute transformations and loads, but so far I've only manually migrated a smallish set of queries and has required some javascript fiddling and customization for some of them.

This will be a little tricky to set up to automatically migrate hundreds of queries, but the javascript above can be improved and generalized even more (I'm not a javascript expert). For example, to dynamically handle any column list without manual editing: https://docs.snowflake.net/manuals/sql-reference/stored-procedures-usage.html#returning-a-result-set

Documentation for the extra RESULT_SCAN step for using the result downstream is here: https://docs.snowflake.net/manuals/sql-reference/stored-procedures-usage.html#using-result-scan-to-retrieve-the-result-from-a-stored-procedure

Solution 2:[2]

We're currently moving our data warehouse from MS SQL to Snowflake. Snowflake stored procedures can be written in SQL or JavaScript. I wasn't able to find a tool that migrates them for you, so we had to refactor ours. We didn't have as many as you do.

The stored procedures I created were relatively simple, so I created a string and executed them:

create or replace procedure STAGE_ABC.SP_LOAD_HISTORY(src varchar, target varchar, run_for_DT varchar, key_cols varchar)
  returns varchar not null
  language javascript
  as 
  $$
//get columns list
var col_sql = "select COLUMN_NAME from information_schema.columns where TABLE_SCHEMA = '" + src_schema + "' and TABLE_NAME = '" + src_table + "' order by ORDINAL_POSITION;"

try {
    var result_set = snowflake.execute( {sqlText: col_sql});    

    while (result_set.next())  {
       cols += result_set.getColumnValue(1) + ", ";
    }}
catch (err)  {
    result =  "Failed to get columns: Code: " + err.code + "\n  State: " + err.state;
    result += "\n  Message: " + err.message;
    result += "\nStack Trace:\n" + err.stackTraceTxt; 
    }

cols = cols.substring(0, cols.length - 2);

var result = "";
$$;

Solution 3:[3]

We are doing a similar migration. Snowflake procs are not easy to work with because of the following:

  • Building the SQL is error prone like execute immediate
  • Converting the results to JSON is ugly
  • Procs do not return result sets like other DB's and require a second call (maybe you can process the JSON on the application side, but I haven't tried this)
  • A lot of the processing in the proc looks like the same processing you'd do in your app if you are using straight SQL. Leaving this logic in the application makes the application more portable. If you ever need to migrate from Snowflake procs to another DB, you will have a lot of work ahead of you since little or no other DB's use the JavaScript syntax like snowflake.

We are looking at foregoing stored procs and storing the SQL in text files in the Java application. I'm going to try using YAML to separate the SQL statements and rollback the transaction to drop temp tables and variables.

Solution 4:[4]

Normally such procedure could be translated into Tabular SQL UDF:

CREATE OR REPLACE FUNCTION SQL_UDF(STARTDATE DATETIME, ENDDATE DATETIME)
RETURNS TABLE(id INT, Column_Date DATETIME)
AS
'SELECT * FROM MyTable WHERE Column_Date BETWEEN STARTDATE AND ENDDATE';
   
SELECT *
FROM TABLE(SQL_UDF((CURRENT_DATE()-1)::DATETIME,
                    CURRENT_DATE()::DATETIME));

For:

CREATE OR REPLACE TABLE MyTable(id INT, Column_Date DATETIME);

INSERT INTO MyTable SELECT 1, CURRENT_DATE();
INSERT INTO MyTable SELECT 2, CURRENT_DATE()-10;

Output:

enter image description here


If stored procedure is a requirement, it is also possible:

CREATE OR REPLACE PROCEDURE SQL_Stored_Procedure(STARTDATE DATETIME,ENDDATE DATETIME)
RETURNS TABLE(id INT, Column_Date DATETIME)
LANGUAGE SQL
AS
DECLARE
  res RESULTSET;
  query VARCHAR DEFAULT 'SELECT * FROM MyTable WHERE Column_Date BETWEEN ? AND ?';
BEGIN
  res := (EXECUTE IMMEDIATE :query USING (STARTDATE, ENDDATE));
  RETURN TABLE (res);
END;

Call:

CALL SQL_Stored_Procedure(CURRENT_DATE()-1, CURRENT_DATE());

Output:

enter image description here

More at: Working with RESULTSETs

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Scott Fair
Solution 3 splashout
Solution 4 Lukasz Szozda