Recently, I was assisting an analytics team with implementing a Snowflake cloud data warehouse. This particular team did not have an existing data warehouse, instead they were extracting CSV files from their source systems to handle reporting in Excel. Our plan was to load these CSV files to tables within Snowflake and build reporting views that would connect to Tableau.

This team was small and had many responsibilities, so our ultimate goal was to automate each step of this process so their data source creation and reporting would require minimal maintenance. On previous projects, I had utilized other cloud ELT tools to handle the loading and transformations. These tools usually have scheduling capabilities that make automating this process very simple. The team I was working with on this project, however, wanted to keep their data processing within the Snowflake instance.

Snowflake created Tasks as a tool to help users execute SQL statements on a consistent schedule. They can be created with dependencies, so there are tasks that will run after a predecessor task has completed successfully. Tasks run using a user specified warehouse and can be scheduled to run at specific intervals or using CRON notation. CRON notation allows you to be very specific about the dates and time of task execution.

Snowflake recommends creating a custom role for task ownership. This role should have the CREATE TASK and EXECUTE TASK privileges.

Here is an example of using Tasks to load data to tables and create a reporting view with some sample data sets.

To Load Raw CSV to Table:

Screenshot of loading CSV to table

In the Task One example, you can see that I used the CREATE TASK command to give a name, specify a warehouse, and schedule an execution time for the first task. The CRON notation here means that this task will run at 8am Central every single day. This task will execute the entire SQL statement that appears after the “AS”. This SQL statement will copy the contents of the csv file in the file path into the customer_table.

For Task Two, I followed a very similar process. The main difference you will see is that I specified that this task should run AFTER the first task “load_customer_data”. This establishes the Parent and Dependent tasks. The second task will begin to run after the first task completes successfully.

Create a Reporting View:

Screenshot of new view

For Task Three, we changed the AFTER statement so that this task will run after the order data has been loaded. This means that once all of our data has been loaded to tables, this SQL statement will join them together to create a view of the sales that can be used for reporting.

“Resume” Tasks:

Once these tasks have been created you can run SHOW TASKS; to view them and if changes are necessary you can use ALTER TASK to do so. You will notice that the state of the tasks is listed as “suspended”. Snowflake requires the user to “Resume” the task in order for the SQL statements to run on the schedule specified.

Resume tasks

This is a System Function that allows you to resume all dependent tasks for the specified task rather than executing one for each of them.

# Resume Tasks

You will use this statement syntax to resume the Parent Task. You can replace the RESUME with SUSPEND if you would like the task to stop running.

Check out the Task History:

Screenshot of Task History

Unfortunately at the moment, there is no way to be notified if a task completes successfully or unsuccessfully. However, Snowflake stores the task history within the Information schema so you can query that table to check if and when the tasks have completed running.

Using Tasks, you are not only able to accomplish the full ELT process without using other tools, you can automate this process to run on your desired schedule.

 


Explore More

What is a Data Monetization? | Unlock Revenue with Data Portals

What is a Data Monetization? | Unlock Revenue with Data Portals

Everyone says, "data is the new gold," and there are a few ways to actually create revenue generation using insights. One such method to unlock the...

Read More
What is Embedded Analytics? | Deliver Insights Directly to People

What is Embedded Analytics? | Deliver Insights Directly to People

Technology has revolutionized how businesses operate, with data being at the forefront of this transformation. The ability to analyze data and...

Read More
Embedded Tableau Pricing | User vs. Usage-Based Pricing (UBL)

Embedded Tableau Pricing | User vs. Usage-Based Pricing (UBL)

Why Embedded Analytics with Tableau Embedded analytics is a growing use case for organizations looking to deliver (and even monetize) their data...

Read More