Category Archives: Snowflake

Secure data sharing in Snowflake

Snowflake has an interesting feature – Secure data sharing, which share the data(tables) with other members(clients). Secure data sharing enables users to share the data between account to account. The provider of data shares the data with consumer using shares.

Example: Snowflake by default share the account usage information and sample databases with all snowflake accounts.

NOTE: Database objects shared to other accounts are read only (they cannot modify or delete the objects). Also sharing is possible only if the consumer is in same region.

How secure data sharing works

With secure data sharing, no actual data is copied or transferred between accounts. All sharing is done by the services layer of the snowflake. The data sharing does not consume any storage at consumer side so consumer do not need to pay for storage he need pau only for the virtual warehouses.

What is share ?

Share is a snowflake database object which has information about the objects shared and with which accounts the objects are shared.

What is Reader Account ?

Sometimes the clients with whom we want to share the data might not have a snowflake account. To share the data with such customers we will use the reader account. The provider creates the reader account, share the database objects to reader account same as full snowflake account using shares.

But the reader account can consume objects only from the provider who created it. To create a reader account, user should have granted the “CREATE ACCOUNT” permissions.

use role accountadmin;

create managed account viswamitra_reader admin_name = viswamitra, admin_password = 'Sdfed43da!44', type = reader;

status
{"accountName":"DCA80445","loginUrl":"https://dca80445.us-east-1.snowflakecomputing.com"}

NOTES:

  • By default, the total number of reader accounts a provider can create is 20. If you reach the limit and require creating additional accounts with snowflake support.
  • The reader account utilizes the same Snowflake Edition as the provider account and is created in the same region.
How to create share ?

To share the database objects with other accounts you need accountadmin privileges, else the user should have “CREATE SHARE” global permissions which makes managing shares more flexible.

NOTE:

Granting CREATE SHARE to other roles makes managing shares more flexible, but also allows users with these roles to expose any objects they own (or on which they have the necessary privileges) to other accounts. This is particularly important to note if you are sharing data from an account that contains sensitive or proprietary data.

grant create share on account to employee_owner;
create share client_share;
How to grant access on tables to share ?

To grant access of objects to a share user should have

  • OWNERSHIP of the share, and
  • OWNERSHIP or USAGE/SELECT WITH GRANT OPTION on each of the objects to be granted/revoked
Grant usage on database employee to share client_share;
Grant usage on schema emp_schema to share client_share;
Grant select on table emp_schema.employee to share client_share;
---- to share all objects in a schema
Grant select on all tables in schema emp_schema to share client_share;
Grant access on share to client account
alter share client_share add accounts=era83997;
How to share tables of multiple databases ?

We can share one database using a share if we want to share objects from multiple databases, we need to use the secure views. The secure view can refer the objects from different databases.

To grant the permission on the secure view to the share grant the “reference_usage” permission on the databases from which the view referring the objects.

grant reference_usage on database employee to share client_share;
grant select on view secure_data to share client_share;

Before we actually share the data with other accounts we can do verification the data shared with them using “simulated_data_sharing_consumer” parameter.

alter session set simulated_data_sharing_consumer = xy12345;
How to access shared data ?

To access the shares shared within an account the user should have “import share” privileges. With this privilege he can see both inbound and outbound shares and create database from inbound share.

Create database from the share (inbound) given by provider.

CREATE DATABASE <name> FROM SHARE <provider_account>.<share_name>;
CREATE DATABASE client_data FROM SHARE xy23456.client_share;

user who created the data from share will be the owner of the database and he can grant the permissions to other users.

Snowflake Architecture

Snowflake is an analytic data warehouse provided as Software-as-a-Service (SaaS). Snowflake provides a data warehouse that is faster, easier to use, and far more flexible than traditional data warehouse offerings. Snowflake is a true SaaS offering which is available in all major cloud environments like AWS, AZURE and GCP. It completely runs on cloud.

Snowflake do not require

  1. Hardware procurement, installation and configuration
  2. Ongoing maintenance or tuning it is handled by Snowflake in background

In another way it is like create account in Snowflake, load data and use it, it support different kinds of data formats like CSV, JSON , AVRO etc.

Shared disk architecture vs Shared nothing architecture

In distributed computing there are two kinds of architectures one is shared disk architecture where the storage is common for all compute nodes or servers like Oracle RAC, shared nothing architecture is each node or server has its own storage and the data is distributed across all the nodes.

Snowflake combines both in its architecture. Storage is common for all the compute nodes which looks like shared storage architecture and each virtual warehouse has multiple nodes which process the query in parallel(MPP) which is equal to shared nothing architecture.

Below are the key components in Snowflake architecture

Database Storage (Storage resources)

When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage. users can access this data using SQL queries like other RDBMS databases. Database storage includes storage for user data, Time-travel and fail-safe.

Based on the cloud provider Snowflake save data in respective storage for example in AWS it use the S3 to store the data and the storage is distributed across three AZ’s(data centers). It charge $40/TB for on demand storage and for upfront storage it charges $23/TB.

Query processing (compute)

Query processing is done using virtual warehouses in Snowflake. Virtual warehouses are collection of machines or cluster of machines which process the queries. Warehouses are independent of each other, they don’t share any memory or compute.

What is Virtual warehouse ?

Virtual warehouses are collection of machines which are used to execute the query (DDL command don’t need warehouse to execute).

Virtual warehouses usage cost is based on the size of warehouse and usage time, warehouse size is defined like size of t-shirt , it is defined as Small, medium, large and etc. User need to pay for use (in terms of seconds, pay/second except for first one minute). Snowflake use the concept of credit/hour for warehouse usage, based on Snowflake version credits will be converted to $. For business critical edition 1 credit = $4. Below are the different warehouses with credit usage information.

Cloud services

These are collection of services to coordinate the access to Snowflake which includes different components of Snowflake from users authentication to query result dispatch to user. Below are some of major services in the list.

  • Authentication – user authentication
  • Metadata management – metadata to plan and optimize the query
  • Query parsing and optimization – query parser, plan generator and optimizer
  • Result cache – cache to store the previous query result
  • Access control – user access permissions

For cloud services snowflake use its own internal virtual warehouses and will be charged to client based on usage.