The following video is an excellent tutorial to understand how Snowflake can perform both as a Data Lake and Datawarehouse.
https://www.youtube.com/watch?v=jmVnZPeClag
The following articles on Snowflake are also worth a perusal:
https://www.snowflake.com/workloads/data-warehouse-modernization/
https://www.snowflake.com/guides/data-lake
The following key concepts are important to understand to appreciate how Snowflake works:
- Snowflake separates compute with storage and each can be scaled out independently
- For storage, Snowflake leverages distributed cloud storage services like AWS S3, Azure Blob, Google Cloud Storage). This is cool since these services are already battle-tested for reliability, scalability and redundancy. Snowflake compresses the data in these cloud storage buckets.
- For compute, Snowflake has a concept called as "Virtual warehouse". A virtual warehouse is a simple bundle of compute (CPU) and memory (RAM) with some temperory storage. All SQL queries are executed in the virtual warehouse.
- Snowflake can be queried using plain simple SQL - so no specialized skills required.
- If a query is fired more frequently, then the data is cached in memory. This "Cache" is the magic that enables fast ad-hoc queries to be run against the data.
- Snowflake enables a unified data architecture for the enterprise since it can be used as a Data Lake as well as a Data warehouse. The 'variant' data type can store JSON and this JSON can also be queried.