Datascape Podcast Episode 44: Snowflake with Kent Graziano
Episode 44: Show Notes
Snowflake is the only data platform built for the cloud for all your data and all your users. Today we have Kent Graziano, Chief Technology Evangelist at Snowflake, here on the show to discuss how this game-changing platform works and what it means for IT professionals and their clients! Kent starts out by giving us a window into his impressive career as a data architect before weighing in on what blew his mind about Snowflake, causing him to make the move from the on-prem world to cloud databases. Snowflake’s founders wrote a completely new, multi-cluster, shared data architecture database from scratch for their product. Just a few of the revolutionary things about this architecture is that it allows for independent compute nodes and automatic scaling of node clusters up and down, which can be automatically switched on and off as clients need. They handle pricing on a per second per node basis too. We hear about a bunch of the old stresses Kent had to deal with as far as onboarding and maintenance in the on-prem world which have been completely erased by features like these. Snowflake takes many nuisances DBAs used to have to deal with away too, which seems like a double-edged sword but Kent explains how it opens a new realm of more interesting tasks, making the new DBA more of a database architect. In today’s show, we also hear Kent’s answers to questions about how Snowflake handles compression, data ingestion, data batching and streaming, job monitoring, and cataloging data. Wrapping up, Kent gives us some ideas of the staggering size of some of Snowflake’s clients’ databases as well as a few great resources for anybody looking to learn more about adopting or using this software. Tune in to get it all!
Key Points from this Episode:
• An overview of Kent’s impressive career as a data architect in many different verticals.
• Kent’s role as CTE at Snowflake advocating for and consulting about Snowflake!
• What completely blew Kent’s mind about Snowflake’s abilities, winning him over with its tech.
• A deep dive into Snowflake’s architecture detailing the many on-prem problems it solves.
• Snowflake’s multi-cluster shared data architecture allowing for independent compute nodes.
• The time and money-saving ‘price per second per node’ pricing model Snowflake uses.
• Automatic capabilities for scaling the provisioning of node clusters up and down.
• What tasks are left for DBAs seeing as Snowflake does so much of what they used to do.
• New DBA tasks that remain after Snowflake: Security hierarchies and data architect work.
• How Snowflake automatically compresses data based on rules it gathers from metadata.
• More tasks for the new DBA/data architect: Resizing, declaring new cluster keys, etc.
• How Snowflake ingests data using a pipeline and dropping files into blob storage.
• Data batching and streaming using a REST API for Snowpipe or Python routines.
• How Snowflake’s UI and database assists with job monitoring and the role of DBAs in this.
• Cataloging data using Alation, Collibra, WhereScape, or building an information schema.
• Pool of compute, customer demand, and why the bulk of Snowflake’s regions are on AWS.
• The size of the databases of Snowflake’s biggest clients: Tables with 40 trillion rows!
• How the IT professional or DBA can get started learning about Snowflake.
• Our lightning round with Kent: His favorite project, book, tool, and more!
Links Mentioned in Today’s Episode:
Kent Graziano on Twitter
Kent Graziano on SlideShare
The Data Warrior
Snowflake’s Online Community
Oracle Autonomous DB
Rocky Mountain Oracle User Group
Oracle Development Tools User Group
Oracle Data Integrator