The whole developer world becomes easier when they are not distracted and can really focus on building the cool apps!
But frictionless access to machine identities within developer tools and processes does not always exist. Gradually, that is changing as Venafi joins forces with top developers from around the world to enable faster, more secure Cloud and DevOps projects. To that end Snowflake, the popular data warehouse, is set to join the ranks of machine identity-enabled developer applications.
How is this happening? We have Hungary-based, Snowflake experts, Starschema to thank for it. I spoke to Soma Osvay, lead software architect at Starschema, to learn about Snowflake’s popularity and how this Venafi-Snowflake integration they are building as part of the Machine Identity Management Development Fund is a great step forward for organizations seeking to modernize for developer speed, agility, and efficiency.
Developers typically need 4 things to deploy an application: The App itself, Config Data, Content Data and Machine identities. Why should they have to use a different process to get machine identities? Now they don’t have to. With the Snowflake integration, machine identities are now available to developers from the same data layers as their config and content data.
Zero Trust with cert-manager, Istio and Kubernetes
What is Snowflake and why is it so hot with developers right now?
Soma: Snowflake is essentially a cloud and subscription-based data warehouse. And the reason it's loved by developers around the world is that it provides a very highly scalable way of running your queries. What it actually does—that no major database player has done before—is it separates compute and storage completely. So, when you're storing data, for example, you're storing your data in your own cloud, in your own data storage. And then Snowflake gives you compute power on top of that storage, and it will run these queries for you very quickly. To translate that from a developer perspective: it provides a very highly scalable way of accessing your data warehouse.
To dig deeper, if you have jobs that are running every day for three hours, and they perform really heavy, intensive data ingestion or something like that, then Snowflake can help you eliminate extra congestion by only providing you with compute power for those times when you're using it. That means out of 24 hours, if you have three-hour data loads, you can set it up, so Snowflake is not running any compute, any virtual machines, any resources to maintain your database. It's just flat storage.
Snowflake is highly scalable, so it supports a very heavy workload. With a traditional database, if you want to move over or aggregate millions of records, or gigabytes and terabytes of data, that takes forever. What Snowflake does is it gives you a cluster instead of giving you a single server. With a cluster, you can automatically scale up. Say you're running your daily load with lots of queries, lots of data that's being moved around once a day, and you want it to be done in one hour. What you can do with Snowflake is spin up lots of computers that will take care of these queries. At the end the extra machines will all be shut down. The benefit is that you're only paying for the time you use. Snowflake can scale horizontally too, so you can always add more nodes into it that will increase your compute power.
How are machine identities used within Snowflake?
Soma: From a Snowflake perspective, architecturally, machine identities are built in. But here’s a good example of how developers are using machine identities in Snowflake: when they are starting an application, they need to generate the machine identity to be used to connect and communicate with other systems. So, when they boot up the application, they need an ID for that application. That machine identity comes in the form of a certificate, which can prove who that software is. And when Snowflake is communicating with other services, it will use that machine identity that it got when it was booted up. These machine identities can be stored in Snowflake.
How important do you think that having the Snowflake integration is for a typical developer?
Soma: It's very important. Developers are probably tired of saying, “Ok, if I want my config data or something, I can go here, but if I want a machine identity, I have to go over here and learn a new API, a new call, I need more libraries, more SDKs. I've got to think about something different.” A Snowflake developer or a PostgreSQL developer are, of course, most comfortable accessing certificates on the actual technology that they are experts of, and that’s basically what this integration allows them to do. They’re now able to cut down on development and management time. They can pretty much generate certificates without incorporating any new technology into their software, and they can use all their existing infrastructure to do that.
All right. So how exactly does the integration with Venafi work?
Soma: Starschema provides another interface for users to interact with the Venafi Trust Protection Platform. Right now, of course, you can access the Venafi platform using the web interface, you also have a CLI—a command-line interface—that's being implemented. Before our integration work, these were all technologies which developers could use to interact with Venafi itself. However, this development effort is expanding the list of interfaces that users can use. With the integration, a PostgreSQL developer does not have to use CLI, use Go, or use Python or any other language to get these certificates. They can do it themselves directly in Snowflake.
Snowflake, like many other database systems, supports external functions, which can be called. When you're producing your query, for example, that will boot up the application and join a bunch of tables together. Say for this application, I need to know what user I'm impersonating, what database I'm using, and so on. Now within Snowflake, I can also say that I want to also generate a certificate, and that will be returned as part of the query itself. The way that's done is Snowflake has something called functions, which can be defined as REST API calls, but these are external functions.
The integration allows you to call this function supplying the parameters you need the certificates to be generated with. Specifically, the function is set up to call an AWS Lambda which in turn calls Venafi, verifies and performs this operation. In other words, we added the middleman. So, we have Snowflake, the database system. We have Venafi on the other side, that's generating certificates. Snowflake cannot directly interact with the Venafi system at all. The new software we added in the middle are like tiny applications running in AWS. Snowflake can call these applications and contact Venafi and do PKI the operations that would normally involve interacting with InfoSec teams. The results then go back to Snowflake. Now developers can get everything in one place!
One more of the reasons why developers love Snowflake so much is precisely functions like these. The value of this integration is it effectively bridges the machine identity gap completely.
Starschema’s Venafi integration for Snowflake is targeted to be complete in Q4 2021. Visit Starschema on the Venafi Marketplace for more information. And stay tuned for future interviews with Machine Identity Management Development Fund recipients.
This blog features solutions from the ever-growing Venafi Ecosystem, where industry leaders are building and collaborating to protect more machine identities across organizations like yours. Learn more about how the Venafi Technology Network is evolving above and beyond just technical integrations.
Why Do You Need a Control Plane for Machine Identities?
Related posts