How do you turn AI coding chaos into a repeatable playbook?
The Five-Stage Framework
Vivek Raghunathan, SVP of engineering at Snowflake, joins Leaders of Code at Snowflake Summit to break down the five-stage framework his org used to go from "let chaos reign" to a repeatable, org-wide system for AI-assisted engineering.
Vivek explains how Snowflake systematically rolled out coding agents across its engineering org - starting with unrestricted experimentation, then codifying what worked into a shared vocabulary of 14 "AI design patterns," from plan-in-English to fencing off parallel agents to reducing on-call toil through continuously updated skills.
Vivek walks through the "inner loop" and "outer loop" of software development, explains Snowflake's internal Yegge scale for measuring how far engineers have progressed along that continuum, and shares how a three-person team used coding agents to deliver a 40x improvement on Snowflake's query compiler.
The discussion also:
- Breaks down Snowflake's "focus weeks," where engineers get dedicated time to either catch up on best practices or push the frontier further.
- Explores the pioneers/settlers/skeptics framework for meeting engineers where they are in adopting AI tools, and why the shift can trigger something like the stages of grief.
- Covers how Snowflake cut release validation time from 15 days to a single day, and why more automated testing hasn't come at the cost of production stability.
- Looks ahead to a four-step maturity model for on-call and incident response, where agents may eventually take primary on-call duty.
Connect with Vivek Raghunathan on LinkedIn.
Transcript
Eira May: Hi, and welcome to Leaders of Code. We are recording this episode at Snowflake Summit here in San Francisco. If this is your first time tuning in, Leaders of Code is a segment on the Stack Overflow podcast where we get senior engineering leaders in the room and ask them about the work they're doing, how they go about building great teams and the biggest challenges that they find themselves staring down right now. My name is Eira May. I'm the B2B editor at Stack Overflow, and I'm here with Vivek Raghunathan, who is the SVP of engineering at Snowflake. Welcome to the show.
Vivek Raghunathan: Thank you for having me.
Eira May: Yeah. How's the week been going for you so far?
Vivek Raghunathan: It's been incredibly exciting. This is the time of the year where we get to show everyone what we've been up to over the last year. It's just an incredible experience seeing customers interact with and experience the things we've been building. This week gives me great energy going into the rest of the year, knowing that we are on the right track. We are building things that will add incredible value to our customers and they share our vision of the future.
Eira May: Awesome. Speaking about vision of the future, one thing that we've been hearing a ton about this week is about how engineering leadership is sort of shifting. So with the cost of code generation, the cost of writing a line of code just sort of approaching zero, the bottleneck is less about writing code, right, and it starts to shift to more kinds of work around managing and orchestrating teams, defining intent, thinking about what you can do strategically to deliver that real business impact. So I wonder for you in your role at Snowflake, have you really thought differently about what engineering leadership looks like in the light of all these changes?
Vivek Raghunathan: I think the more interesting question in my mind is also how is the business of producing software changing by itself? Because at some level, roles and what people do is an outward metric to how is the business of software resulting in software being produced. Right? We have taken a fairly methodical approach to doing this over the last 12, 18, 24 months. And I'm happy to go through it in great detail.
The way I think of this is code gets worked in an engineer's head, then on their workstation or in a cloud workspace of some form, and it goes through a bunch of stuff before it makes its way into main or master, right?
Eira May: Sure.
Vivek Raghunathan: I'll call that the inner loop of software. There's the outer loop of software, which is now that code and main has to get released into production. There's what I'd call the second outer loop of software, which is bugs are found in production and they make their way back into the systems as support tickets or incidents or some kind of like anomaly detection in our systems, and how do fixes in response to those bugs make their way back into main. So I'd say that is like outer loop, too, if you will. And then there's the question you asked, which is, how's the roles and responsibilities look like in this new world? How is the act of building software itself changing? And then how do you structure your organizations in response to the fact that the act of building softwares? So I break it up into those five tasks. Happy to dive into any one of them, but we're making like very methodical progress on each of those five vectors almost building up to the last vector, which was the question.
Eira May: Yeah. I'd love to just start at the beginning if you could unpack it a little bit for us. That would be great.
Vivek Raghunathan: Happy to do it. I think the inner loop is at some level the easiest because at its core coding agents make it easy for you to write more code and better code faster. Right?
Eira May: Sure.
Vivek Raghunathan: We started very much with an approach to, Andy Grove once said this and it's a very popular phrase I guess reused and thrown around quite a lot now. He said, any platform shift, you need to let chaos reign before you rein in the chaos. And so what the approach we took initially was, we're just going to let chaos reign. Habit formation is hard. Habit breaking is hard. We encourage people to use coding agents. We're just going to measure adoption. We're not going to measure lines of code. We're not going to measure PR certain. We're not going to measure any of these metrics that are easily gameable. We're just going to measure, do you use this twice a day? 95% of you use it weekly.
Eira May: Right.
Vivek Raghunathan: And we're going to encourage you to use it to write code, review code, understand code, write design docs, do everything, right? We're going to give you every possible tool. We're not going to say no to any tool that you want. Right? And clearly that is an age of little more letting chaos reign than raining in the chaos. So that's step one.
Now you have chaos reigning. A bunch of people are using coding agents. 95% of our engineers use them on a weekly active basis. I think 97. Are all of them equally effective doing it. That's the second question to ask. Right? There's a difference between using it to save yourself 20 minutes a day and using it to do 80% of your job. And I would posit that the difference between folks who are using coding agents and mastering coding agents or using it effectively is 14 AI design patterns. I use that word very deliberately. Design patterns for those of us who are in the software engineer industry, gang of four, the book with a whole bunch of like the factory pattern or the command pattern and so on and so forth, and that book created a language of almost how you think about becoming more effective writing software. Right? I think a similar thing will happen with how people are using coding agents. And so we have discovered I'm going to say about 14 patterns right now. I start them from a numbering of zero because you're in next and they represent patterns that our most fearless explorers, ours like AI czars, if you will, the people who are at the cutting edge have discovered as effective ways to use coding agents. So I'll give you an example of some of these patterns.
Eira May: Yeah, I'd love to. I'd love to hear one.
Vivek Raghunathan: Pattern one is it says plan in English. And what it means is first plan, use plan mode in your coding agent. First figure out what the plan is in markdown and then write code. Right? And I'm happy to show it to you on my laptop when we... We have an XKCD comic with these 14 so we can easily disseminate knowledge in the organization.
Pattern four I believe is fence your robots, and what it means is you can have a single agent and it's kind of slow. You can start a bunch of agents and have them all work together and then you have chaos or you can have git-worktrees that run each of them independently and invent them a bit and they work on stuff and that can dramatically improve the productivity of most of our engineers.
Eira May: Okay. So order from chaos.
Vivek Raghunathan: Order from chaos.
Eira May: Yeah.
Vivek Raghunathan: Pattern eight is what I call the TLA pattern or the TLF agents pattern. It is your orchestrator, the agent you're talking to, the master agent you're talking to is never holding a lot of context. It is delegating work. It is using an agent team of some form to actually do the work. And so it is always, its brain is free to talk to you.
Pattern 11 and 12 are, or 12 and 13 are new patterns. There are patterns around continued learning. There are patterns that recognize that you can min memory and you can promote it into skills overnight and that will make the system get better as you use it. And if you do this in a multiplayer way where everybody in the team is doing it, then you get to harness tribal knowledge.
Now each of these 14 patterns or patterns individual are best coding, our best AI forward engineers are using. So now you have these 14 patterns. Now we have a language to speak in terms of how you upskill or reskill the engineering teams, if you will. And when you do that, you can then progress to what I call stage three, which is reign in the chaos. Right?
Eira May: Yeah.
Vivek Raghunathan: And there you start taking some of these paved patterns and you say, how do I get more people in the organization using them? Most interesting technique we have discovered is just a simple act of creating space and time for them. So we do these things called focus weeks. We do them pretty regularly and it's a week where everyone in the org just takes the time off to figure things out, and it serves two purposes. There's maybe 95% of the organization is what I call exploiters and the term feels like very spicy, but it actually means something very simple. It means they just want to know the paved paths and use them.
Eira May: Right.
Vivek Raghunathan: Right?
Eira May: Just give me the good stuff.
Vivek Raghunathan: Just give me the good stuff. I don't want to do all this learning. I just want to use this. Right?
Eira May: Sure.
Vivek Raghunathan: My PhD is in RL, so explore and exploit is very common. And so these are the exploiters. There's 5% of the organization and they need the time to exploit. They don't know the patterns. They're like, "I'm too busy." And then the 5% organization is what I call the fearless explorers. These are the people paving the past. These are people creating these best practices. They will go find the time. They're doing it on the weekends, they're doing it in the evenings, they're doing it... Sometimes when they're out with their friends, they're busy hooking up a mobile app to Cortex Code and doing stuff. And these users, these engineers just need the time to explore, and so we give them this week to go and... And I call it raising the floor and raising the bar. So the first guys we're raising the floor on, the second set of people were raising the bar on. So that's just the inner loop.
Eira May: I like that.
Vivek Raghunathan: Right?
Eira May: Yeah.
Vivek Raghunathan: What happens if you do this is, roughly speaking where we are, 97% of our users, of our engineers are weekly actives on coding agents, and code is up about 1.5X in the last year over year, maybe 3X up over the last three years. Time to merge. Like things we know are characteristics of high performing teams, like they review code fast. They're like a music band really just like riffing off of each other.
Eira May: Yeah. You start to see them kind of compounding on the... Yeah.
Vivek Raghunathan: And so those kinds of patterns are all up and to the right. They're all up between one and 2X-
Eira May: Cool. Yeah.
Vivek Raghunathan: ... based on every metric. That's just the inner loop, right?
Eira May: Sure.
Vivek Raghunathan: So I said there were five stages. That's just the inner loop.
Eira May: Okay.
Vivek Raghunathan: On the outer loop, we are using AI to basically rethink every step of the outer loop. I think of three steps of the outer loop, right? Can we release code faster and better? The second is can we test harder? Can we have a lot more tests, a lot better tests and a lot higher coverage tests. And the third is, can we debug smarter? On the first step, and this is ver
Comments
No comments yet. Start the discussion.