Watch Episode

Episode Description

Zak Cole has shipped four successful exits in crypto—Whiteblock, Slingshot, Code Arena, and more—and he attributes much of that success to a rigorous specification-driven development process. In this episode, Zak joins us to discuss his journey from Marine Corps cryptographic asset manager to serial Ethereum builder, and introduces Adversarial Specs, an open-source tool that uses multiple AI models to pressure-test product specifications before a single line of code is written.

We dig into Zak’s philosophy that the most expensive mistake you can make is building the wrong thing, and how getting consensus on specifications upfront can compress what used to be a two-week review cycle into a couple of hours. The conversation covers his “two pizza rule” for team size, why LLMs are better at reasoning about specifications than writing code, and the surprising finding that Grok is the laziest model when it comes to providing critical feedback.

Whether you’re a solo founder bootstrapping your next project or leading a team through a major build, this episode offers practical insights on how to ship faster by slowing down to think first.

Topics Covered

  • Zak’s path from the Marine Corps to the Ethereum ecosystem
  • The product development process behind four successful exits
  • How Adversarial Specs pits Claude, GPT, and Gemini against each other to refine PRDs
  • Why specification-driven development saves engineering hours
  • Small teams, agency, and the “two pizza rule”
  • The case for trusting AI with specifications over code
  • Open source culture and building for external contributors

Transcript

Click to expand transcript

All right, welcome to another Hashing Out interview. We got a fun one today. Brought on Zack. Zach, let’s get this out of the way. Why don’t you tell the audience who you are, what you do, where you came from, and then we can get into more fun stuff. >> Yeah, sure. I’m Zach Cole. I am an engineer and uh I run a venture studio called uh number group and I’m also the president of the Ethereum community foundation and I’m also the chief protocol officer at Oxbow and Privacy Pools and uh I’ve been working in crypto for oh man I don’t know like 10 over 10 years now. >> Long time. Yeah, too long. I guess I should probably pivot into AI. >> We’ll get there. So, like, yeah, like I I mean, you’re known you’ve been around the space in a lot of different capacities. Um I don’t know before we dive into why why I brought you on, like how did you how did you get into the ecosystem and how did you find your way? How did you find yourself here? >> Yeah. Um, I was in the Marine Corps in 2007 through 200 11 and I was a network engineer and I managed cryptographic assets for a department of defense. And during that time period, I got into tour and then that led me to find out about Bitcoin which was interesting to me. And then when Ethereum came out, I got into Ethereum uh strictly because Emojin Heap uploaded a song to the blockchain and I like the Garden State soundtrack. >> Um we interviewed her. We interviewed her about that back in what 15 Bitcoin podcast >> nuts. We’re like, we’re interviewing him and Gene. >> Yeah. Um Yeah. Yeah. And then uh the Dow hack happened and I kind of like got wrecked on that. And then um but that that was made it more interesting to me um when ETH recovered and then I went all in on ETH and I’ve been building full-time in Ethereum since then and I’ve started a bunch of companies and had a few exits and I just build and ship cool stuff. It’s pretty much what I do. Here’s a good here’s an interesting question for you that may be I don’t know enlightening to the audience because you’ve been doing that and you’ve done it a number you’ve you’ve built and shipped successfully and exited successfully a number of times. Do you feel like the opportunity to continue to do that is less than it used to be? Because I like I mean if you if you just look at the bare 2017 era where like I can ship a white paper in an ICO and make millions and millions and millions of dollars and walk away compared to now. Like clearly the competency required to be successful has changed. Is the opportunity less? Um I guess it depends on what you mean by opportunity. I mean, if you think you can ship a white paper and raise 10 million, I don't think that Well, I don't know. Maybe it is. >> You need to be really connected. >> Yeah. I don't think that that's very that's very feasible. I mean, I uh granted I never did any of that. Um, and the opportunity I feel like for me is the same as it's always been. I mean, there's like pretty much infinite upside. This is a pretty nent industry still, even though we've been doing this for a long time. >> Yeah. Um, I think there's still plenty of opportunity if you're like building cool stuff because you enjoy building cool stuff. Um, then there it's like, you know, it's a green field. Um, but if you're just trying to like max extract, then I don't know, you're probably screwed, but you probably would be anyway. Um, but maybe not. I don't know. the memecoin thing like there's always some that allows people to just >> max extract but um yeah I think that the opportunity is as good as it's ever been. >> Can I ask what projects you've been involved in in like in terms of shipping and maybe exiting? >> Yeah, sure. Um >> my first company that I exited was Bite Block and that was a protocol uh level testing framework. >> Um we worked on we we built and then used the product that we built to work on um uh it's white block like white whi t >> and then um >> we talked a long time ago about this. Yeah, I was working on it for a while, probably about five years. Um, >> and then uh uh after that I started Slingshot, which was Liddex Aggregator. Um, and that was acquired by Magic Eden. >> And then um after that we did Code Arena. Code Arena was acquired by Zelic. Um >> Oh, you found Arena? >> Yeah. came up with the whole model for the competitive audits. >> Oh, I didn't know that was in a blockchain. >> That's kind of like our spicy >> that's a good one >> thing that we worked on. >> And were you uh inspired by Jeepson for for when making white block kind of >> uh like the testing stuff? >> Yeah. >> Uh no, I wasn't in I didn't know I wasn't aware of it, but I became aware of it after working in testing for a while. Um so yeah it's pretty pretty cool one. Um then after code arena um I started yeah I started um number group and worked on a bunch of stuff through number group and one of the projects we've incubated is oxbow and privacy pools. Um, so that's pretty much I've been working on number group for I don't know about three four years now, something like that. And uh, you know, we're a pretty small team. Um, >> but what we do is just we just like ship stuff, come up with cool ideas, bring them to market. >> Um, >> nice. >> Which is uh what I wrote the tool for, the spec tool. Now it's full circle. >> Perfect. I mean like that's kind of like my better than my transition. Uh like it's clear that your concept of building is more rigorous uh than >> than what I would consider to be the standard in today. Yeah, I'm very um I'm very uh specific and very picky about, you know, processes and um I have a very specific process that I bring to all of my projects. And it seems to have worked pretty well. Uh because, you know, I've had four exits. I think statistically that's fairly unlikely to have one, let alone four. Um and I just kind of like fumbled my way into those. I didn't really like, you know, I just kind of conformed to this process and stick to it and that's how I build products and that's how I develop and that's kind of like how I I live and die by the process, you know. >> You walk us through that process. Yeah. Exactly. Yeah. >> Yeah. So, um if you have an idea for a product, so we're like a venture studio. We build a bunch of products. Everybody should have an idea. If anyone has an idea, they can bring it to the team. They can propose it. Um, the way that we do that is we have like an executive overview, like a feature brief, kind of like a product brief, and it's like a TLDDR that's like, "Hey, this is the product that I want to build. This is the uh problems that it addresses in the market. Here's why I think it'll have product market fit. Um, here's how we can monetize it or scale it or whatever." And then everybody reviews that, provides feedback. If there's consensus on like you know hey this is what we should ship then um we kick it into a product requirements document which is a specification that defines the business objectives of the product how the user is in expected to interact with it we have personas uh there's like a bunch of stuff it's pretty it's a pretty standard PRD um but it just kind of like defines the success of the product and the idea for all behind this whole process is that by the time we start writing code, everybody understands and agrees um uh like what we're building like on what we're building. So so nobody goes into it >> kind of like with without knowing exactly what we're doing and everybody everybody has already provided feedback. it's already undergone like you know rigorous you know cycles of of uh you know um like editing and modifying and just like changing based on everybody's you know opinions essentially uh based on their role um and the PD kind of acts as like the single source of truth across all departments so that's like in in uh like engineering marketing executive team whatever everybody body everybody references this PRD throughout the process. Um then once the PRD is locked in after it's gone through a few rounds of iteration um we work on the technical specification and that kind of defines the architecture and the highlevel technical requirements for the product how it's going to be built. Um and all of this saves engineering hours like the at a company on a team at a tech company uh engineering is like the most costly thing. Um, so we don't want uh and also engineers uh have a tendency to overengineer like they're they'll keep drilling down for as long as you let them. So it's really important >> good enough isn't isn't like unless good enough is defined they'll make their own and it's usually way overkill, >> right? So you need to define that within the technical specification within the PRD. That's why it's important that the PRD defines what is done and what's you know like what's good enough. Um, like don't keep spinning your wheels on something like we want to ship something quickly. We want to get something into the market so we can verify whether or not we have users, whether there's any type of inkling of product market fit. >> And I like to move really fast. Like if I'm working on something and I ship it and it doesn't work and people don't like it and it sucks. Yeah. >> Then I don't really like to buy into the sunk cost fallacy. It's like let's all right, let's forget about that project. Done. It's done. >> Like, you know, like let's move on to the next thing quickly. Yeah. >> Like, so I want to fail fast so we can get to the next successful thing. And I've had a litany of failures. You know, life is just a series of failures with like, >> you know, a couple wins sprinkled in there. And I think it's important to kind of learn from those failures, but also just forget about them and move on to the next one. >> Like that's it. like you're you're you're at a you're you're know you're at a casino. You're you're hitting the you know you gota like uh you're hitting the blackjack table. You move on to the next table. You don't really dwell on the last one, you know. Um >> what's the time from start to finish in terms of how long do you give uh a project uh to try to get PMF and then before you just ditch it? >> It really depends. Um uh like we I generally like to have a product built from concept to implementation ideally within like you know I mean it depend it depends on every every project. Yeah, it depends on the complexity and >> but generally I want to build something that takes two weeks to a month >> uh to put out there. Depends on what we want to build. Like a lot of the things you have good technical specifications, the building part becomes >> easier and easier depending upon >> how well you're able to leverage AI and how complex it is, things like this. And then on a long enough timeline as you build things and continue to build them, you start to develop a library of reusable components that you can just plug and play. So it gets easier and easier. As long as you can form these processes, you can move really fast. And that's what I want to do because, you know, we're we were mostly playing with our own money. We're self-funded. So, it's like let's like it's literally time is money. So, like >> we can't really afford and and and we're paying for it. Like, you know, we don't >> we didn't go do like some crazy raise and like we don't have like a bunch of money, right? So, we eat what we kill. So, I want to get out there. I want to get I want to ship things quickly. I want to see if it works or not. We set KPIs. Generally, my KPI is like, >> let's get 10 users. Um, and then like the big KPI is let's make 1. I’m not even trying to make a hundred dollars here. If if anyone in the world is willing to give me a dollar for what we’ve built, then that’s enough validation to keep the project alive for another day. >> But don’t get caught in this like sunk cost fallacy that’s like, you know, like where you’re just, oh, we put so much time into it, like we can’t we can’t abandon it now. And it’s like, yeah, >> you should. >> You probably should. Like most projects don’t need to exist like you know like let alone have a token or whatever or like a raise or anything like just because you have an idea doesn’t mean you need to raise money for it or like marry it like you know >> loosely held >> one of the things so you probably know this but I work with Dimmitri on a storage protocol and we’ve been working on that for like four years now and so I’m in like shipping mode in terms of trying to squeeze out something that normal people can uh touch and feel the protocol, understand how it works because it’s basically like Bit Torrent plus persistence of the data. >> And my mentality is definitely aligned with yours in terms of only giving myself uh a month to ship a client like a Bittoring client essentially >> and then seeing if it if I can get traction with some users quickly and then try to monetize it right after. Yeah. >> Yeah. And if nobody likes it, then like >> sorry, but like don’t don’t keep trying to like don’t try to force like a square block into a round hole. Just like move on with your life. You have >> only so much energy and attention that you can give something and if it’s not serving you, move on. >> Like how do you how do you how do you budget things? Like for instance, you were saying that you’re bootstrapping everything. We’re doing the same thing. So again, same mentality like time is money. We have to move fast. How do you guys figure out what projects are worth greenlighting and how much to give them in terms of runway? >> Yeah, I give them the minimum amount like until there’s some sort of validation external validation that you know comes from someone that’s not messers. >> Yeah. Yeah. if you don’t like and then at that point we try to like calculate the total addressable market and then we figure out how much you know how much of a how much of that can we consume and then we calculate that based on the amount of capital we have available. Um but generally the answer is as little as possible for as long as possible. >> Yeah. I mean, it’s a it’s a lot of vibes. Like, really, I’m not I don’t I don’t have like a hu a great huristic for that. It’s kind of just guessing like going with my gut based on like experience. But >> feels as though your gut’s doing a pretty good job then. >> Sometimes sometimes it makes does a very poor job. It’s like don’t don’t get like there’s tons like I am a failure. You got a lot of failures that you haven’t >> statistically like a failure like you know like but you got to just be able to move on and like yeah I’m going to have a hundred million more failures >> today you know. >> Yeah. >> But like you you mentioned like the the core process is live and die by this method of doing things. >> Yeah. >> And because it’s been successful for you and like statistically you’re doing a pretty good job. um how has that process changed over the course of you doing it? Because if if you just look at you know what you prioritize or or value within a PRD uh versus what you call a technical spec is differentiated like I have a bunch of things that I will probably either fork or PR into this adversarial spec so that I can use it for how I plan to do things within logos. um because we’ve been following like we’ve been trying to follow specification driven development for a long time which is mostly modeled after the IETF and that’s a very different process than purity and technical spec but like maybe the simplicity of what you have here is sufficient for like actually shipping good things. So I’m just curious like how have you how have you felt that process change over the course of the time of like what you consider >> I think at the start at the beginning I was more like oh we have to be agile we need to use Jira and like we need to do this and we need to do that and use this particular tool in this way and that was because I came from like enterprise I was working in like enterprise tech and >> and I was like well this is what they do so it must work >> but it And realistically, you need to >> Yeah, it’s stupid. Um, like don’t be married to any one of those things. Like I think like the important thing is to not be is to be very uh strict on the things that uh matter. And the things that matter are communication and making sure that everybody’s on the same page and that’s what ultimately what the spec does. And then if they choose to use Jira or they want to do whatever like doesn’t really matter as long as everybody’s conforming to the same process and they’re rowing in the same direction like the details don’t really matter. Um >> so overall like we use these specs as like a road map for like how we’re going to ship something and as long as we all stay on that same course it works. So some teams that I’m with like to use uh notion and they like to use linear and then others like to use GitHub and then others like to use Google Docs and like a lot of people get hung up on the details over like we should use this platform versus that platform. At the end of the day it doesn’t matter. All that matters is that everybody is in sync and they’re on the same page and they’re rowing in the same direction. Um that’s all that matters. >> Yeah, couldn’t agree. Um, and a lot of teams will kind of bike shed about like like I’ve been on teams where they’re like using uh Confluence and then like halfway through a cycle they’ll be like, you know, we don’t even we don’t I don’t like Confluence. I’m going to switch to linear and then they argue about it and then it’s like two weeks of wasted work just like going back and forth debating the virtues of one platform versus the other and it at the end of the day it doesn’t matter at all. like you could write it in Google Docs as long as everybody understands and they’re moving like that’s all that matters. >> Have you had a hard time convincing them that like all that matters is the spec? >> Um I think >> it’s kind of downstream from that in terms of like >> Yeah, I think I think that like some teams are like they conflate moving fast with like not having any to like guide them or anything like that. And um >> oh just build it. Don’t you you don’t need any documentation. Just build it. We’ll do that later. >> Yeah. A lot of people and then and then you end up in this situation where one engineer’s built something and this other guy has a completely different idea and now the codebase that they’ve both worked on indep independently like doesn’t really work. It doesn’t matter. >> There’s no mechanical sympathy associated >> and then you have to go back and you have to start from scratch again to make sure that all of these parties are aligned and building the same thing. Because I’ve been on projects and had teams where different people are building completely different things and they all think they’re building the same thing, >> but they are all like, “Oh, we’re moving fast. We don’t have time to like do this and that.” And it’s like, “Oh, it’s a too much. It takes too much time and too many cycles to like lock in a PRD and like a tech spec.” Like it doesn’t >> it doesn’t like that’s stupid. Like you because you literally are wasting time. Like there’s no there’s nothing wrong with making sure everybody’s aligned. I mean that’s like exactly what you should be doing and that’s the goal. >> But there was some like ostensibly there’s an impetus for you to build this this claude tool. So like I guess for those that aren’t aware uh you shipped a a cloud skill and a repository called adversarial spec which is uh a way in which you can either build a spec or prd from scratch or um improve the quality of one that you already have by having a bunch of LLMs disperate MLMs argue about it. >> Yeah. Um, and so you have a number of templates on what a what a good PRD or spec is and some variations around quality or uh perspectives in there which we can get to maybe later. But like the idea is that you have a bunch of LLMs argue about these things and Claude is kind of the arbitrator slash one of the arguers and arbitrator of this of this discussion and in the end hopefully it spits out a quality document that you can then move forward with. Um I assume you built that because the quality of things wasn’t good enough or the time to take to get to that level of quality was taking too long. Can you talk about that? >> Yeah, it was just too long. Um, I mean, I’ve worked on PRDs for like a couple weeks and it’s a pain in the ass and you have to go back and forth and iterate and get feedback from different stakeholders. Um, and like again, this is a process that I use on every one of my projects. Like this is what defines me as a product engineer. Like you know, this is this is what I do and it’s taken years to like refine that process. So, I’m going to these new teams. I’m putting building new teams. I’m putting them together and like I’m trying to bring this process to them. And it’s just a lot easier to use AI for that. Um I was and it’s something that I was doing manually. I would like write my own PR. I drop it in chat GPT. I’d be like, “Hey, what do you think of this? Uh give me feedback on it.” and kind of, you know, obviously it’s like much it’s like a little more eloquent of a prompt, but um that’s like and then it would output something and then I would copy that and I’d put it into Claude and I’d be like, “Hey Claude, Chad GPT is working on this with me. What do you think?” And then it would come out with some come up with some criticism and it’s stuff that I sometimes wouldn’t even think about. It would surface. So that was really valuable and it’s this is a process that generally I’m relying on the other the rest of the team to kind of act as like the rubber ducky for that. Um but uh I could just use LLMs for that now. And then once the final product is is available and this is something that I can I can pretty much get consensus among all these LLMs within like seven cycles. So like within an hour or two I’m pretty much reducing a two week work cycle to like a day and then the output is like pretty much good. It’s like 90%. >> I was I was playing around with it on a couple like draft specs that we have running around and I was like well I covered some corner cases found some problems just like with the like the the simplest of setups than what you have here because you can get pretty complex with what you shipped. just adding just having two of them argue gave me what I would consider a better quality output and allowed me to ask better questions on like hey what do you think about this part that we isn’t covered here >> right yeah >> one what one one question I have is like team size right so like for instance my experience with working at logos is like it’s a it’s a big organization right it’s like 200 plus people there’s a lot of voices a lot of stakeholders like a lot of random stakeholders you didn’t >> yeah yeah so many opinions >> so each of like codec X, right, was like a team of like almost 20 people, but then you have stakeholders outside of the 20 people. So like it blew up in terms of like I I did a PRD for Codeex, right, long time ago and I don’t think every anybody really looked at it. So like there was like different different breakdowns in in flow in terms of number of people who wanted to be involved or should be involved. And one of the questions I have is like how big should a team be and how big is your team? H I mean I the way that I think and the way that I work is not conducive to large teams. I think that like there’s diminishing returns. I’m a big fan of the two pizza rule. Like your team should never be larger than like uh than can share you know a pizza essentially. >> Um >> I like that. Yeah. >> Yeah. So my team at number group were six people and we’ve >> you know we bring contractors on and like you know tag people in on a project like per project basis but um I don’t I don’t see a world where we need more than that if you need more people to >> I mean the constitution was written with like a dozen people. Why why would you need to >> why would you need more? That puts even further emphasis on quality specs, too, if you’re dealing with contractors and parties and freelancers is that you need to make sure that >> they’re under the same expectations that you are so you can >> leave them alone and they come back with meaning. >> Yeah. And so that’s what I want in people that are on my team is like I want people that are like, you know, have agency and are self-managing and um I want them to come in and tell me what to do. I don’t want to have to hire somebody that I have to like constantly manage. That’s a pain in the ass. I don’t really like employees. I don’t like people that are like taking orders. I want people that are collaborative and like on my level. And you know, it’s most people are not like that. Like a lot of people just don’t have agency. They kind of sit there until you tell them what to do. And that’s like >> a waste of time and space. And people like that should just leave. >> Go work enterprise. >> Yeah. Go work. go work at Amazon or some Go work at like a 500 2,000 person company. Like get out get out of here. I don’t have time for that. That pisses me off. Um >> well, it’s kind of the thing like I I take that mentality um and the type of environment that needs to exist for you to work well and ship things. And I and and and for me it has a very strong overlap with like what I consider open source software culture. Um and if you want to allow for the engagement of open source contributors. You have to have these things in place. Otherwise, you can’t have the same you can’t you can’t you can’t row in the same direction with a bunch of random ass people that you don’t even know exist unless you have these things in place, >> right? You need like everything like your codebase needs to be commented. You need to have documentation. You need to have specifications. You need to have like everything set up so that somebody can come in and pick up where you left off. Like if you die tomorrow, >> what’s that? >> Without talking to you. >> Yeah. Yeah. Exactly. Like if you die or if you get fired, like everything needs to be in a state that like allows you to uh like allows somebody else to come in and pick pick up the work. And even even if you don’t get fired, like you should always be training somebody to take your place because that allows you to kind of like scale yourself and move on and like, you know, become better at what you do. Like always be training somebody to take your place. Always be writing code that other people can pick up. Like don’t like leave quest. Like it’s really easy to just not to ignore those things and just like keep drilling down on building something, you know, but like a part of something being functional is its ability to outlive and outlast you because ultimately at the end of the day that’s what I want to do is build things that outlast me. I want to build things that are like bigger than me. That’s like the that’s the only impact I can really have on the world is like the skills that I’ve developed and like the products that I’m able to build. I want them to outlast me, you know, and the only way to do that is by communicating effectively and making sure that your codebase is clean, making sure everything is commented appropriately, uh making sure that everything is set up for success so that way it can it can it can do that and outlive you. How do you how do you signal for like for instance um I I’ve read like several um hiring hiring manager experiences and it seems like people in larger organizations tend to hire people who have a subset of their skills who potentially are limited in their self-development in terms of like what you were saying is you expect people to give you feedback to kind of like you feel friction because they’re providing new ideas they’re doing things unilaterally like there’s agency a lot of People don’t have that and it seems like I I see organizations blow up with people who are not they don’t have agency and they don’t develop new skill sets and it it almost seems like people are trying to hire someone who they can just forever manage and then they just live in that management zone and it like that’s like Amazon and these big organizations. It it’s it’s >> it’s awful. >> Well, those are the frameworks that we have to kind of grow from, right? It’s like if it’s it’s hard to grow up in a and with looking at how the internet and companies have scaled and then not model yourself after that like >> Yeah. I don’t I don’t understand. >> You quickly realize like you quickly realize it doesn’t work. >> Yeah. Like you you need to you need to feel the the burn of your money being lost because you’re not doing things like you can get lost in the comfort because an organization pays you well grows big like and that’s exactly what I don’t want to be in right that’s the environment that stunts most people it seems like >> yeah it’s pretty it’s pretty easy like I mean it’s pretty easy to be like oh that’s a cushy job you could do this for the rest of your life you know and just kind of like live there it’s kind of I mean like I have I’ve have friends that like work at like these big bank companies and like Apple and stuff. And I have I have this one friend in particular that worked at Apple and he was ne he was never working and it was just like he was always playing video games >> big head in uh in the Silicon. >> Yeah. Yeah. Yeah. Yeah. Like he like he wasn’t and he never really wanted to like but he doesn’t have like he’s not very like motivated I guess. It’s just like one of those guys I grew up with, went to college with, you know, he’s a cool dude, one of my good friends, but like doesn’t really have like the drive to do better and to grow. He just wants to take that job, sit there, and take a paycheck for the rest of his life, which is fine. You know, he’s making good money. >> I’m sure it’s more money than he ever thought he would make. So, like, that’s good. But, I don’t know. I just I want to like I’m always I don’t know. I always need more. I don’t know if that’s like a good thing or a bad thing. Sure, it’s bad in a lot of ways, but like I just want to like keep keep trying and growing and scaling, you know? >> What’s that impact, right? Like I think that I think it’s a admirable thing to want in the way that you want it because you’re saying like you want to build things that outlast you and outgrow you. And that’s just in my opin I mean that’s just like the millennial in us of like I want to have impact, right? Like I want the things I do to be meaningful because we watched our parents do things that weren’t. And like >> that’s what you’re trying to do. And so wanting to do more at least I way I interpret it is wanting to have more impact and build things that actually have something that are valuable society. >> Yeah. I I don’t want to just I don’t know. You only have so much time and energy like you know look at your life as like a glass of water. You only have so much you can distribute to other vessels you know. So, like you might as well make it count. >> I remember reading like an Arnold Schwarzenegger quote is like uh he just wanted to be useful and so he did a whole bunch of different things to find other people finding him useful. >> Yeah. >> Arnold’s a good guy. >> Coming back one. Like bringing it back to because like if you want to here’s a here’s a quality segue. If you would like to distribute your water in a way that is efficient so that things grow typically you try and do that through tooling that you can that let that that amplify your output gives you leverage. An adversarial spec in my opinion is something that you feel does that. Um is it where you want it to be? like you shipped it so it’s some level of quality of good for you. Uh what do you hope or expect or plan for it to grow or change? Is it good enough or is it something that you like you can see the complexity grow or are you concerned with like it becoming too plex and becoming too cumbersome? >> Yeah, I think it’s at the point now where like the features and the functionality are like pretty robust. I think it’s fine. I think it’s good. And then my criteria for success would be all right, this is an open source project. are people going to contribute to it? And so far there have been like >> there have been over half a dozen contributors to this within the past five days that it’s been out and people are opening poll requests and they’re submitting issues. So I think that’s like a pretty good criteria for success. It was like a simple open source tool that >> um automates a process that I engage in on a daily basis um and it makes it a lot easier. So, um, putting it out there, I think it’s good to go. So, the thing that I would like to continue to research is like this multi this multimodel like adversarial process of like uh refinement and um thinking about different ways that that can be applied because I don’t see a whole lot about it. Like there’s not a lot of people out there that or there’s not a lot of stuff that I can find that like >> I haven’t seen the adversarial model too much. I mean, there’s some that things in training. Um, >> I run research papers on it, but it’s like I don’t not really like a research paper type guy. It’s like just show me how it works. Like I’d rather just figure it out and ship the product as opposed to like, >> you know, pontificating on the academic implications of this. That’s why useful. That’s why I asked about your your like how you view the template for this type of thing because if you’re going to use an adversarial setting, they’re arguing over something >> and that’s going to be some context you give them on like what the ideal is and that’s this template. So if you look at like the skill.mmarkdown file >> that defines what you call good in a lot of ways and then the different models in their different viewpoints argue over what good is, >> right? >> Until you get to some final point. So like >> how do you see the direction of good for these types of things? Or do you think that like is is the is the improvement going to be the artifact that you call good and how you reason around those things and the prioritization of each point or is it the the quality of the arguing or is it both? >> Yeah, I think it’s both. Um I think quality is definitely important though. I mean um so I kind of like circum I kind of like address that in this model by um like if a model agrees early on uh then it’s important that it gets pressed on why it agrees like it’s like actually in my experience I’ve tested all of the models and uh just like funny side note is like uh Grock is actually the laziest model um >> the most laidback one’s the laziest that’s >> yeah yeah like it it has been presented with things and it’s like yeah it looks good to me and it’s like wait don’t have any >> you don’t have any like feedback on why don’t you have feedback >> you can put that in the G wait for the Memphis uh supercomputer that they have there to kick up maybe you’ll get some some feedback >> yeah it’s kind of I don’t know the model’s like not that great at like abstract thought and like you know a lot of other things not in the same way that like Claude and Chachi PT and Gemini are >> chat HBT 5.2 is pretty pretty great. >> Yeah. So, I put 5.2 against like Opus and like uh against you know Gemini and all those models and it works pretty good. Yeah. But yeah, just wanted to say Grock is the laziest model for anyone who’s >> have you seen uh like adding in the different perspectives you can put on different personas like QA and stuff. Have you used those for um like you have a feeling as though a spec doesn’t have a good viewpoint done. Are you using that persona just to improve on that thing when you do that or like how are you using those things? >> Uh what do you what can you be a little more specifically? >> What’s the point of using the persona for you? Why did you add that in? >> Oh um like so different models can take the role of different personas. Uh you mean like is that what you mean? >> Yeah. Yeah, like why would you even add in that specific thing? Is it was because like you saw like you saw a you saw some uh lack within a given specification and you wanted to add in the ability to hone in on that thing. Is that why you added it? >> Yeah. Yeah. And also different models are good at different things too. So applying them like assigning them roles within a within a cycle is pretty good. I mean that’s like a useful thing. Having them focus on particular areas. But I mean, I just want to create something that’s more uh it’s kind of like versatile. So you could say, “All right, this is this is what uh I don’t know, just something that that’s like more um more uh diverse. How do you uh clearly you built this because you feel as though this is something LLM are good at and it saves you time so it’s, you know, >> Yeah. Yeah. Well, I mean, LLMs are I think they’re better at stuff like this than they are at writing code just because it’s language specific and it’s about communication and it’s like about reasoning. I think that they’re pretty good and they excel quite well in that. >> Um, so I trust it to do this specification more than I trust it to write code. >> It’s also a little less um the stakes are a little lower like they’re not going to delete your codebase, right? You know what I mean? Like >> Right. Yeah. Yeah. or like like you can’t it’s it’s harder to reason around some complex codebase additions than it is around this this like this human read document that is meant for clarity. >> Yeah. I mean I definitely spend a lot more time arguing with the with AI about code implementation than I do about writing specs. It’s pretty easy to like just write a good spec with a with an LLM. And it’s like most of the time it’s better than what I would come up with myself even with my team. Um code different totally different story like because LLMs will like they’ll like they’ll straight up lie to you. They’ll like gaslight you about and when it comes to code it’s like totally >> You’re exactly right. >> Yeah. Yeah. And then and then no it’ll also just like it’s like okay this test isn’t passing like so usually I’m like all right writing code should be pretty easy. You write a test if the function uh conforms to that test and the output you know um conforms to like what that test defines then it should be good to go and you should just be able to trust it. But then the model will like if the tests aren’t passing. I’ve seen it do this more times than it’ll say. >> Yeah, exactly. It’ll or it’ll modify the test to be like, okay, uh, this passes and it’s like, hold on, are these smoke tests or like did you edit the test? You modified the test. And even I’ll explicitly have like a cloud MD that’s like don’t modify tests. Don’t do this. Don’t do that. And it’ll do that stuff anyway. like it doesn’t even listen to you. And then you so you have to pay attention to like exactly what the output is in your terminal to make sure that it’s not like taking these like cutting corners and like finding loopholes and like doing a bunch of weird lazy Like it’s pretty crazy. So at that point I might as well just be writing the code myself. >> Like you know it’s like more work to babysit an LLM to make sure it’s implementing something the way that I want it to than it would be to just do it myself, you know? So it’s like uh so anyway the point is is writing specs is a a lot it’s a lot easier >> to like write the spec than it is to like write functional code using an LLM. It’s interesting to see how we figure out how to leverage these things uh effectively, right? Because like that’s the kind of the whole thing is like what is it going to be good at and right and like you have this full tilt shift when we came past a threshold of quality where everyone uses it for everything and said it’s going to change the world and take over everybody’s jobs and and then >> and people like actually it can’t build anything of complexity and it kills your test. So like we to back off that and find some middle ground and it’s interesting to see where that boundary is not only with what we have today but as it progresses how that boundary changes because maybe one day it just does it all but for now it stick to the you know human reasoning thing. >> Yeah. I think that like we’re still in this like discovery period with AI and it’s like okay what is it good at and what can it do? Like ask chaty like chatchpt is pretty great at a lot of things but ask it to show you a dolphin emoji and it’s and you will break that >> say ask be like is there a in tomato and like it no that was that’s that’s been fixed but yeah >> try try asking it to you a dolphin emoji and it’ll >> be a dolphin emoji let’s try Uh, here’s good. >> We’re getting better. >> We I mean I tried it last week. The dolphin emoji. >> Which one did you use? >> Screenshare screen. Uh, no. Chat GBT 5.2. >> Yeah. Here. I’ll add it to the >> the greatest and the best and greatest can handle the dolphin emoji. All right. Good deal. >> There you go. >> Yeah. Uh, >> I don’t know. >> You’ll figure it out. You’ll get there. Oh, >> yeah. I’ll I’ll >> No, no, no, no. Ask, “Is there a seahorse emoji >> here?” >> Is there a seahorse emoji? >> Ask, “Is there a seahorse emoji?” >> Nope. There is a seahorse emoji. Horse, fish, shrimp, shell, please. So, seahorse. >> Okay. >> What if you just say, “Show me a seahorse emoji.” >> There isn’t. There’s a seahorse emoji. an a cute text seahorse. Apparently that’s a tech seahorse. >> That looks more like a dick to me. But >> how about like how about this as a question as you try to leverage this thing? Um have you found it more useful to use yourself or are you getting your teams to use it when they present things to you? I built this because I needed it. And all the stuff that I build that I like the most, that people seem to like the most are things that I build because I need them. >> Um, >> because I want to use this thing, right? Like I don’t know if like I need to try to proitize this to everyone so that when they’re building things, they come up with better quality when they because like I’m always interested in using AI to improve the quality of my work to a point where like when I present it to other people, it’s better. >> Yeah. Well, I mean that should it’s kind of like the big value proposition for it, >> but like it also it also kind of helps um >> like I’m training a new product manager kind of and uh um trying to get him locked in on like product processes is it’s kind of hard to like how do you teach somebody to write a good spec like you know this kind of like reduces a lot of the friction in that >> because I could just be like hey use this plugin like use this tool >> just for a day on using setting up their environment correctly. >> And then I don’t have to like pound it into his brain about like oh this this PRD needs a like needs more user personas and then they’re like well what’s a user persona like you know what I mean? It’s like well here you go like just >> that goes back to having good documentation like the skill itself is good documentation and then like like I found it interesting um as the complexity of AI progressed and we have continued to struggle to figure out how to use it that the way in which we’re coalesing is give the LM good context so that it can go off and do the job by itself. Mhm. >> And I found that to be the exact same thing you’re doing with humans. >> Yeah. >> Like it’s literally when you’re when you’re managing when you’re managing a project, your goal as a project manager is to give them the right context so they can be useful by themselves and not talk to anybody. >> Yep. >> And and so like it’s funny to me that where we’re coming in terms of like context engineering and learning how to leverage these AIs is literally the same thing that project managers have been trying to get across for a long time. Mhm. >> And now it’s useful because we can just say like the turnaround time of getting an element to be useful allows you to make documents and text or artifacts, whatever the you want to call them, so fast that like, oh, that is actually the way to do it. This is and so like maybe it’s just because the speed of turnaround was so much faster that it’s easier to get that point across. >> Yeah. I guess on the topic of context, another interesting feature that I implemented is the like interview process. So, if you don’t have So, if you don’t have a PRD, you could just kind of describe what you want and then it will prompt you uh with questions uh to figure out like >> uh like what it is you want exactly and then you could also just use that interview mode if you have like an existing PRD. But, um I think that that provides a lot more context. >> Yeah. Reads it, ask some questions, get some clarifying questions, improves it. And that way you’re like in in a sense you’re doing that step I mentioned of like improving the quality of something that you have before giving it to somebody else. Even for the LLM, right? Like you’re you’re seeing as like before we have multiple LLMs argue about this thing, how about we get it to a state where like that argument is useful and then you get that done and they say now we’ve done a bunch of LLM arguments to get this document into a really what we would consider useful state. Now, it’s really good to give to another human and have them argue about it or like leverage it for whatever work they need to do. >> Kind of this mental process of like how you leverage LLMs to >> not make slop and build things that are actually useful for building. >> Yeah, it’s super easy to just put out a bunch of infinite slop with AI, you know. Um but um yeah, I just built something that I wanted and this is what I what I needed that helps me. Um, so seems like it’s helpful to other people, too. It’s got a pretty good It’s gotten pretty good traction, but I think mostly what I’m interested in is exploring that like multimodel kind of adversarial um pattern. I don’t I don’t think a lot of people a lot of people haven’t really done that. So >> at least publicly I would >> I would suspect that like most people are doing this stuff but they’re just like it’s like alpha every like the way that you use your your AI and stuff like but I’m not like >> similar to how you started right of like you put it in JBT you copy and paste it into claw and say like what do you think you know like and this is this is a much more agentic way of doing it >> that ends up with something that’s reasonably useful faster. I’m curious about how it changes over time and how you can like what the effect effective way of altering personas is like do you have some can you like maybe setting some specific um devil’s advocate in there of like trying to things up is >> like have a model that just their role is to um >> be an idiot >> just just just or yeah or that or just disagree >> but you need to like it’s it’s gonna be it’s hard because at some point you’re going to reach like diminishing returns, right? >> So what are you going to do? >> Like how do you identify that that area so you’re not like over engineering and like over complicating the process? >> Yeah. Because like the ultimately speaking, I mean this is something that um a guy one of my colleagues Yatsk has said when discussing specifications is like to him like people always ask him like what’s a spec he’s like it’s the minimal amount of information you can give me where I I can I understand what you’re doing. >> Yeah. And it’s like that that minimal amount is I think a really key point because you don’t want to over oversp specify what’s going on but you need sufficient amount of information to get the job done. So like finding that boundary is like where you’re really after when doing this type of thing. >> Yeah, that’s kind of hard to quantify. I mean I think it’s like different for everybody like everyone like what I need in order like the context that I require in order to do something without any feedback is different than yours, >> right? >> So it’s it’s not it’s kind of nebulous. It’s not really quantifiable. It’s not like a deterministic level. Like everybody’s is different. So, how can you create something that’s like good enough for most people? I think that’s probably what you need to do. What I like to I think my litmus test for this is if someone asks me a question. Um and the question is a product of them going through and trying to answer it via this document first then that communication is already more efficient. >> Yeah. >> Right. Yeah, they’ve already they’ve already gone through the process of trying to understand through this document and if they didn’t get to it when they come to me that communication is very efficient. >> Yeah. Why why say more word when fewer word do trick? >> Yeah. My my girlfriend I always look at her and like I just want her to contextualize the way I look at her without verbalizing anything so that she knows exactly what I need at any moment. Like I told her like at some point you’ll know me well enough that you’ll have like >> and every everything I do will be etched into your mind and same like likewise, right? Cuz like I’ll know exactly her her mannerisms, the way she’ll react to certain things and and you know it’s interesting like but you’re trying to scale one person’s potential idea that you reach consensus in such a way that like you’re saying you minimize communication. >> Yeah. Snow back >> like hive mind. >> Yeah. Yeah. >> Maybe that’s what we’re building towards with AI >> tot >> All right, man. I definitely appreciate you coming on the show. Uh, >> yeah. I hope that people watch this and leverage this tool. I I’ve already found it useful even in just my understanding of things and how I can maybe implement help people proitize the idea specifications and pods more. Is there something else you would have liked to talk about that we didn’t get to? >> No, not really. Yeah, pretty much covered it. Cool to talk about this stuff. Yeah, I mostly just I’m interested in you know how we can kind of apply that like multimodel adversarial pattern to additional um you know use cases like how is it going to be good for writing code or like shipping a better product outside of strictly the specification phase like testing QA whatever QA might be interesting actually uh so yeah I’m just thinking about different ways different things I can develop that would kind of help research arch and explore that area. >> Well, anybody listening has some ideas, hit us up, ship something, make a PR. Um, thanks for coming on. >> Yeah, thank you.