From "What the F*ck Does That Mean, Claude?" to "Aha" Moment(s)

I don't have any coding experience. Not at all. I did a short computer course when I was seven on Soviet-era PCs. But you'll need to buy me a pint to hear that story.

Fast Forward 30 Years

It's August 2025. I told Gemini: build me a Word add-in. (Not exactly good practice, I know.)

And frankly, it did something. It wrote code. It explained what I needed to do. It said: get Script Lab. Fast prototyping, easy to set up. So I did.

Then it said: here's the code, sideload it. I loaded it. The add-in looked visually basic. Time to try it out.

It was shit. Redlining was turning on but deleting big chunks of text.

Still, I thought, it could do something. I vibe coded a few other things. A time-tracking app. Basic HTML. A timer. Ability to export into CSV. That was about it.

Enter Claude

A few months later, Gemini 3 comes out. Buzz everywhere about how brilliant it is. Around the same time, Google releases Antigravity. Everyone's excited.

Now at this point, my vibe coding experience is limited to using the Gemini consumer app to generate code and copy it into Script Lab. I did experiment with Script Lab quite a lot. The results were improving but still poor.

Then Claude Opus gets updated. Even bigger buzz. I go to Claude. I said: hey Claude, how do I make this Word add-in work?

Claude said: you need to inject OOXML into Microsoft Word.

I said: what the fuck is that, Claude?

Claude said: rather than programmatically turning on track changes, just inject the XML directly. You'll get proper redlines. (Claude is always being nice.)

The code was probably twice as long as what Gemini generated. And at this point I'm still manually copying libraries, CSS, TypeScript into Script Lab. I didn't even know you could import files into Script Lab and it would just work.

So Claude tells me: go and do OOXML injection. Little did I know how difficult that was. XML is extremely fragile. One mistake and you're out. AI can't reliably generate it. You have to figure it out programmatically.

I had Claude generate TypeScript. When I hit limits, I'd switch to Gemini and continue there. It generated decent results. You could highlight a clause, say amend it. That kind of worked.

But I still didn't understand how any of it actually worked. And that was a problem.

10,000 Lines of Code. No Idea What Any of It Means.

The codebase is approaching 10,000 lines. One file. I have no idea what any of this means. Is a big blob of 10,000 lines of code good enough?

(Spoiler alert. It's not.)

The Chromebook, the Command Line, and Antigravity

This is when I learned about Antigravity. It had generous tiers. I said: can I do this there?

About my setup. I have a work laptop and a Chromebook. Why a Chromebook? What do you actually do with your personal hardware? Browse the internet. Watch films. A Chromebook is good enough.

It's also fortunate that setting up Linux on a Chromebook is easy. Claude explained how. I did it. Then Claude explained how to install Antigravity. I did that too.

First time since I was seven that I'm staring at a command line in Linux. Intimidating black screen with letters. No UI. No visual experience whatsoever.

Antigravity was set up. I opened it. Amazing. The experience was agentic. The tiers were generous.

But I'm still working with that 10,000-line blob. And at some point Antigravity started crashing. I thought: surely my Chromebook is to blame. Turns out it was the code. I think.

Learning to Do Things Properly

I said: Claude, what do I need to do?

At this point I'm running Antigravity to generate code and paying £20 a month for a Claude subscription to tell me what to do, teach me, guide me through the process.

Claude said: you need to run a proper project. You need to do it with Yeoman.

I said: what the fuck is Yeoman, Claude?

Claude explained. It walked me through the whole thing. Created a proper tree structure. Now it could work with the relevant bits of code instead of one enormous file.

This is when I learned how to push stuff to GitHub.

I had one project in GitHub from back in August. That big blob of Script Lab code, manually copied into a repository. Private.

Claude said: push to GitHub.

I said: what does that mean?

Claude practically generated a facepalm emoji. (It did not, but I bet it "thought" about it.) Don't you know anything?

It explained how to set up an SSH connection to GitHub. How to push things into repositories. Then it said: tell Antigravity to commit.

I said: commit to what exactly?

Claude said: just commit.

I typed the command. Antigravity: committed. I thought: what the fuck just happened?

Claude gave me a nice explanation of what commit means. Then: tell Antigravity to push.

I said: push what? And where?

I typed the command. Antigravity: pushed. I went back to Claude: what just happened?

Claude said: look at your GitHub.

I went to the repository. There was the tree. There was the code. It was amazing.

Maybe reading about it first would have been more streamlined. But because I tried it first, failed, broke things, I learned a lot. And I remembered how to do stuff.

Beyond the Word Add-In

Okay. I've generated this thing I don't understand. Started as 10,000 lines. Now it's in a tree. What do you do with it?

At that time (and still to this day, though I feel the tide is changing), what I read about vibe coding was mainly negative. Security. Vulnerabilities. Unmaintainable code. No scalability. All valid concerns. But that didn't help me.

I thought: it's a hobby. I want to understand things. But I also want it to be useful. (I also wanted to secretly test how close, functionally, you could get to big legal tech.) Because I was about 70% there with the Word add-in, and it could genuinely be something worth finishing. Something bigger.

So I started experimenting. I started a project where you'd upload documents, upload a playbook, and it would produce a redlined version. That was even more complicated than the Word add-in. Because you're not doing it clause by clause. You're ingesting the entire document structure.

This is where I spent a lot of time with Claude learning how XML works. And this is where things started to change. Because Claude, like any LLM, can't exactly "come up" with solutions, right? These are big statistical machines. They predict the next most probable token. You must have heard of that.

Claude said: you need to do this programmatically. Python. So we did. It worked for certain things. Missed lots of edge cases. Formatting was shit. Styles were shit. The redlines were good. Everything else was shit.

I said: Claude, can you make it less shit?

The "Aha" Moment

And this is where it clicked.

Contracts have a particular structure. Word has a particular structure. There are paragraph IDs. I learned that through various iterations. And because I'm a lawyer, I know contracts follow a particular pattern. Definitions. Operative provisions. Boilerplate. The typical stuff.

So I said: why don't we generate a map? Each paragraph ID has context and style. We pass that to the LLM. Wait, we give the LLM the structure of this contract: clauses, sub-clauses, schedules. This will surely work, right?

Claude said: good idea. (Just like fully trusting an LLM is a good idea.)

Still shit.

I said: right. Why don't we have post-processing? A styling pass. Yes, we'll spend more tokens. But let's try having AI apply styles based on the map. Either specifying what the style should be, or where the style should be cloned from. Because inserting new paragraphs into a Word document and having them look consistent is genuinely difficult.

That worked. Didn't solve every edge case. But that was the turning point.

This is how you work with AI. Vibe coding is not do this for me. For a simple tool, sure. But for something more complex? Not a chance. This is where human and LLM genuinely work together.

And as you know, LLMs are agreeable. They change their mind. They go along with whatever you suggest. The value isn't in blind delegation. It's in the back-and-forth. Pressing Claude (and any frontier LLM for that matter) on approaches. Challenging it.

Claude was (and is) my brainstorming buddy.

Attempts at Best Practices

Okay. I did the thing. Claude helped me with XML injection. But I still had the same problem. Who would ever use this stuff?

You can't really go to enterprise with it. So I thought: open source might be the answer. But who would maintain this code? I don't know what a pull request is. I had no idea how to deal with issues. I had no idea what's going on on GitHub. I still am a little bit in the dark, although I understand much more now.

So I said: hey Claude, how do I make this work?

Claude said: you need to create ADR documentation.

Here we go again. What the fuck is an ADR?

And I thought. I don't know what I did. You did it. I know why we did it. But the how? That's all Claude.

So we created ADR documentation for the Word add-in. It's still there. It explains decisions in detail.

But obviously, creating documentation is only part of the story. What use is documentation if the code isn't maintainable? If you don't know whether it's secure enough? If you don't know whether there are vulnerabilities?

And the most recent endeavour...

Claude Code and the Chrome Extension

Around this time, everyone starts talking about Claude Code. I'm still sitting in Antigravity. I don't understand what the fuss is about.

So I decided to try it. This is where I had to spend more than £20, which is annoying. But the results were good.

I installed Claude Code (command line in Linux). The next project was a Google Chrome extension. The reason I wanted to try it: I had a couple of projects on GitHub and certain enthusiasts were starring them. There were some forks. But there was no way for people to actually try the tools. The idea was to move the entire backend into Google Chrome.

I said: Claude, can I do this?

Claude said: you need WebAssembly and Pyodide.

Now. Progress. Instead of saying what the fuck is that, I said: Claude, please explain this to me.

Pyodide is basically the ability to run Python code within a Chrome extension. I said: cool, let's do this.

Claude said: but there's a size limitation. Your extension will be big.

Claude Code's ability to fetch tools and run agents in parallel is unparalleled. I wasn't entirely sure at that point how it all worked, but the results rendered through the terminal were much better than what I could have achieved through Antigravity (though I now know that this is not quite right).

Now because with this one, unlike the earlier projects, I wanted people to actually try it. So I asked: hey Claude, how can it become maintainable? I'll put it on GitHub, but how could people actually maintain it?

Claude said: you need the structure to be in blocks of no more than 200 lines of code.

So we gave that task to Claude Code.

Then I said: okay, documentation is good. But how do we ensure it's secure?

This is where I told Claude to imagine a Fortune 500 company was auditing this extension, as if they were onboarding it. Claude produced a self-audit document. We gave that to Claude Code to run against the codebase. All documented.

Then I accidentally discovered that there's a code simplification skill in Claude Code. So I ran that too.

The result? The extension was packaged, accepted by Google, and published.

And then I found a bug. I didn't like the output of the packaged extension. So I had to resubmit. And with Google, you have to wait a few days. So before I could even roll it out properly, I waited a couple of weeks until Google accepted the version I could actually publish.

What This Story Doesn't Tell You

The paragraphs above make it look as if I moved quickly. Gemini consumer app. Antigravity. Claude Code. Neat progression.

In reality, I spent many hours and many evenings burning tokens. I'm not proud of how many. But it was trial and error that took me from oh, this looks cool to how will this technology shape the future?

What this story doesn't tell you is that I experimented with agentic harnesses. I took Claude as an agent, put it on top of a pipeline, and created custom UI for lawyers where each instruction is a separate workspace. And through all of that, I arrived at a realisation.

Big legal tech made a judgement error. They gave us chatboxes. Then they built features around those chatboxes. (Some may say they are just wrappers and argue about whether they are "thin". But wrappers can be useful. The rest is economics.)

In a way they had no choice. Regardless, that's backwards. Lawyers don't work with chatboxes. They work with client instructions.

This process made me think tactically about the tools I'm building. The Google Chrome extension is one. The MCP form filler is another. And the latest endeavour: making Copilot actually useful for lawyers.

In the end, what I arrived at, through all of these, is a set of principles. Not exactly best practices. My best practices. Built through various iterations with Claude.

I ended up asking Claude to produce documentation. To simplify code. To break it into maintainable chunks. To run security checks. To run self-audits as if a Fortune 500 company was evaluating the tool.

Is this enough? Probably not. But progress is undeniable.

Legal Quants

Now. This story feels lonely. It isn't.

And that's where the Legal Quants community is relevant.

When I created my Word add-in, I saw that Jamie Tso had been posting tools he'd built. So I decided to post something too and tag Jamie. Jamie reached out and added me to the early iteration of what became Legal Quants. At the time, the name didn't even exist. He added me to a WhatsApp group of just a few individuals. That community has significantly grown since.

But the point is: I saw that there are like-minded people. And the quality of discussion in that community is second to none.

This post is not about Legal Quants. But instead of asking what the fuck is that, Claude? I can now say: hey guys, what do you think about this?

Now watch how this community makes a massive impact within the industry.

[To be continued]