Posts tagged "ai"

AUG 23 August 23, 2025

First Impressions of Sonic (the stealth xAi Grok coding model?) -

Update (August 26, 2025): Sonic has been confirmed as the new Grok Code model and is now officially available today! What was once a “stealth” model is now xAi’s publicly released coding assistant.

A new stealth model called Sonic has quietly appeared in places like Opencode and Cursor, and it’s rumored to be xAi’s Grok coding model. I spent a full day working with it inside Opencode, replacing my usual Claude Sonnet 4 workflow — and came away impressed. The short version? It feels like Sonnet, but turbocharged.

Getting Access

Sonic showed up in Opencode’s model selector — no special invite required. Selecting it as the active model made it fully functional, with no setup beyond choosing it in the dropdown. During this stealth period it’s free to use and offers a massive 256,000 token context window. I didn’t encounter any rate limits or throttling all day.

First Test

Instead of a contrived “hello world,” I dropped Sonic into my real workflow. Where I’d usually use Sonnet 4, I asked Sonic to help me build a new Training Center CRUD feature in a Ruby on Rails application.

It generated the models, migrations, controllers, and views exactly as expected — and most importantly, it did it fast. Text streamed out so quickly it was hard to keep up. Compared to Claude Code, I’d estimate 3x faster tokens per second, consistently.

That was my wow moment: speed without sacrificing accuracy. It nailed the Rails conventions and felt like a drop-in Sonnet replacement.

My Standard Benchmark

My go-to evaluation prompt for coding models is: “Tell me about this codebase and what the most complex parts are.” On my legacy Rails codebase, Sonic immediately identified the correct overloaded model as the main complexity hotspot, and even suggested ways to refactor it.

The result was near identical in quality to what I’d get from Sonnet or even Opus — but again, the response flew back near-instantly. For day-to-day comprehension and reasoning, Sonic seems to match the best while dramatically cutting wait time.

Comparison to Claude Sonnet

  • Reasoning & Accuracy: Nearly identical to Sonnet. No hallucinations, no weird Rails missteps.
  • Personality: Very similar, though Sonic felt a bit more positive and agreeable to suggestions.
  • Speed: The standout. At least 3x faster than Claude Code/Sonnet, with the speed holding steady across short and long generations.
  • Where Sonnet Might Still Win: Hard to say yet — I didn’t hit any reasoning failures, but Sonnet have a longer track record of reliability under heavy workloads.

Bottom line: if you’re used to Sonnet, Sonic feels like the same experience at high speed.

Pricing and Availability

Right now Sonic is free to use in Opencode during its stealth rollout. No public pricing details are available yet, but if it undercuts Sonnet, it could be a winner.

I haven’t seen any rate limits or regional restrictions. Cursor users also report seeing Sonic available as a selectable model.

Initial Verdict

After a full day of work, I would — and did — ship production code with Sonic. It handled CRUD features, comprehension of a large legacy codebase, and various refactors without issue.

The real differentiator is speed. If Sonnet is your baseline, Sonic offers the same reasoning ability but with response times that feel instantaneous. Unless something changes in pricing or reliability, this could easily become my daily driver.

Quick Reference

  • Access: Select “Sonic” in Opencode’s model selector (no invite required during stealth)
  • Pricing: Free during stealth; official pricing TBD
  • Best for: Day-to-day coding tasks, CRUD features, codebase comprehension, Rails development
  • Avoid for: Nothing obvious yet — but keep human oversight on critical code
  • API docs: Not yet public
  • Hot take: As I posted on X, this feels like my Sonnet replacement should I be forced to not have Sonnet. @robzolkos


AUG 4 August 4, 2025

Two Days. Two Models. One Surprise: Claude Code Under Limits - The upcoming weekly usage limits announced by Anthropic for their Claude Code Max plans could put a dent in the workflows of many developers - especially those who’ve grown dependent on Opus-level output.

I’ve been using Claude Code for the last couple of months, though nowhere near the levels I’ve seen from the top 5% users (some of whom rack up thousands of dollars in usage per day). I don’t run multiple jobs concurrently (though I’ve experimented), and I don’t run it on Github itself. I don’t use git worktrees (as Anthropic recommends). I just focus on one task at a time and stay available to guide and assist my AI agent throughout the day.

On Friday, I decided to spend the full day using Opus exclusively across my usual two or three work projects. Nothing unusual - a typical 8-hour day, bouncing between tasks. At the end of the day, I measured my token usage using the excellent ccussage utility and this calculated what it would have cost via the API.

Claude Code Opus usage

Then today (Monday), I repeated the experiment - this time using Sonnet exclusively. Different tasks of course, but the same projects, similar complexity, and the same eight-hour block. Again I recorded the token usage.

Claude Code Sonnet usage

Here’s what I found:

  • Token usage was comparable.
  • Sonnet’s cost was significantly lower (no surprise)
  • And the quality? Honestly, surprisingly good.

Sonnet held up well for all of my coding tasks. I even used it for some light planning work and it got the job done (not as well as Opus would have but still very very good).

Anthropic’s new limits suggest we’ll get 240-480 hours/week of Sonnet, and 24-40 hours/week of Opus. Considering a full-time work week is 40 hours, and there are a total of 168 total hours in a week, I think the following setup might actually be sustainable for most developers:

  1. Sonnet for hands-on coding tasks
  2. Sonnet + Github for code review and analysis
  3. Opus for high-level planning, design, or complex architectural thinking

I highly recommend you be explicit about which model you want to use for your custom slash commands and sub-agents. For slash commands there is a model attribute you can put in the command front matter. Release 1.0.63 also allows setting the model in sub-agents.

I would love to see more transparency in the Claude Code UI of where we sit in real-time against our session and weekly limits. I think if developers saw this data they would control their usage to suit. We shouldn’t need 3rd party libraries to track and report this information.

Based on this pattern, I don’t think I’ll hit the new weekly limits. But we’ll see - I’ll report back in September. And of course there is nothing stopping you from trialing other providers and models and even other agentic coding tools and really diving deep into using the best model for the job.


AUG 3 August 3, 2025

Claude Code's Feedback Survey - It seems this weekend a bunch of people have reported seeing Claude Code ask them for feedback on how well it is doing in their current session. It looks like this:

Claude Code feedback survey

And everyone finds it annoying. I feel things like this are akin to advertising. For a paid product, feedback surveys like this should be opt in. Ask me at the start of the session if I’m ok in providing feedback. Give me the parameters of the feedback and let me opt in. Don’t pester me when I’m doing work.

I went digging in the code to see if maybe there is an undocumented setting I could slam into settings.json to hide this annoyance. What I found instead is an environment variable that switched it on more!

CLAUDE_FORCE_DISPLAY_SURVEY=1 will show that sucker lots!

These are the conditions that will show the survey:

  1. A minimum time before first feedback (600seconds / 10 minutes)
  2. A minimum time between feedback requests (1800 seconds / 30 minutes)
  3. A minimum number of user turns before showing feedback
  4. Some probability settings
  5. Some model restrictions (only shows for certain models) - I’ve only had it come up with Opus.

Asking for feedback is totally ok. But don’t interrupt my work session to do it. I hope this goes away or there is a setting added to opt out completely.


JUL 22 July 22, 2025

AI Coding Assistants: The $200 Monthly Investment That Pays for Itself - Lately the price of AI coding assistants has been climbing. Claude Code’s max plan runs $200 per month. Cursor’s Ultra tier is another $200. Even GitHub Copilot has crept up to $39/month. It’s easy to dismiss these as too expensive and move on.

But let’s do the math.

The Simple Economics

A mid-level developer in the US typically costs their employer around $100 per hour when you factor in salary, benefits, and overhead. At that rate, an AI coding assistant needs to save just 2 hours per month to pay for itself.

Two hours. That’s one debugging session cut short. One feature implemented faster. One refactoring completed smoothly instead of stretching into the evening.

I’ve been using these tools for the past year, and I can confidently say they save me 2 hours in a typical day, not month.

The Chicken and Egg Problem

Here’s the catch: how do you prove the value before you have the subscription? Your boss wants evidence, but you need the tool to generate that evidence.

This is a classic bootstrapping problem, but there are ways around it:

1. Start with Free Trials

Most AI coding assistants offer free trials or limited lower cost tiers. Use them strategically:

  • Time yourself on similar tasks with and without the assistant
  • Document specific examples where AI saved time
  • Track error rates and bug fixes

2. Run a Pilot Program

Propose a 3-month pilot with clear success metrics:

  • Reduced time to complete user stories
  • Fewer bugs making it to production
  • Increased test coverage
  • Developer satisfaction scores

3. Use Personal Accounts for Proof of Concept

Yes, it’s an investment, but spending $200 of your own money for one month to demonstrate concrete value can be worth it. Track everything meticulously and present hard data.

[read more...]


JUL 21 July 21, 2025

Context7: The Missing Link for AI-Powered Coding - If you’ve spent any time working with AI code assistants like Cursor or Claude, you’ve probably encountered this frustrating scenario: you ask about a specific library or framework, and the AI confidently provides outdated information or hallucinates methods that don’t exist. This happens because most LLMs are trained on data that’s months or even years old.

Enter Context7 – a clever solution to a problem that not many developers know about, but everyone experiences.

What is Context7?

Context7 is a documentation platform specifically designed for Large Language Models and AI code editors. It acts as a bridge between your AI coding assistant and up-to-date, version-specific documentation.

Instead of relying on an LLM’s training data (which might be outdated), Context7 pulls real-time documentation directly from the source. This means when you’re working with a specific version of a library, your AI assistant gets accurate, current information.

The Problem It Solves

AI coding assistants face several documentation challenges:

  1. Outdated Training Data: Most LLMs are trained on data that’s at least several months old, missing recent API changes and new features
  2. Hallucinated Examples: AIs sometimes generate plausible-sounding but incorrect code examples
  3. Version Mismatches: Generic documentation doesn’t account for the specific version you’re using
  4. Context Overload: Pasting entire documentation files into your prompt wastes tokens and confuses the model

I’ve personally wasted hours debugging code that looked correct but used deprecated methods or non-existent parameters. Context7 aims to eliminate this friction.

[read more...]


JUL 14 July 14, 2025

The real gift of letting ‘a murder of AI crows’ do the ‘CRUD monkey’ work isn’t efficiency - it’s liberation. Free from the repetitive, we can finally tackle the ambitious, the novel, the genuinely hard problems worth solving. The stuff that actually moves the needle.


JUL 4 July 4, 2025

After doing a coding session, I run a custom Claude Code slash command /quiz that will find a couple of interesting things in the work we just did and quiz me on it. A bit of fun and keeps the learning happening.

We've done some substantial work in this session and I would like you to quiz me to cement learning.

You are an expert Ruby on Rails instructor.

1. Read the code we have changed in this session.
2. Pick **2 non-obvious or interesting techniques** (e.g. `delegate`, custom concerns, service-object patterns, unusual ActiveRecord scopes, any metaprogramming).
3. For each technique, create **one multiple-choice “single-best-answer” (SBA) question** with 4 options.
4. Ask me the first question only.
5. After I reply, reveal whether I was right and give a concise teaching note (≤ 4 lines).
6. Then ask the next question, and so on.

When all questions are done, end with:
`Quiz complete – let me know where you’d like a deeper dive.`

I share my Claude Commands here


JUL 2 July 2, 2025

The Ground Your Agent Walks On - Every codebase is terrain. Some are smooth highways, where AI agents can move fast and confidently. Others are more like an obstacle course - still functional, but harder to navigate, even for experienced developers. For AI agents, the difference matters more than you might think.

Matt Pocock recently tweeted, “Know the ground your agent will walk on.” It’s a great metaphor. AI coding assistants aren’t just tools - they’re travelers trying to make sense of your landscape. The easier that terrain is to read, the better they perform.

The Terrain Metaphor

Think of your AI agent as a sharp, capable junior developer. Fast, tireless, and helpful - but very literal. They don’t infer intent. They follow cues.

When your codebase has clear structure with focussed models, controllers that follow consistent patterns, logic that lives in obvious places then AI agents can hit the ground running. They know where to go and what to do. But when logic is scattered across models, helpers, and controller actions - when responsibilities blur and patterns break - it’s harder. The AI has to guess, and that’s when bugs, duplication, or missed edge cases creep in.

You’ve likely seen it: in a clean, readable codebase, the AI knows where to add password reset logic. In a tangled one, it might reinvent validation from scratch, or break something that silently relied on old behavior.

The Productivity Multiplier

Well-structured code doesn’t just help AI a little. It can make them drastically more useful.

Clean abstractions give the model leverage. Instead of spitting out code you need to carefully review or fix, it can offer changes that fit right into your architecture. The AI stops being just a helpful autocomplete and starts being a real multiplier.

[read more...]


JUN 20 June 20, 2025

How can we use AI without getting dumber? - DHH’s recent observation about AI resonated:

“As soon as I’m tempted to let it drive, I learn nothing, retain nothing.”

When he lets AI take the wheel, his learning stops. Knowing how you learn best is crucial and I respect that. But his tweet reminded me of my first week with Rails many years ago. One command - rails generate scaffold Blog - and boom. Working blog. Complete with database migrations, controllers, views, the works. I felt like a god.

For about ten minutes.

Then came the client requests. “Can we add comments?” “What about tags?” “Why is it so slow with 10,000 posts?”

I had no idea. I’d built nothing. I’d learned nothing. The only thing that had happened was some code appeared on my screen.

I know what you’re thinking: “That’s different. Rails generators create 50 lines of boilerplate. AI writes entire features, algorithms, and architectural decisions. One removes tedium, the other removes thinking.”

Fair point. The scale has changed dramatically. But here’s what’s interesting - in the early days, those 50 lines of generated code felt just as magical and just as dangerous to understanding. I remember the debates about whether generators would make us worse developers, whether we’d lose touch with the fundamentals. Some of us worried we’d forget how to write SQL or understand HTTP.

Sound familiar?

We’re watching the same pattern repeat, just amplified. With Rails generators, with Stack Overflow, with Google, with IDE autocomplete, and now with AI. Each time, the community splits: those who see a shortcut to ship faster, and those who worry we’re shortcutting our way out of understanding anything.

Both groups are right. And both are missing the point.

[read more...]


JUN 18 June 18, 2025

That Weird AI Workflow Might Just Work - Kieran Klaassen and team mates shared their Claude Code workflow a few days ago. They broke down their process, showed what they built, and yesterday Kieran posted about the (potential) API costs a workflow like this has (or would have if not for Anthopic’s Max plan). The response? While some were curious, the critical voices dominated - calling it too expensive and claiming ‘these AI folks’ aren’t building anything real (check out Kieran’s X feed to see how absurd that is).

Here’s what bothers me: Kieran wasn’t bragging. They were sharing data. They were excited about their productivity gains and wanted to show others what worked for them. And instead of curiosity or questions, they got dismissal.

The Real Problem

We all know AI is transformative. Nobody’s arguing that anymore. But there’s this weird gatekeeping happening around how people use these tools.

Someone posts about using Claude to write tests? “That’s not real testing.” Someone shares their Cursor workflow? “You’re just racking up API bills.” Someone shows how they built an app in a weekend with AI assistance? “But is it production quality?”

The critics are missing the point entirely. These developers aren’t saying their way is the only way. They’re experimenting. They’re pushing boundaries. They’re figuring out what works.

Why This Matters

Every breakthrough in development workflows started with someone trying something different and sharing it. Remember when people mocked developers for using Rails? “It doesn’t scale.” “It’s just a toy framework.” “Real developers use Java.”

Those early Rails developers weren’t wrong for sharing their excitement. They were pioneering new ways of building web apps. Some of their approaches failed. Others became industry standard.

Sure, not every experiment will pan out, and healthy skepticism has its place. But there’s a difference between thoughtful critique and reflexive dismissal.

Same thing is happening now with AI workflows.

[read more...]