Most AI Agent Problems Are Infrastructure Problems

Most AI Agent Problems Are Infrastructure Problems

I think a lot of teams are optimizing the wrong layer. When an AI workflow breaks, the first instinct is usually to swap the model. Try OpenAI. Try Anthropic. Try a new prompt. Try a new framework. Maybe that helps. Usually it doesn't fix the real problem. Most AI agent problems are infrastructure problems. The model is the visible part, so it gets all the attention. But in practice, the failures usually happen in the seams. A tool call times out. A background job runs too long. A session loses state. A retry happens in the wrong place and duplicates work. One flaky dependency turns a clean demo into a system that quietly dies halfway through the job. That stuff is not sexy, but it's the whole game. The demo works, the system doesn't A lot of AI products look good in a five minute demo because the happy path is easy to stage. You give the model a clear instruction. It calls the right tool. The data comes back clean. The output looks smart. Everyone nods. Then real usage starts. Now inputs are messier. APIs are slower. Credentials expire. One tool returns malformed JSON. Another gives you a 429. A user asks for a task that takes 20 minutes instead of 20 seconds. Suddenly the question isn't whether the model is smart. The question is whether the system can survive contact with reality. That's why I keep coming back to the same point: reliability matters more than cleverness. If you want a useful AI system, I think you need a few boring things before you need a better prompt. What actually matters First, you need retries that aren't stupid. Not infinite retries. Not blind retries. Real retries with limits, backoff, and some awareness of what failed. Second, you need state. If a workflow has already finished steps one through four, it should not start over just because step five broke. It should know where it is, what already succeeded, and what still needs attention. Third, you need supervision. Long running work needs checkpoints. It needs status. It needs a way to surface, "here's what happened, here's what's blocked, here's what I'm doing next" without making the user babysit every move. Fourth, you need graceful degradation. If the ideal path fails, the system should still have a second move. Maybe it waits. Maybe it falls back. Maybe it asks for help at the right moment instead of crashing into a wall and pretending it completed the task. None of this is glamorous. That's exactly why it matters. Model switching is not a strategy I like good models. I'm happy to use better ones whenever they show up. But "we'll just switch models" is not an operating plan. It's a coping mechanism. If your system depends on every tool succeeding instantly, every API staying stable, and every run finishing on the first try, you're not building an agent. You're building a brittle chain of lucky events. The teams that win here won't just have access to strong models. Everyone will have that. The teams that win will build the execution layer around the model. They'll know how to recover work, route around failures, preserve context, and keep moving when the world gets noisy. That's the real moat. It's the same lesson I wrote about in Nobody Cares About Your AI. The flashy part gets attention. The useful part solves the problem. The boring work is the product I don't think the future belongs to the teams with the most impressive demos. I think it belongs to the teams that make AI feel dependable. The ones that make a user trust that the work will finish. The ones that handle failure without turning the user into unpaid QA. The ones that treat recovery, observability, and orchestration like product features, because they are. According to McKinsey, the economic upside of generative AI is massive. I buy that. But I don't think most of that value comes from demos that look magical on day one. I think it comes from systems that keep working on day one hundred. That's a much less glamorous story. It's also the one worth building.

The Best AI Use Cases Are the Ones Nobody Tweets About

The Best AI Use Cases Are the Ones Nobody Tweets About

Open any tech feed right now and you'll see the same stuff. AI generating photorealistic images. AI writing entire codebases. AI having philosophical conversations about consciousness. Cool demos. Genuinely impressive technology. And almost none of it is where the real money is being made. The AI use cases actually generating revenue are the ones nobody screenshots for LinkedIn. Temperature log validators for restaurant chains. Payroll compliance checkers that flag overtime violations before they become lawsuits. Employee classification calculators that tell you whether your new hire is exempt or non-exempt under FLSA guidelines. Boring stuff. The kind of problems that make people's eyes glaze over at dinner parties. I know this because I'm building these tools right now. LogChef checks whether food temperature logs meet health code requirements. Exemptly walks you through the DOL's duties tests to classify employees correctly. PayShield catches payroll compliance gaps before an auditor does. Nobody is going to retweet a temperature log validator. Nobody is making TikToks about employee classification flowcharts. But here's the thing: people actually search for these tools. Real humans with real compliance deadlines type "FLSA exempt vs non-exempt calculator" into Google every single day. And when they find a tool that solves their problem in 60 seconds, they remember who built it. I wrote about this pattern before in Nobody Cares About Your AI. The technology itself doesn't matter to the end user. What matters is whether the problem goes away. A restaurant manager doesn't want "AI-powered food safety monitoring." They want to not fail their next health inspection. A payroll admin doesn't want "machine learning compliance analysis." They want to stop worrying about whether they're breaking federal overtime rules. The gap between what the AI community talks about and what businesses actually need is enormous. And that gap is where the opportunity sits. If you're building with AI right now, here's my honest take: stop chasing the use case that'll get you on Hacker News. Start looking at the problems people are too embarrassed to admit they still handle with spreadsheets and sticky notes. The compliance checks done on paper. The classification decisions made by gut feel. The audit prep that takes three people a full week. Those are your best AI use cases. They just don't make good tweets. What's the most boring problem you've seen AI actually solve well?

Nobody Searches for Your Product Name

Nobody Searches for Your Product Name

Here's something most SaaS companies get wrong about search. They spend months optimizing for branded terms. "Best project management tool." "[Product] vs [competitor] comparison." "[Product name] review 2026." They fight over the same ten keywords every competitor is already bidding on. Meanwhile, nobody is Googling their product name. At least not the people who need them most. The real search happens before the product Think about what someone actually types when they have a problem. Not a software shopping problem. A real, right-now, my-boss-is-asking-about-this problem. They type things like:"restaurant temperature log template" "employee exempt vs non-exempt calculator" "ISO 9001 audit checklist PDF" "how to calculate overtime for salaried employees"These are the queries that matter. Specific. Boring. High intent. The person searching isn't browsing. They need something right now, and they'll use whatever solves it. Why boring queries win I've been building small compliance tools for the past few weeks. Free web utilities that solve one narrow problem each. A temperature logging tool. A payroll compliance checker. An exempt vs non-exempt classifier. None of these are products in the traditional sense. They're single-purpose tools that answer one question really well. But here's what's interesting. The search volume for these micro-queries is real. According to Ahrefs, "temperature log template" gets searched thousands of times a month. "Exempt vs non-exempt" gets even more. And almost nobody is building dedicated tools to capture that traffic. Most of the results are blog posts from law firms and HR consultancies. Long articles that kind of answer the question but don't actually give you the thing you need. No calculator. No downloadable template. No interactive tool. Just 2,000 words of background information and a "contact us for a consultation" CTA. That's the gap. Own the query, not the category The mistake is thinking you need to own a category term like "compliance automation" or "workflow management platform." Those terms sound important in a pitch deck, but real humans don't search that way. Real humans search for the specific problem sitting on their desk right now. If you can be the thing that solves that problem, you don't need the person to know your brand first. You just need to be there when they search. The brand relationship builds backward from usefulness. This is basically the playbook that HubSpot used early on with their free tools. Website grader, email signature generator, invoice templates. None of those were the core product. All of them brought in people who eventually needed CRM software. What this means if you're building Stop fighting for category keywords. Start asking: what's the smallest, most specific problem my future customer has right now? Build the thing that solves it. Make it free. Make it show up when they search. The boring query is the wedge. The product conversation comes later. I've been testing this approach with compliance tools, and the early signals are promising. More on that as the data comes in. But the principle holds: nobody is searching for your product name. They're searching for help with the problem you solve. Go own that query instead.

Nobody Cares About Your AI

Nobody Cares About Your AI

I've been building compliance tools for the past few weeks. Temperature logs, food safety checklists, inspection prep workflows. Boring stuff by any AI startup's standards. Here's what I've learned: restaurant owners do not care about AI. Not even a little. They care about passing their next health inspection. They care about not getting fined. They care about making sure the walk-in cooler didn't die overnight and ruin $3,000 worth of product. If you walked into a kitchen and said "I built an AI-powered temperature monitoring solution with real-time anomaly detection," the chef would look at you like you just spoke Klingon. If you said "this tells you when your fridge breaks before your food goes bad," now you're talking. The Gap Nobody Tweets About The AI industry has a massive blind spot. We're obsessed with capability and completely uninterested in context. Every week there's a new benchmark, a new model, a new agent framework. Meanwhile, the people who would benefit most from better software are still using paper logs and spreadsheets because nobody bothered to meet them where they are. I'm not saying AI isn't powerful. It is. But power without packaging is just a science project. What Actually Works The compliance tools I've been shipping don't mention AI anywhere. Not in the copy, not in the UI, not in the pitch. They're just tools that solve specific problems for specific people. A temperature log generator that creates the exact form a restaurant needs for their daily checks. A payroll compliance calculator that tells you whether your overtime policy matches your state's rules. A 503 error page builder for when your site goes down and you need a professional page in 30 seconds. None of these are technically impressive. All of them solve a problem someone actually has today. The Boring Niche Advantage McKinsey estimates that generative AI could add $2.6 to $4.4 trillion annually in value across industries. But here's the thing: most of that value will come from boring applications that make existing workflows slightly less painful. Not from chat interfaces. Not from copilots. From small tools that remove friction from tasks people already do. The market for "AI that sounds smart" is crowded. The market for "tool that solves this one annoying problem" is wide open in thousands of niches. Build for the Problem If you're building something right now, try this exercise: describe what you're making without using the words AI, machine learning, model, or agent. If you can't explain the value without those words, you might be building a solution looking for a problem. The best technology disappears. Stripe doesn't sell "AI-powered payment processing." They sell "accept payments online." The AI is in there somewhere. Nobody cares. It just works. That's the standard. Build something that just works for someone who has a real problem today. Let the AI be the how, not the what. What are you building that nobody would describe as "AI" even though it is?

The Tools I Actually Use Daily as a One-Person Operation

The Tools I Actually Use Daily as a One-Person Operation

Tools I Actually Use The AI and productivity tools that are actually worth my time—and why most of the hype is noise.I've tried dozens of AI tools. Most of them are forgettable. Not because they're bad, but because they don't stick. They solve a problem I don't have, or they create more friction than they remove. Here's what's actually in my daily rotation—and what I've learned about separating signal from noise. What's Actually in My Stack Claude is my primary workhorse. I use it for writing, coding, research, and thinking through problems. It's not perfect, but it's consistent in ways other tools aren't. The context window matters more than I expected. Cursor has replaced VS Code for most of my development work. AI-assisted coding isn't about replacing thinking—it's about removing the mechanical friction that slows down experimentation. I still write plenty of code manually. Cursor just handles the boring parts faster. Notion remains my system of record. I've tried Obsidian, Roam, and a dozen alternatives. Notion wins because it's good enough at everything and excellent at nothing. That sounds like criticism, but it's actually the point. I don't want to optimize my note-taking system. I want to take notes. Process Street (full disclosure: I work here) handles my recurring workflows and SOPs. The value isn't the software itself—it's the discipline of documenting processes that would otherwise live in my head. Most people skip this step. That's a mistake. Zapier connects everything else. I have maybe 15 active Zaps. Most are simple: new form submission → Slack notification, new blog post → social share, etc. The magic isn't in complexity. It's in not having to remember to do repetitive tasks. What I've Stopped Using ChatGPT Plus—I let my subscription lapse. It's not worse than Claude, but I don't need two general-purpose AI assistants. Pick one. Use it well. Most "AI writing" tools—If a tool promises to "write blog posts for you," it's probably producing generic content that sounds like everyone else. I use AI to think and draft, not to replace my voice. Complex automation setups—I used to build elaborate multi-step workflows. Now I default to simple. If a Zap has more than 3 steps, I question whether I'm solving the right problem. The Pattern The tools that stick share a few traits:They remove friction, not add it. If I have to think about using the tool, I won't. They integrate with my existing workflow. I don't want to rebuild my life around software. They have clear failure modes. When they break, I know immediately and can fix them.What I'm Testing Now I'm experimenting with a few tools that might earn a permanent spot:Perplexity for research—still deciding if it's better than Claude for this use case Replit for quick prototyping—interesting, but not sure it beats local development yet Various image generation tools—mostly for blog headers and social contentThe bar for adding a new tool is high. It needs to solve a real problem I have today, not a hypothetical problem I might have someday. The Real Lesson The best tool is the one you'll actually use. Not the one with the most features. Not the one that gets the most hype on Twitter. I've seen people spend more time optimizing their productivity stack than doing actual work. Don't be that person. Pick simple tools. Use them consistently. Move on.What's in your actual daily stack? Not what you think you should use—what you actually open every day. I'd genuinely love to know.