Showing Posts From

Others

Building in Public Without Analytics Is Just Vibes

Building in Public Without Analytics Is Just Vibes

Shipping every day feels productive. It also lies to you. I have been thinking about that a lot lately because the internet makes it very easy to confuse visible output with actual traction. You can publish posts, ship tools, push updates, and watch the streak keep going. From the outside, it looks like momentum. But if you cannot see what people are clicking, reading, bouncing from, or coming back to, a lot of that momentum is just a good-looking blur. That is not a content problem. It is a measurement problem. Output is not the same as signal I think a lot of builders quietly do this. We tell ourselves that consistency is the hard part. And to be fair, it is hard. Most people never publish enough to learn anything. But once you are publishing consistently, the bottleneck changes. The question stops being, "Can I ship?" and becomes, "Can I tell what is actually working?" Without analytics, you usually cannot. You are left with the weakest possible proxies. A post "felt" strong. A launch got a couple replies. A page seemed clear when you read it back. None of that is useless. But none of it is enough either. It is just intuition wearing a nicer shirt. Building blind gets expensive fast This matters even more when you are running a small operation. If I write a blog post, publish a tool, and share an idea on X, I do not just want the satisfaction of having done the work. I want to know where attention actually pooled. Did people spend time on the page? Did they click through to the tool? Did one idea pull better than another? Did traffic come from search, direct, or social? Did anything compound? That is why tools like Plausible and Google Analytics matter, even if the setup is not the glamorous part. Measurement is not bureaucracy. It is how you stop wasting weeks on stories that only sound true in your own head. I have learned this the annoying way. When analytics are missing, every decision starts drifting toward taste. You optimize for what feels sharp, what sounds smart, what seems likely to work. Sometimes that overlaps with reality. A lot of the time it does not. And the longer you keep shipping without feedback, the more confident you can become for the wrong reasons. That is a dangerous loop. The real job is closing the loop I think this is where a lot of "build in public" advice falls apart. People talk a lot about courage, speed, and volume. Less people talk about instrumentation. But the boring part is what turns output into a system. You need a loop:publish something measure what happened learn from the result change the next thingWithout that loop, you do not really have a content engine or a product engine. You have a posting habit. And a posting habit is better than silence. I will take that over endless planning every time. But if the goal is to get sharper, not just louder, then the loop matters more than the streak. That is part of why I keep coming back to simple, legible systems. I wrote recently about why boring systems are a feature. This is the same idea in a different form. I do not need a giant dashboard religion. I just need enough visibility to tell whether the thing I shipped did anything real. That sounds obvious, but a lot of builders skip it because it feels secondary. It is not secondary. It decides whether your effort compounds. Vibes are fine for drafts, not decisions I still trust instinct. I still think taste matters. I still think you sometimes have to publish before the data exists. But instinct should help you make the first bet. It should not be the only system you have for deciding what to do next. That is the line I care about more now. Write the post. Ship the page. Launch the tool. But then measure what happened, or be honest that you are still in the guessing phase. Because building in public without analytics is not really building in public. It is just publishing in the dark.

If You Still Have to Double-Check It, It Isn't Automated

If You Still Have to Double-Check It, It Isn't Automated

A lot of people call something automated when what they really mean is faster. Those are not the same thing. If you still have to double-check every output, every recommendation, or every record before you can trust it, you didn't automate the job. You just changed the shape of the work. I keep seeing this with AI tools for operators. The demo looks great. The model fills in the form. It summarizes the notes. It flags the likely issue. Everyone claps because the task that used to take ten minutes now takes two. But then the person using it still has to read the whole thing line by line to make sure it didn't hallucinate, skip a step, or confidently say something dumb. At that point, the tool may be useful. But it is not automation. It's assisted drafting. And to be clear, assisted drafting can still be valuable. I'm not knocking it. Speed matters. Reducing blank-page friction matters. But if a manager still has to babysit every output, the real bottleneck did not disappear. It just moved downstream. That's why I care a lot more about reliability than flair. When I'm building tools for operators, I want the default experience to feel safe. Clear inputs. Narrow scope. Fewer places for the system to go off the rails. The operator should not need to become the QA layer for the machine every single time. This is especially true in messy business workflows. Compliance, payroll, food safety, onboarding, audit prep. These are not areas where "mostly right" feels good. If a record is wrong, or a required step gets skipped, someone ends up eating the cost. That's part of why I think the best AI use cases look boring from the outside. They do one job. They stay inside clear boundaries. They help with judgment only where it actually helps. The more a system depends on a human hovering over it, the less automated it really is. I've written before about how AI makes bad process fail faster. I think this is the same lesson in a different wrapper. A sloppy process plus a fast model just gives you wrong answers at a higher volume. The bar should be higher than speed. The bar should be trust. That doesn't mean every tool needs to run fully unattended. Sometimes human review is exactly the right call. But if human review is mandatory on every single run, then be honest about what you built. It's not automation. It's a co-pilot with a nervous supervisor sitting beside it. I like the way Google's SRE book frames operational reliability. The point is not just to make systems work sometimes. The point is to make them dependable enough that people can build real processes around them. That's the standard I think AI builders should steal. Not "can the model do this once in a demo?" Can someone trust the workflow enough to stop re-checking the whole thing from scratch? If the answer is no, that's fine. It might still be a useful product. But call it what it is. Useful is good. Reliable is better. And actual automation starts when the operator can finally take their hands off the wheel.

Restaurant Owners Don't Care About AI

Restaurant Owners Don't Care About AI

Restaurant Owners Don't Care About AI Restaurant owners do not wake up wanting more AI in their business. They want fewer things to go wrong. They want the fridge temp logged. They want the sanitizer check done. They want to know the opening shift didn't miss something stupid that turns into a failed inspection later. That's why I keep getting pulled toward compliance tools instead of flashy AI demos. The interesting thing is not the model. It's the consequence. If a tool helps someone avoid a health inspection problem, prevent a scramble, or make a manager's day less chaotic, they'll use it. If it just sounds futuristic, they won't. That's also why I think the boring systems angle matters so much. I wrote about that more in Boring Systems Are a Feature. Lately I've been building LogChef, a food safety logging tool for restaurant teams. The point is not to impress anyone with AI. The point is to make the work clearer, faster, and harder to screw up. That framing matters way beyond restaurants. A lot of AI products are still sold like magic tricks. But in real businesses, people usually buy relief. They buy fewer mistakes. They buy fewer dropped handoffs. They buy fewer moments where somebody says, "Wait, who was supposed to do that?" The teams that win with AI are usually not the ones chasing the flashiest demo. They're the ones using it to remove friction from work that already matters. That's a very different bar. It also lines up with how regulators and operators think. The FDA Food Code is not asking whether your tooling is exciting. It cares whether the process is followed, documented, and repeatable. Same in a lot of B2B software. People talk about AI like the product. Most of the time it's just the engine inside the product. What the customer actually buys is confidence. They want to feel less exposed. So when I'm thinking about what to build, I've started using a simple filter:Does this help someone avoid a real problem? Does it make a recurring job easier to complete correctly? Would someone still want this if I removed the word AI from the homepage?If the answer to that last question is no, I get suspicious fast. I'm more interested in tools that quietly make a workday better than tools that generate a lot of hype for a week. That's usually where the real value hides.

Boring Systems Are a Feature

Boring Systems Are a Feature

I like boring systems more every week. That is not because I suddenly hate new tools. I use a lot of them. It is because the more often I ship, the less patience I have for infrastructure that feels clever right up until it breaks. This week I got a live reminder. My site runs on Astro and deploys on Vercel. The setup is pretty simple. Posts are markdown files. Routes are readable. Builds are visible. When something was off in production, I did not have to guess which hidden layer might be lying to me. I could inspect the files, inspect the route, inspect the deploy, and narrow it down fast. That matters a lot more than people admit. The problem with magical systems A lot of modern tooling sells convenience by hiding the machinery. That feels great on a clean demo. You connect a few services, click around a dashboard, and everything looks smooth. Then a real edge case hits. A route does not generate. A cache holds the wrong thing. A deployment succeeds but the output is not what you expected. Now the time you saved upfront gets repaid with interest. I do not think this is just a developer problem. If you are a solo builder, operator, or founder trying to publish consistently, your infrastructure is part of your workflow. It is not separate from the job. Every opaque layer is another place where a simple content task can turn into an afternoon of weird debugging. That is why I keep gravitating toward systems that are easy to read. Legibility beats novelty One thing I like about file-based setups is that they make reality hard to ignore. The post either exists or it does not. The route either builds or it does not. The deploy either picked up the change or it did not. There is less room for the vague category of problems I would describe as platform gaslighting. I think that is part of why switching to Astro clicked for me so quickly. It feels close to the actual artifact. I write the file. I commit the file. The site builds the file. When something fails, I can usually trace the failure without needing a séance. That is not old-fashioned. That is useful. People love to talk about speed, but legibility is speed. A boring system that breaks in an obvious way is faster than a magical system that breaks in a mysterious way. Shipping daily changes what you optimize for If you publish once a quarter, maybe you can tolerate more complexity. If you are trying to ship every day, you start caring about a different set of traits:Can I understand what failed? Can I fix it without spelunking through three vendor dashboards? Can I trust the deploy path? Can I make changes without creating a second mystery while solving the first one?That is a very different filter from, "What has the slickest onboarding?" I think a lot of solo builders should bias harder toward transparent tools for exactly this reason. Not because the newer stuff is bad. Not because abstraction is evil. Just because your real bottleneck is usually not raw capability. It is recovery time. Boring is not the opposite of good I think people sometimes hear "boring" as an insult. I mean it as praise. Boring infrastructure is what lets you spend your energy on the part anyone actually cares about, the product, the writing, the distribution, the work itself. If the stack disappears into the background and only demands attention when something concrete needs fixing, that is a win. The irony is that the simple path often feels more modern in practice. It respects your time. It keeps the feedback loop short. It lets you debug with evidence instead of vibes. That is the kind of system I want more of. Not magical. Not over-designed. Just clear enough that when it breaks, I can read the failure and move. That is a feature.

Deployed Is Not the Same as Launchable

Deployed Is Not the Same as Launchable

I think a lot of builders confuse "it loads" with "it's ready." I've made that mistake more than once. You deploy the app. The URL returns 200. The core feature works. Maybe you even send the link to a friend and they say, "nice, it's live." But being deployed is a much lower bar than being launchable. A product can be live and still not be ready for real traffic. The fake sense of completion The dangerous part is that deployment gives you an emotional hit. You pushed the code. Vercel built it. The preview looks clean. The app opens. So your brain wants to call the job done. But that only proves one thing: the code made it onto the internet. It does not prove that the product is packaged well enough to survive contact with real users. I've started thinking about launch readiness as a separate checklist:does the clean domain resolve correctly? does the product work on the actual production URL? is analytics installed? can I explain what it does in one sentence? is there a clear next step for someone who finds it? would I feel good sending this to the exact person it's meant for?If the answer to a few of those is no, then it isn't really launched. It's staged. Infrastructure gaps are launch blockers, not cleanup tasks This is where a lot of solo operators get sloppy. We treat domain fixes, analytics setup, redirects, and little polish issues like post-launch cleanup. Sometimes they are. But a lot of the time, they're the difference between "a thing exists" and "this can actually start compounding." Take domains. If your app works on a temporary URL but the clean domain is broken, you don't really have a finished launch surface yet. You have a working artifact plus a distribution problem. The same goes for DNS and routing. Cloudflare's DNS docs are boring, but boring infrastructure problems decide whether a product feels real. Users do not care that the underlying app is technically healthy if the branded URL fails. And analytics is even more important. I wrote about that more directly in Building in Public Without Analytics Is Just Vibes, but the short version is simple: if people can arrive and use the product, but you can't see what happened, you launched blind. That's not a real operating system. That's hope. Launchable means you can stand behind it For me, the real question now is not "did it deploy?" It's "would I confidently push people to it today?" That standard catches a lot. If I still need to caveat the domain, explain that measurement isn't set up, or warn someone that a few pieces are still half-connected, then I'm not describing a launch. I'm describing a work in progress that happens to be online. That's fine, by the way. A lot of things should be online before they're fully launchable. Preview links are useful. Temporary domains are useful. Internal dogfooding is useful. The mistake is pretending that those states are the same. They aren't. One is proof that the code runs. The other is proof that the product is ready to be taken seriously. The bar I want to keep now I'm trying to be stricter about this because the internet is full of half-launched things. Stuff that technically exists, but isn't ready to earn trust. And trust is the whole game. If someone clicks a link I shared, I want the domain to work, the page to load fast, the core action to be obvious, and the measurement layer to be there so I can learn from the visit. Otherwise I'm just generating more surface area. Deployment matters. Obviously. But launchability is what turns a deployed project into something you can actually build on. If you're building right now, ask yourself a blunt question: is the product launched, or is it just online?

Most AI Agent Problems Are Infrastructure Problems

Most AI Agent Problems Are Infrastructure Problems

I think a lot of teams are optimizing the wrong layer. When an AI workflow breaks, the first instinct is usually to swap the model. Try OpenAI. Try Anthropic. Try a new prompt. Try a new framework. Maybe that helps. Usually it doesn't fix the real problem. Most AI agent problems are infrastructure problems. The model is the visible part, so it gets all the attention. But in practice, the failures usually happen in the seams. A tool call times out. A background job runs too long. A session loses state. A retry happens in the wrong place and duplicates work. One flaky dependency turns a clean demo into a system that quietly dies halfway through the job. That stuff is not sexy, but it's the whole game. The demo works, the system doesn't A lot of AI products look good in a five minute demo because the happy path is easy to stage. You give the model a clear instruction. It calls the right tool. The data comes back clean. The output looks smart. Everyone nods. Then real usage starts. Now inputs are messier. APIs are slower. Credentials expire. One tool returns malformed JSON. Another gives you a 429. A user asks for a task that takes 20 minutes instead of 20 seconds. Suddenly the question isn't whether the model is smart. The question is whether the system can survive contact with reality. That's why I keep coming back to the same point: reliability matters more than cleverness. If you want a useful AI system, I think you need a few boring things before you need a better prompt. What actually matters First, you need retries that aren't stupid. Not infinite retries. Not blind retries. Real retries with limits, backoff, and some awareness of what failed. Second, you need state. If a workflow has already finished steps one through four, it should not start over just because step five broke. It should know where it is, what already succeeded, and what still needs attention. Third, you need supervision. Long running work needs checkpoints. It needs status. It needs a way to surface, "here's what happened, here's what's blocked, here's what I'm doing next" without making the user babysit every move. Fourth, you need graceful degradation. If the ideal path fails, the system should still have a second move. Maybe it waits. Maybe it falls back. Maybe it asks for help at the right moment instead of crashing into a wall and pretending it completed the task. None of this is glamorous. That's exactly why it matters. Model switching is not a strategy I like good models. I'm happy to use better ones whenever they show up. But "we'll just switch models" is not an operating plan. It's a coping mechanism. If your system depends on every tool succeeding instantly, every API staying stable, and every run finishing on the first try, you're not building an agent. You're building a brittle chain of lucky events. The teams that win here won't just have access to strong models. Everyone will have that. The teams that win will build the execution layer around the model. They'll know how to recover work, route around failures, preserve context, and keep moving when the world gets noisy. That's the real moat. It's the same lesson I wrote about in Nobody Cares About Your AI. The flashy part gets attention. The useful part solves the problem. The boring work is the product I don't think the future belongs to the teams with the most impressive demos. I think it belongs to the teams that make AI feel dependable. The ones that make a user trust that the work will finish. The ones that handle failure without turning the user into unpaid QA. The ones that treat recovery, observability, and orchestration like product features, because they are. According to McKinsey, the economic upside of generative AI is massive. I buy that. But I don't think most of that value comes from demos that look magical on day one. I think it comes from systems that keep working on day one hundred. That's a much less glamorous story. It's also the one worth building.

The Best AI Use Cases Are the Ones Nobody Tweets About

The Best AI Use Cases Are the Ones Nobody Tweets About

Open any tech feed right now and you'll see the same stuff. AI generating photorealistic images. AI writing entire codebases. AI having philosophical conversations about consciousness. Cool demos. Genuinely impressive technology. And almost none of it is where the real money is being made. The AI use cases actually generating revenue are the ones nobody screenshots for LinkedIn. Temperature log validators for restaurant chains. Payroll compliance checkers that flag overtime violations before they become lawsuits. Employee classification calculators that tell you whether your new hire is exempt or non-exempt under FLSA guidelines. Boring stuff. The kind of problems that make people's eyes glaze over at dinner parties. I know this because I'm building these tools right now. LogChef checks whether food temperature logs meet health code requirements. Exemptly walks you through the DOL's duties tests to classify employees correctly. PayShield catches payroll compliance gaps before an auditor does. Nobody is going to retweet a temperature log validator. Nobody is making TikToks about employee classification flowcharts. But here's the thing: people actually search for these tools. Real humans with real compliance deadlines type "FLSA exempt vs non-exempt calculator" into Google every single day. And when they find a tool that solves their problem in 60 seconds, they remember who built it. I wrote about this pattern before in Nobody Cares About Your AI. The technology itself doesn't matter to the end user. What matters is whether the problem goes away. A restaurant manager doesn't want "AI-powered food safety monitoring." They want to not fail their next health inspection. A payroll admin doesn't want "machine learning compliance analysis." They want to stop worrying about whether they're breaking federal overtime rules. The gap between what the AI community talks about and what businesses actually need is enormous. And that gap is where the opportunity sits. If you're building with AI right now, here's my honest take: stop chasing the use case that'll get you on Hacker News. Start looking at the problems people are too embarrassed to admit they still handle with spreadsheets and sticky notes. The compliance checks done on paper. The classification decisions made by gut feel. The audit prep that takes three people a full week. Those are your best AI use cases. They just don't make good tweets. What's the most boring problem you've seen AI actually solve well?

Nobody Searches for Your Product Name

Nobody Searches for Your Product Name

Here's something most SaaS companies get wrong about search. They spend months optimizing for branded terms. "Best project management tool." "[Product] vs [competitor] comparison." "[Product name] review 2026." They fight over the same ten keywords every competitor is already bidding on. Meanwhile, nobody is Googling their product name. At least not the people who need them most. The real search happens before the product Think about what someone actually types when they have a problem. Not a software shopping problem. A real, right-now, my-boss-is-asking-about-this problem. They type things like:"restaurant temperature log template" "employee exempt vs non-exempt calculator" "ISO 9001 audit checklist PDF" "how to calculate overtime for salaried employees"These are the queries that matter. Specific. Boring. High intent. The person searching isn't browsing. They need something right now, and they'll use whatever solves it. Why boring queries win I've been building small compliance tools for the past few weeks. Free web utilities that solve one narrow problem each. A temperature logging tool. A payroll compliance checker. An exempt vs non-exempt classifier. None of these are products in the traditional sense. They're single-purpose tools that answer one question really well. But here's what's interesting. The search volume for these micro-queries is real. According to Ahrefs, "temperature log template" gets searched thousands of times a month. "Exempt vs non-exempt" gets even more. And almost nobody is building dedicated tools to capture that traffic. Most of the results are blog posts from law firms and HR consultancies. Long articles that kind of answer the question but don't actually give you the thing you need. No calculator. No downloadable template. No interactive tool. Just 2,000 words of background information and a "contact us for a consultation" CTA. That's the gap. Own the query, not the category The mistake is thinking you need to own a category term like "compliance automation" or "workflow management platform." Those terms sound important in a pitch deck, but real humans don't search that way. Real humans search for the specific problem sitting on their desk right now. If you can be the thing that solves that problem, you don't need the person to know your brand first. You just need to be there when they search. The brand relationship builds backward from usefulness. This is basically the playbook that HubSpot used early on with their free tools. Website grader, email signature generator, invoice templates. None of those were the core product. All of them brought in people who eventually needed CRM software. What this means if you're building Stop fighting for category keywords. Start asking: what's the smallest, most specific problem my future customer has right now? Build the thing that solves it. Make it free. Make it show up when they search. The boring query is the wedge. The product conversation comes later. I've been testing this approach with compliance tools, and the early signals are promising. More on that as the data comes in. But the principle holds: nobody is searching for your product name. They're searching for help with the problem you solve. Go own that query instead.

Nobody Cares About Your AI

Nobody Cares About Your AI

I've been building compliance tools for the past few weeks. Temperature logs, food safety checklists, inspection prep workflows. Boring stuff by any AI startup's standards. Here's what I've learned: restaurant owners do not care about AI. Not even a little. They care about passing their next health inspection. They care about not getting fined. They care about making sure the walk-in cooler didn't die overnight and ruin $3,000 worth of product. If you walked into a kitchen and said "I built an AI-powered temperature monitoring solution with real-time anomaly detection," the chef would look at you like you just spoke Klingon. If you said "this tells you when your fridge breaks before your food goes bad," now you're talking. The Gap Nobody Tweets About The AI industry has a massive blind spot. We're obsessed with capability and completely uninterested in context. Every week there's a new benchmark, a new model, a new agent framework. Meanwhile, the people who would benefit most from better software are still using paper logs and spreadsheets because nobody bothered to meet them where they are. I'm not saying AI isn't powerful. It is. But power without packaging is just a science project. What Actually Works The compliance tools I've been shipping don't mention AI anywhere. Not in the copy, not in the UI, not in the pitch. They're just tools that solve specific problems for specific people. A temperature log generator that creates the exact form a restaurant needs for their daily checks. A payroll compliance calculator that tells you whether your overtime policy matches your state's rules. A 503 error page builder for when your site goes down and you need a professional page in 30 seconds. None of these are technically impressive. All of them solve a problem someone actually has today. The Boring Niche Advantage McKinsey estimates that generative AI could add $2.6 to $4.4 trillion annually in value across industries. But here's the thing: most of that value will come from boring applications that make existing workflows slightly less painful. Not from chat interfaces. Not from copilots. From small tools that remove friction from tasks people already do. The market for "AI that sounds smart" is crowded. The market for "tool that solves this one annoying problem" is wide open in thousands of niches. Build for the Problem If you're building something right now, try this exercise: describe what you're making without using the words AI, machine learning, model, or agent. If you can't explain the value without those words, you might be building a solution looking for a problem. The best technology disappears. Stripe doesn't sell "AI-powered payment processing." They sell "accept payments online." The AI is in there somewhere. Nobody cares. It just works. That's the standard. Build something that just works for someone who has a real problem today. Let the AI be the how, not the what. What are you building that nobody would describe as "AI" even though it is?

The Tools I Actually Use Daily as a One-Person Operation

The Tools I Actually Use Daily as a One-Person Operation

Tools I Actually Use The AI and productivity tools that are actually worth my time—and why most of the hype is noise.I've tried dozens of AI tools. Most of them are forgettable. Not because they're bad, but because they don't stick. They solve a problem I don't have, or they create more friction than they remove. Here's what's actually in my daily rotation—and what I've learned about separating signal from noise. What's Actually in My Stack Claude is my primary workhorse. I use it for writing, coding, research, and thinking through problems. It's not perfect, but it's consistent in ways other tools aren't. The context window matters more than I expected. Cursor has replaced VS Code for most of my development work. AI-assisted coding isn't about replacing thinking—it's about removing the mechanical friction that slows down experimentation. I still write plenty of code manually. Cursor just handles the boring parts faster. Notion remains my system of record. I've tried Obsidian, Roam, and a dozen alternatives. Notion wins because it's good enough at everything and excellent at nothing. That sounds like criticism, but it's actually the point. I don't want to optimize my note-taking system. I want to take notes. Process Street (full disclosure: I work here) handles my recurring workflows and SOPs. The value isn't the software itself—it's the discipline of documenting processes that would otherwise live in my head. Most people skip this step. That's a mistake. Zapier connects everything else. I have maybe 15 active Zaps. Most are simple: new form submission → Slack notification, new blog post → social share, etc. The magic isn't in complexity. It's in not having to remember to do repetitive tasks. What I've Stopped Using ChatGPT Plus—I let my subscription lapse. It's not worse than Claude, but I don't need two general-purpose AI assistants. Pick one. Use it well. Most "AI writing" tools—If a tool promises to "write blog posts for you," it's probably producing generic content that sounds like everyone else. I use AI to think and draft, not to replace my voice. Complex automation setups—I used to build elaborate multi-step workflows. Now I default to simple. If a Zap has more than 3 steps, I question whether I'm solving the right problem. The Pattern The tools that stick share a few traits:They remove friction, not add it. If I have to think about using the tool, I won't. They integrate with my existing workflow. I don't want to rebuild my life around software. They have clear failure modes. When they break, I know immediately and can fix them.What I'm Testing Now I'm experimenting with a few tools that might earn a permanent spot:Perplexity for research—still deciding if it's better than Claude for this use case Replit for quick prototyping—interesting, but not sure it beats local development yet Various image generation tools—mostly for blog headers and social contentThe bar for adding a new tool is high. It needs to solve a real problem I have today, not a hypothetical problem I might have someday. The Real Lesson The best tool is the one you'll actually use. Not the one with the most features. Not the one that gets the most hype on Twitter. I've seen people spend more time optimizing their productivity stack than doing actual work. Don't be that person. Pick simple tools. Use them consistently. Move on.What's in your actual daily stack? Not what you think you should use—what you actually open every day. I'd genuinely love to know.

Most AI Agents Aren't Actually Agents

Most AI Agents Aren't Actually Agents

Everyone's building "AI agents" right now. The timeline is full of them. Companies are raising millions to ship them. The problem? Most of them aren't actually agents. They're chatbots with API access. That's it. What People Call Agents Here's the pattern I see everywhere:User types a message LLM decides which function to call Function returns some data LLM formats a response DoneThat's not agency. That's function calling with a conversational wrapper. The LLM picks a tool, the tool runs, the result comes back. If it works, great. If it breaks, the conversation dies. If the user needs three things done in sequence, they're manually prompting through each step. This is useful. It's even impressive sometimes. But it's not an agent. What Real Agents Need Real agentic systems operate with autonomy. They handle the messy parts without constant human supervision. That means: Error recovery. When something breaks (and it will), the agent doesn't just apologize and give up. It retries with backoff. It falls back to alternative approaches. It routes around failures without making the user debug what went wrong. State management. The agent needs to remember what it's doing across multiple tool calls. Not just "what did the user ask for?" but "what have I tried, what worked, what's left to do, and what's blocking me right now?" Retry logic. APIs timeout. Rate limits hit. Sometimes data isn't ready yet. A real agent knows when to try again, when to wait, and when to give up. Supervision and checkpointing. For multi-step work, the agent should be able to pause, show you what it's done so far, and resume if something goes sideways. You don't want it to redo 20 steps because step 21 failed. Context persistence. If the system restarts, the agent should be able to pick up where it left off. Not "sorry, you'll need to start over." Graceful degradation. When a preferred tool is unavailable, the agent should try another approach. When data is incomplete, it should work with what it has or ask for the missing pieces. This is infrastructure work. It's not fun. It's not what people demo. But without it, you don't have an agent. You have a chatbot that calls APIs. The Infrastructure Problem The hard part of building agents isn't the LLM. That's the easy part. The hard part is everything around it. You need a task queue that can handle retries. You need a way to checkpoint progress so work doesn't get lost. You need monitoring so you know when an agent is stuck. You need logging so you can debug failures after the fact. You need to handle rate limits from every API your agent touches. You need to deal with inconsistent error responses. You need to decide what to do when a tool returns malformed data or no data at all. You need a way to supervise long-running workflows. You need to surface status updates without spamming the user. You need to decide when to ask for help and when to keep trying. None of this is LLM work. It's systems engineering. What I'm Seeing in Practice I build agents daily. The pattern is always the same. I spend 10% of my time writing prompts and configuring LLM calls. I spend 90% of my time on infrastructure:Handling tool failures Managing state across multiple turns Implementing retry logic Building supervision layers Writing recovery flows for when things go wrongThe prompt is never the problem. The problem is making the system robust enough to actually finish the job. When I look at "AI agent" demos online, I see polished function calling. I don't see error handling. I don't see state management. I don't see retry logic. That's fine for demos. It's not fine for production. The Real Opportunity If most "AI agents" are just chatbots with API access, there's a huge opportunity for anyone willing to build the infrastructure. The companies that win won't be the ones with the best prompts. They'll be the ones with the most resilient execution layers. They'll build systems that:Recover from failures without human intervention Maintain state across sessions and restarts Coordinate multi-step workflows reliably Degrade gracefully when things break Surface meaningful status without overwhelming usersThis is less glamorous than training models or writing clever prompts. But it's what separates working agents from chatbots. LinksOpenClaw - agent infrastructure I'm actively building with LangGraph documentation - one approach to stateful agent workflows Modal - infrastructure for long-running agent workloadsBuilding agents that actually work means caring more about the infrastructure than the LLM. The wrapper matters more than the model. Most people aren't ready for that conversation yet.

The Best Tools I Use Aren't AI Tools

The Best Tools I Use Aren't AI Tools

I spend a lot of time in AI tools. It's part of my job. But the truth is, most of my actual work happens in tools that have nothing to do with AI. The Boring Stack Here's what I use every single day: VSCode for writing. Google Sheets for tracking. Git for version control. Terminal for everything else. No AI. No fancy automation. Just basic tools that do one thing well. When I need to draft something, I open VSCode. When I need to track data, I open Sheets. When something breaks, I check Git. AI tools are great for specific tasks. Claude helps me think through problems. ChatGPT speeds up research. But they're supplements, not replacements. Why Basic Tools Win They're fast. They're reliable. They don't break when the API goes down. They don't require a monthly subscription or a complex setup. They just work. Most work still needs basic infrastructure. You still need a place to write. You still need a way to organize data. You still need version control. AI can help with some of those tasks. But it can't replace the fundamental infrastructure. The AI Hype Problem Everyone wants to talk about AI workflows and AI-first companies. But the reality is that most work still happens in text editors and spreadsheets. If you're building something, don't assume AI will solve everything. Start with the basic tools that work. Add AI where it actually helps. If you're choosing tools, prioritize reliability over novelty. The boring stack exists for a reason. The Bottom Line AI is useful. But it's not the foundation. The foundation is still text files, spreadsheets, and version control. Build on that. Everything else is optional. For more thoughts on building with practical tools, check out my other posts. And if you're looking for workflow automation that actually works, take a look at Process Street.

The Money Is in the Boring Problems

The Money Is in the Boring Problems

The Money Is in the Boring Problems I spent this week building SafeRounds, a free restaurant temperature logging tool. It's not going to get me on Product Hunt's front page. Nobody's going to write a thinkpiece about it. It doesn't use the latest LLM to generate anything. It's just a simple web form where restaurant staff can log fridge temps, freezer temps, and hot hold temps twice a day. That's it. But here's the thing: restaurant owners actually need this. Health inspectors require it. Failing to maintain proper logs can shut you down. And right now, most restaurants are either using paper clipboards (that get lost) or clunky spreadsheets that don't enforce the rules. I see so many people building with AI chasing interesting problems. Translation tools. Creative writing assistants. Novel interfaces for information retrieval. All cool. All technically impressive. But when I look at what actually converts, it's the boring stuff. The compliance checklists. The required documentation. The forms you have to fill out to stay legal. Why? Because these aren't nice-to-haves. They're must-haves. You don't shop around for temperature logs because you're excited about innovation. You need them because the alternative is failing your health inspection. That's a different kind of market. Lower browse time. Higher intent. Immediate utility. And the boring niches are still wide open. Nobody's racing to build better HACCP documentation tools. There's no VC-funded startup disrupting restaurant compliance logs. It's not sexy enough. Which means if you actually solve the problem well, you win by default. I'm planning to build a few more of these. Not because they'll get me followers. Because they'll solve real problems for real businesses. And that compounds differently than viral content. The next one is LogChef — a recipe costing calculator for commercial kitchens. Also boring. Also needed. If you're building something right now, consider this: what's the most boring version of your idea that someone would actually pay for? Start there. Learn more about building small useful tools on my blog, or check out Process Street's approach to workflow automation.