I’d define AI tourism this way: An implementation without a purpose where people wander—like tourists. We’ve seen this across most of the business world. The last four years have been a fun, if occasionally pressure-some break for workers from regular commercial activities as they played around to see what it was for.
Now, AI agents have grown skilled at building decks and documents, but also product requirements (PRDs), websites, and mock applications via what Andrej Karpathy termed “vibe-coding.” In the development and product world, AI-assisted coding has really stuck: 91% of developers say it has improved or ‘revolutionized’ their workflows, according to the developer platform Temporal.
But I think automation is about to take a front seat again, because of the tokenization aspect and financial impact. Companies are not spending their tokens effectively. Foundational models are charging them based on usage. CFOs are finally seeing the bill come due and realizing that they’ve encouraged their company to essentially hook their product up to a variable cost lever that LLM companies can adjust at any time.
Now, instead of mandating token spend, teams I talk to are implementing caps. Here are my predictions of how the end of AI tourism is going to rattle the industry, and ideas on a path to AI citizenship, if we can call it that, for the companies that found real uses for it.
AI economics make no sense at present
Anthropic’s recent 9x cost increase is just the start. In April, they raised token costs without an announcement, which stopped the music for a lot of companies that just days prior, were bragging about their receipts. Suddenly, executives are wondering whether agents are more expensive than the people they’d replaced.
Token costs were always subsidized. Anthropic did not intend to keep losing $8-13 on every $1 people spent on subscriptions forever. This is true of all the foundation models, which expected delivery costs to fall faster than they have, but are engaged in a land grab for users.
Rising token costs could be fine so long as the AI is delivering, and this is what leaders are now asking. I think every CIO can point to examples of AI rapidly scaling capacity in one area, but few can point to organization-wide transformation, like 50% higher project velocity. That’s because real workflows and layers and security are a lot to navigate. It's also because those organizations lack proper cloud cost management—nobody even knows what the combined AI spend is, much less its impact to their margins. To make a case for continued spending, companies are going to have to get a lot more focused, have up-to-date numbers, and have a lot more to show than a few spot automations.
AI broke Agile. What will replace it?
Whether AI-generated code creates maintainable, usable software has yet to be seen. What is clear to me is that it’s massively changed people’s habits around how software is made. I am old enough to remember quarterly software releases and how DevOps and better tooling allowed the industry to move to continuous delivery.
Now, LLMs have sped things up so much that they’ve broken Agile. AI has made PRDs ridiculously attainable, even for business users. It used to be that those users would make a request and then go quiet while you spent weeks or months building something to show them. Then they’d say, “That’s bad, that’s bad, change this,” and that was your first feedback. Now, those users can play around on their own and bring you something they’ve created. Creating it themselves also helps them think it through. So it speeds up testing and collapses that cycle.
The PRD process has changed at every company I am talking to. No more two-week cycles. Way faster momentum. Stronger collaboration because everyone’s done the prework and isn’t waiting on someone else to show them what something would look like. That speed comes with downsides, of course—it shifts work right, and creates security and quality assurance bottlenecks. But Agile is officially broken, and we need a new framework.
And perhaps new tooling. Now that the thinking part of business is happening so fast and outside of people’s heads, we have a greater need than ever to document the context.
The future demands portable context layers and consolidated cores
Models enforce the limits of their own understanding, and limited LLM context windows have taught us just how remarkable the human capacity to maintain context is. LLMs still lose context between chats, and even within chats. We’re in a messy middle period of companies figuring out how to imbue these systems with memory and make that context portable and consistent.
Let’s take go-to-market as an example. When a new prospect lead comes in, that form fill is the initial context. At my last employer, we experimented with creating automations that saved that context as a deal “manifest” and updated it as the deal progressed through stages. We found that this architecture did a better job than people at maintaining context across handoffs. Especially because salespeople are poor at leaving handoff notes. The context manifest gave customer success teams all they needed—and could, theoretically, give agents what they needed as well.
Yet applying this in real work, it quickly gets complicated. You have no idea how fragmented the context in our world is without us to unite it. If you use two LLMs, that’s a division. If you use an LLM for personal and for work, that’s a division. If you use both Slack and email, that’s a division. If you have any in-person interaction with co-workers, that’s a division. Add in authentication layers, devices, multiple calendars, nonverbal communication, and company culture, and you realize nothing’s ever going to have enough context to do your job anytime soon. And what if someone leaves and takes their context with them? Is your context yours, like a digital wallet? Many companies are turning Slack into that knowledge repository. But I think that all this does is make a case for aggressive core system consolidation.
I think that all this does is make a case for aggressive core system consolidation
Back when I was at Intercom, I merged the business applications and engineering teams for just this reason. The silo didn’t serve any useful purpose, and I wanted everyone to understand the whole ecosystem, from product to back office.
Your systems of record are now your most vital systems of context. They always were, when they were updated properly, and now that context is even more valuable.
At my last company, Notion, we mapped out all the different departments and roles and concluded that the most important roles were cross-functional—because that’s where the context loss was the most acute. So, places like configure-price quotes because it crosses so many systems of record.
If you want to maintain better systems, you need to reduce the number of moving parts and consolidate your core systems. I can’t think of any reason sales and finance shouldn’t all be one core workflow. At Notion, we studied those cross-system information flows and had analysts apply automation to those handoffs. We found that the thing holding us back was context. Without enough of it, we could easily create 80% of the cross-system fix—but the other 20% could take 12x as long.
The final frontier: Who QAs the QA agent?
You can use AI to summarize vast quantities of information. But how do you know the summaries are any good, and actually capturing the salient points? Automation will never be agentic automation if humans are still in the loop. At Notion, we allowed users to “thumbs-up” the result of an automation, and this worked to a point. A single click is about the most you can ask of users and get enough responses. But what does a thumbs-up actually mean? And does it ultimately correlate to the kind of outcomes the company wants to see? That’s very important to figure out before you let something run and loop forever.
You can, of course, have humans pull random samples of tickets. Or create adversarial QA agents that have to arrive at consensus or it triggers a human review. You can explore using formal mathematical proofs.
But even if you figure out all of the above, you’re going to have to figure out who will quality assure the agents as well as a human with common sense.
The CFO is about to demand a return to deterministic workflows
We’re far along enough in the AI journey to know that you don’t need an LLM for every little task. CFOs everywhere are now busily encouraging employees to realize this, and to distinguish what use cases actually add value.
I am for this correction. There’s a great deal to like about workflows that never hallucinate and always return the same predetermined result—especially in financial reporting. As AI tourism slows down, companies are going to sort themselves into those that really were just on vacation, and are now done, and those that want to become AI citizens.
For the latter, I think we’re going to see a new software development methodology emerge around the instant PRD and instant prototype. I think we’re going to see those companies pursue aggressive core system consolidation, and better securing and managing their knowledge and context. And I think we’ll have humans in the loop for a while longer—overseeing LLMs that generate deterministic code that companies can use over and over.
Tourist season has been a lot of fun, but it’s time for most of us to get back to work.




