Your AI Has No Idea What You Are Actually Working On
How Interface Design Drives Overtrust, Wrong Delegation, And Quality Loss
AI has a problem. Its strength is not the issue.
It’s the rectangle you type into.
As a marketer I often test the limits of AI, and I teamed up with Andreea from Products Users Love to explore why AI output is so fragmented and why we are working in thousands of micro silos every day. A real UX challenge that is waiting to be solved.
Wilbert: I am gathering a community of marketers and entrepreneurs in Decode: AI here on Substack, to learn and grow durable AI skills. My mission is to bring peace through expertise for marketers. I bring in 15+ years in international marketing, worked in AI consultancy and recently started a job as a marketer for the largest social causes fundraiser in The Netherlands.
Andreea: I am an HCI researcher turned UX strategist who spent a decade running hundreds of research studies for tech startups and product teams. My background is in gesture recognition and VR interfaces, which trained me to spot something most people miss: how the design of a tool shapes the behavior of its user, often in ways neither intended. I write about how to see product problems more clearly — and how to make research decisions that actually move things forward, rather than add noise to an already complicated process.
Speed, Quality and Structure, The AI I Wish For
Wilbert: Early 2025, I started as a content marketer at an AI consultancy company. Content repurposing was “hot.” Every piece of content should become ten. Every blog should at least evolve into a LinkedIn-carousel, a newsletter and a video script. Delivering on a 10x promise.
So I went tool-hunting. Hard.
Which AI tool could get me closest to structuring this effectively?
That took me in a lot of places. And I found a subpar solution in Projects (back then in ChatGPT) and more recently in a combination of Claude Skills, Cowork, Memory and Projects.
By structuring my content repurposing, I had a more clear view on it as well:
Repurposed posts that were 80% AI-generated got less traction. This consisted of asking AI to shorten and auto-generate something based on guidelines and light editing.
Posts that were repurposed and only 20-30% was AI-generated were outperforming it.
With me restructuring posts, giving it extra detail and attention instead of a rough pass, the post performed significantly better. I wrote 80% myself, and only asked AI for some guidelines to set up strong posts and do some redaction and light editing at the end.
My conclusion: the 10x output promise is a fantasy if you care about quality.
It’s an insight about how AI tools promise quantity and don’t deliver on great quality.
But it also made me think about how these tools never seem to understand the right context.
I was either working in a silo (project) with context and broader memory in the general chat window.
What if the AI tool understood my full context?
The specific campaign
The specific subbrand
The needed marketing assets and platforms with that
The specific work context I work in (this Substack, my work as freelancer and in a parttime job)
And not only by guessing it, but by my manual input or correction if needed?
The Human-AI Interaction Challenge
Andreea: Wilbert’s insight regarding the 80/20 use of AI and its link to output quality points to something more fundamental about how we’re using these tools, and more importantly, why.
We all heard the advice “automate what doesn’t need your input” and it might seem a very easy and reasonable solution to the marketing content challenge. But how do you actually know what that is? Do we have a shared, well-tested understanding of where AI can reliably help us? The honest answer is no, and most people find out the hard way, exactly like this experiment showed.
Over-delegating or under-delegating AI
Research on human-AI task delegation confirms this. Studies show that people need two types of information to delegate effectively: knowledge of what the AI is actually good at, and an understanding of their own task distribution, meaning which of their tasks require creativity, judgment, or domain expertise, and which are more mechanical. Without both, people tend to either over-delegate or under-delegate, and they usually only figure out which one they did after the results come in.
So while the advice that “AI can do a lot” isn’t entirely wrong, the way AI communicates its own capabilities makes calibration nearly impossible. When was the last time an AI tool flagged that the content it just generated might not sound like you, or that this particular task might be outside its reliable range? It doesn’t happen. Research on LLM confidence calibration shows that these models express high certainty even when they are wrong or when the output quality is low. AI doesn’t promise quality explicitly, but the tone implies it. It always sounds sure of itself.
This creates a trust problem that is easy to miss. Users build their understanding of AI reliability based on the errors they notice, not the ones they don’t. If you catch a factual mistake, you learn to check facts. If you don’t catch it, you trust it. But how about more subtle aspects, like your human edge? That is even harder to catch as it’s not that easy to be aware of everything that makes you, you. This is how trust gets miscalibrated without people realising it.
Real understanding of your full context: absent in AI tools
The context problem is the third layer of this, and probably the most underappreciated one. The question “what if AI understood my full context when I work in a project?” touches on something real and genuinely difficult. AI tools today don’t hold context the way our brains do. When a language model generates a response, it is working from what is in the conversation window at that moment. It doesn’t know how you usually structure your ideas, what your audience expects from you, or the positioning choices you made three months ago. It estimates an answer based on the input it has, and it does so very confidently.
Simulating our brains?
The harder truth is that giving AI full context is not simply a matter of filling in a brief or uploading a style guide. Our brains process context continuously and often subconsciously. We carry background knowledge about our audience, our goals, our past mistakes, things we would never think to write down because they feel obvious. Translating that into something a model can actually use is genuinely difficult, and most tools don’t help users figure out what they are missing. So when the 80% AI posts underperformed, it likely wasn’t one failure. It was the compounding effect of misaligned trust, incomplete context, and no signal from the tool that any of this was happening.
The Chat Window Shapes The Work
Wilbert: How I am using AI most days: I open a chat, I question back and forward, I copy and save it somewhere.
Fragmentation happens there.
The single chat window is built for quick output. Some questions, one answer. It’s a vending machine for text or action. And because that’s the interface, that’s how people work.
I think that fundamentally shapes how I use AI, and if not used critically, this pushes people toward shallow, generic, high-volume output.
Conversation is not the end game
Andreea: There is something philosophically interesting about this shift that I don’t think gets enough attention. We moved from full graphical interfaces, with menus and buttons and structured workflows, to a blank chat window. The logic made sense: conversation is one of the most natural forms of human interaction, so why not make computers work the same way? The promise was that the learning curve would disappear. You would not need to find the right button or understand the right menu structure. You would just say what you want in plain language.
But here is what we are discovering now that tasks have become more complex: conversation is a poor interface for a lot of what we actually need to do.
One shots? Rare.
Nielsen Norman Group researchers studied 425 interactions with generative AI tools and found that users almost always engage in multistep iteration because the AI cannot deliver exactly what they want on the first try. Once that happens, the conversation stops being easy. Users have to do significant extra work to guide, correct, and refine the output, and the endlessly scrolling chat window provides almost no support for that process. With this complexity in mind, current tools started to implement solutions for structuring information and not keeping everything in a chat.
Is Projects The Solution?
Wilbert: Claude Projects is actually a step in the right direction. You can upload context, set instructions, maintain continuity across conversations. Also, Memory is getting smarter, it learns preferences over time.
I believe that 90% of the users doesn’t use Projects yet. And still, if you use Projects, or more recently Skills in Claude Cowork, it’s missing connective tissue between everything. Memory does an OK job, but not good enough.
My big what if: What if your AI tool understands your processes in-depth?
It knows you’re working on a campaign for a subsidy program, that you’ve been developing a brand voice across multiple conversations, and that yesterday’s LinkedIn copy connects to next week’s email sequence.
And what if this was visual, not only under the hood in Claude’s memory?
AI that auto-recognizes work and labels it.
It sees the links between your projects, not because you explicitly told it, but because it observed the patterns. It notices you always rewrite AI drafts in a specific way and starts adapting. It recognizes that your visual style follows certain patterns and suggests imagery that fits.
What if it knows where to help and where to step back. It suggests: “This section could benefit from an AI-generated structure. But this closing paragraph? That’s where your voice matters most. Write it yourself.”
Not more AI. Smarter and honest AI.
AI that understands the boundary between its strengths and yours.
The chat window might be the bottleneck here.
The bridge between a powerful model and a marketer leveraging that power is yet to be build.
AI That Asks: Where Can I Help?
Wilbert: This is what I would need: A hybrid between Projects and Memory that actually compounds. Where my work in one project informs and improves work in another.
AI doesn’t just remember facts about me, but understands how I work in my process, not just my preferences.
Process-aware AI, not just prompt-responsive AI.
Because right now, AI tools are asking: “What do you want me to write or do?”
The better question would be: “What are you working on, and where can I actually help?”
Shifting the Human-AI relationship
Andreea: There is something worth sitting with in the problem Wilbert describes. The shift from “what do you want me to write?” to “what are you working on, and where can I actually help?” sounds like a small change in phrasing. It is actually a fundamentally different relationship between a person and a tool.
Reactive AI
Right now, AI is largely reactive. It waits for input, responds to prompts, and starts fresh each time. What Wilbert is describing is something more like a working relationship, where the tool understands not just your preferences but your process. How you move from a rough idea to a finished piece. Where you get stuck. Which parts you always rework and which parts you leave alone. That kind of understanding cannot be captured in a project brief or a style guide. It accumulates over time, through repeated interaction, through observation.
We are certainly moving in that direction, but to be honest, the challenges are not small. Context is fragile, trust is miscalibrated and our own workflows are hard to explain. It is the kind of knowledge that researchers call tacit, things we know how to do but cannot easily explain. Asking AI to learn that is a harder problem than it might appear.
Process-aware AI and a better framework for engagement
So I think we need two things to develop in parallel. Better tools, yes, ones that are process-aware, that compound context across work rather than resetting it, that provide signals about what they are actually reliable for. But also a better framework for how we engage with AI as users. More awareness of where the boundaries are. More intentionality about what we delegate and why. Less assumption that confidence means accuracy.
A better 10x promise
Perhaps the 10x promise should look like this: AI could help you do your best work more consistently, not by replacing your judgment, but by supporting the parts of the process where your judgment is least needed.
Ps. How do you handle fragmented work with AI right now?
If you loved reading this article, please consider subscribing:









Thank you, Andreea, for this amazing collaboration!