This website uses cookies

Read our Privacy policy and Terms of use for more information.

Let me tell you what nobody explains when you sign up for Claude Pro.

You get your shiny $20/month subscription. You start using it for everything — drafting content, building strategy, brainstorming offers, writing emails. Life is good.

Then three hours into your workday: "You've reached your usage limit."

And you're sitting there like… I paid for this? I barely got through my morning?

It was 12pm ET when I took this screenshit…

If you switched to Claude from ChatGPT (or you're running both side by side) this moment hits different. ChatGPT presents its limits as messages. "You get 160 messages every 3 hours." It's not actually that simple under the hood (limits shift by model, traffic, and time of day), but it feels predictable. You have a rough sense of how much runway you've got.

Claude's usage limits work differently — and that's where the friction starts.

Claude Pro gives you approximately 45 messages per 5-hour window, but that number is a moving target. The actual limit is based on tokens — the tiny units of text Claude processes behind the scenes, roughly one word per token. And your token cost per message shifts based on what model you're using, how long your conversation has gotten, what files you've attached, and how much you're asking Claude to write back.

Which means you can send 10 messages one day and be totally fine, then send 10 messages the next day and hit the wall — and have no idea what changed.

Once you understand how Claude token limits actually work — like, the actual mechanics — you can stretch that same $20 plan dramatically further. This is the guide I wish someone had handed me my first week, before I burned through my limit like it was a free trial.

What's Actually Eating Your Tokens

So what's burning through your Claude usage limits so fast? It's not just the words you type. It's everything:

  • Every word you type

  • Every word Claude writes back

  • Every file you upload

  • Every image you attach

  • And your entire conversation history gets re-read with every single message

That last one is the biggest difference from how most people expect AI tools to work.

The Snowball Effect (Why Message 20 Costs 10x More Than Message 1)

This is the single most important concept for understanding Claude token limits, and almost nobody talks about it.

Claude doesn't "remember" your conversation. It re-reads the entire thing from the top every time you send a new message. Message 1? Claude reads your message and responds. Message 10? Claude re-reads messages 1 through 9, plus all of its responses, plus your new message. Message 20? It's re-reading the equivalent of a short essay just to process your latest "can you tweak that second paragraph?"

Token usage doesn't grow in a straight line. It snowballs. The longer the conversation, the more expensive each new message becomes.

This means a 30-message conversation isn't 30x the cost of one message. It's dramatically more. And that's before we even talk about what else you're feeding it.

What to do instead:

  • Start new conversations more often. If you've finished a task or you're pivoting topics, start fresh. Don't keep stacking onto the same thread.

  • Use the /compact command. This summarizes your conversation history and compresses it, so you can keep working without carrying the full weight of every previous message.

  • Use the /clear command when switching tasks entirely. It resets your context to zero.

Extended Thinking Is Eating Your Tokens (And You Probably Don't Know It's On)

Claude has a feature called extended thinking — basically a deep reasoning mode where it works through complex, multi-step problems before responding. It's brilliant for hard tasks.

So when you ask Claude something casual — "hey, brainstorm 5 email subject lines" — it might still be activating deep reasoning in the background. That thinking process uses tokens. For a quick brainstorm or a simple edit, you're paying for cognitive overhead you don't need.

What to do instead:

Toggle extended thinking off for simple tasks. Save it for the stuff that actually requires multi-step reasoning — like building a complex strategy, debugging something technical, or analyzing a dense document. For everyday drafting, editing, and brainstorming? You don't need it, and turning it off stretches your Claude token limits noticeably further.

Your Attachments Are More Expensive Than You Think

Every file you upload to Claude gets converted into tokens. But not all files are created equal, and this is where most people accidentally burn through their usage without realizing it.

Images (PNG, JPG, screenshots):

Here's the formula: (width x height) / 750 = tokens. So a 1500x1000 screenshot is roughly 2,000 tokens. A full-resolution photo from your phone? Potentially way more. If you're uploading screenshots for Claude to review, resizing them smaller before you upload directly reduces the cost.

PDFs:

This one is sneaky. Claude converts each PDF page into an image internally, and each page costs 1,500 to 3,000 tokens depending on how dense the content is. A 20-page PDF could eat 30,000 to 60,000 tokens in a single upload. So a dense, text-heavy 5MB PDF actually generates more tokens than a 20MB PDF full of images.

Text files (MD, TXT, CSV):

These are the most efficient option. Raw text gets processed as-is — no image conversion overhead. If you have the choice between uploading a PDF or pasting the same content as plain text, the text version is dramatically cheaper on your Claude usage limits.

What to do instead:

  • Resize screenshots before uploading. You don't need 4K resolution for Claude to read your Canva mockup.

  • If you can copy-paste the text from a PDF, do that instead of uploading the file.

  • Be intentional about what you attach. Ask yourself: does Claude actually need to see this, or can I describe it in a sentence?

You're Probably Using the Wrong Model

Claude Pro gives you access to multiple models, and which one you use directly impacts how fast you hit your Claude token limits.

Here's the breakdown:

Haiku — the quick-and-light model. Best for simple tasks like reformatting text, answering quick questions, basic summaries, and anything pattern-based. Think of it as the "text message" model. Fast, cheap on tokens, gets the job done for the easy stuff.

Sonnet — the workhorse. This handles roughly 95% of what solopreneurs actually need. Drafting content, editing, research, brainstorming, strategy sessions, analyzing documents. If you're defaulting to anything, default to this.

Opus — the deep thinker. This model uses approximately 5x more resources than Sonnet. It's built for genuinely complex reasoning — multi-constraint problems, long nuanced analysis, tasks where precision across many variables matters. But for writing a LinkedIn post or reviewing your sales page copy? Overkill.

Here's a stat that might change your behavior: a solopreneur defaulting to Opus for everything burns through the equivalent of ~$450/month in token usage. The same person using Sonnet for most tasks, Haiku for the quick stuff, and Opus only when they truly need deep reasoning? Roughly $120/month equivalent. Same quality where it counts, 3-4x more mileage from your plan.

What to do instead:

Before you start a conversation, ask yourself: does this task need deep thinking, or does it need fast execution? Match the model to the job. Sonnet for 80% of your work. Haiku for quick tasks. Opus only when the problem genuinely demands it.

The "Mega-Prompt" Method (a.k.a. Stop Drip-Feeding Context)

Most people use Claude like a text conversation. They send a short message, get a response, send a follow-up, get a response, clarify, get another response. Back and forth, ten messages deep, before they get what they actually wanted.

Every one of those messages compounds the snowball effect. And every time you add context — "oh, I forgot to mention, my audience is…" or "actually, can you also consider…" — you're spending tokens on course-correction that could've been free if you'd front-loaded it.

The mega-prompt method flips this. Instead of drip-feeding, you draft your full request in one shot:

  • What you want

  • Who it's for

  • The constraints (length, tone, format)

  • Any reference material or context

  • All your questions, batched together

Bonus: When Claude gives you follow-up options to click on (like suggested questions), clicking one of those costs almost nothing in tokens compared to typing out a paragraph-long follow-up. Those little clickable options are actually the most token-efficient way to guide a conversation.

Subscribe to keep reading

This content is free, but you must be subscribed to Strategies that Stack® by Victoria Boyd to continue reading.

I consent to receive newsletters via email. Terms of use and Privacy policy.

Already a subscriber?Sign in.Not now

Reply

Avatar

or to participate

Keep Reading