OpenAI Token Counter: What Tokens Are and How to Count Them Accurately
Try it now: Paste any YouTube URL and get subtitles free
Get Subtitles →Every time you send text to an OpenAI model — whether through the API, the ChatGPT interface, or a third-party tool — the text is broken into small pieces called tokens. Tokens are the fundamental unit that large language models read, process, and generate. Understanding how tokens work is essential for managing costs, staying within context-window limits, and writing more effective prompts.
What Are Tokens?
A token is a chunk of text that the model treats as a single unit. Tokens are not words — they are sub-word pieces determined by the model's tokeniser. As a rough rule of thumb:
- 1 token ≈ 4 characters of English text
- 1 token ≈ ¾ of a word
- 100 tokens ≈ 75 words
Here is a concrete example. The sentence "SubtitlesYT is awesome" might be tokenised as:
["Sub", "titles", "YT", " is", " awesome"] → 5 tokens
Common English words like "the", "is", and "and" are usually a single token. Longer or rarer words get split into multiple tokens. Non-English text, code, and special characters often require more tokens per word than plain English prose.
OpenAI uses different tokeniser encodings for different model families:
- o200k_base — used by GPT-4o, o1, o3-mini, and newer models. It has a 200,000-token vocabulary and is more efficient with non-English text and code.
- cl100k_base — used by GPT-4, GPT-4 Turbo, and GPT-3.5-turbo. It has a 100,000-token vocabulary.
The encoding affects the exact token count for the same input text, so always use the encoding that matches your target model.
Why Token Counting Matters
There are three practical reasons to care about tokens:
1. Context-Window Limits
Every model has a maximum number of tokens it can process in a single request (the "context window"). If your prompt plus the expected response exceeds this limit, the API will return an error. Knowing your token count before sending a request prevents wasted API calls and truncated results.
2. API Cost
OpenAI charges per token — separately for input tokens (your prompt) and output tokens (the model's response). A long YouTube transcript can contain thousands of tokens, so estimating the cost before you process it is a smart habit. See our token cost calculator for a step-by-step walkthrough.
3. Prompt Optimisation
Shorter, more focused prompts use fewer tokens, cost less, and often produce better results. Token counting helps you identify bloated prompts and trim them down without losing meaning.
How to Count Tokens (3 Methods)
Method 1: SubtitlesYT Token Counter (Easiest)
The SubtitlesYT Token Counter is a free browser-based tool that counts tokens instantly. No account, no API key, no installation required.
- Go to subtitlesyt.com/token-counter.
- Paste your text into the input area.
- Select the encoding — o200k_base for GPT-4o and newer models, or cl100k_base for GPT-4 / GPT-3.5-turbo.
- The token count appears instantly below the input.
This is the fastest method when you want a quick count without writing any code.
Method 2: tiktoken Python Library
If you are building an application or need to count tokens programmatically,
use OpenAI's official tiktoken library:
import tiktoken
# Choose the encoding for your model
enc = tiktoken.get_encoding("o200k_base") # GPT-4o, o1, o3-mini
# enc = tiktoken.get_encoding("cl100k_base") # GPT-4, GPT-3.5-turbo
text = "SubtitlesYT is awesome"
tokens = enc.encode(text)
print(f"Token count: {len(tokens)}")
print(f"Tokens: {tokens}")
print(f"Decoded: {[enc.decode([t]) for t in tokens]}")
Install tiktoken with pip install tiktoken. The library is fast,
works offline, and gives exact results.
Method 3: API Response Headers
When you make an API call to OpenAI, the response includes a
usage object with the exact token counts:
{
"usage": {
"prompt_tokens": 42,
"completion_tokens": 128,
"total_tokens": 170
}
}
This is useful for logging and monitoring but does not help you estimate costs before making the call. Combine it with Method 1 or 2 for pre-flight checks.
Token Limits by Model
Here are the context-window and output-token limits for the most popular OpenAI models as of 2026:
| Model | Context Window | Max Output Tokens |
|---|---|---|
| GPT-4o | 128,000 | 16,384 |
| GPT-4 Turbo | 128,000 | 4,096 |
| GPT-3.5-turbo | 16,385 | 4,096 |
| o1 | 200,000 | 100,000 |
| o3-mini | 200,000 | 100,000 |
A one-hour YouTube transcript is typically around 9,000 tokens — well within the context window of any current model. But if you are processing multiple videos or adding a long system prompt, the total can add up quickly.
Token Pricing
OpenAI charges separately for input and output tokens. Here are the approximate prices per 1 million tokens for common models (as of early 2026):
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| GPT-4o | $2.50 | $10.00 |
| GPT-4 Turbo | $10.00 | $30.00 |
| GPT-3.5-turbo | $0.50 | $1.50 |
| o1 | $15.00 | $60.00 |
| o3-mini | $1.10 | $4.40 |
For a detailed cost breakdown when processing YouTube transcripts specifically, see the Token Cost Calculator for YouTube Transcripts.
Tips for Reducing Token Usage
If you are hitting context-window limits or want to lower your API bill, try these techniques:
- Remove filler text. YouTube auto-generated subtitles often include "um", "uh", "you know", and repeated words. Stripping these out before sending the transcript to an AI model can cut token count by 10-20%.
- Use concise prompts. Replace verbose instructions with short, direct ones. For example, instead of "I would like you to please provide a summary of the following transcript," write "Summarise this transcript."
- Chunk long transcripts. If a transcript exceeds the context window, split it into sections and process each one separately. Then ask the model to combine the partial summaries.
- Choose the right model. GPT-3.5-turbo and o3-mini are significantly cheaper than GPT-4o for simple tasks like summarisation or extraction. Reserve the more expensive models for tasks that genuinely require stronger reasoning.
- Download TXT format. When using SubtitlesYT, choose TXT format to get clean text without timestamps. SRT and VTT files include timing metadata that adds tokens without adding useful content for most AI tasks.
Ready to count your tokens? Head over to the SubtitlesYT Token Counter and try it out — it takes less than five seconds. And if you are working with YouTube transcripts, start by downloading your subtitles first.
Ready to download subtitles? Paste a URL and get started.
Get Subtitles →