What is the cheapest text-to-speech API?

Amazon Polly Standard and Google Cloud Standard are tied as the cheapest TTS APIs at $4 per 1 million characters ($0.004/1K chars). A 10-minute narration script (~7,500 characters) costs roughly 3 cents with either provider.

How much does text-to-speech cost?

TTS API pricing ranges from $4 to $50 per million characters on pay-as-you-go plans. Amazon Polly Standard is the cheapest at $4/1M chars, while Cartesia Sonic is $50/1M chars. Subscription-based ElevenLabs starts at $6/month for 30,000 characters.

How are TTS APIs priced?

Most TTS APIs charge per character on a pay-as-you-go basis — you pay for exactly what you use. ElevenLabs is the exception, using monthly subscriptions with character credit allotments (e.g., $6/mo for 30K credits). Some providers like Google and Amazon also offer generous free tiers.

What factors affect TTS API costs?

The main factors are voice quality tier (standard vs. neural/generative), volume of characters processed, and whether the provider uses pay-as-you-go or subscription pricing. For example, Amazon Polly Standard costs $4/1M chars but their Generative tier costs $30/1M chars — a 7.5× premium for higher quality.

Which TTS API has the best voice quality?

ElevenLabs leads quality benchmarks with a 4.5/5 score on crowd-sourced evaluations. OpenAI tts-1-hd (4.1/5) and Cartesia Sonic (4.0/5) follow closely. Quality is subjective and depends on your use case, voice style, and language requirements.

TTS Cost Calculator

Paste your text and compare text-to-speech pricing, quality, latency, and language support across leading AI voice providers.

✓ Pricing verified 28 June 2026📄 From official API docs📊 12 pricing tiers compared

1Paste your text

0 characters · 0 words

2Settings

Speaking rate (WPM)

🔒 We normalize pricing on a common basis so you can compare apples to apples.

Cost Comparison

Show prices in:

Free tier available

Provider ↕	Pricing Model ↕	Quality ↕	Latency ↕	Languages ↕	Cost per 1K ↕	Total Cost ↑
Amazon Polly StandardFree tier	Pay-as-you-go	★★★☆☆3.0	200ms	29	$0.004	$0.0000
Google StandardFree tier	Pay-as-you-go	★★★☆☆3.2	180ms	75	$0.004	$0.0000
Azure Neural StandardFree tier	Pay-as-you-go	★★★½☆3.9	150ms	140	$0.016	$0.0000
OpenAI tts-1Free tier	Pay-as-you-go	★★★½☆3.8	300ms	50	$0.015	$0.0000
Deepgram AuraFree tier	Pay-as-you-go	★★★½☆3.7	120ms	30	$0.030	$0.0000
Google WaveNetFree tier	Pay-as-you-go	★★★½☆3.6	220ms	75	$0.016	$0.0000
Amazon Polly Generative	Pay-as-you-go	★★★★☆4.1	250ms	29	$0.030	$0.0000
OpenAI tts-1-hdFree tier	Pay-as-you-go	★★★★☆4.1	350ms	50	$0.030	$0.0000
OpenAI gpt-4o-mini-ttsFree tier	Pay-as-you-go	★★★½☆3.7	250ms	50	$0.015	$0.0000
Cartesia SonicFree tier	Pay-as-you-go	★★★★☆4.0	65ms	40	$0.050	$0.0000
ElevenLabs StarterFree tier	Subscription	★★★★½4.5	180ms	32	$0.200	$0.0000
ElevenLabs Creator	Subscription	★★★★½4.5	180ms	32	$0.182	$0.0000

Amazon Polly StandardFree

Azure Neural StandardFree

Amazon Polly Generative

OpenAI gpt-4o-mini-ttsFree

ElevenLabs StarterFree

View detailed comparison →

Top 3 Cheapest Options

Direct links to official pricing pages

1Amazon Polly Standard

$0.004 / 1K charsView Pricing ↗

2Google Standard

$0.004 / 1K charsView Pricing ↗

3Azure Neural Standard

$0.016 / 1K charsView Pricing ↗

See full comparison →

Compare More Than Price

Quality Scores (1–5)

Based on voice naturalness and output quality

Latency (TTFA ms)

Time to first audio in milliseconds

Language Support

Total number of supported languages

Free Tiers

Check which providers offer free tiers

The Definitive Guide to Text-to-Speech (TTS) API Pricing in 2026

Navigating the landscape of text-to-speech (TTS) API pricing is notoriously difficult for developers, product managers, and content creators. With the rapid advancement of generative AI voices, the market has exploded with options, but pricing models remain deliberately opaque.

Some providers, like Amazon Polly and Google Cloud, charge on a straightforward pay-as-you-go basis, billing you per million characters. Others, like ElevenLabs, have popularized the subscription quota model, where you pay a flat monthly fee for a set number of characters, creating a "use it or lose it" dynamic. Still others, like Cartesia and Deepgram, charge per character but compete heavily on latency and real-time generation speed.

This calculator was built to solve a single, frustrating problem: normalizing TTS costs across the entire industry. By pasting your specific text, audiobook chapter, or conversational AI prompt into the tool above, we instantly normalize all pricing tiers from the top providers into a single, comparative dollar amount. No math required, no hidden overage fees, just the raw cost to generate the audio you need.

Pay-As-You-Go vs. Subscription Models

The biggest divide in TTS pricing is between pay-as-you-go (consumption-based) and subscription models.

Pay-As-You-Go: Providers like OpenAI (tts-1 and tts-1-hd), Google Cloud (Standard, WaveNet, Neural2, Studio), Amazon Polly, Azure, Deepgram, and Cartesia all use this model. You are billed exactly for what you use, usually tracked down to the individual character. This is ideal for unpredictable workloads, bursty traffic, or applications where voice generation is a sporadic feature.

Subscriptions: ElevenLabs is the most notable provider using subscription tiers. Their plans start at $5/month for 30,000 characters and scale up to enterprise volumes. While subscriptions can offer a lower effective cost-per-character if you perfectly utilize your entire quota, they often result in wasted spend for low-volume users.

How Much Does a 10-Minute Video Cost?

To understand the real-world impact of these pricing disparities, consider the cost of generating voiceover for a standard 10-minute video.

At a standard speaking rate of 130 words per minute, a 10-minute narration contains roughly 1,300 words, translating to approximately 7,500 characters.

The Cheapest: On legacy systems like Amazon Polly Standard or Google Cloud Standard, generating this audio costs about $0.03 (three cents).
The Middle Ground: Using OpenAI's highly popular tts-1 model, the same text costs roughly $0.11.
The Premium: Using Cartesia Sonic for ultra-low latency, it costs $0.38. Using ElevenLabs on a Creator plan, it burns 7.5% of your $22 monthly quota.

As you can see, the spread between the cheapest and most expensive option for the exact same text is over 50×.

Evaluating Quality and Latency

Cost is only one axis of evaluation. When selecting a TTS API, developers must balance pricing against voice quality (naturalness, expressiveness, emotion) and latency (time-to-first-audio).

Quality Benchmarks: In crowd-sourced blind A/B testing (such as the Artificial Analysis Speech Arena), ElevenLabs consistently holds the top position for conversational naturalness and voice cloning accuracy. Cartesia Sonic and Azure Neural also score exceptionally high. Legacy models like Google Standard offer the lowest cost but sound noticeably robotic compared to modern generative approaches.

Latency Constraints: For asynchronous tasks like generating audiobook chapters or podcast voiceovers, latency is irrelevant; quality and cost are the primary drivers. However, for real-time conversational AI agents, latency is the most critical metric. Cartesia Sonic was engineered specifically for this use case, boasting sub-100ms time-to-first-audio (TTFA). Deepgram Aura also competes heavily in the low-latency space.

Free Tiers and Credits: How to Start for Zero Cost

Before committing to a paid plan, almost every major provider offers a generous free tier for development and testing:

Microsoft Azure: Offers an incredible 500,000 characters per month absolutely free on their Neural Standard voices, refreshing every month.
Google Cloud: Provides 1,000,000 characters per month free on WaveNet and Neural2 voices, and up to 4,000,000 characters free on legacy Standard voices.
Amazon Web Services (AWS): The Polly service includes 5,000,000 Standard characters or 1,000,000 Neural characters per month free, but only for your first 12 months.
OpenAI: Provides $5 in free API credits for new accounts, usable across all TTS models.
Deepgram: Provides $200 in free signup credits, which can be used across their speech-to-text and text-to-speech (Aura) APIs.
Cartesia: Offers $5 in free credits for new accounts to test Sonic TTS.
ElevenLabs: The free tier is limited to 10,000 characters per month (roughly 10 minutes of audio) and requires attribution.