Connecting AI Agents to Text-to-Speech with MCP
What Is MCP?
The Model Context Protocol (MCP) is an open standard for connecting AI models to external tools. Instead of building custom integrations, your AI agent discovers and calls tools through a standardized interface.
Think of it as USB for AI — plug in a tool server, and your agent can use it immediately.
Why TTS + MCP?
AI agents are great at generating text. But sometimes you need audio output — voice messages, narrated summaries, spoken notifications. MCP lets your agent call TTS tools directly:
- Generate speech from any text
- Choose voices and formats programmatically
- Manage audio files (list, organize, share)
- Check usage and costs
Available Tools
AI TTS Microservice exposes 34 MCP tools covering the full platform:
| Category | Tools |
|---|---|
| Generation | generate_speech, estimate_cost |
| Voices | search_voices, get_voice_details |
| Library | list_jobs, get_job_status, get_audio_link |
| Sharing | create_share, list_shares, revoke_share |
| Organization | manage_collection, list_tags, list_bookmarks |
| Account | get_usage, get_storage |
Quick Start
Install the MCP package:
npm install -g @theproductivepixel/aittsm
Configure your MCP client to connect:
{
"mcpServers": {
"aittsm": {
"command": "aittsm",
"env": {
"AITTSM_API_KEY": "your_api_key"
}
}
}
}
Your agent can now call generate_speech with text, voice, and format parameters — and get back a playable audio URL.
Example: Agent Generates a Voice Note
Agent: I'll create an audio summary of today's meeting notes.
→ calls
generate_speech({ text: "Today we discussed the Q3 roadmap...", voice_id: "google:en-US-Chirp3HD-Kore", output_format: "mp3" })← returns
{ job_id: "abc123", status: "completed", audio_endpoint: "..." }Agent: Done. Here's the audio summary: [link]
Remote Server (No Install)
You can also connect directly to the hosted MCP server — no local package needed. Point your MCP client to the remote endpoint with your API key for authentication.