MiniMax

MiniMax TTS — Try Speech-02 Online via Fish Speech

MiniMax Speech-02 is a state-of-the-art Chinese and multilingual TTS model from MiniMax AI. It delivers highly expressive, emotionally nuanced speech with industry-leading Chinese quality — available on Fish Speech alongside other top models.

在 Kitta AI 免费体验 MiniMax TTS

无需信用卡。每月 2,000 免费 credits。

核心特性

  • Industry-leading Chinese TTS quality
  • Emotion control (happy, sad, angry, fearful)
  • Speech-02 Turbo and HD variants
  • Multi-speaker conversation support
  • Low-latency streaming API
  • Voice cloning support

适用场景

  • Chinese content creators
  • Emotional storytelling
  • Customer service bots
  • E-learning in Chinese

支持语言数

30+

Chinese, English, Japanese, Korean, Spanish & more

MiniMax TTS 与其它方案对比

平台质量速度语言声音克隆价格
Fish Speech (MiniMax)★★★★★Fast30+✓ 10s sampleFree tier + from $9/mo
ElevenLabs★★★★★Fast32✓ Paid onlyFrom $5/mo (limited)
Azure TTS★★★★Fast140+✓ Custom NeuralPay-per-use
Google TTS★★★★Fast50+Pay-per-use

常见问题

What is MiniMax TTS?

MiniMax TTS (Speech-02) is an AI text-to-speech model developed by MiniMax AI. It is widely regarded as one of the best TTS models for Chinese language, with strong multilingual support and emotion control.

How does MiniMax Speech-02 compare to other TTS models?

MiniMax Speech-02 excels in Chinese TTS quality and emotional expressiveness. It offers two variants: Speech-02 Turbo (fast, cost-efficient) and Speech-02 HD (highest quality).

Can I try MiniMax TTS for free?

Yes. Fish Speech provides free access to MiniMax TTS. Sign up for a free account and select MiniMax as your model in the workspace.

Does MiniMax TTS support emotion control?

Yes. MiniMax Speech-02 supports emotion tags including happy, sad, angry, fearful, disgusted, and surprised — giving you fine-grained control over speech expressiveness.

What is the difference between MiniMax Speech-02 Turbo and HD?

Speech-02 Turbo is optimized for speed and cost efficiency, ideal for real-time applications. Speech-02 HD delivers the highest audio quality, best for final production output.

继续探索 Kitta AI