MARS5 TTS

MARS5 TTS - Open-source, insanely prosodic text-to-speech model

はじめに

MARS5 an opensource TTS model to replicate performances (from 2-3s of audio reference) in 140+ languages, even for extremely tough prosodic scenarios like sports commentary, movies, anime & more. Join our Discord https://discord.gg/4GVdQ28cZC today!

更新日:

2024年6月14日

月間訪問者数:

437.9M

アフィリエイト・プログラム:

📝🔉 Text To Speech

MARS5 TTS's 概要

MARS5 is a novel English speech model (TTS) developed by CAMB.AI. It follows a two-stage AR-NAR pipeline with a unique NAR component. With just 5 seconds of audio and a snippet of text, MARS5 can generate speech for diverse scenarios like sports commentary and anime. The model can be steered with punctuation and capitalization to guide the prosody of the generated output. Speaker identity can be specified using an audio reference file. MARS5 supports both shallow and deep cloning, with the latter requiring the prompt transcript. The model can be easily loaded using `torch.hub` and inference can be performed by providing the reference audio and transcript. The default settings provide good results, but the inference settings can be tuned for specific use cases. The checkpoints for MARS5, along with the necessary hardware requirements, are provided on the GitHub repo. Contributions to improve the model are welcome.

MARS5 TTS's 特徴

Two-stage AR-NAR pipeline
Guided prosody using punctuation and capitalization
Speaker identity specification
Shallow and deep cloning
Easy model loading using `torch.hub`
Inference using reference audio and transcript
Open-source with alternative licensing options

MARS5 TTS's Q&A

MARS5 TTS's 価格

MARS5 is open-source and available under GNU AGPL 3.0 license. For alternative licensing options, please contact [email protected]

さらに詳しく

MARS5 TTS's アナリティクス

ウェブサイト概要

主なパフォーマンス指標 github.com

直帰率

38.34%

ページ / 訪問

6.50

総訪問者数

437,914,238

現地滞在時間

7m 18s

グローバルランク

#78

国別ランク

#111

トップ

国別トラフィック分布

1.
United States15.94%
2.
China15.11%
3.
India9.28%
4.
Japan3.94%

総来場者数

過去3ヶ月の月間ビジター統計

トレンドダウン by 5.3% 今月

April - June 2026

トラフィック・ソース

トラフィック・ソースの分布

Social:

6.7%

Paid Referrals:

0.0%

Mail:

0.9%

Referrals:

11.0%

Search:

30.1%

Direct:

51.3%

支配的なソース: Direct

51.3% 全トラフィックの

MARS5 TTS

MARS5 TTS - Open-source, insanely prosodic text-to-speech model

はじめに

更新日:

月間訪問者数:

アフィリエイト・プログラム:

MARS5 TTS's 概要

MARS5 TTS's 特徴

Two-stage AR-NAR pipeline

Guided prosody using punctuation and capitalization

Speaker identity specification

Shallow and deep cloning

Easy model loading using `torch.hub`

Inference using reference audio and transcript

Open-source with alternative licensing options

MARS5 TTS's Q&A

What is MARS5?

What scenarios can MARS5 generate speech for?

How can the prosody of the generated output be guided?

How can speaker identity be specified?

What is the difference between shallow and deep cloning?

How can MARS5 be loaded and used for inference?

What are the hardware requirements for running MARS5?

Can MARS5 be used via an API?

What are the future improvements planned for MARS5?

How can I contribute to improving MARS5?

MARS5 TTS's 価格

MARS5 TTS's アナリティクス

ウェブサイト概要

トップ

総来場者数

トラフィック・ソース

MARS5 TTS's 代替案