Spirit-LM

ファインチューニング

開発者

Meta

ライセンス

Llama 2 Community License

リリース日

2024/2/8

対応言語

en

ベースモデル

meta-llama/Llama-2-7b

officialmultilingual

Metaが開発したマルチモーダル言語モデル。テキストと音声を自由に混在して処理可能。Llama 2をベースに、音声トークンを追加学習することで、テキストから音声、音声からテキストへのシームレスな変換を実現。表現豊かな音声生成が可能。

技術仕様

アーキテクチャ

Transformer (Multimodal)

パラメータバリエーション

Spirit-LM Base 7B(7B)

テキストと音声の相互変換が可能なマルチモーダルモデル。セマンティック音声トークンを使用。

VRAM4GB

GGUFファイルは登録されていません

Spirit-LM Expressive 7B(7B)

表現豊かな音声生成に特化したバリエーション。ピッチやスタイルトークンを使用。

VRAM4GB

GGUFファイルは登録されていません

関連モデル

LLaMA 1

4 バリエーション

Llama 2

6 バリエーション

Code Llama

4 バリエーション

Llama Guard 1

1 バリエーション

Swallow (Llama 2)

3 バリエーション

Llama 3

4 バリエーション

Llama Guard 2

1 バリエーション

Swallow (Llama 3)

2 バリエーション

ELYZA Japanese

1 バリエーション

Llama 3.1

6 バリエーション

Llama Guard 3

3 バリエーション

Swallow (Llama 3.1)

2 バリエーション

DeepSeek-R1-Distill-Llama

2 バリエーション

Llama 3.2

8 バリエーション

Llama 3.3

1 バリエーション

Swallow (Llama 3.3)

1 バリエーション

Llama 4

3 バリエーション

Llama Guard 4

1 バリエーション

家系図

現在のモデル: Spirit-LM

ベース

FT

派生

表示中