diff --git a/voicevox-remotion-template/AGENTS.MD b/voicevox-remotion-template/AGENTS.MD index 312987e..0179de4 100644 --- a/voicevox-remotion-template/AGENTS.MD +++ b/voicevox-remotion-template/AGENTS.MD @@ -9,13 +9,14 @@ - コミットは、機能・生成物・ドキュメントなど意味のある単位で分ける。 ## 実用上の制約 -- `src/data/script.ts` が時系列脚本の唯一の編集元。`src/data/script.json` は互換用であり、現在は参照しない。 +- `src/data/{コンポジション名}/script.ts` が時系列脚本の唯一の編集元。`src/data/script.json` は互換用であり、現在は参照しない。 - `npm run voice:generate` は、VOICEVOX エンジンが `VOICEVOX_URL` で起動していることを前提にする。 - VOICEVOX の話者名・スタイル名は利用環境の `/speakers` に依存する。合わない場合は `characters.*.voicevox` を調整する。 -- `say(...)` を追加・変更した場合は、音声と `src/data/voicevox-manifest.json` を再生成する。 -- `public/audio/lines/*.wav` は動画再現に必要な音声素材として扱う。`out/` などレンダリング済み動画はコミットしない。 +- `say(...)` を追加・変更した場合は、音声と `src/data/{コンポジション名}/voicevox-manifest.json` を再生成する。 +- `public/audio/{コンポジション名}/lines/*.wav` は動画再現に必要な音声素材として扱う。`out/` などレンダリング済み動画はコミットしない。 - 立ち絵画像は `public` 配下に置き、`avatar.imagePath` で参照する。現在の小夜は CSS 仮立ち絵。 - 実装後は可能な範囲で `./node_modules/.bin/tsc --noEmit` と `npm run lint` を実行する。 +- 共通コンポーネントとして実装したコードや関数については、それぞれの先頭行の直上に「用途」「使用方法」「オプションや引数詳細」をコメントとして日本語で記載する。 ## 新規動画追加方針 - 原則として、同一 Remotion プロジェクト内に別 Composition を追加し、動画別データディレクトリで脚本・音声・manifest・素材を分離する。 diff --git a/voicevox-remotion-template/README.md b/voicevox-remotion-template/README.md index 38ffa27..7763080 100644 --- a/voicevox-remotion-template/README.md +++ b/voicevox-remotion-template/README.md @@ -17,7 +17,7 @@ https://github.com/VOICEVOX/voicevox_engine ### 3. 脚本を編集 -`src/data/script.ts` の `characters` と `timeline` を編集します。 +`src/data/yukkuri-composition/script.ts` の `characters` と `timeline` を編集します。 ```ts show("sayo", {caption: "ネコミミ代表として、小夜が登場!"}); @@ -36,10 +36,17 @@ npm run voice:generate ``` -`src/data/script.ts` の `say(...)` から `public/audio/lines/*.wav` を生成し、 -`src/data/voicevox-manifest.json` に長さ・話者・スタイル情報を記録します。 +`src/data/yukkuri-composition/script.ts` の `say(...)` から +`public/audio/yukkuri-composition/lines/*.wav` を生成し、 +`src/data/yukkuri-composition/voicevox-manifest.json` に長さ・話者・スタイル情報を記録します。 音声が未生成の行は、プレビュー時にテキスト長から尺を推定します。 +YukkuriComposition の音声だけを明示して生成する場合は、次も使えます。 + +```bash +npm run voice:generate:yukkuri-composition +``` + ピザ窯サンプルの音声を生成する場合は、次を実行します。 ```bash @@ -66,18 +73,19 @@ 単体音声だけ再生成する場合は、次のように音声ファイルを指定できます。 ```bash -npm run lipsync:generate -- public/audio/lines/zunda-001.wav +npm run lipsync:generate -- public/audio/yukkuri-composition/lines/zunda-001.wav ``` 時系列シナリオ単位で再生成する場合は、対応する VOICEVOX manifest を指定します。 ```bash -npm run lipsync:generate -- --source-manifest src/data/pizza-oven-project-01/voicevox-manifest.json +npm run lipsync:generate -- --source-manifest src/data/yukkuri-composition/voicevox-manifest.json ``` -PizzaOvenProject01 には専用コマンドもあります。 +YukkuriComposition と PizzaOvenProject01 には専用コマンドもあります。 ```bash +npm run lipsync:generate:yukkuri-composition npm run lipsync:generate:pizza-oven-project-01 ``` @@ -124,23 +132,23 @@ ``` ## 編集ポイント -- 時系列脚本: `src/data/script.ts` -- 音声タイミング: `src/data/voicevox-manifest.json` (自動生成) +- 時系列脚本: `src/data/yukkuri-composition/script.ts` +- 音声タイミング: `src/data/yukkuri-composition/voicevox-manifest.json` (自動生成) - 口パクタイミング: `src/generated/lipsync/manifest.json` (自動生成) -- 映像の構成: `src/yukkuri-composition.tsx` +- 映像の構成: `src/yukkuri-composition/index.tsx` - 立ち絵セット: `src/standee-sets.ts` ## 字幕方針 本テンプレートは、短編 VOICEVOX ドラマ動画や、実写映像を背景にした解説動画での利用を主用途としています。 -そのため、字幕は `@remotion/captions` の `Caption` 型 JSON による単語単位・時刻単位の正式な字幕データとしては扱わず、`src/data/script.ts` の `say(...)` / `show(...)` に紐づく発話単位・シーン単位の表示テキストとして扱います。 +そのため、字幕は `@remotion/captions` の `Caption` 型 JSON による単語単位・時刻単位の正式な字幕データとしては扱わず、`src/data/yukkuri-composition/script.ts` の `say(...)` / `show(...)` に紐づく発話単位・シーン単位の表示テキストとして扱います。 `say(...)` の字幕は VOICEVOX で生成した音声尺、または未生成時の推定尺に同期して表示します。SRT/VTT 互換、単語単位ハイライト、自動文字起こし字幕が必要になった場合は、その時点で `@remotion/captions` の導入を検討します。 ## VOICEVOX設定 - `VOICEVOX_URL` (既定: `http://host.docker.internal:50021`) -- 話者とスタイルは `src/data/script.ts` の `characters.*.voicevox` で指定します。 +- 話者とスタイルは各コンポジションの `src/data/{コンポジション名}/script.ts` の `characters.*.voicevox` で指定します。 ## 立ち絵セットの追加・差し替え 立ち絵本体、口パク画像、通常表示時の基本レイアウトは `src/standee-sets.ts` の `standeeSets` にまとめています。 @@ -191,7 +199,7 @@ - `imageLayout.flipX`: 左右反転したい場合に `true` にします。 ### 3. キャラクター定義で使うセットを選ぶ -通常コンポジションは `src/data/script.ts`、ピザ窯コンポジションは +YukkuriComposition は `src/data/yukkuri-composition/script.ts`、ピザ窯コンポジションは `src/data/pizza-kiln/script.ts` の `characters.*.avatar` で、使いたい立ち絵セットを展開します。 対象ファイルで `getStandeeSet` を import して使います。 @@ -216,7 +224,7 @@ 通常の全身表示は `src/standee-sets.ts` の `imageLayout` で調整します。 コンポジションごとに特別な配置がある場合だけ、描画側を調整します。 -- 通常コンポジション: `src/yukkuri-composition.tsx` の `Stage` / `CharacterAvatar` +- YukkuriComposition: `src/yukkuri-composition/index.tsx` - ピザ窯コンポジション: `src/pizza-kiln-composition.tsx` の `SayoStandee` ピザ窯コンポジションは、通常背景用の `STAGE_STANDEE_*` と、実写動画右下用の diff --git a/voicevox-remotion-template/package.json b/voicevox-remotion-template/package.json index 5ce86d9..b4b96d0 100644 --- a/voicevox-remotion-template/package.json +++ b/voicevox-remotion-template/package.json @@ -8,9 +8,11 @@ "render": "remotion render", "lint": "eslint .", "lipsync:generate": "node scripts/generate-lipsync.js", + "lipsync:generate:yukkuri-composition": "node scripts/generate-lipsync.js --source-manifest src/data/yukkuri-composition/voicevox-manifest.json", "lipsync:generate:pizza-oven-project-01": "node scripts/generate-lipsync.js --source-manifest src/data/pizza-oven-project-01/voicevox-manifest.json", "test:lipsync": "node --test scripts/lipsync-utils.test.js", "voice:generate": "node scripts/voicevox-generate.js", + "voice:generate:yukkuri-composition": "node scripts/voicevox-generate.js --script src/data/yukkuri-composition/script.ts --output public/audio/yukkuri-composition/lines --manifest src/data/yukkuri-composition/voicevox-manifest.json", "voice:generate:pizza-kiln": "node scripts/voicevox-generate.js --script src/data/pizza-kiln/script.ts --output public/audio/pizza-kiln/lines --manifest src/data/pizza-kiln/voicevox-manifest.json", "voice:generate:pizza-oven-project-01": "node scripts/voicevox-generate.js --script src/data/pizza-oven-project-01/script.ts --output public/audio/pizza-oven-project-01/lines --manifest src/data/pizza-oven-project-01/voicevox-manifest.json" }, diff --git a/voicevox-remotion-template/scripts/generate-lipsync.js b/voicevox-remotion-template/scripts/generate-lipsync.js index 61ea044..4ef5748 100644 --- a/voicevox-remotion-template/scripts/generate-lipsync.js +++ b/voicevox-remotion-template/scripts/generate-lipsync.js @@ -12,7 +12,7 @@ const rawDir = path.join(publicDir, "lipsync/raw"); const DEFAULT_SOURCE_MANIFESTS = [ - "src/data/voicevox-manifest.json", + "src/data/yukkuri-composition/voicevox-manifest.json", "src/data/pizza-kiln/voicevox-manifest.json", "src/data/pizza-oven-project-01/voicevox-manifest.json", ]; diff --git a/voicevox-remotion-template/scripts/voicevox-generate.js b/voicevox-remotion-template/scripts/voicevox-generate.js index de101d7..19eec45 100644 --- a/voicevox-remotion-template/scripts/voicevox-generate.js +++ b/voicevox-remotion-template/scripts/voicevox-generate.js @@ -18,9 +18,9 @@ const parseArgs = () => { const values = { - script: "src/data/script.ts", - output: "public/audio/lines", - manifest: "src/data/voicevox-manifest.json", + script: "src/data/yukkuri-composition/script.ts", + output: "public/audio/yukkuri-composition/lines", + manifest: "src/data/yukkuri-composition/voicevox-manifest.json", }; const args = process.argv.slice(2); diff --git a/voicevox-remotion-template/src/data/script.json b/voicevox-remotion-template/src/data/script.json index 3c49ed9..ade9e18 100644 --- a/voicevox-remotion-template/src/data/script.json +++ b/voicevox-remotion-template/src/data/script.json @@ -1,4 +1,4 @@ { "deprecated": true, - "message": "The editable sequence now lives in src/data/script.ts." + "message": "The editable sequence now lives in src/data/yukkuri-composition/script.ts." } diff --git a/voicevox-remotion-template/src/data/script.ts b/voicevox-remotion-template/src/data/script.ts deleted file mode 100644 index 041438b..0000000 --- a/voicevox-remotion-template/src/data/script.ts +++ /dev/null @@ -1,131 +0,0 @@ -import {getStandeeSet, type AvatarDefinition} from "../standee-sets"; - -export type VoicevoxVoice = Readonly<{ - speakerName: string; - styleName: string; -}>; - -export type CharacterDefinition = Readonly<{ - displayName: string; - voicevox: VoicevoxVoice; - avatar: AvatarDefinition; -}>; - -export const characters = { - zundamon: { - displayName: "ずんだもん", - voicevox: { - speakerName: "ずんだもん", - styleName: "ノーマル", - }, - avatar: { - ...getStandeeSet("zundamon_ohnegus_ai"), - accentColor: "#79d36f", - nameplatePosition: "none", - idleAnimationType: "none", - speakingAnimationType: "rhubarbLipSync", - }, - }, - sayo: { - displayName: "小夜", - voicevox: { - speakerName: "小夜/SAYO", - styleName: "ノーマル", - }, - avatar: { - ...getStandeeSet("sayo_ohnegus_ai"), - accentColor: "#6b5f83", - nameplatePosition: "none", - idleAnimationType: "none", - speakingAnimationType: "rhubarbLipSync", - }, - }, -} as const satisfies Record; - -export type CharacterId = keyof typeof characters; - -export type SpeechOptions = Readonly<{ - subtitle?: string; - voicevox?: Partial; -}>; - -export type ShowOptions = Readonly<{ - caption?: string; - durationSeconds?: number; -}>; - -export type SpeechEvent = Readonly<{ - type: "say"; - id: string; - character: CharacterId; - text: string; - subtitle?: string; - voicevox?: Partial; -}>; - -export type ShowEvent = Readonly<{ - type: "show"; - character: CharacterId; - caption?: string; - durationSeconds?: number; -}>; - -export type TimelineEvent = SpeechEvent | ShowEvent; - -export const say = ( - id: string, - character: CharacterId, - text: string, - options: SpeechOptions = {} -): SpeechEvent => ({ - type: "say", - id, - character, - text, - ...options, -}); - -export const show = ( - character: CharacterId, - options: ShowOptions = {} -): ShowEvent => ({ - type: "show", - character, - ...options, -}); - -export const initialVisibleCharacters: CharacterId[] = ["zundamon"]; - -export const timeline: TimelineEvent[] = [ - say("zunda-001", "zundamon", "みなさんこんにちは、ずんだもんなのだ!"), - say( - "zunda-002", - "zundamon", - "今日のテーマは「ネコミミはなぜかわいいのか?」なのだ。" - ), - say( - "zunda-003", - "zundamon", - "まず大きな理由は、丸みのあるシルエットと動きなのだ。" - ), - say( - "zunda-004", - "zundamon", - "そして感情が伝わりやすくて、親近感が増すのだ!" - ), - show("sayo", { - caption: "ネコミミ代表として、小夜が登場!", - }), - say( - "sayo-001", - "sayo", - "小夜です。ネコミミ代表として、猫耳のかわいさを証明しに来ました。" - ), - say("zunda-005", "zundamon", "それじゃあ、また次回なのだ!"), -]; - -export const isSpeechEvent = ( - event: TimelineEvent -): event is SpeechEvent => event.type === "say"; - -export const script = timeline.filter(isSpeechEvent); diff --git a/voicevox-remotion-template/src/data/timing.ts b/voicevox-remotion-template/src/data/timing.ts deleted file mode 100644 index 6780ef3..0000000 --- a/voicevox-remotion-template/src/data/timing.ts +++ /dev/null @@ -1,55 +0,0 @@ -import {SpeechEvent, timeline, TimelineEvent} from "./script"; -import voicevoxManifest from "./voicevox-manifest.json"; - -type ManifestEntry = { - id: string; - character?: string; - speakerName?: string; - styleName?: string; - speakerId?: number; - file: string; - durationSeconds: number; -}; - -const manifestEntries = voicevoxManifest as ManifestEntry[]; -const manifestById = new Map( - manifestEntries.map((entry) => [entry.id, entry]) -); - -export const FPS = 30; -export const GAP_FRAMES = 6; -export const DEFAULT_SHOW_SECONDS = 1.5; - -export const hasAudioForSpeech = (speech: SpeechEvent) => - manifestById.has(speech.id); - -export const audioFileForSpeech = (speech: SpeechEvent) => - manifestById.get(speech.id)?.file ?? `audio/lines/${speech.id}.wav`; - -export const durationForSpeech = (speech: SpeechEvent, fps = FPS) => { - const entry = manifestById.get(speech.id); - if (entry && Number.isFinite(entry.durationSeconds)) { - return Math.max(1, Math.ceil(entry.durationSeconds * fps)); - } - - const estimatedSeconds = Math.max(1.2, speech.text.length * 0.11); - return Math.ceil(estimatedSeconds * fps); -}; - -export const durationForTimelineEvent = ( - event: TimelineEvent, - fps = FPS -) => { - if (event.type === "say") { - return durationForSpeech(event, fps); - } - - const durationSeconds = event.durationSeconds ?? DEFAULT_SHOW_SECONDS; - return Math.max(1, Math.ceil(durationSeconds * fps)); -}; - -export const totalDurationInFrames = (fps = FPS) => - timeline.reduce((sum, event, index) => { - const gap = index < timeline.length - 1 ? GAP_FRAMES : 0; - return sum + durationForTimelineEvent(event, fps) + gap; - }, 0); diff --git a/voicevox-remotion-template/src/data/voicevox-manifest.json b/voicevox-remotion-template/src/data/voicevox-manifest.json deleted file mode 100644 index a34bc69..0000000 --- a/voicevox-remotion-template/src/data/voicevox-manifest.json +++ /dev/null @@ -1,56 +0,0 @@ -[ - { - "id": "zunda-001", - "character": "zundamon", - "speakerName": "ずんだもん", - "styleName": "ノーマル", - "speakerId": 3, - "file": "audio/lines/zunda-001.wav", - "durationSeconds": 3.0613333333333332 - }, - { - "id": "zunda-002", - "character": "zundamon", - "speakerName": "ずんだもん", - "styleName": "ノーマル", - "speakerId": 3, - "file": "audio/lines/zunda-002.wav", - "durationSeconds": 4.48 - }, - { - "id": "zunda-003", - "character": "zundamon", - "speakerName": "ずんだもん", - "styleName": "ノーマル", - "speakerId": 3, - "file": "audio/lines/zunda-003.wav", - "durationSeconds": 4.394666666666667 - }, - { - "id": "zunda-004", - "character": "zundamon", - "speakerName": "ずんだもん", - "styleName": "ノーマル", - "speakerId": 3, - "file": "audio/lines/zunda-004.wav", - "durationSeconds": 4.32 - }, - { - "id": "sayo-001", - "character": "sayo", - "speakerName": "小夜/SAYO", - "styleName": "ノーマル", - "speakerId": 46, - "file": "audio/lines/sayo-001.wav", - "durationSeconds": 5.834666666666666 - }, - { - "id": "zunda-005", - "character": "zundamon", - "speakerName": "ずんだもん", - "styleName": "ノーマル", - "speakerId": 3, - "file": "audio/lines/zunda-005.wav", - "durationSeconds": 2.474666666666667 - } -] diff --git a/voicevox-remotion-template/src/data/yukkuri-composition/script.ts b/voicevox-remotion-template/src/data/yukkuri-composition/script.ts new file mode 100644 index 0000000..b0380f6 --- /dev/null +++ b/voicevox-remotion-template/src/data/yukkuri-composition/script.ts @@ -0,0 +1,131 @@ +import {getStandeeSet, type AvatarDefinition} from "../../standee-sets"; + +export type VoicevoxVoice = Readonly<{ + speakerName: string; + styleName: string; +}>; + +export type CharacterDefinition = Readonly<{ + displayName: string; + voicevox: VoicevoxVoice; + avatar: AvatarDefinition; +}>; + +export const characters = { + zundamon: { + displayName: "ずんだもん", + voicevox: { + speakerName: "ずんだもん", + styleName: "ノーマル", + }, + avatar: { + ...getStandeeSet("zundamon_ohnegus_ai"), + accentColor: "#79d36f", + nameplatePosition: "none", + idleAnimationType: "none", + speakingAnimationType: "rhubarbLipSync", + }, + }, + sayo: { + displayName: "小夜", + voicevox: { + speakerName: "小夜/SAYO", + styleName: "ノーマル", + }, + avatar: { + ...getStandeeSet("sayo_ohnegus_ai"), + accentColor: "#6b5f83", + nameplatePosition: "none", + idleAnimationType: "none", + speakingAnimationType: "rhubarbLipSync", + }, + }, +} as const satisfies Record; + +export type CharacterId = keyof typeof characters; + +export type SpeechOptions = Readonly<{ + subtitle?: string; + voicevox?: Partial; +}>; + +export type ShowOptions = Readonly<{ + caption?: string; + durationSeconds?: number; +}>; + +export type SpeechEvent = Readonly<{ + type: "say"; + id: string; + character: CharacterId; + text: string; + subtitle?: string; + voicevox?: Partial; +}>; + +export type ShowEvent = Readonly<{ + type: "show"; + character: CharacterId; + caption?: string; + durationSeconds?: number; +}>; + +export type TimelineEvent = SpeechEvent | ShowEvent; + +export const say = ( + id: string, + character: CharacterId, + text: string, + options: SpeechOptions = {} +): SpeechEvent => ({ + type: "say", + id, + character, + text, + ...options, +}); + +export const show = ( + character: CharacterId, + options: ShowOptions = {} +): ShowEvent => ({ + type: "show", + character, + ...options, +}); + +export const initialVisibleCharacters: CharacterId[] = ["zundamon"]; + +export const timeline: TimelineEvent[] = [ + say("zunda-001", "zundamon", "みなさんこんにちは、ずんだもんなのだ!"), + say( + "zunda-002", + "zundamon", + "今日のテーマは「ネコミミはなぜかわいいのか?」なのだ。" + ), + say( + "zunda-003", + "zundamon", + "まず大きな理由は、丸みのあるシルエットと動きなのだ。" + ), + say( + "zunda-004", + "zundamon", + "そして感情が伝わりやすくて、親近感が増すのだ!" + ), + show("sayo", { + caption: "ネコミミ代表として、小夜が登場!", + }), + say( + "sayo-001", + "sayo", + "小夜です。ネコミミ代表として、猫耳のかわいさを証明しに来ました。" + ), + say("zunda-005", "zundamon", "それじゃあ、また次回なのだ!"), +]; + +export const isSpeechEvent = ( + event: TimelineEvent +): event is SpeechEvent => event.type === "say"; + +export const script = timeline.filter(isSpeechEvent); diff --git a/voicevox-remotion-template/src/data/yukkuri-composition/timing.ts b/voicevox-remotion-template/src/data/yukkuri-composition/timing.ts new file mode 100644 index 0000000..d7daf29 --- /dev/null +++ b/voicevox-remotion-template/src/data/yukkuri-composition/timing.ts @@ -0,0 +1,56 @@ +import {SpeechEvent, timeline, TimelineEvent} from "./script"; +import voicevoxManifest from "./voicevox-manifest.json"; + +type ManifestEntry = { + id: string; + character?: string; + speakerName?: string; + styleName?: string; + speakerId?: number; + file: string; + durationSeconds: number; +}; + +const manifestEntries = voicevoxManifest as ManifestEntry[]; +const manifestById = new Map( + manifestEntries.map((entry) => [entry.id, entry]) +); + +export const FPS = 30; +export const GAP_FRAMES = 6; +export const DEFAULT_SHOW_SECONDS = 1.5; + +export const hasAudioForSpeech = (speech: SpeechEvent) => + manifestById.has(speech.id); + +export const audioFileForSpeech = (speech: SpeechEvent) => + manifestById.get(speech.id)?.file ?? + `audio/yukkuri-composition/lines/${speech.id}.wav`; + +export const durationForSpeech = (speech: SpeechEvent, fps = FPS) => { + const entry = manifestById.get(speech.id); + if (entry && Number.isFinite(entry.durationSeconds)) { + return Math.max(1, Math.ceil(entry.durationSeconds * fps)); + } + + const estimatedSeconds = Math.max(1.2, speech.text.length * 0.11); + return Math.ceil(estimatedSeconds * fps); +}; + +export const durationForTimelineEvent = ( + event: TimelineEvent, + fps = FPS +) => { + if (event.type === "say") { + return durationForSpeech(event, fps); + } + + const durationSeconds = event.durationSeconds ?? DEFAULT_SHOW_SECONDS; + return Math.max(1, Math.ceil(durationSeconds * fps)); +}; + +export const totalDurationInFrames = (fps = FPS) => + timeline.reduce((sum, event, index) => { + const gap = index < timeline.length - 1 ? GAP_FRAMES : 0; + return sum + durationForTimelineEvent(event, fps) + gap; + }, 0); diff --git a/voicevox-remotion-template/src/data/yukkuri-composition/voicevox-manifest.json b/voicevox-remotion-template/src/data/yukkuri-composition/voicevox-manifest.json new file mode 100644 index 0000000..781ea9c --- /dev/null +++ b/voicevox-remotion-template/src/data/yukkuri-composition/voicevox-manifest.json @@ -0,0 +1,56 @@ +[ + { + "id": "zunda-001", + "character": "zundamon", + "speakerName": "ずんだもん", + "styleName": "ノーマル", + "speakerId": 3, + "file": "audio/yukkuri-composition/lines/zunda-001.wav", + "durationSeconds": 3.0613333333333332 + }, + { + "id": "zunda-002", + "character": "zundamon", + "speakerName": "ずんだもん", + "styleName": "ノーマル", + "speakerId": 3, + "file": "audio/yukkuri-composition/lines/zunda-002.wav", + "durationSeconds": 4.48 + }, + { + "id": "zunda-003", + "character": "zundamon", + "speakerName": "ずんだもん", + "styleName": "ノーマル", + "speakerId": 3, + "file": "audio/yukkuri-composition/lines/zunda-003.wav", + "durationSeconds": 4.394666666666667 + }, + { + "id": "zunda-004", + "character": "zundamon", + "speakerName": "ずんだもん", + "styleName": "ノーマル", + "speakerId": 3, + "file": "audio/yukkuri-composition/lines/zunda-004.wav", + "durationSeconds": 4.32 + }, + { + "id": "sayo-001", + "character": "sayo", + "speakerName": "小夜/SAYO", + "styleName": "ノーマル", + "speakerId": 46, + "file": "audio/yukkuri-composition/lines/sayo-001.wav", + "durationSeconds": 5.834666666666666 + }, + { + "id": "zunda-005", + "character": "zundamon", + "speakerName": "ずんだもん", + "styleName": "ノーマル", + "speakerId": 3, + "file": "audio/yukkuri-composition/lines/zunda-005.wav", + "durationSeconds": 2.474666666666667 + } +] diff --git a/voicevox-remotion-template/src/root.tsx b/voicevox-remotion-template/src/root.tsx index a96a91a..90b507d 100644 --- a/voicevox-remotion-template/src/root.tsx +++ b/voicevox-remotion-template/src/root.tsx @@ -1,7 +1,10 @@ import React from "react"; import {Composition} from "remotion"; import {YukkuriComposition} from "./yukkuri-composition"; -import {FPS as YUKKURI_ZUNDAMON_FPS, totalDurationInFrames as totalYukkuriZundamonDurationInFrames} from "./data/timing"; +import { + FPS as YUKKURI_ZUNDAMON_FPS, + totalDurationInFrames as totalYukkuriZundamonDurationInFrames, +} from "./data/yukkuri-composition/timing"; import {PizzaKilnSayoComposition} from "./pizza-kiln-composition"; import { PIZZA_KILN_FPS, diff --git a/voicevox-remotion-template/src/yukkuri-composition.tsx b/voicevox-remotion-template/src/yukkuri-composition.tsx deleted file mode 100644 index 312557b..0000000 --- a/voicevox-remotion-template/src/yukkuri-composition.tsx +++ /dev/null @@ -1,215 +0,0 @@ -import React from "react"; -import { - AbsoluteFill, - interpolate, - Sequence, - spring, - useCurrentFrame, - useVideoConfig, -} from "remotion"; -import { - characters, - initialVisibleCharacters, - timeline, - type CharacterId, - type TimelineEvent, -} from "./data/script"; -import { - audioFileForSpeech, - GAP_FRAMES, - durationForTimelineEvent, - hasAudioForSpeech, -} from "./data/timing"; -import {roundedFontFamily} from "./fonts"; -import { - VQCaptionOverlay, - VQCharacterStage, - VQSpeechOverlay, - VQWarmGradientBackground, - type VQMouthResolver, -} from "./lib/VQRemotionLib"; -import {getMouthForSpeechFrame} from "./lipsync/manifest"; - -type ScheduledTimelineEvent = Readonly<{ - event: TimelineEvent; - from: number; - durationInFrames: number; - visibleCharacters: CharacterId[]; - focusedCharacter: CharacterId; -}>; - -const scheduleTimeline = (fps: number): ScheduledTimelineEvent[] => { - let cursor = 0; - const visibleCharacters = new Set(initialVisibleCharacters); - - return timeline.map((event, index) => { - visibleCharacters.add(event.character); - - const durationInFrames = durationForTimelineEvent(event, fps); - const scheduledEvent = { - event, - from: cursor, - durationInFrames, - visibleCharacters: Array.from(visibleCharacters), - focusedCharacter: event.character, - }; - - cursor += durationInFrames; - if (index < timeline.length - 1) { - cursor += GAP_FRAMES; - } - - return scheduledEvent; - }); -}; - -const activeSegmentForFrame = ( - scheduledEvents: ScheduledTimelineEvent[], - frame: number -) => { - let activeSegment = scheduledEvents[0]; - - for (const scheduledEvent of scheduledEvents) { - if (frame >= scheduledEvent.from) { - activeSegment = scheduledEvent; - } else { - break; - } - } - - return activeSegment; -}; - -const clampInterpolation = { - extrapolateLeft: "clamp", - extrapolateRight: "clamp", -} as const; - -const yukkuriSubtitleOptions = { - fontFamily: roundedFontFamily, - fontSize: 36, - lineHeight: 1.4, - backgroundColor: "rgba(255, 255, 255, 0.88)", -} as const; - -const resolveMouth: VQMouthResolver = ({ - speechId, - speakingLocalFrame, - fps, -}) => getMouthForSpeechFrame(speechId, speakingLocalFrame, fps); - -const Title: React.FC> = ({progress}) => { - const opacity = interpolate(progress, [0, 1], [0, 1], clampInterpolation); - const translateY = interpolate(progress, [0, 1], [-30, 0], clampInterpolation); - - return ( -
- ネコミミはなぜかわいい? -
- ); -}; - -const TimelineOverlay: React.FC> = ({event}) => { - if (event.type === "say") { - const character = characters[event.character]; - - return ( - - ); - } - - return ( - - ); -}; - -const keyForEvent = (event: TimelineEvent, index: number) => { - if (event.type === "say") { - return event.id; - } - - return `show-${event.character}-${index}`; -}; - -export const YukkuriComposition: React.FC = () => { - const frame = useCurrentFrame(); - const {fps} = useVideoConfig(); - const scheduledEvents = scheduleTimeline(fps); - const activeSegment = activeSegmentForFrame(scheduledEvents, frame); - const isInsideActiveSegment = - frame < activeSegment.from + activeSegment.durationInFrames; - - const titleProgress = spring({ - frame, - fps, - config: {damping: 18, mass: 0.6}, - }); - const activeSpeech = - isInsideActiveSegment && activeSegment.event.type === "say" - ? activeSegment.event - : undefined; - const speakingCharacter = activeSpeech?.character; - const speakingLocalFrame = activeSpeech ? frame - activeSegment.from : 0; - - const sequences = scheduledEvents.map((scheduledEvent, index) => ( - - - - )); - - return ( - - - - <VQCharacterStage - characters={characters} - visibleCharacters={activeSegment.visibleCharacters} - focusedCharacter={ - isInsideActiveSegment ? activeSegment.focusedCharacter : undefined - } - speakingCharacter={speakingCharacter} - speakingSpeechId={activeSpeech?.id} - speakingLocalFrame={speakingLocalFrame} - frame={frame} - fps={fps} - resolveMouth={resolveMouth} - fontFamily={roundedFontFamily} - /> - {sequences} - </AbsoluteFill> - ); -}; diff --git a/voicevox-remotion-template/src/yukkuri-composition/index.tsx b/voicevox-remotion-template/src/yukkuri-composition/index.tsx new file mode 100644 index 0000000..4465d37 --- /dev/null +++ b/voicevox-remotion-template/src/yukkuri-composition/index.tsx @@ -0,0 +1,215 @@ +import React from "react"; +import { + AbsoluteFill, + interpolate, + Sequence, + spring, + useCurrentFrame, + useVideoConfig, +} from "remotion"; +import { + characters, + initialVisibleCharacters, + timeline, + type CharacterId, + type TimelineEvent, +} from "../data/yukkuri-composition/script"; +import { + audioFileForSpeech, + GAP_FRAMES, + durationForTimelineEvent, + hasAudioForSpeech, +} from "../data/yukkuri-composition/timing"; +import {roundedFontFamily} from "../fonts"; +import { + VQCaptionOverlay, + VQCharacterStage, + VQSpeechOverlay, + VQWarmGradientBackground, + type VQMouthResolver, +} from "../lib/VQRemotionLib"; +import {getMouthForSpeechFrame} from "../lipsync/manifest"; + +type ScheduledTimelineEvent = Readonly<{ + event: TimelineEvent; + from: number; + durationInFrames: number; + visibleCharacters: CharacterId[]; + focusedCharacter: CharacterId; +}>; + +const scheduleTimeline = (fps: number): ScheduledTimelineEvent[] => { + let cursor = 0; + const visibleCharacters = new Set<CharacterId>(initialVisibleCharacters); + + return timeline.map((event, index) => { + visibleCharacters.add(event.character); + + const durationInFrames = durationForTimelineEvent(event, fps); + const scheduledEvent = { + event, + from: cursor, + durationInFrames, + visibleCharacters: Array.from(visibleCharacters), + focusedCharacter: event.character, + }; + + cursor += durationInFrames; + if (index < timeline.length - 1) { + cursor += GAP_FRAMES; + } + + return scheduledEvent; + }); +}; + +const activeSegmentForFrame = ( + scheduledEvents: ScheduledTimelineEvent[], + frame: number +) => { + let activeSegment = scheduledEvents[0]; + + for (const scheduledEvent of scheduledEvents) { + if (frame >= scheduledEvent.from) { + activeSegment = scheduledEvent; + } else { + break; + } + } + + return activeSegment; +}; + +const clampInterpolation = { + extrapolateLeft: "clamp", + extrapolateRight: "clamp", +} as const; + +const yukkuriSubtitleOptions = { + fontFamily: roundedFontFamily, + fontSize: 36, + lineHeight: 1.4, + backgroundColor: "rgba(255, 255, 255, 0.88)", +} as const; + +const resolveMouth: VQMouthResolver<CharacterId> = ({ + speechId, + speakingLocalFrame, + fps, +}) => getMouthForSpeechFrame(speechId, speakingLocalFrame, fps); + +const Title: React.FC<Readonly<{progress: number}>> = ({progress}) => { + const opacity = interpolate(progress, [0, 1], [0, 1], clampInterpolation); + const translateY = interpolate(progress, [0, 1], [-30, 0], clampInterpolation); + + return ( + <div + style={{ + fontFamily: roundedFontFamily, + fontSize: 54, + fontWeight: 700, + color: "#1f2a44", + letterSpacing: 1, + textAlign: "center", + marginTop: 40, + opacity, + transform: `translateY(${translateY}px)`, + textShadow: "0 6px 18px rgba(31, 42, 68, 0.2)", + }} + > + ネコミミはなぜかわいい? + </div> + ); +}; + +const TimelineOverlay: React.FC<Readonly<{event: TimelineEvent}>> = ({event}) => { + if (event.type === "say") { + const character = characters[event.character]; + + return ( + <VQSpeechOverlay + speech={event} + speakerName={character.displayName} + accentColor={character.avatar.accentColor} + hasAudio={hasAudioForSpeech} + getAudioPath={audioFileForSpeech} + subtitleOptions={yukkuriSubtitleOptions} + /> + ); + } + + return ( + <VQCaptionOverlay + text={event.caption} + subtitleOptions={yukkuriSubtitleOptions} + /> + ); +}; + +const keyForEvent = (event: TimelineEvent, index: number) => { + if (event.type === "say") { + return event.id; + } + + return `show-${event.character}-${index}`; +}; + +export const YukkuriComposition: React.FC = () => { + const frame = useCurrentFrame(); + const {fps} = useVideoConfig(); + const scheduledEvents = scheduleTimeline(fps); + const activeSegment = activeSegmentForFrame(scheduledEvents, frame); + const isInsideActiveSegment = + frame < activeSegment.from + activeSegment.durationInFrames; + + const titleProgress = spring({ + frame, + fps, + config: {damping: 18, mass: 0.6}, + }); + const activeSpeech = + isInsideActiveSegment && activeSegment.event.type === "say" + ? activeSegment.event + : undefined; + const speakingCharacter = activeSpeech?.character; + const speakingLocalFrame = activeSpeech ? frame - activeSegment.from : 0; + + const sequences = scheduledEvents.map((scheduledEvent, index) => ( + <Sequence + key={keyForEvent(scheduledEvent.event, index)} + from={scheduledEvent.from} + durationInFrames={scheduledEvent.durationInFrames} + premountFor={Math.min(fps, scheduledEvent.from)} + > + <TimelineOverlay event={scheduledEvent.event} /> + </Sequence> + )); + + return ( + <AbsoluteFill + style={{ + display: "flex", + flexDirection: "column", + alignItems: "center", + }} + > + <VQWarmGradientBackground /> + <Title progress={titleProgress} /> + <VQCharacterStage + characters={characters} + visibleCharacters={activeSegment.visibleCharacters} + focusedCharacter={ + isInsideActiveSegment ? activeSegment.focusedCharacter : undefined + } + speakingCharacter={speakingCharacter} + speakingSpeechId={activeSpeech?.id} + speakingLocalFrame={speakingLocalFrame} + frame={frame} + fps={fps} + resolveMouth={resolveMouth} + fontFamily={roundedFontFamily} + /> + {sequences} + </AbsoluteFill> + ); +};