name: display-captions description: Displaying captions in Remotion with TikTok-style pages and word highlighting metadata:
This guide explains how to display captions in Remotion, assuming you already have captions in the Caption format.
Read Transcribing audio for how to generate captions.
First, the @remotion/captions package needs to be installed. If it is not installed, use the following command:
npx remotion add @remotion/captions
First, fetch your captions JSON file. Use useDelayRender() to hold the render until the captions are loaded:
import { useState, useEffect, useCallback } from "react";
import { AbsoluteFill, staticFile, useDelayRender } from "remotion";
import type { Caption } from "@remotion/captions";
export const MyComponent: React.FC = () => {
const [captions, setCaptions] = useState<Caption[] | null>(null);
const { delayRender, continueRender, cancelRender } = useDelayRender();
const [handle] = useState(() => delayRender());
const fetchCaptions = useCallback(async () => {
try {
// Assuming captions.json is in the public/ folder.
const response = await fetch(staticFile("captions123.json"));
const data = await response.json();
setCaptions(data);
continueRender(handle);
} catch (e) {
cancelRender(e);
}
}, [continueRender, cancelRender, handle]);
useEffect(() => {
fetchCaptions();
}, [fetchCaptions]);
if (!captions) {
return null;
}
return <AbsoluteFill>{/* Render captions here */}</AbsoluteFill>;
};
Use createTikTokStyleCaptions() to group captions into pages. The combineTokensWithinMilliseconds option controls how many words appear at once:
import { useMemo } from "react";
import { createTikTokStyleCaptions } from "@remotion/captions";
import type { Caption } from "@remotion/captions";
// How often captions should switch (in milliseconds)
// Higher values = more words per page
// Lower values = fewer words (more word-by-word)
const SWITCH_CAPTIONS_EVERY_MS = 1200;
const { pages } = useMemo(() => {
return createTikTokStyleCaptions({
captions,
combineTokensWithinMilliseconds: SWITCH_CAPTIONS_EVERY_MS,
});
}, [captions]);
Map over the pages and render each one in a <Sequence>. Calculate the start frame and duration from the page timing:
import { Sequence, useVideoConfig, AbsoluteFill } from "remotion";
import type { TikTokPage } from "@remotion/captions";
const CaptionedContent: React.FC = () => {
const { fps } = useVideoConfig();
return (
<AbsoluteFill>
{pages.map((page, index) => {
const nextPage = pages[index + 1] ?? null;
const startFrame = (page.startMs / 1000) * fps;
const endFrame = Math.min(
nextPage ? (nextPage.startMs / 1000) * fps : Infinity,
startFrame + (SWITCH_CAPTIONS_EVERY_MS / 1000) * fps,
);
const durationInFrames = endFrame - startFrame;
if (durationInFrames <= 0) {
return null;
}
return (
<Sequence
key={index}
from={startFrame}
durationInFrames={durationInFrames}
>
<CaptionPage page={page} />
</Sequence>
);
})}
</AbsoluteFill>
);
};
The captions are whitespace sensitive. You should include spaces in the text field before each word. Use whiteSpace: "pre" to preserve the whitespace in the captions.
Put captioning logic in a separate component.
Make a new file for it.
A caption page contains tokens which you can use to highlight the currently spoken word:
import { AbsoluteFill, useCurrentFrame, useVideoConfig } from "remotion";
import type { TikTokPage } from "@remotion/captions";
const HIGHLIGHT_COLOR = "#39E508";
const CaptionPage: React.FC<{ page: TikTokPage }> = ({ page }) => {
const frame = useCurrentFrame();
const { fps } = useVideoConfig();
// Current time relative to the start of the sequence
const currentTimeMs = (frame / fps) * 1000;
// Convert to absolute time by adding the page start
const absoluteTimeMs = page.startMs + currentTimeMs;
return (
<AbsoluteFill style={{ justifyContent: "center", alignItems: "center" }}>
<div style={{ fontSize: 80, fontWeight: "bold", whiteSpace: "pre" }}>
{page.tokens.map((token) => {
const isActive =
token.fromMs <= absoluteTimeMs && token.toMs > absoluteTimeMs;
return (
<span
key={token.fromMs}
style={{ color: isActive ? HIGHLIGHT_COLOR : "white" }}
>
{token.text}
</span>
);
})}
</div>
</AbsoluteFill>
);
};
By default, put the captions alongside the video content, so the captions are in sync.
For each video, make a new captions JSON file.
<AbsoluteFill>
<Video src={staticFile("video.mp4")} />
<CaptionPage page={page} />
</AbsoluteFill>