Overview
Generate audio that matches given video content + prompt using ThinkSound.- Output Type: video
- Estimated Cost: 0.5 * duration credits
- Handler: replicate
Parameters
Required Parameters
URL of the base video
Optional Parameters
Brief description of the video content
A short caption describing what’s happening in the video to help the model understand the context.
- Label: Caption
Detailed description of the sound generation process
A detailed description that begins with the sound process and includes texture, atmosphere, and timing details. This helps the model understand exactly what audio to generate and how it should match the video.
- Label: Chain of thought
Classifier-free guidance scale
Controls how closely the model follows the prompt. Higher values mean stricter adherence to the prompt.
- Label: CFG Scale
- Minimum: 1
- Maximum: 20
Number of inference steps
More steps generally produce higher quality but take longer to generate.
- Label: Inference steps
- Minimum: 10
- Maximum: 50