
Powered By RTX 4090
MMAudio 可根据视频和/或文本输入生成同步音频。我们的关键创新是多模式联合训练,它允许在各种音频-视频和音频-文本数据集上进行训练。此外,同步模块将生成的音频与视频帧对齐。
MMAudio generates synchronized audio given video and/or text inputs. Our key innovation is multimodal joint training which allows training on a wide range of audio-visual and audio-text datasets. Moreover, a synchronization module aligns the generated audio with the video frames.
Folix
2025-01-10 更新
Folix
2025-01-10 更新
工作流介绍
MMAudio 可根据视频和/或文本输入生成同步音频。我们的关键创新是多模式联合训练,它允许在各种音频-视频和音频-文本数据集上进行训练。此外,同步模块将生成的音频与视频帧对齐。
MMAudio generates synchronized audio given video and/or text inputs. Our key innovation is multimodal joint training which allows training on a wide range of audio-visual and audio-text datasets. Moreover, a synchronization module aligns the generated audio with the video frames.
节点信息
7
MMAudioFeatureUtilsLoader
MMAudioModelLoader
MMAudioSampler
PreviewAudio
VHS_LoadVideo
VHS_VideoCombine
VHS_VideoInfo