Abstract: In traditional audio captioning methods, a model is usually trained in a fully supervised manner using a human-annotated dataset containing audio-text pairs and then evaluated on the test ...
Abstract: This research focuses on generating image captions using Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) models. As deep learning advances, the availability of large ...
We are creating multimedia contents everyday and everywhere. While automatic content generation has played a fundamental challenge to multimedia community for decades, recent advances of deep learning ...
Google has removed dozens of AI-generated videos that depicted Disney-owned characters after receiving a cease and desist letter from the studio on Wednesday. Disney flagged the YouTube links to the ...
Background: Short-video platforms have become major channels for public access to health information in the digital era. However, the low barriers to content creation and the increasing use of ...