Generating single subject activity videos as a sequence of actions using 3D convolutional generative adversarial networks

Ahmad Arinaldi, Mohamad Ivan Fanany

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Humans have the remarkable ability of imagination, where within the human mind virtual simulations are done of scenarios whether visual, auditory or any other senses. These imaginations are based on the experiences during interaction with the real world, where human senses help the mind understand their surroundings. Such level of imagination has not yet been achieved using current algorithms, but a current trend in deep learning architectures known as Generative Adversarial Networks (GANs) have proven capable of generating new and interesting images or videos based on the training data. In that way, GANs can be used to mimic human imagination, where the resulting generated visuals of GANs are based on the data used during training. In this paper, we use a combination of Long Short-Term Memory (LSTM) Networks and 3D GANs to generate videos. We use a 3D Convolutional GAN to generate new human action videos based on trained data. The generated human action videos are used to generate longer videos consisting of a sequence of short actions combined creating longer and more complex activities. To generate the sequence of actions needed we use an LSTM network to translate a simple input description text into the required sequence of actions. The generated chunks are then concatenated using a motion interpolation scheme to form a single video consisting of many generated actions. Hence a visualization of the input text description is generated as a video of a subject performing the activity described.

Original languageEnglish
Title of host publicationArtificial General Intelligence - 10th International Conference, AGI 2017, Proceedings
EditorsTom Everitt, Ben Goertzel, Alexey Potapov
PublisherSpringer Verlag
Pages133-142
Number of pages10
ISBN (Print)9783319637020
DOIs
Publication statusPublished - 2017
Event10th International Conference on Artificial General Intelligence, AGI 2017 - Melbourne, Australia
Duration: 15 Aug 201718 Aug 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10414 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th International Conference on Artificial General Intelligence, AGI 2017
Country/TerritoryAustralia
CityMelbourne
Period15/08/1718/08/17

Keywords

  • 3D GAN
  • Activity video generation
  • LSTM

Fingerprint

Dive into the research topics of 'Generating single subject activity videos as a sequence of actions using 3D convolutional generative adversarial networks'. Together they form a unique fingerprint.

Cite this