Temporally Consistent Semantic Video Editing

Generative adversarial products can crank out photorealistic visuals. They use GANs to just take a latent code as input and develop an graphic as the output. On the contrary, GAN inversion strategies allow for projecting a true graphic on to the latent space of a pretrained GAN and retrieving its corresponding latent code.

An example of semantic image segmentation. Image credit: B. Palac via Wikimedia, CC-BY-SA-4.0

An instance of the semantic graphic segmentation. Image credit rating: B. Palac through Wikimedia, CC-BY-SA-4.

Modifying the latent code enables large-amount enhancing tasks. The modification of the latent code with a semantic modify in the graphic is recognised as semantic enhancing.

A new paper on arXiv.org provides a technique for temporally steady movie semantic modifying. A straightforward and helpful flow-primarily based strategy is proposed to mitigate the temporal inconsistency of a immediately (frame by frame) edited video. The two-section optimization solution will allow updating both of those latent code and generator to preserve the video details. The system is product-agnostic and can be used to various GAN inversion and manipulation procedures.

Generative adversarial networks (GANs) have demonstrated remarkable image generation top quality and semantic enhancing ability of authentic photographs, e.g., altering item courses, modifying characteristics, or transferring styles. Even so, applying these GAN-primarily based editing to a video independently for each individual frame inevitably benefits in temporal flickering artifacts. We present a simple nonetheless efficient technique to aid temporally coherent movie modifying. Our main plan is to minimize the temporal photometric inconsistency by optimizing each the latent code and the pre-trained generator. We appraise the good quality of our modifying on diverse domains and GAN inversion procedures and display favorable outcomes from the baselines.

Research post: Xu, Y., AlBahar, B., and Huang, J.-B., “Temporally Reliable Semantic Video clip Editing”, , 2022. Website link: https://arxiv.org/stomach muscles/2206.10590