Publications

UPR-Net: A Unified Pyramid Recurrent Network for Video Frame

Published

International Journal of Computer Vision (IJCV)

Date

2024.07.17

Research Areas

Abstract

Flow-guided synthesis provides a popular framework for video frame interpolation, where optical flow is firstly estimated to warp input frames, and then the intermediate frame is synthesized fromwarped representations.Within this framework, optical flow is typically estimated from coarse-to-fine by a pyramid network, but the intermediate frame is commonly synthesized in a single pass, missing the opportunity of refining possible imperfect synthesis for high-resolution and large motion cases. While cascading several synthesis networks is a natural idea, it is nontrivial to unify iterative estimation of both optical flow and intermediate frame into a compact, flexible, and general framework. In this paper, we present UPR-Net, a novel Unified Pyramid Recurrent Network for frame interpolation. Cast in a flexible pyramid framework, UPR-Net exploits lightweight recurrent modules for both bi-directional flow estimation and intermediate frame synthesis. At each pyramid level, it leverages estimated bi-directional flow to generate forward-warped representations for frame synthesis; across pyramid levels, it enables iterative refinement for both optical flowand intermediate frame.We showthat our iterative synthesis significantly improves the interpolation robustness on large motion cases, and the recurrent module design enables flexible resolution-aware adaptation in testing. When trained on low-resolution data, UPR-Net can achieve excellent performance on both low- and high-resolution benchmarks. Despite being extremely lightweight (1.7M parameters), the base version of UPR-Net competes favorably with many methods that rely on much heavier architectures. Code and trained models are publicly available at: https://github.com/ srcn-ivl/UPR-Net.