UPR-Net: A Unified Pyramid Recurrent Network for Video Frame
Published
International Journal of Computer Vision (IJCV)
Abstract
Flow-guided synthesis provides a popular framework for video frame interpolation, where optical flow is firstly estimated to
warp input frames, and then the intermediate frame is synthesized fromwarped representations.Within this framework, optical
flow is typically estimated from coarse-to-fine by a pyramid network, but the intermediate frame is commonly synthesized
in a single pass, missing the opportunity of refining possible imperfect synthesis for high-resolution and large motion cases.
While cascading several synthesis networks is a natural idea, it is nontrivial to unify iterative estimation of both optical flow
and intermediate frame into a compact, flexible, and general framework. In this paper, we present UPR-Net, a novel Unified
Pyramid Recurrent Network for frame interpolation. Cast in a flexible pyramid framework, UPR-Net exploits lightweight
recurrent modules for both bi-directional flow estimation and intermediate frame synthesis. At each pyramid level, it leverages
estimated bi-directional flow to generate forward-warped representations for frame synthesis; across pyramid levels, it enables
iterative refinement for both optical flowand intermediate frame.We showthat our iterative synthesis significantly improves the
interpolation robustness on large motion cases, and the recurrent module design enables flexible resolution-aware adaptation
in testing. When trained on low-resolution data, UPR-Net can achieve excellent performance on both low- and high-resolution
benchmarks. Despite being extremely lightweight (1.7M parameters), the base version of UPR-Net competes favorably with
many methods that rely on much heavier architectures. Code and trained models are publicly available at: https://github.com/
srcn-ivl/UPR-Net.