High Efficient Energy Compaction Network for Image Transform
Published
SPIE Optical Engineering + Applications (SPIE)
Abstract
For decades, Discrete Cosine Transform (DCT) has been a crucial role for video and image compression since Chen and Pratt proposed image compression application based on DCT. The energy compaction property of DCT is highly efficient for the compression when combined with entropy coder and a specific scan order. By exploiting the property, DCT has widely been used for video and image compression from JPEG, which is the famous image compression format, to High Efficiency Video Coding (HEVC), which is the latest video compression standard, over the 20 years. Since DCT has been used for image compression, several transforms have been proposed for the better compression performance than DCT. Among them, the most famous transform is Karhunen–Loève transform (KLT). The KLT has the best performance in the aspects of the energy compaction. However, the KLT must send the extra information of transform basis, which is not required in DCT, therefore its compression performance is worse and complexity is heavier than DCT. To achieve the energy compaction performance of KLT without extra information, we propose the machine learning network, TransNet, for image/video transform. TransNet is trained to achieve the better energy compaction property than DCT and maintain the image quality simultaneously. To find the optimal point between reconstructed image quality and energy compaction, we propose new loss function based the orthogonal transform property and regularization term. To evaluate the compression performance of the proposed network, we compared DCT and TransNet using JPEG encoder. In terms of the BD-rate on Peak Signal to Noise Ratio (PSNR), the proposed network shows about 11% gain compared with DCT.