Search

ALL
Blog
Research Areas
Publications
News
Others

Blog(1)

[CVPR 2023 Series #3] GENIE: Show Me the Data for Quantization
Since the scale of the state-of-the-art AI models has become deeper, model compression also has been attracting more attention as a method to let models be deployed on edge devices without accessing cloud servers.

Research Areas(0)

Publications(23)

Progressive Mixed-Precision Decoding for Efficient LLM Inference

AuthorHao (Mark) Chen, Fuwen Tan, Alexandros Kouris, Royson Lee, Stylianos Venieris

PublishedInternational Conference on Learning Representation (ICLR)

Date2025-04-25
QBB: Quantization with Binary Bases for LLMs

AuthorAdrian Bulat, Yassine Ouali, Georgios Tzimiropoulos

PublishedNeural Information Processing Systems (NeurIPS)

Date2024-12-11
Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers

AuthorJunhan Kim, Chungman Lee, Eulrang Cho, Kyungphil Park, Ho-young Kim, Joonyoung Kim, Yongkweon Jeon

PublishedNeural Information Processing Systems (NeurIPS)

Date2024-12-11

View More

News(3)

Others(0)