Blog(0)
Blog(0)
Research Areas(0)
Publications(6)
LittleBit: Ultra-Low Bit Quantization via Latent Factorization
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
MoDeGPT: Modular Decomposition for Large Language Model Compression
News(4)
Others(0)