Abstract: In this paper, we propose LoopLynx, a scalable dataflow architecture for efficient LLM inference that optimizes FPGA usage through a hybrid spatial-temporal design. The design of LoopLynx ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results