TurboAttention: Efficient Attention Approximation For High Throughputs LLMs
Hao Kang, Srikant Bharadwaj, James Hensman, Tushar Krishna, Victor Ruehle, Saravan Rajmohan
ArXiv | December 2024, Vol abs/2412.08585
Hao Kang, Srikant Bharadwaj, James Hensman, Tushar Krishna, Victor Ruehle, Saravan Rajmohan
ArXiv | December 2024, Vol abs/2412.08585
Rya Sanovar, Srikant Bharadwaj, Renee St. Amant, Victor Ruehle, Saravan Rajmohan
May 2024
Srikant Bharadwaj, Shomit Das, K. Mazumdar, Bradford M. Beckmann, Stephen Kosonocky
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems | March 2024
Srikant Bharadwaj, Guilherme Cox, Tushar Krishna, Abhishek Bhattacharjee
2018 International Symposium on Microarchitecture | October 2018
Hao Kang, Srikant Bharadwaj, James Hensman, Tushar Krishna, Victor Ruehle, Saravan Rajmohan
ArXiv | December 2024, Vol abs/2412.08585
Rya Sanovar, Srikant Bharadwaj, Renee St. Amant, Victor Ruehle, Saravan Rajmohan
May 2024
Rya Sanovar, Srikant Bharadwaj, Renee St. Amant, Victor Ruehle, Saravan Rajmohan
May 2024
Srikant Bharadwaj, Shomit Das, K. Mazumdar, Bradford M. Beckmann, Stephen Kosonocky
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems | March 2024
Srikant Bharadwaj, Guilherme Cox, Tushar Krishna, Abhishek Bhattacharjee
2018 International Symposium on Microarchitecture | October 2018
Hao Kang, Srikant Bharadwaj, James Hensman, Tushar Krishna, Victor Ruehle, Saravan Rajmohan
ArXiv | December 2024, Vol abs/2412.08585
Srikant Bharadwaj, Shomit Das, K. Mazumdar, Bradford M. Beckmann, Stephen Kosonocky
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems | March 2024
Srikant Bharadwaj, Guilherme Cox, Tushar Krishna, Abhishek Bhattacharjee
2018 International Symposium on Microarchitecture | October 2018