Change the repository type filter
All
Repositories list
33 repositories
ZhiLight
PublicA highly optimized LLM inference acceleration engine for Llama and its variants.TLLM_QMM
PublicTLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pytorch module. We modified the dequantation and weight preprocessing to align with popular quantization alogirthms such as AWQ and GPTQ, and combine them with new FP8 quantization.norm
Publicgriffith
PublicA React-based web video player- 🎆 A well-designed local image and video selector for Android
redis-shard
Publiczetta-client-go
Publiczetta-proto
Public- An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.
chaika
Public- Graphite On VictoriaMetrics
cuBERT
PublicFast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKLkids
Publiczetta-client-java
Publicpresto-connectors
Publictache
PublicSugarAdapter
PublicRxLifecycle
Publichive
Publicprotobuf
Publicphabricator
Publiclibphutil
Publicpuppet-cdh
Public