yiliu30

🌍

Working on site

Yi Liu yiliu30

🌍

Working on site

Talk is cheap, pick one and do it.

24 followers · 170 following

AI Frameworks Engineer @intel
SH
16:17 (UTC +08:00)

Achievements

x3 x3

Achievements

x3 x3

torchutils Public

Torch helper functions

Python MIT License Updated Dec 1, 2025
vllm-fork Public
Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 2 Apache License 2.0 Updated Nov 28, 2025
llm-compressor-fork Public
Forked from vllm-project/llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 1 Apache License 2.0 Updated Nov 27, 2025
vllm-io-fork Public
Forked from vllm-project/vllm-project.github.io

JavaScript Updated Nov 27, 2025
vllm-gaudi Public
Forked from vllm-project/vllm-gaudi

Community maintained hardware plugin for vLLM on Intel Gaudi

Python Updated Nov 25, 2025
gpucodes Public
Forked from jalexine/gpucodes

codes documenting my gpu learning journey

Cuda Updated Nov 19, 2025
auto-round-fork Public
Forked from intel/auto-round

SOTA Weight-only Quantization Algorithm for LLMs

Python Apache License 2.0 Updated Nov 4, 2025
oneAPI-samples-fork Public
Forked from oneapi-src/oneAPI-samples

Samples for Intel® oneAPI Toolkits

C++ MIT License Updated Nov 1, 2025
sglang-fork Public
Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python Apache License 2.0 Updated Oct 24, 2025
native-sparse-attention-fork Public
Forked from fla-org/native-sparse-attention

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python MIT License Updated Oct 2, 2025
compressed-tensors-fork Public
Forked from vllm-project/compressed-tensors

A safetensors extension to efficiently store sparse quantized tensors on disk

Python 1 Apache License 2.0 Updated Sep 25, 2025
triton-fork Public
Forked from triton-lang/triton

Development repository for the Triton language and compiler

MLIR MIT License Updated Sep 19, 2025
FP-Quant-fork Public
Forked from IST-DASLab/FP-Quant

Python Updated Sep 18, 2025
lm-eval-fork Public
Forked from EleutherAI/lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python MIT License Updated Sep 4, 2025
LongCat-Flash-Chat Public
Forked from meituan-longcat/LongCat-Flash-Chat

MIT License Updated Sep 1, 2025
torchao-fork Public
Forked from pytorch/ao

The torchao repository contains api's and workflows for quantization and pruning gpu models.

Python Other Updated Aug 22, 2025
transformers Public
Forked from huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python Apache License 2.0 Updated Aug 6, 2025
hccl_demo-fork Public
Forked from HabanaAI/hccl_demo

C++ Apache License 2.0 Updated Aug 5, 2025
ais-bench-fork Public

Python Apache License 2.0 Updated Jul 21, 2025
SageAttention-Fork Public
Forked from thu-ml/SageAttention

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda Apache License 2.0 Updated Jul 16, 2025
vllm-hpu-extension-fork Public
Forked from HabanaAI/vllm-hpu-extension

Python Apache License 2.0 Updated Jul 1, 2025
microxcaling-fork Public
Forked from microsoft/microxcaling

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python MIT License Updated Jun 18, 2025
towl-fork Public
Forked from HabanaAI/TOWL

HTML Apache License 2.0 Updated May 22, 2025
FlashMLA-fork Public
Forked from pzhao-eng/FlashMLA

C++ MIT License Updated Apr 16, 2025
flashinfer-fork Public
Forked from flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda Apache License 2.0 Updated Mar 11, 2025
DeepSeek-V3-fork Public
Forked from deepseek-ai/DeepSeek-V3

Python MIT License Updated Dec 31, 2024
optimum-habana Public
Forked from huggingface/optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Python Apache License 2.0 Updated Dec 24, 2024
torch-xpu-ops-fork Public
Forked from intel/torch-xpu-ops

C++ Apache License 2.0 Updated Dec 17, 2024
pytorch-fork Public
Forked from pytorch/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python Other Updated Dec 4, 2024
intel-extension-for-pytorch Public
Forked from intel/intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Python Apache License 2.0 Updated Dec 3, 2024

Yi Liu yiliu30

Achievements

Achievements

torchutils Public

Uh oh!

vllm-fork Public

Uh oh!

llm-compressor-fork Public

Uh oh!

vllm-io-fork Public

Uh oh!

vllm-gaudi Public

Uh oh!

gpucodes Public

Uh oh!

auto-round-fork Public

Uh oh!

oneAPI-samples-fork Public

Uh oh!

sglang-fork Public

Uh oh!

native-sparse-attention-fork Public

Uh oh!

compressed-tensors-fork Public

Uh oh!

triton-fork Public

Uh oh!

FP-Quant-fork Public

Uh oh!

lm-eval-fork Public

Uh oh!

LongCat-Flash-Chat Public

Uh oh!

torchao-fork Public

Uh oh!

transformers Public

Uh oh!

hccl_demo-fork Public

Uh oh!

ais-bench-fork Public

Uh oh!

SageAttention-Fork Public

Uh oh!

vllm-hpu-extension-fork Public

Uh oh!

microxcaling-fork Public

Uh oh!

towl-fork Public

Uh oh!

FlashMLA-fork Public

Uh oh!

flashinfer-fork Public

Uh oh!

DeepSeek-V3-fork Public

Uh oh!

optimum-habana Public

Uh oh!

torch-xpu-ops-fork Public

Uh oh!

pytorch-fork Public

Uh oh!

intel-extension-for-pytorch Public

Uh oh!