zhisbug's submissions

1.		Can LLMs play real-time games like supermario (other than Pokemon red)? (twitter.com/haoailab)
		3 points by zhisbug 11 months ago \| past \| 1 comment
2.		Sliding Tile Attention: A New Method That Speeds Up HunyuanVideo's Outputs by 3x (reddit.com)
		2 points by zhisbug 12 months ago \| past \| 1 comment
3.		Fast Video Generation with Sliding Tile Attention (hao-ai-lab.github.io)
		12 points by zhisbug 12 months ago \| past \| 2 comments
4.		More Efficient Chain-of-Thought Reasoning Through Certainty Probing (huggingface.co)
		6 points by zhisbug 12 months ago \| past \| 2 comments
5.		AI Space Escape: Playing Games While Evaluting LLM Reasonsing (lmgame.org)
		13 points by zhisbug on Feb 11, 2025 \| past \| 2 comments
6.		Efficient LLM Scheduling by Learning to Rank (hao-ai-lab.github.io)
		2 points by zhisbug on Jan 14, 2025 \| past \| 1 comment
7.		FastVideo: a lightweight framework for accelerating large video diffusion models (github.com/hao-ai-lab)
		110 points by zhisbug on Dec 17, 2024 \| past \| 24 comments
8.		MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving (hao-ai-lab.github.io)
		2 points by zhisbug on June 24, 2024 \| past \| 1 comment
9.		Consistency LLM: converting LLMs to parallel decoders accelerates inference 3.5x (hao-ai-lab.github.io)
		461 points by zhisbug on May 8, 2024 \| past \| 98 comments
10.		Throughput Is Not All You Need: Maxing Goodput in LLM Serving via Disaggregation (hao-ai-lab.github.io)
		5 points by zhisbug on March 18, 2024 \| past \| 1 comment
11.		Break the Sequential Dependency of LLM Inference Using Lookahead Decoding (lmsys.org)
		17 points by zhisbug on Nov 21, 2023 \| past \| 2 comments
12.		Important and MUST-KNOW techniques for a 2023 LLM serving system (twitter.com/haozhangml)
		1 point by zhisbug on Sept 13, 2023 \| past
13.		Fastchat-T5: 4x smaller but more powerful than Dolly-v2, commercial use ready (twitter.com/lmsysorg)
		7 points by zhisbug on April 28, 2023 \| past \| 1 comment
14.		Alpa: Auto-parallelizing large model training and inference (by UC Berkeley) (github.com/alpa-projects)
		7 points by zhisbug on June 23, 2022 \| past \| 1 comment