How to Quantize an LLM with GGUF or AWQ

Trelis Research

3,810 Subscribers

4,263 views since Nov 26, 2023

GGUF and AWQ Quantization Scripts
- Includes pushing model files to repo
Purchase here: https://buy.stripe.com/5kA6paaO9dmbcV...

ADVANCED Fine-tuning Repository Access
1. Quantization Scripts
2. Unsupervised + Supervised Fine-tuning Notebooks
3. Q&A Dataset Preparation + Cleaning Scripts
4. Scripts to create and use Embeddings
Learn More: https://trelis.com/advanced-fine-tuni...

Resources:
- Presentation Slides: https://tinyurl.com/2s58xnam
- Llama.cpp: https://github.com/ggerganov/llama.cpp
- AutoAWQ: https://github.com/casper-hansen/Auto...
- Runpod Affiliate Link: (supports Trelis) https://tinyurl.com/yjxbdc9w
- AWQ paper: https://arxiv.org/pdf/2306.00978.pdf
- GPTQ paper: https://arxiv.org/pdf/2210.17323.pdf
- Ready-Quantized Models: https://huggingface.co/TheBloke/

Referenced Videos:
- Supervised Fine-tuning (with bitsandbytes):    • Embeddings vs Fine Tuning  - Part 2, ...  
- Tiny Llama (run a GGUF model on your laptop):    • Tiny Llama  
- AWQ API setup and explanation:    • Double Inference Speed with AWQ Quant...  

0:00 How to quantize a large language model
0:38: Why quantize a language model
1:30 What is quantization
2:23 Which quantization to use?
3:29 GGUF vs BNB vs AWQ vs GPTQ
10:01 How to quantize with AWQ
18:48 How to quantize with GGUF (GGML)
25:29 Recap

Furr

© Furr.pk

[email protected]

How to Quantize an LLM with GGUF or AWQ

Trelis Research

3,810 Subscribers

164

Download

4,263 views since Nov 26, 2023

Furr

© Furr.pk

[email protected]

How to Quantize an LLM with GGUF or AWQ

Trelis Research

3,810 Subscribers

164

Download

4,263 views since Nov 26, 2023

40:08

Quiet Light

20:37

Charles Cleyn

01:16:36

Trelis Research

28:18

Shaw Talebi

26:21

Trelis Research

26:58

Kevin Stratvert

17:40

Adam Finer - Learn BI Online

21:24

Tim Ferriss