Model merging lessons in The Waifu Research Department

Interconnects - A podcast by Nathan Lambert

Categories:

Note: some of the audio in the second half is a little wonky, but the general voice was upgraded so hopefully it's a little less "poppy" until then!I'm trying to fix little pronunciation problems on a weekly basis. Thanks to my early fans! It'll keep improving. E.g. some of the months were wonky.When what seems like pure LLM black magic is actually supported by the literature.This is AI generated audio with Python and 11LabsSource code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/model-merging00:00 Model merging lessons in The Waifu Research Department02:21 How and why does model merging work?07:13 Aside: merging vs. ensembles vs. mixture of experts08:21 Why are people doing this?11:22 Tools & Links11:51 Brief (visual) literature review12:07 Full model merging and recent methods15:55 Weight averaging during pretraining17:18 LoRA merging17:53 More backgroundFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_005.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_016.pngFigure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_042.pngFigure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_051.pngFigure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_055.pngFigure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_058.pngFigure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_060.pngFigure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_062.pngFigure 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_065.pngFigure 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_075.pngFigure 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_077.pngFigure 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_084.png Get full access to Interconnects at www.interconnects.ai/subscribe

Visit the podcast's native language site