Papers
arxiv:2304.14402

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

Published on Apr 27, 2023
Authors:
,
,

Abstract

Leveraging distilled knowledge from large instruction-tuned LLMs, LaMini-LM achieves competitive performance on NLP benchmarks using a fraction of the resources.

AI-generated summary

Large language models (LLMs) with instruction finetuning demonstrate superior generative capabilities. However, these models are resource intensive. To alleviate this issue, we explore distilling knowledge from instruction-tuned LLMs to much smaller ones. To this end, we carefully develop a large set of 2.58M instructions based on both existing and newly-generated instructions. In addition to being sizeable, we design our instructions to cover a broad set of topics to ensure. A thorough investigation of our instruction data demonstrate their diversity, and we generate responses for these instructions using gpt-3.5-turbo. We then exploit the instructions to tune a host of models, dubbed LaMini-LM, of varying sizes, both from the encoder-decoder as well as the decoder-only families. We evaluate our models both automatically (on 15 different NLP benchmarks) and manually. Results show that our proposed LaMini-LM are on par with competitive baselines while being nearly 10 times smaller in size.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2304.14402
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 24

Browse 24 models citing this paper

Datasets citing this paper 3

Spaces citing this paper 178

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.