LLMs are surprisingly great at compressing images and audio, DeepMind researchers find

pavnilschanda@lemmy.world · 2 years ago

LLMs are surprisingly great at compressing images and audio, DeepMind researchers find

PlexSheep@feddit.de · 2 years ago

So like, mp3, gzip and zstd? Why would you use a LLM for compression??

rubikcuber@programming.dev · 2 years ago

The research specifically looked at lossless algorithms, so gzip

“For example, the 70-billion parameter Chinchilla model impressively compressed data to 8.3% of its original size, significantly outperforming gzip and LZMA2, which managed 32.3% and 23% respectively.”

However they do say that it’s not especially practical at the moment, given that gzip is a tiny executable compared to the many gigabytes of the LLM’s dataset.

NaibofTabr@infosec.pub · 2 years ago

Do you need the dataset to do the compression? Is the trained model not effective on its own?

Tibert@compuverse.uk · 2 years ago

Well from the article a dataset is required, but not always the heavier one.

Tho it doesn’t solve the speed issue, where the llm will take a lot more time to do the compression.

gzip can compress 1GB of text in less than a minute on a CPU, an LLM with 3.2 million parameters requires an hour to compress

akrot@lemmy.world · 2 years ago

I wonder how consistent is the decompression and how much information is lost in the process.