No, as you can see from your very definition. But here's a good example: If you ...

fsckboy · on Sept 7, 2023

>convert each letter to a number, add up the numbers that make each sentence...The end result is sufficiently transformed that copyright no longer applies

the problem with this as an example is that copyright would not apply to this transformative work, not the original author's copyright nor your new authorship because this transformative work contains no creative human expression (unless the original book was designed to add up to some fortune cookie, of course, in which case you have not transformed it)

A nuttier, chewier example would be retelling a litigious story like Moana ("consider the copyright, across all these leaves... make way!"), from the pig's perspective or something, and seeing what would fly and what wouldn't.

kybernetikos · on Sept 5, 2023

Weights are simply a lossy compression of the training data set.

Now, I understand the argument that perhaps the specific work has been homeopathically diluted down to nothingness in the weights and so therefore has only been used to contextualise the compression process of other works, but if the weights can be reasonably used to generate copyright infringing text (and condensations and abridgements and transformations are explicitly listed in the law, verbatim copying is not necessary), or even answer substantial questions about it, then that shows that the weights included that data.

If I take a sound file and compress it down so it's poor quality but I can still make out the tune, that doesn't mean that I've avoided copyright law.

crazygringo · on Sept 5, 2023

> Weights are simply a lossy compression of the training data set.

No they're not -- they're more like the dictionary generated to produce a lossless compressed data set. But then we throw out the compressed data itself, and keep only the dictionary.

> but if the weights can be reasonably used to generate copyright infringing text (and condensations and abridgements and transformations are explicitly listed in the law, verbatim copying is not necessary)

First of all, they haven't been shown to substantially generate infringing text that aren't the kinds of short snippets covered by fair use. And my previous comment already explained that longer texts are not going to happen, for both legal and economic reasons.

But secondly, you're wrong about "condensations and abridgements and transformations". You can absolutely sell a page-long summary of a book without getting permission, for instance. What do you think things like CliffsNotes are all about? Or all those two-page "executive summaries" of popular busines books?

You can't abridge a 1,000 page book to 500 pages and sell that, but you can summarize its ideas in a page and sell that. Which is basically the approximate level of understanding that LLM's seem to absorb.