> one could wonder if... memorization of some bits.
certainly. that's why it's mostly (~99%+) accurate on most (~98%) "common" queries... queries that have 1,000,000+ google results. but it's truly "lossy", like jpeg. And like JPEG (at reasonable compression ratios), for 99.99% of uses, users don't need pixel-accuracy (i.e. TIFF, RAW). (s…
> one could wonder if... memorization of some bits.
certainly. that's why it's mostly (~99%+) accurate on most (~98%) "common" queries... queries that have 1,000,000+ google results. but it's truly "lossy", like jpeg. And like JPEG (at reasonable compression ratios), for 99.99% of uses, users don't need pixel-accuracy (i.e. TIFF, RAW). (sidenote: nor, btw, do users generally need mathematical accuracy past the 2nd decimal). So you compress "the memorized internet" dataset (?800TB?) into a neural net that fits on a laptop (?2TB?). I'm guessing at those sizes, but I think i'm within an order of magnitude on both figures. Its rocking compression any way you look at it, and that's not even giving credit to the embedded "contextual understanding" and functionality of an LLM.
That's why I mentioned fractal compression, which I think is the most accurate "memory" analogy. What GPT does is look at an oak tree, then look at 10,000 oak trees, and somehow back-derives the DNA of the "seeds" that created those trees, which is an insane form of compression. This model was recently validated with oToy releasing a new 3d model standard (as opposed to polygons & NURBs) called the "neural object model". It takes a 3d object and "digests" it via a neural net, into a seed. it can then hyper-efficiently "re-grow / generate" the model based on the seed, much like LLMs grow/generate responses.
Thank you for your service to the community, Alberto! Keep it up!
> one could wonder if... memorization of some bits.
certainly. that's why it's mostly (~99%+) accurate on most (~98%) "common" queries... queries that have 1,000,000+ google results. but it's truly "lossy", like jpeg. And like JPEG (at reasonable compression ratios), for 99.99% of uses, users don't need pixel-accuracy (i.e. TIFF, RAW). (sidenote: nor, btw, do users generally need mathematical accuracy past the 2nd decimal). So you compress "the memorized internet" dataset (?800TB?) into a neural net that fits on a laptop (?2TB?). I'm guessing at those sizes, but I think i'm within an order of magnitude on both figures. Its rocking compression any way you look at it, and that's not even giving credit to the embedded "contextual understanding" and functionality of an LLM.
That's why I mentioned fractal compression, which I think is the most accurate "memory" analogy. What GPT does is look at an oak tree, then look at 10,000 oak trees, and somehow back-derives the DNA of the "seeds" that created those trees, which is an insane form of compression. This model was recently validated with oToy releasing a new 3d model standard (as opposed to polygons & NURBs) called the "neural object model". It takes a 3d object and "digests" it via a neural net, into a seed. it can then hyper-efficiently "re-grow / generate" the model based on the seed, much like LLMs grow/generate responses.
Thank you for your service to the community, Alberto! Keep it up!