I agree with a lot of what you're saying, but wonder about the claim that we think "ramp, ramp, box" watching this video. I think that discussions about what qualifies as artificial intelligence benefit from more careful consideration of what we consider to be our own intelligence. We do think ramp, ramp, box, but we can only think that after taking in a ton of visual data and attempting to make sense of it with our previous experience. We have a heuristic for ramp developed over a similar feedback loop, like gently correcting your kid when they call a raccoon "kitty". What if an extension of LLMs was developed that made use of long term memory and the ability to generate new categories? Why do you think that they'll never be able to work that way? And moreover, what is the model trying and failing to replicate in your mind? An individual human speaker, or some sort of ideal speaker-listener, or something else?