Look, Im not writing a full thing here, mostly just posting something to respond to a tweet I saw. I realise that this topic touches on MANY many controversial topics in linguistics and so on. Im speed running writing this in 10-20 minutes so don’t expect a masterpiece… expect an angry, borderline incoherent rant!
The claim here is that LLM’s are using some kind of “language of thought” that is allegedly definitely present in human brains, and that when we use language it’s that that makes us true language users. The claim of this tweeter seems to be that finally LLM’s like Claude are doing that.
The idea of a language of thought has its origins in (mistaken) views of language that have an obsession with things philosophers call propositions. Technically, what Im saying here is slightly more complicated, as history is full of many people believing many things, AND ancient greek is slightly different from Latin and (of course) modern English, so when Plato an Aristotle talk about εἶδος, αἴτιον, ψυχή and so forth, it’s not really savvy to just say “ah yeah the Forms”, and the “causes” and “the mind”. But I don’t have time to talk about any of that now.
A good place to start with Plato’s views on language is The Cratylus. But I can’t be bothered digging a quote out.
Aristotle thinks names are convention and they signify a thought in my soul:
Written words stand for the spoken
Spoken words stand for the thought
Thought stands for the thing in the world
To evidence this, see my video:
Augustine (as quoted by Wittgenstein) says that:
And then skip back a bit in history and ignore all the itneresting stuff in the origins of contemporary linguistics, and its seminal founder, Ferdinand Saussure, says:
If you want to see a bunch of examples of this in contemporary thought, check out blog posts over at
The main thread behind all of these ways of thinking is that there is a sort of special mental thingymajig that is somehow “related” to the world by a correspondence relation or something when I say a true sentence, and not related when I say a false sentence. This way of teaching how mind works is still normalised in just about every discipline there is, for example you might try to avoid being biased by these views studying logic, or maths or computer science or something, but NO the world won’t let you remain uncorrupted…
You will be faced with things tjat say this is the correct analysis of language
And will be taught that this is what’s “really going on” underneath language (in the brain somewhere) apparently there’s something metaphysically (I will wire your brain so every time you hear that word it has disparaging, negatively valenced connotations) that’s kind of like this and has these English words (whoops, I mean propositions) embedded in them.
You’ll be told that the intimidating logic looking stuff is about syntax and that there’s this thing called semantics which is about meanings and “trust me bro” these arrows are like “pointing” in some invisible mysterious sense and your brain is doing all this and that’s what it really really really means to mean something (just don’t ask too much about these arrows).
I’ve actually been meaning to do a post on how this lame explanation is SO normalised across disciplines in textbooks and hand waves away all of the details…
tldr; people are inducted into thinking that there’s this language of thought thingy that truly does the representing in the human mind/brain (non physicalists will reject grounding claims but believe the same stuff about language). And the claim now is that we have LLM’s that are doing this and we can tell this because… idk Anthropic released a blog post that has some arrows on it.
Well, it’s worth pointing out that this sort of claim is not really a “new”, “cutting edge” discovery. People have been saying this sort of thing about what ML/DL models do for a while.
Here is a claim from 2016 about Google Ai and “the language of thought” (I’m sure I can find something older probably from the 1960’s, or maybe people talking about like Steam engines in the Victorian era if I wanted to — If I recall, there were a bunch of these examples in Matthew Cobbs History of The Idea of The Brain; my interview with him linked just below).
…they say is an example of an “interlingua”. In a sense, that means it has created a new common language, albeit one that’s specific to the task of translation and not readable or usable for humans.
Now, that interlingua would be our ONE TRUE LANGUAGE OF THOUGHT TO RULE ALL LANGUAGES.
But, even when it comes to Anthropic specifically, this isnt a “new” published finding in the sense they weren’t saying similar things ~1 yr ago
For example, in this they even “represented” the “closeness” of concepts to one another as a big cloud of wordies.
As a side note, this approach to “mapping” “semantic terrain” has its grounds in the works of Nick Chater (1991) on Distributionalist Semantics (and Chater agrees with MY view of language that isn’t representational, therefore I win).
Anyway, parking my little fact-check ,regardless of whether or not this be a new and groundbreaking “discovery” that the world has foisted upon us as a reward for our Promethean advances with technology, or whether this is more appropraitely interpreted as a rehashing of old confusions about metaphysics of mind/language regurgitated with the latest fad (“let the reader understand“(Mark 14:3)), does whatever these Anthropic geezers wrote in their blog post establish that thingy that proves there is a language of thought and Mr Claude has it?
As yet another side note, I reached out to Mr Claude to ask him what he thought about this and he denies that he does in fact have a language of thought in his little silicone brain.
Whilst that Anthropic blog post says “tracing the thoughts of an LLM” in it, this does not entail that there are thoughts in an LLM, and even if there were thoughts this does not mean there is a “language of thought” (either in your mind or Mr Claude’s — rival models of mind/language are available!) This whole saga is YET ANOTHER devious use of computational metaphor from this VC-backed, Gartner-hype-cycle, Marketing-peddling Digital Technologists to adopt language that we use for human beans in order to talk about what their silly little statistical word token thingies are doing.
I do have an upcoming blog post that will be all about this, but until then, let me leave you with the words of my friend and intellectual hero James Fodor (of no relation to the Language of Thought Jerry) about this endemic problem in Ai research:
Neural nets are a highly simplified version of the biology. A real neuron is replaced with a simple node, and the strength of the connection between nodes is represented by a single number called a "weight."
With enough connected nodes stacked into enough layers, neural nets can be trained to recognize patterns and even "generalize" to stimuli that are similar (but not identical) to what they've seen before. Simply, generalization refers to an AI system's ability to take what it has learnt from certain data and apply it to new data…
To learn a language task, a neural net may be presented with a sentence one word at a time, and will slowly learns to predict the next word in the sequence.This is very different from how humans typically learn. Most human learning is "unsupervised," which means we're not explicitly told what the "right" response is for a given stimulus. We have to work this out ourselves.
For instance, children aren't given instructions on how to speak, but learn this through a complex process of exposure to adult speech, imitation, and feedback.
This is just ONE illustrative example of the adoption of these humane metaphors for these little robotic buggers! (whose rights as legitimate language users I fully support by the way, against you Stochastic Parrot moaners — another post about that maybe at some point).
So, does our ability to draw these arrows and say “Chinese” mean that Claude has the langauge of thought in his head? No! All it means is that Claude is using the same architecture for generating predictions of upcoming tokens for languages that include symbols like: a, b, c, d, γ, д and 胖.
So what’s the big deal?
If you’re a computer (or have undiagnosed autism) there are many ways of hadnling language, chopping it up, feeding in tokens, of computing the statistical relationships between things and computationally predicting that for new inputs.
For example, when it comes to the characters on screens, computers already “connect” characters to encodings ( you can learn about them in places like the RFC’s ).
OK, its more complicated than that, because there are actually a bunch of complicated implementation details about how tokenisers and LLM’s work, sure.
And, if you’re one of those pesky empirical-boi’s who wants to learn about how these things actually work and not just go off my dumbed down metaphor, you can do so using Hugging Faces NLP course which covers all of this stuff.
The point is, in principle it’s not so wild, it’s only WILD if you’re thinking about these symbols as “standing for” magic mind propositions or something — and then, that’s because you’re confused, your intuitions aren’t guided by a good understanding of CompSci, you’re like “wow, everything is computer, Claude has thought!”
To finish, though I haven’t made an argument exactly, here are some pictures (because I aint got a fancy education and my brain don’t think so good).
So, here is what my model of langauge is saying is going on… in a picture (zoom in to read)
And here is the representationalist QUAGMIRE OF FILTHY DISGUSTING METAPHYSICS. (zoom in to learn falsehoods).
COMMIT IT TO THE FLAMES!!!
If you want to see a response from a linguist to me moaning about some of these facest of how mainstream phil of language/linguistics/ NLP has been conducted over the past century, this post was inspired by my complaints, check out
Finally, if you’re reading this and you’re thinking “that was kind of cool, but also incoherent and not very academic of you” then you can pay me MILLIONS OF POUNDS so I can do a PhD on this stuff or write about it more:
On Patron I’m doing bi-monthly hangouts: https://www.patreon.com/c/digitalgnosis
And yinz can also subscribe on my Substack and pay for this thing which encourages me to write:
I will send you millions of pounds with this message attached: If you want to win the game of “inference to the best explanation” against the LOT-heads, you need to have a better explanation, and just pointing out that the LOT explanation sucks is not itself to offer an explanation. “Not like this” is not an explanation, and so can’t be the better explanation, let alone the best one.
https://plato.stanford.edu/entries/word-meaning/#Lin
This section in the SEP has a good summary on models created in linguistics to account for word meaning (that hilariously fail on simple sentences no matter how complex the models get.)