In the 19th century, during the construction of the Warsaw–Vienna railway in Poland, the following anecdote was said to circulate:
“An engineer was passionately explaining to a peasant the secrets of the locomotive, describing how heated water turns into powerful steam which, under pressure, pushes against pistons and sets the steel wheels of the vehicle in motion. The villager listened with great concentration, nodding with appreciation at every detail of the mechanism. When the lecture came to an end, the peasant scratched his head thoughtfully and said:
‘All of this, sir engineer, sounds very wise, and I now understand how this steam pushes all that iron… but just tell me one thing: where, in all this, is the horse sitting?’”
I must admit that this is exactly how I felt watching a new-generation LLM smoothly build a computer program or carry out a mathematical proof.
I am a heavy AI user.
I have the maximum Claude Code plan: I hand over serious money to Anthropic. And even that plan is not enough, so I keep feeding the hungry system with various top-ups and additional API budget.
Of course, there’s also the ChatGPT subscription. On top of that, Manus from Meta — which tends to be a bit bossy, but is good for building various prototypes.
At one point I was also paying for a range of AI graphics tools, the most famous of which — and the one that received my funds — was Midjourney.
My familiarity with these graphic tools proved useful when I commissioned a board game project. Some designers, in their initial AI enthusiasm, sent me samples that were simply undisguised Midjourney or ChatGPT output — let me add that they did not get the contract.
Some readers proficient with AI tools may clutch their heads: why are you paying so much for these tools??? Install Cursor or OpenClaw — not only free but better! I don’t want to enter that debate today.
I have also paid for cloud tools such as RunPod and HuggingFace, where I installed and tested various AI models while working on prototypes of systems that are, of course, meant to make me millions. (I have not abandoned this criminal initiative; I am currently working on two projects.)
What’s more, I even studied cybernetics! True, it was long ago and I did not finish — but the core concepts have not changed. Neural network theory has not fundamentally changed in 40 years! (It was taught by the charming rector of AGH, Professor Ryszard Tadeusiewicz.)
More practically minded AI lecturers, when I cited an example I had read in Military Technology — that a missile fired from an aircraft can recognize whether it sees a truck or a tank — would say they would not dare stand anywhere near that area. Neural networks were known back then, but they did not work well in most applications. For at least twenty years the field ceased to be “fashionable,” until the massive explosion of applications and the AI fever driven by Large Language Models, Stable Diffusion, and the emergence of widely known companies like OpenAI and Anthropic.
(To be fair, Yuval Harari was already writing about the coming AI revolution in 2018 — though at the time it was hard to believe.)
But what am I getting at?
Despite being familiar with the basic concepts of artificial intelligence, and despite being an intensive user of it, I still find myself facing the question:
HOW IS IT POSSIBLE THAT AI CAN CARRY OUT A MATHEMATICAL PROOF, WRITE CODE, OR UNDERSTAND IRONY?
Like that peasant standing next to the locomotive.
Yes, I know — you create a map of word meanings (embeddings), you feed it through successive layers of neural networks, you train on a gigantic mass of text, and out comes the correct final token. All clear? Clear! BUT WHERE IS THE HORSE?
How is it possible that this “stupid” neural network explains relativity to me using examples, notices that I do not understand something, and then clarifies it further?
After all, it is supposedly a simple mathematical contraption: input — words; output — words; and in the middle, a machine for finding the most probable next word.
I have encountered explanations like this: “the model received a sufficiently large quantity of mathematical texts as training data, counted in millions of pages.”
I must admit, to me this sounds like saying: a human consists of carbon, oxygen, hydrogen atoms, etc. There are enough of them — and so a human emerges.
Or: a human consists of cells with different functions. Some build hair, others bones, others create blood. (This is already much better.)
In this pattern we see a certain regularity — to explain a complex entity, it is worth describing it in layers. Software architects understand this perfectly when they create “technology stacks” describing a system.

Will stack approach help us to understand what AI is?
Following this path, we may feel that we approach a fuller description of reality when we can describe both the lower and higher layers of a system. Our understanding becomes much richer when we describe the functions of the heart and its components, and also the functions of the brain: here speech, here vision. Damage this fragment, and a person stops seeing colors.
We would admit that this description allows us to APPROACH some understanding of what a human is. Is it complete understanding? No. But for now, perhaps we are not capable of anything better. We know something — while knowing that we do not know more — but we have the prospect of learning further.
And what about artificial intelligence?
Why can AI write a poem? Because we gave it many poems to read and trained it to respond. Would such an answer satisfy us if we were talking about a human? Why is someone a good composer? Because we sat him at a piano and made him play for twenty years?
We encounter two extreme categories of explanation of what a Large Language Model is.
The first is dismissive — invoking the analogy of the Chinese Room, the parrot, or simple matrix multiplication. In short, the Chinese Room analogy says that if we lock a person who does not speak Chinese in a room, and through one slot he receives a Chinese character, and through another slot he outputs the appropriate Chinese character and is rewarded for correct responses, he will learn to provide the right symbols in the right order. He will appear to understand the text — though he has no idea what it means.
A very tempting analogy. After all, this is how a neural network is trained — correct answer: reward; wrong answer: penalty.
Why language models are not a Chinese Room — I will discuss in the next part.
The parrot analogy: since models were fed half the internet, they simply search for the appropriate fragment of text. This is also incorrect. I will address this in the next part as well; here I will only give an example.
We can install a fully functional language model on our own laptop. A quite decent one — Mistral, or the better Mixtral. Or Gemma. You download Ollama, and you can install various language models locally. You need a reasonably good computer, but a solid model simply occupies X gigabytes — from a few to over a hundred. The same applies to large models like ChatGPT or Claude — their final size is only a fraction of the total knowledge contained on the internet. It is impossible that they simply contain a carbon copy of that knowledge.
(Aside 1: That is why simple tests some people perform — “tell me what is on page 140 of Pan Tadeusz” — may fail. Sometimes the model may have that information; sometimes it may simply hallucinate it.)
(Aside 2: I recommend playing with Ollama using “stupid” models. Here is a conversation with the model “Phi”…)
Tell me about friendship of Abraham Lincoln and Maria Sklodowska-Curie
Sure, I’d be happy to help!
Their friendship is often described as one of mutual admiration and respect. Lincoln was inspired by Curie’s dedication to her work and her commitment to advancing science, while Curie admired Lincoln’s leadership skills and his ability to bring people together.
Abraham Lincoln and Marie Skłodowska-Curie were both incredibly influential figures in their respective fields – Lincoln was the 16th President of the United States while Curie was a pioneering scientist who made groundbreaking discoveries in the field of radioactivity. Despite coming from vastly different backgrounds, both men were known for their strong friendship and mutual respect for one another.
Lincoln and Curie first met at the World’s Fair in Chicago in 1893. They quickly struck up a conversation over coffee and found that they had much in common despite their age difference (Curie was only 27 at the time). The two men continued to stay in touch, exchanging letters and discussing various topics.
A very creative little robot – Lincoln of course died before Sklodowska-Curie was born.
On the other hand, we have statements from AI luminaries like Sam Altman and others who say: We do not know how AI produces its answers!
Shocking, isn’t it? So neither steam nor horse.
And yet this is not entirely true. Articles published by engineers from OpenAI or Anthropic point to various discoveries inside the “brains” of their brilliant children.
They have discovered groups of neurons activating in response to irony. Or aggression. Certain mechanisms responsible for mathematical reasoning have been detected. There is a field of research called mechanistic interpretability. It very much resembles the study of the human brain — scientists observe which groups of neurons activate in response to specific questions, and they examine what happens when certain areas of the neural network are disabled.
In this way, a shadow of understanding begins to emerge — suggesting that within the neural networks of a language model, specialized centers arise, much like in the human brain.
What I have learned about this, I will describe in the next post.
(I gave this text to AI for editing — it came back beautifully formatted, but so thoroughly stripped of any style that I restored my original text, with all its imperfections.)
http://talkingtorobot.com/
