FIRST PART. UNCERTAINTY, CAUSALITY, AND
SCHRÖDINGER'S CAT
HOME
You are very poor and your only hope is to sell your cat. You put it in a box
and get ready to go to the market. But you start thinking. What if the cat dies
in the box? I can't sell a dead cat. I will starve. I may die of starvation. But
if it is still alive when I get there, what if I don't get enough money to
survive? Your brain is effectively considering two different universes. You
can't decide between them so you consider both. From your perspective, you are
living in two different universes at the same time.
We say that your brain is the system, and the system has only two states: cat
dead, or cat alive. We call this situation an uncertainty, because you are not
certain which way it is going to be. So you need an answer to your uncertainty.
You choose a behavior that is independent of the uncertainty. We say that your
behavior is invariant under a transformation from one state to the
other. You take the box to the market and open it. The cat is alive.
As a result of your invariant behavior, you have now acquired new information:
the cat has survived. Your two universes collapse into one. But you start
thinking again. If my cat looked better I would get more money. Again you have
two universes, it either looks well or it doesn't. You seek an answer. You need
an invariant behavior. You take the cat out of the box and put it in the light.
Now you acquire more info: the cat doesn't look well. Again, you are left with
only one universe. And so on.
In Physics, we describe a system by means of a set of variables, and we say that
each possible combination of values of the variables is a state of the system.
The variables can be Boolean, or integer-valued, or anything appropriate. Then,
we specify an initial state, and a dynamics for the system. A dynamics is a rule
or set of rules that specify how the system transitions from one state to
another. And the rules account for the uncertainty. Say the system is in state A
and it can transition to state B, or C, or D. We don't know which one it will
be. And if the system transitions to B, then it can transition from there to X,
or to Y, or to Z, again we don't know which. The dynamics is not causal
because the state and the rule to do not add up to a transition. Yet, the
response behavior, an algorithm, should be causal so it can be executed. Where
does the algorithm come from?
The algorithm comes from what we know, from the information we have. So here is
what we do know. We know that states B, C, or D can not exist unless state A has
existed before. So we say that A precedes B, C, and D. We also
know that X, Y, or Z can exist only if B has existed. And this is the precise
point where causal sets come in. We formalize our knowledge by
writing:
A≺B, A≺C, A≺D, B≺X, B≺Y, B≺Z.
which, together with the set of states, {A, B, C, X, Y, Z} in this case, is
known as a causal set
(read the '≺' sign as "precedes"). We can write a computer program for this. It
would look as
follows:
if(A) then
{
if(B) then
{
if(X) then ...
else if(Y) then ...
else if(Z) then ...
}
else if(C) then ...
else if(D) then ...
}
which means that if A has existed then either B, C, or D can exist, and in the
case where B
has existed then either X, Y or Z can exist, and so on until the program stops
because there
are no more state transitions left. But what have we achieved by writing the
program? Nothing.
The uncertainty is still there, only now it has been transferred to the data. In
order to run the
program we have to specify all the uninitialized variables as data, for example
we could specify
A, B and X, or A and C, etc. In other words we have to specify the exact
sequence of state
transitions. Guess, rather than specify.
There are many possible sequences of execution that satisfy the constraints in
the causal set,
and no apparent reason to prefer one over the other. But our brains make a
unique solution,
every time, when in possession of certain information. For example, if I want to
travel from
Houston to Dallas I can fly first to San Francisco and from there to Dallas, or
I can fly directly
from Houston to Dallas. And brains are very consistent. Every person would
choose the second
alternative, unless they have some other reason to go to San Francisco first, in
which case they
would have used additional information. So how do our brains make that unique
selection? Obviously,
we are missing something here. How does the brain do it?
SECOND PART: ENTROPY, SELF-ORGANIZATION, AND PATTERNS
The brain doesn't do "it". It does something
else, and "it" follows
as a result. The brain must satisfy its never-ending hunger for energy.
Information carries energy.
Yes, information itself. In March 2012, they have actually measured the amount
of heat generated
by erasing one bit of information, thus confirming the 50 years old Landauer's
prediction, see Berut(2012).
When the brain learns something, that is, when it receives information, it
supplies energy to its
memory so it can store that information. As it stores, it immediately recovers
any energy it can
from the stored information, and uses this energy to store more information.
And here is some Physics. When energy is extracted from information, then
entropy is also extracted.
This is the Second Law of Thermodynamics. But entropy is the measure of
uncertainty in the information.
When energy is removed from the system, the number of state transitions
available to the system is
also reduced. This is because the system has less energy and higher-energy
states are no longer accessible to it. Fewer states mean less uncertainty, and
less entropy, which
is exactly what the Second Law predicts.
In addition, when the removal of energy blocks the system from accessing
high-energy states, the
state space shrinks, and the dynamics of the system is compressed into
that small space. This compressed space is known as an attractor, and we say that the
system has converged to an attractor. As the space state is now so small, the system
becomes more
stable, and the attractors become observable and easily identifiable. They are
the patterns or regularities in information that our brains create all the time. And, as if all
that were not enough,
the transformation induced in the information by the removal of entropy is
behavior-preserving.
It is known in Computer Science as refactoring. The result is an algorithm,
and the algorithm is
causal. In the brain, by the simple act of conserving energy, the entropy and
uncertainty are
removed from the information, and the result is a unique, invariant behavior.
In Computer Science and Artificial Intelligence, the energy consumption by
computers has been
the focus of attention for decades. But few seem to have noticed that the brain
goes beyond that
point, and also reduces the energy consumption of the information itself, not
just the machine.
Removing energy from information also removes entropy, and causes it to
self-organize into
invariants. That's what the brain does, all the time, create invariants, or
invariant representations
of the information it has acquired. This is the answer to Hofstadter's
challenge, the 100,000,000 dots of light the your
retina that become one single word, "mother."
We use the invariant representations for everything, to think, to communicate,
to create more
invariant representations. Every word I write here is an invariant
representation in my mind.
A language is an invariant representation. There can be no intelligence without
invariant representations.
It generally seems to me, but I can't promise, that little will be left to
explain intelligence once the invariant representations are understood. There is an infinite numerable quantity of causal sets,
there is and infinite
numerable quantity of invariants, and there is a bijective correspondence
between each causal set
to each invariant representations. What is left? Of course, understanding the implementation
details of the brain is
another matter.
In Computer programming and Artificial Intelligence, information is processed
while leaving all the
entropy in it. Alternatively, the information may be passed to humans so as to
use the human brain
to remove the entropy and create the invariant representations, which are then
fed back to computers.
This practice perpetuates the familiar man-machine inter-dependency in both
fields.
We all remember the quest for the perpetual motion machine, which stopped only
when the
energy-entropy interplay was understood in thermodynamic machines. Are we not
pursuing a
"perpetual certainty" information machine?