subscribe to our mailing list:

SECTIONS




A Free Lunch in a Mousetrap
By Mark Perakh
Posted on February 27, 2002. Updated on January 5, 2003.
Discussion
Contents
Introduction
Dembski defends Behe's mousetrap example
Dembski salvages the irreducible complexity
Dembski's explanatory filter revisited
The third node of the explanatory filter
Is complexity equivalent to low probability?
Dembski suggests a fourth law of thermodynamics
Can functions add information?
Are NFL theorems relevant for Denbski's thesis?
Acknowledgments
Appendix
References
Discussion
"Christ is never an addendum to a scientific theory but
always a completion."
(W. Dembski, Intelligent Design, page 207 ).
"As Christians we know naturalism is false."
(W. Dembski, in coll. "Mere Creation," page 14).
"...scientific creationism has prior religious
commitments whereas intelligent design has not." (W. Dembski. Intelligent
Design, page 247).
1. Introduction
Yes, I know that the title of this paper sounds like gobbledygook. This is
because the new book by William A. Dembski [1] titled No Free Lunch – Why
Specified Complexity Cannot Be Purchased Without Intelligence (DNFL), some
parts of which I am going to discuss in this paper, makes no more sense than the
title of this article. I tried to find in that book something which would
enable me to say a few words favorable to at least a fraction of Dembski's new
publication. I could not find it. In my review [2] of Dembski's earlier
publications including his book The Design Inference (TDI) [3], I
criticized the latter in quite unambiguous terms. In my view, Dembski's
new book (which he claims is a sequel to TDI) is even worse than TDI. DNFL has
been highly acclaimed by Dembski's colleagues and supporters. In fact, confusing
statements, contradictory definitions, and even elementary errors as well as
unnecessary mathematical exercises, abound in this book. A substantial portion
of DNFL simply reiterates, often verbatim, Dembski's earlier publications,
although many critics have demonstrated the multitude of weaknesses in Dembski's
position. On the other hand, there are some new elements in this new book
compared with Dembski's earlier books and papers. Unfortunately, these new
elements are mostly characterized by the same penchant for using selfcoined
terms, pretentious claims of important insights or discoveries without a proper
substantiation, and tooobvious a subordination of the discourse to preconceived
beliefs. What these beliefs are can be seen from the first two quotations
from Dembski, placed right under the title of this article. Note that the
third quotation plainly negates the first two. Whereas in the third
quotation Dembski claims that his intelligent design theory is purely scientific
and not tied to any religious doctrine, so it can legitimately be discussed
within the framework of a scientific dispute, the first two quotations clearly
show the real religious motivation and therefore the goal of his allegedly
scientific discourse.
I have no intention of providing a
comprehensive review of Dembski's new book. I don't believe it deserves such,
although it surely will be highly praised and used by intelligent design
(ID) adherents as an allegedly very sophisticated and mathematically rigorous
substantiation of the ID theory in their ongoing war against genuine
science. I will discuss only a few selected points in Dembski's new
book. However, the absence of discussion of some other parts of that book
in no way signifies that I agree with them or find any merits in them. Though I
feel that Dembski's new book should not be completely ignored lest the ID
adherents could claim that their opponents have nothing to say in response, a
comprehensive review would be a waste of time and effort.
Among the new elements found in
DNFL we see Dembski arguing with some of his critics. Some of that material is
a repetition of arguments in his earlier papers (for example his dispute with
Robert Pennock). Some other, however, seem to appear for the first time (like
his replies to Wesley Elsberry, Gert Korthof, Howard Van Till, and John
McDonald). On the other hand, Dembski seems to still ignore some other
critics of his theories. For example, among the names of people with whom
Dembski has had "direct contacts," listed on page xxiv, we find the names of Eli
Chiprout and Richard Wein. However, Dembski does not say a single word
about the strong critique of his work by Chiprout or Wein. Then, Dembski refers
to a book by Del Ratzsch and thanks the latter for providing an example of Oklo
uranium mine (page 26). However, Dembski fails to even mention that the same
quoted book by Ratzsch contains serious critical remarks that address a number
of items in Dembski's book The Design Inference. Other critics whom
Dembski seems to ignore are Professor Massimo Pigliucci and myself.
Whereas Dembski is certainly not under obligation to reply to his critics, the
absence of a response is usually viewed as an admission of the lack of good
counterarguments. Hence, Pigliucci, Wein, Chiprout, and some other
critics seem to be entitled to interpret the absence of replies from Dembski as
his tacit acknowledgment of having nothing to say in response. Of course, there
also is another possible explanation, that Dembski is so enchanted by his own
achievements that he disdainfully ignores our critical remarks viewing them as
well below the level of sophistication befitting his brilliant work. I leave to
the readers the choice between these two explanations.
2.
Dembski defends Behe's mousetrap example
A substantial fraction of Dembski's new book is devoted to a fierce defense
of Behe's concept of irreducible complexity [4]. This is easy to
understand. Dembski and Co. insist that their activity is not motivated by
religious predisposition but is genuinely scientific and based on legitimate
evidence of design. Of course, among the opuses of design proponents there are
also theological or nearlytheological works wherein they explicitly reveal the
religious (usually Christian) foundation of their approach, but in his latest
book Dembski's denies that his theory is anything but an unbiased
analysis, supported by mathematical arguments which proves the concept of
intelligent design regardless of any religious connotations. However,
perusing the literary production of "design theorists" reveals the paucity of
evidence in favor of their position but abundance of casuistry. Behe's
concept of irreducible complexity seems to be almost the sole case which may be
capable of being presented as a scientific argument based on biochemical
evidence. Therefore intelligent design creationists like Dembski desperately
need to prove Behe's thesis. If Behe's thesis is refuted (as, in my view, it
should be) then the design creationists are left with nothing which would
constitute even a semblance of a scientific discourse.
In my article [5] I reviewed Behe's argument in detail and came to the
conclusion that his concept is contrary to logic and to facts. I will not
repeat this argument here but will address Dembski's defense of Behe.
In my article [5] I did not devote any substantial space to the critique of
Behe's favorite example – that of a mousetrap which he presented as an example
of irreducible complexity and therefore as a model of a biological cell from the
viewpoint of that concept. I did not do it because it seemed to be a secondary
point and proving the inadequacy of a particular example did not seem as
important as unearthing the principal flaws in Behe's concept. I wrote a
little more about Behe's mousetrap model in another paper [6] where I
discussed the question of models in science in general and pointed to the faults
of Behe's model.
Behe's model – his mousetrap – was debunked by John H. McDonald in a posting
[7]. McDonald demonstrated how Behe's fivepart mousetrap can be gradually
reduced to a fourpart, threepart, twopart and finally onepart contraption,
each preserving the ability to catch mice, albeit not as good as the fivepart
construction. Therefore, the mousetrap does not seem to be irreducibly complex,
contrary to Behe's assertion.
I readily concede that demonstrating the weakness of Behe's particular example
does not in itself disprove his thesis. It only shows that he happened to
suggest a bad model. It may, though,
presumably cast a shadow on Behe's overall image and hence potentially
undermine Behe's overall idea in the eyes of those readers who have not yet made
up their minds in regard to the ID theory. That is something ID
creationists are not willing to accept. Therefore Dembski devoted many pages of his
new book not only to defending Behe's overall concept, but also to an attempt to overturn McDonald's particular debunking of Behe's mousetrap model.
When McDonald demonstrated the inadequacy of Behe's mousetrap example, the
proper behavior for Behe the scientist would have been either to offer a
reasonable rejoinder or to concede that his example was not very well thought
of. In the latter case he might still insist that the inadequacy of his example
does not translate into the inadequacy of his entire theory and such a position,
which is legitimate, could be discussed on its own terms. Behe and Dembski chose
a different approach – to defend Behe's example by using casuistry. (For a partial discussion of Behe's response to McDonald, see the appendix to the article at Irreducible Contradiction ). This is one more illustration of the typical behavior of ID proponents. They
seem to be not that much interested in establishing the truth as in winning the
dispute at any cost, by whatever means available.
In order to see the lack of substantiation in Dembski's defense of the mousetrap
model, let us quote from Behe. In his book Darwin's Black Box Behe writes: "If
any of the components of the mousetrap (the base, hammer, spring, catch, or
holding bar) is removed, then the trap does not function. In other words, the
simple little mousetrap has no ability to trap a mouse until several separate
parts are all assembled. Because the mousetrap is necessarily composed of
several parts, it is irreducibly complex."
Note that Behe did not suggest any additional conditions the trap must meet to
be characterized as irreducibly complex. The only feature determining, according
to Behe, the irreducible complexity of the trap was that the removal of any of
its parts must make the remaining set of parts incapable of catching mice, and
that the trap cannot be functional until all five parts are in place.
Confronted with McDonald's spectacular counterexample, Dembski resorts to a
transparently contrived argument purported to show
that McDonald's example does not prove the inadequacy of Behe's model.
Dembski's main argument concentrates on the particular shapes of the mousetrap
parts which are slightly different in each of McDonald's simplified
versions. When McDonald removes this or that part of the mousetrap, he
slightly modifies the shape of the remaining parts thus preserving the trap's
functionality albeit diminishing the trap's quality. It is obvious that the
process can be reversed. One can start with the simplest onepart trap,
then gradually add more parts, improving its ability to catch mice at every step
of the procedure. In Dembski's view, the modifications of the parts' shape at
each stage of McDonald's procedure makes his example irrelevant to Behe's
thesis. Look again at the above quotation from Behe. He did not mention
the additional condition introduced by Dembski – that the removal of the trap's
parts must not be accompanied by any modification of the shape of the remaining
parts. Perhaps Behe indeed had in mind the notion that in an
irreducibly complex system the parts remaining after the removal of some other
part must retain their initial shape. However, he did not spell out such a
condition so McDonald was not constrained by it at the time he suggested his
counterexample. Furthermore, and it is more important, why shouldn't the shape of parts in the "reduced" versions of the mousetrap change? McDonald's example satisfied
all Behe's originally formulated conditions – it showed that, contrary to Behe's
position, parts of the trap can be removed and the remaining contraption can be
made to preserve functionality (albeit at a lower level of fitness). This
fully debunked Behe's assertion as it has been originally formulated by
Behe.
Is the requirement added by Dembski  that the parts remaining after the
removal of a certain other part retain their shape  indeed relevant to the
discrimination between irreducible and reducible complexities? Remember that
Behe's example of the alleged irreducible complexity of a mousetrap was
suggested as an illustration of his concept of irreducible complexity as applied
to biological systems. The Darwinian theory of evolution includes as an inseparable
part the concept of gradual changes resulting in the slow
accumulation of features advantageous for organisms. If we stick to Behe's
example as a model of a biological system, the evolution of a mousetrap in
McDonald's scheme from a onepart to a fivepart contraption comprises steps
which are not really small. Actually each step of McDonald's scheme may be
viewed as the sum of many smaller steps wherein the parts gradually change their
shape and at a certain stage of the evolution another part is added. At a
certain step of that evolutionary process, some of the already existing parts of
the system may change again their shape, in particular make it simpler, if such
a simplification of shape is not detrimental to the organism's fitness. The
mousetrap's parts, if it is at all viewed as a model of a biological system, can
modify their shape at every step of the evolution preserving the trap's ability
to work at a higher level of fitness. Though McDonald's scheme shows only
four steps in the path from a onepart to the fivepart contraption, actually
that path may have comprised many more intermediate steps not shown. Therefore
the modification of the parts' shape at every step from a onepart to the five
part contraption in no way diminishes the power of McDonald's example, which
decisively debunks Behe's statement quoted above.
Moreover, if the requirement of unchangeable shape of all parts of the mousetrap
were included in Behe's original formulation, McDonald, as well as anybody else
with some engineering experience, could have shown a single part, a twopart,
then the threepart, fourpart and finally fivepart mousetraps wherein the
shapes of the original parts would not change at all in any of the steps
building up to the final fivepart contraption. This would result in a fivepart
trap whose parts would have shapes different from those shown in Behe's picture,
but such a fivepart trap would be as good as that by Behe, even if its parts
would have not the simplest shape possible. In living organisms such organs which have
unnecessarily complex shapes are common testifying to their evolutionary
history.
The above considerations have recently found a confirmation as McDonald updated his scheme of gradually developing mousetrap. This new version of McDonald's example is not discussed in Dembski's book, apparently because it appeared after the book was submitted for publishing. In his new (animated) scheme, McDonald shows a series of mousetraps starting with a simple bent wire serving as a primitive mousetrap. Step by small step, by adding one more part at each step, and gradually modifying the functions of the trap's parts, as the parts, originally optional, become necessary, McDonald illustrates how a process of the trap's development may proceed. While in McDonald's example, each modification of the device was done by the designer (McDonald), in the process of evolution a similar progress from a primitive to a more complex device could have been governed by a combination of mutations and natural selection. McDonald updated scheme shows the lack of substantiation in Dembski's attempt to defend Behe's model.
McDonald states that his example does not represent the actual process of
biological evolution. Dembski grasps at that statement using it to insist that
McDonald's example does not prove that Behe's position is wrong or that Behe's
mousetrap model is bad. He says: "the problem is that his progression of
mousetraps has little connection to biological reality." You can say that
again, Dr. Dembski. What you pretend not to notice is that the reason the
progression of mousetraps does not represent biological reality is simply that
Behe's example has very little to do with biological reality. Obviously, when
debunking Behe's model which has little to do with biological reality, McDonald
had no need and no way to provide an example which would be more relevant to
biological reality than Behe's original model was. Neither Behe's model
nor McDonald's counterexample have much to do with biological reality. Dembski
is prepared to happily forgive this from Behe but not from McDonald. This
is typical of Dembski's selective logic.
The difference between McDonald
and Behe is that the former realizes and freely admits that his scheme does not
adequately represent biological evolution, whereas Behe tries to unduly use his
model as an illustration of biological reality. What McDonald's scheme does very
well, is show the lack of substantiation in Behe's statement about the
irreducible complexity of a mousetrap. Given Dembski's education and his
"formidable intelligence" (in the words of one of Dembski's admirers found in a
blurb in the DNFL book) it is hard to believe that he himself does not realize
the fallacy of his attack on McDonald and of his defense of Behe. A much
more plausible assumption seems to be that he is not interested in an unbiased
evaluation of arguments but only in winning the dispute at any cost, by whatever
means he can contrive.
3. Dembski salvages the irreducible complexity
Dembski's rendition of the dispute between Behe and his retractors is replete
with inaccuracies. I don't think, though, that a detailed analysis of
these inaccuracies is important or interesting. It would be proper if
Dembski's work were a scientific monograph summarizing research reported in
peerreviewed scientific journals and discussed at scientific conferences.
Since, however, Dembski chose to address his book to a general audience and thus
eschewed the peerreviewing process, it seems better to analyze just the most
salient points of Dembski's defense of Behe's concept of irreducible
complexity.
On page xvii Dembski says: "I am
not a fan of notationheavy prose and avoid it whenever possible."
However, just leafing through his new book reveals that the quoted statement is
contrary to facts. Like his preceding book, the new one is chockfull of
mathematical symbols, more often than not adding nothing of substance to his
discourse.
A good example is found on pages 271 279, in a section titled "The Logic of
Invariants." Here Dembski resorts to his favorite method of discussion –
presenting a convoluted chain of arguments in a heavily symbolic form,
incomprehensible to a general reader, who must be impressed by the mathematical
sophistication of Dembski's discourse, and thus conclude that an expert of such
intellectual caliber, with his PhD degree in mathematics, surely must know what
he is talking about. Actually, all this mathematical discourse is largely
irrelevant to the concept of irreducible complexity.
Dembski explains the concept of
invariant in a heavily symbolic form wherein his style is unnecessarily
complicated so that it requires a considerable effort even for a reader trained
in mathematics to comprehend what Dembski actually means to say. I suggest that
an average reader try to decipher the following passage on page 274: "...define a
function Invar on Ω (for definiteness assume Invar is realvalued,
i.e., Invar takes values in the real numbers R). Let A = { r Є R  there
exists some natural number n and some X in Init such that Invar
(φ^{n})(X) = r} and B = { r Є R  there exists some Y in Term
such that Invar (Y) = r }..." etc. The quoted passage is just a fraction of
a much longer exercise in heavily symbolic discussion. If it is an example of
Dembski's attempts to avoid "notationheavy prose," I wonder what he
considers to be a really notationheavy prose? The sole purpose of that convoluted discussion heavily loaded with mathematical symbols seems to be providing a definition of the concept of an invariant.
However, those readers who have
enough experience with mathematics to have comprehended Dembski's
notationsheavy paragraphs, certainly know what is an invariant and need no
explanation. Those readers who don't know what an invariant is presumably
are also not prepared to digest the mathematical exercise on page 274 which
therefore does not seem to serve any useful purpose.
In any case, the definition of an
invariant could be done in one simple sentence. For example, for the purpose of
Dembski's discourse it would be sufficient to say that an invariant of a certain
procedure is a quantity whose value does not change in that procedure.
(For example, entropy is an invariant of a reversible adiabatic process.)
After having devoted considerable
space and effort to introduce his mathematically loaded definition of an
invariant, Dembski makes no use of that definition anywhere afterward. Instead,
he tries to apply the concept of an invariant as a tool for what he calls
proscriptive generalization, one more of his selfcoined terms.
Proscriptive generalization simply means that certain processes or events are
claimed to be impossible based on some general consideration rather than on a
detailed analysis of the factors preventing the occurrence of these processes or
events. What Dembski asserts essentially can be spelled out as the
following simple statement: if it is found that in a certain process a quantity
which is an invariant of that process would actually change, then such a process
must be considered as impossible.
Whereas the gist of the above
statement itself meets no objection, it is actually of very little informative
value, and the lengthy and convoluted discourse by Dembski aimed at arriving at
that assertion is plainly redundant and seems only to serve as a scientificlike
embellishment.
Indeed, when Dembski turns to his defense of Behe's concept of irreducible
complexity, all he says in regard to his preceding lengthy mathematical exercise
is that irreducible complexity is "an invariant for the Darwinian process of
random variation and natural selection." To make such a statement there
was no need for all of the preceding mathematicallooking exercise. More
important, though, is that the above statement, which purports to reflect Behe's
position, cannot be taken for granted. It requires proof and none has been
provided by either Behe or Dembski.
Apparently aware of the strong objections to Behe's concept from many
professional biologists, Dembski uses a clever device – he admits that Behe's
concept is not faultless. He writes: "Behe's idea of irreducible
complexity is neither exactly correct nor wrong.... Instead it is salvageable."
(page 280). Contrary to his statement, Dembski actually tries to prove that
Behe's idea is indeed correct, and asserts that this must become clear if only
Behe's original definition of irreducible complexity is slightly fixed. Hence,
in order to salvage the concept of irreducible complexity, which is
needed by the ID advocates to substantiate their otherwise arbitrary
conceptual system, Dembski, who has never admitted a single error in his own output, is even prepared to sacrifice to a certain extent the
sterling reputation of his cohort Behe.
Having quoted Behe's definition, Dembski then
proceeds to repair it in five consecutive steps. Here is Behe's original
definition quoted by Dembski:
DefinitionIC_{init
}" A system is irreducibly complex if it is 'composed of several
wellmatched, interacting parts that contribute to the basic function, wherein
the removal of any one of the parts causes the system to effectively cease
functioning'."
And here is the final (salvaged)
definition of irreducible complexity suggested by Dembski as the result of his
fivestep salvaging effort:
Definition IC_{final
}"A system performing a given basic function is irreducibly
complex if it includes a set of wellmatched, mutually interacting parts
such that each part in the set is indispensable to maintaining the system's
basic, and therefore original, function. The set of these indispensable parts is
known as the irreducible core of the system."
Looking at the "salvaged definition" of Irreducible Complexity according to BeheDembski immediately reveals that is not a proper definition of anything. It can be argued that the fallacies of the above quasidefinition can be forgiven to Behe who is a qualified biochemist but not a professional logician (although, as a scientist, he is expected to offer a definition at least reasonably logical). However Dembski, with his collection of advanced degrees including a PhD in philosophy, should have known better than to suggest a definition which is contrary to the elementary requirements of logic. Of course there is not much surprising in that, since from his previous book The Design Inference we already know that formulating definitions is by no means Demsbki's forte.
The above definition of Irreducible Complexity purports to be a deductive definition, whose standard form necessarily is that of a triad containing the following elements: 1) A general concept, which is supposed to be known to all participants of the discourse and interpreted by all of them in the same way, 2) Qualifiers, which point to those features of the concept to be defined that distinguish it from all other concepts also encompassed by the general concept spelled out in point 1, and 3) The target of the definition, i.e., the concept to be defined. One of the requirements for a proper deductive definition is that whatever belongs in item 2 cannot simultaneously belong in either items 1 or 3. Otherwise the definition would turn out to be circular and therefore void of informative value.
Another requirement for a proper definition is that the general concept in item 1 and the qualifiers in item 2 must be precisely predefined according to a consensus among all the participants of the discourse.
It is easy to see that the BeheDembski's "definition" of IC above fails on both counts.
In that definition the target of definition, i.e., the concept to be defined, is IC, the famous Irreducible Complexity. Instead of directly defining IC, the above definition instead defines a system which is irreducibly complex. Such a substitution is legitimate because it is exactly what we are interested in – a definition of an irreducibly complex system, from which, if desired, the definition of Irreducible Complexity per se can be inferred.
In the BeheDembski definition above, the general concept (item 1 of the triad) is "a system performing a given basic function." Already this item does not meet the requirements for being the general concept known to all and interpreted in exactly the same way. Unfortunately, this concept has no definite meaning and allows for various interpretations. What is a "basic" function"? How can it be distinguished from a "nonbasic" one? Moreover, what is a "system?" How should the boundaries of a system be defined, separating it from whatever is beyond the system? Behe and Dembski offer no definitions of these constituents of item 1. There is no general consensus regarding the precise meaning of these terms. In many situations the boundaries of a system can be chosen in many different ways.
Item 1 in BeheDembski's definition would be legitimate and meaningful if the concepts of a "system" and of a "basic function" were defined beforehand. It is a common logical flop when the general concept (item 1 of the triad) has no commonly accepted definition, thus rendering the definition meaningless.
The target of the above definition is an "Irreducibly Complex System." The qualifiers are defined there as the requirement that a system "includes a set of wellmatched, mutually interacting parts such that each part in the set is indispensable to maintaining the system's basic, and therefore original, function."
Again, the subdefinition of the qualifiers does not meet the requirements of being known and identically interpreted by all. When are the parts "wellmatched" and when are they not "wellmatched?" The answer is uncertain. What is wellmatched for John may be poorly matched for Mary. However, even if the concept of "wellmatched parts" were unambiguously defined and known to all, this would not save the BD definition. The reason for that is the egregious mixup of items 1 and 2. The subdefinition of the qualifiers refers to the same "basic function" which is a part of item 1. This makes the definition circular and therefore void of informative value. Indeed, to verify that the parts are indispensable, we are told to test whether or not they serve the "basic function." On the other hand, the "basic function" is a part of the general concept, and therefore cannot serve also as a qualifier. To be functional, the qualifiers must be independent of the general concept.
The conclusion: both Behe's original and Dembski's final (salvaged) formal definitions of Irreducible Complexity are logically deficient and have little, if any, meaning. Of course, Behe and Dembski may limit themselves to more informal, partially descriptive definitions, which is a legitimate manner of discussion, but their attempt to offer a strict formal definition can hardly be viewed as successful.
Now let us discuss the above definitions of IC from an informal viewpoint.
The comparison of Behe's original
("initial") definition with the final, "salvaged" definition by Dembski boils
down to, first, the introduction of the concept of "irreducible core" of the
system and, second, to the assertion that a system of reduced complexity must
retain the "basic (and therefore original)" function in order for the original
system to be considered not irreducibly complex.
Behe's definition did not contain
an indication that in an irreducibly complex system not all of its parts may be
necessary for its proper functioning. Dembski's "salvaged" definition
allows for the existence of some parts of a system which can be removed without
eliminating its functionality. However, if a system includes a set
of parts all of which are necessary for maintaining its basic (i.e. original)
functionality, which set Dembski calls its "irreducible core," then the system
is irreducibly complex.
The "salvaged" definition does not
seem to add anything of substance to Behe's original concept, which, according
to Dembski, is "neither exactly correct nor wrong." The question of
whether all parts of the system are necessary for its original functionality or
only those included in an irreducible core, is of no significance.
The boundaries of a system are often set arbitrarily. The irreducible core may
be as well viewed itself as the system under consideration. The only
purpose of Dembski's modification is to enable him to reject an argument
pointing to a system which retains its functionality after the removal of some
of its parts by simply asserting that either the removed part did not belong to
the "irreducible core," or that the functionality of the reduced system is not
equivalent to the "basic (i.e. original)" one.
Dembski's salvaging argument does
not really save Behe's concept because it has nothing to do with the actual
critique of that concept.
First, Behe has never proved that
even a single biochemical system he described was indeed irreducibly complex
according to his definition. Dembski's argument that the reduced system
must perform exactly the same "basic" function is an arbitrary
requirement. Biologists tell us that in the course of evolution many
systems changed their functionality. A biochemical system which in modern
organisms clots blood, in some of its preceding, simpler form could very well
have performed a different function or even no function at all, acquiring its
ability to do this or that job only at a certain stage of evolution.
Dembski also offers a number of
refutations to the critique of Behe's concept by Kenneth Miller, Niall Shanks,
Karl Joplin and Russell Doolittle. Except for Shanks, these critics,
unlike Dembski, are professional biologists who criticize Behe's ideas from the
biological standpoint. Since I am not a biologist, I leave the discussion of
Dembski's attack on the above listed experts to those better versed in biology
than myself. However, there is one point in Dembski's attack on Doolittle which
has little to do with biology and everything to do with the ethics of a
scientific discussion. On page 281 Dembski discusses the dispute between Behe
and Doolittle wherein he essentially reiterates the argument used by Behe
himself in a paper published in the collection [8]. In that paper Behe
claimed that after he replied to Doolittle's critique [9] of Behe's work, Doolittle
conceded being wrong in his interpretation of the experiment by Bugge at al [10].
I have contacted Professor Doolittle and asked him to confirm that he indeed
acknowledged his error in the interpretation of the experiment in question.
Professor Doolittle unequivocally and energetically denied having provided a
reason for such an assertion by Behe. In his book, Dembski essentially repeats
Behe's claim saying that "Doolittle's counterexample failed." This assertion is
not based on evidence. When I see the methods of discussion employed by Behe and
Dembski in this case, I feel that every statement by these writers must be
carefully examined since there seems to be a good chance they did not verify
their sources diligently enough.
4.
Dembski's explanatory filter revisited
In his new book [1] Dembski devotes a considerable part of his discussion
to his "Explanatory Filter" (to be referred as EF). To my knowledge, this is at
least the sixth time Dembski published a description of the device which
allegedly enables one to reliably distinguish among the three causal antecedent
of an event – necessity (also referred to as regularity or law), chance and
design. In Dembski's scheme, these three causal antecedents of an event
are supposed to cover all possibilities and are mutually exclusive.
In my earlier discussion [2] of Dembski's "Design Theory" I reviewed many parts
of Dembski's previous discourse, including a rather detailed analysis of his EF,
and offered a number of arguments against the validity and reliability of that
allegedly powerful analytical tool. In this review of DNFL I will not
repeat the arguments detailed in [2]. I will, though, provide some additional
examples showing the inadequacy of EF, and will also discuss some rather
telltale alterations of Dembski's presentation of his theory in [1] as compared
to his five earlier renditions of EF.
EF comprises three socalled "nodes," which are three steps of analysis aimed at
determining whether an event is due to law (regularity, necessity), chance, or
design.
At the first node, according to Dembski, one estimates the probability of the
event in question and if this probability turns out to be "large," the event is
attributed to law (regularity, necessity). The term "large" was not
quantitatively defined by Dembski. If the probability is "not large," whatever
this means quantitatively, it passes to the second node. Here the decision is to
be made whether the probability of the event is "intermediate" or "small."
Again, these terms have not been defined quantitatively. If the probability of
the event is determined to be "intermediate," the event is attributed to chance.
If the probability is "small" the event passes to the third node, where the
final judgment is to be made whether the event must be attributed to chance or
to design.
I will now discuss separately the first and the second nodes of EF on the one
hand, and the third node on the other hand. The reason for such a
separation is that Dembski's analytical procedure is essentially the same for
the first and the second nodes but is rather different for the third node. At
the first and the second nodes the only criterion according to Dembski's scheme
is the event's probability, whereas at the third node the criterion is twofold,
comprising low probability and specification.
In my previous analysis of EF I
argued that the procedure suggested by Dembski for the first and the second
nodes of EF is unrealistic thus making his entire triadlike scheme
meaningless. As I argued there, the actual procedure is necessarily
opposite to Dembski's scheme. For example, at the first node of EF,
according to Dembski, we conclude that an event resulted from regularity (law,
necessity) if we find that its probability is large. To see that Dembski's
prescription is meaningless, consider the relationship between two concepts –
the large probability of an event and a law (regularity, necessity) which
determines the occurrence of that event. It does not take a "formidable
intelligence" (which, according to a blurb in DNFL, characterizes Dembski) to
see the simple fact – of these two concepts, law (regularity,
necessity) is the cause and high probability is the consequence of the law
(regularity, necessity). Therefore it is contrary to elementary logic to
suggest that the attribution of an event to law (regularity, necessity) can
result from a prior estimate of the event's probability. If one does not
know about the existence of a law (regularity, necessity) one has no way to
conclude that the event's probability is high.
Dembski's scenario, according to
which we in some mysterious way conclude that the event's probability is high,
is unrealistic because we cannot estimate the probability of an event unless we
know its causal history. This history is a necessary part of the
background knowledge possessing which is necessary for estimating the event's
probability. In particular, to conclude that the event probability is
"high" we have to first ascertain that the event was caused by law (regularity,
necessity), not the other way around, as Dembski suggested. Likewise, at the
second node of EF, we cannot assert that the event's probability is either
"intermediate" or "small," unless we know the event's causal history, which is
again contrary to Dembski's unrealistic scheme.
Let me suggest an example
illustrating the above argument. Imagine that a small detachment of
American soldiers in Afghanistan moves in a rugged mountainous terrain.
They enter a narrow gorge between two steep rocky slopes. When they are in
the middle of the gorge, a large stone rolls down from the slope and hits the
path close to the soldiers.
Now let us try to apply EF to this
event. According to Dembski, there are three and only three distinctive
possibilities – that the rock fall was due either to regularity, chance,
or design. If such rock falls happen in this gorge regularly, say once every few
minutes, the event in question has to be attributed to regularity (law,
necessity) and therefore its probability is estimated as high. If such rock
falls occur not very often, say once in a couple of days, the knowledge of the
frequency of such events is what enables one to conclude that the probability of
that event was "intermediate." Finally, if such falls of stones occur in
this gorge extremely rarely, say once in ten to fifteen years, the event,
according to Dembski's EF, is to be analyzed within the framework of the third
node, wherein the discrimination between chance and design has to be made by
looking for a possible pattern, i.e. specification. Design in this case may mean
that some Taliban fighters deliberately pushed the rock from the crest of the
rim of the ridge in order to hit the soldiers.
Note, though, that in each case
the probability is estimated based on the knowledge of the causal history of the
event. Dembski's scheme prescribes the opposite procedure – first the
probability is somehow "read off the event," and based on the value estimated,
the event is attributed to one of the three causes. This procedure is
unrealistic.
At the first node of EF, according
to Dembski, the assumption of regularity at work is to be tested.
Obviously the occurrence of the
event itself does not provide any clue as to whether its probability is high,
intermediate or small. If the soldiers continue their trip, they will
never find out whether the probability of the described event was large,
intermediate, or small. Using Dembski's own terminology, the probability
of an event cannot be "read off the event."
Assume, though, that the soldiers
were warned that in that gorge rocks fall regularly and hit the path. Now they
possess the necessary background knowledge – they know in advance that the rocks
fall regularly. They have a prior knowledge of a regularity (law,
necessity) and therefore estimate the probability of the actual event
(the falling stone) as high. Its probability was high because it was due
to regularity. In accordance with Dembski's unrealistic scheme, though,
the soldiers first should have somehow estimate the probability of rock's
falling from that slope and having found it to be large (how?) decide that a
regularity was at work. In fact, to ascribe large probability to an event, the
knowledge about its being due to law (regularity, necessity) must precede the
estimation of probability.
Now imagine the soldiers walk
through another gorge and again a large stone rolls down at the moment they are
in the gorge's middle section. Again, there is no way they can follow
Dembski's scheme and estimate the event's probability without
knowing the history of the rocks falls in that gorge. Assume
that this time they were warned that in this gorge rocks fall from time to time,
but not too often, say once in so many hours. Now the probability of the
event can be estimated as being neither very high nor very small. This
"intermediate" probability was estimated based on the known history of similar
events, whereas in Dembski's scheme the probability estimate is supposed to be
made without any such knowledge. In Dembski's scheme the event is attributed to
chance because its probability was found to be "intermediate." How
this probability could have been "read off the event," i.e., estimated on its
own, without utilizing the "background knowledge" which in this case must
necessarily include the history of rocks falls, remains Dembski's secret. The
actual procedure is opposite to that suggested by Dembski: the probability is
estimated as not large enough to attribute the event to regularity based on the
background knowledge available, not the other way around as Dembski's scheme
requires. If no such background knowledge is available, no useful estimate of
probability is possible, thus rendering the first and the second nodes of
Dembski's EF meaningless.
Since the first and the second
nodes of EF in Dembski's scheme are contrary to elementary logic, the triadlike
structure of his EF collapses. The remaining part of EF, its third node,
is, however, a different story requiring a little more detailed discussion.
5. The third node of the explanatory filter
Let us look at Dembski’s treatment of the third node in DNFL.
In all five previous renditions of his EF, Dembski suggested that the discrimination between chance and design is made at the third node of EF by two criteria. One is the event’s probability (this time, unlike at the two preceding nodes, estimated upon the definite assumption that the event was due to chance) and the other is "specification."
As I argued in my previous review of Dembski’s theory, the concept of specification, which was presented by Dembski in a heavily symbolic form, actually could be rather simply defined as "a subjectively recognizable pattern." This simple definition follows from all those examples Dembski provided in his previous publications. Dembski, however, chose to cloak this concept in a convoluted mathematical mantle.
To infer design according to Dembski, an event, besides being improbable on a chance hypothesis (the latter not being specified) must also display "specified complexity," or, for short, specification. (The concept of "specified complexity," which is assigned a great importance in Dembski’s theory, is discussed in detail in another section of this review.)
Specification, according to Dembski’s previous renditions of his theory, in turn comprises two necessary components, one called detachability and the other delimitation. Detachability, in turn, according to Dembski’s previous renditions, necessarily comprises two subcomponents, one named conditional independence of the background knowledge and the other tractability.
This multicomponent scheme has been criticized from various viewpoints by various reviewers of Dembski’s publications. In particular, I argued against the excessively convoluted way these concept were presented by Dembski, pointing out that the concept of specification in itself does not require all that mathematical symbolism and actually all these components of specification in Dembski’s rendition serve no useful role. Apparently the critique has had some effect after all, since in [1] Dembski suggests a discussion of specification different in certain respects from his previous opuses. Of course Dembski never admits that anything was wrong in his previous publications or that he was influenced by criticism. However, when discussing specification in [1], Dembski introduces two alterations of his earlier discourse.
As mentioned above, in Dembski’s earlier rendition of specification, the concept encompassed three components, conditional independence of background knowledge (denoted CINDE), tractability (denoted TRACT), and delimitation (denoted DELIM).
In the new rendition found in DNFL, TRACT is no longer a constituent of detachability. On page 66 in [1], Dembski says: "…I have retained the conditional independence but removed the tractability condition." In The Design Inference Dembski spent considerable effort to justify the inclusion of TRACT into his concept of detachability, using both plain words and mathematical symbolism plus examples illustrating the importance of that alleged insight into the design inference. In my previous discussion [2] of Dembski’s The Design Inference, I argued that all this convoluted structure of the design inference, and of the concept of specification in particular, was excessive, while the concepts of tractability, delimitation, etc, played no useful role and served only to embellish the discourse with a complex mathematical fa?ade. Although in [1] Dembski does not explicitly admit any faults of his earlier argument, he actually does it implicitly, by removing from the concept of detachability its TRACT component which he previously viewed as necessary and important. Instead of frankly admitting that he goofed, Dembski says now that tractability is not really necessary within detachability but rather should be moved to his Generic Chance Elimination Argument (GCEA).
The question of whether tractability is indeed a useful part of the GCEA or is as useless there as it is in detachability is a separate issue. The fact is that Dembski implicitly admits his previous error but is reluctant to say this directly.
Furthermore, there is one more alteration of Dembski’s earlier discussion of specification. A fate worse than that of TRACT befell another component of Dembski’s earlier scheme, the one he denoted DELIM. Whereas in [1] Dembski at least points out the deletion of TRACT from detachability, he does not mention at all the elimination of DELIM, which in his earlier treatment was deemed a necessary and important component of specification.
In his new book, DELIM has disappeared. Dembski does not explain what happened to that feature which he previously claimed to be a necessary part of specification.
Obviously, if that term is not mentioned any longer when specification is being discussed, it was not really a necessary component of specification, which negates Dembski’s convoluted discussion of DELIM in his previous book. Normally in a scientific publication such alterations of the author’s earlier position are explained and when appropriate, the errors or inadequacy of the earlier argument are admitted. Of course DNFL is not really a scientific book, although Dembski wants it to be accepted as such.
If we remove from EF the first and the second nodes, will the remaining third node provide a reliable tool to attribute an event either to design or to chance? As I argued in my previous discussion [2] of Dembski’s work, the third node is not a reliable tool. Dembski himself admits that EF can yield false negatives, i.e., attribute an event to chance when it was actually designed. He insisted, though, that when EF attributes an event to design this is reliable, i.e., that EF does not yield false positives. Since Dembski made this claim, many examples of false positives produced by EF have been demonstrated. Dembski does not address these examples in DNFL.
As I argued previously, one of the main faults of Dembski’s scheme is his attributing to specification the status of a kind of magic. There is, though, nothing magical in that concept. In the examples of false negatives specification was not discerned but the event was obviously designed. In the examples of false positives the specification seemed to be present but the event was due to chance.
Specification, as follows from Dembski’s own examples, is nothing more than a subjectively recognized pattern. It can be illusory or real, but it has no exclusive status among many factors pointing either to design or to chance. Recognition of a pattern (which necessarily is subjective) affects the estimate of the event’s probability, but so do many other factors.
I discussed this thesis in detail in my previous discussion [2] of Dembski’s work.
In conclusion of this section, let us look again at the example discussed at the beginning, that of a stone falling into a gorge, and review it within the scope of the third node of EF.
At the third node, the event of low probability (estimated on a chance hypothesis) has to be attributed to design if specification is discovered. In one of Dembski’s own examples, if in an archery competition the arrow hits a small target painted on a wall, the event is specified. In the case of the stone hitting the path in a gorge, if the stone falls exactly at the spot where the soldiers are at that moment, the event is likewise specified. In the case of archery competition, specification is in that the target has a specific shape and location, unique among all other locations on the wall. In our example, the soldiers happen to be at a specific spot, unique among all other parts of the gorge, so this example is exactly like Dembski’s. Hence, if the likelihood of a chance occurrence of the event is estimated as very small (based on the known history of the gorge) and the stone falls exactly on the spot where the soldiers happen to be at that moment, we have to attribute the event to design, i.e. to assume that some enemy fighters deliberately pushed the stone down the slope to hit the soldiers. Yes, this is a reasonable assumption. However, this assumption still may happen to be wrong.
Indeed, once in a while stones do fall in that gorge. It happens very rarely, but it is not an impossible event. Could such a coincidence happen that a stone falls exactly at the moment the soldiers are where it lands? Of course, it could. Why, then, do we attribute the event to design? Because the probability of such a coincidence is very small and only because of that. Why is the probability of the event in question estimated as being very small? Since stones fall in that gorge from time to time, albeit very rarely, the probability that at some moment a stone would fall upon some spot within the gorge is 100%. What makes the probability of the actual event so small? Specification  the choice of a specific spot and time.
My point is that all what specification does is decrease the probability of the event in question. Yes, specification (any specification, i.e., any choice of a specific event out of the multitude of all possible events, and not necessarily the particular kind of specification meeting Dembski’s criteria) always decreases the estimate of probability of an event. But so do many other factors which may have nothing to do with specification. For example, if the soldiers know that a reconnaissance detachment has investigated the slopes flanking the gorge and found that all the stones above the path are rather strongly embedded in the ground, this would greatly decrease the likelihood of a chance fall of rocks, and this decrease of likelihood would have nothing to do with specification which in this case is in the unique location and timing of the event. On the other hand, if the soldiers knew that right behind that steep rocky slope which flanks the gorge, there is a village whose inhabitants are Taliban sympathizers who boasted that they would harm American soldiers at every opportunity, and there is an easy access path from that village to the crest of the ridge hovering over the gorge, this knowledge would not affect the probability of a chance fall of rocks, but will greatly increase the estimated likelihood of design. Hence, the ratio of likelihoods of chance and design would drastically change in favor of design. This increase of the estimated likelihood of design would also have nothing to do with specification which was in the unique location and timing of the event. Likewise the knowledge that the winner of an archery contest happened to be a novice who has never before succeeded in winning the competition would greatly decrease the likelihood of design and increase the ratio of likelihoods of chance and design in favor of chance. On the other hand it the winner was a world champion with a record of hitting the target 99% of the time, this would greatly increase the likelihood that his success was due to design as compared with the likeihood of chance and this conclusion would have nothing to do with specification (which in this case, according to Dembski, is in the unique location and small size of the target).
At this point it seems reasonable to go back to the first node of Dembski’s filter and see how the knowledge of the village behind the ridge or of the archer’s record affects the procedure.
In the case of the village inhabited by Taliban sympathizers, the answer is obvious. The knowledge of the existence of the village in question does not imply a regularity is at work. Indeed, the villagers do not regularly climb up the slope and drop rocks into the gorge. The event therefore reaches the third node where the likelihood of design exceeds that of chance because of the knowledge about the Taliban village.
In the case of an archer the situation is different. The knowledge of the champion archer’s record points to regularity, since the champion archer regularly succeeds in hitting the target. Actually, though, this has little to do with Dembski’s scheme. In his scheme, we first estimate the probability of an event and if it is found to be large we attribute it to regularity. In fact, the procedure is opposite to Dembski’s scheme. We know that the archer in question regularly hits the target and therefore we estimate the probability of his success as large. In other words, in that case the event is attributed to regularity even before it enters the filter, so that design is inferred without using the filter at all. In this case specification (i.e., the unique location of the recognizable target) plays no role whatsoever in the inference to design.
The role of specification is not that of an independent factor besides the small probability as Dembski’s alleged "crucial insight" asserts. Specification, regardless of whether or not it meets Dembski’s criteria, always diminishes the probability estimated on a chance hypothesis, but in that it is not any different from many other factors affecting the estimate of probability.
Design inference may be very plausible, but still remains probabilistic and its plausibility is due only to the low estimated likelihood of chance being the causal antecedent of an event. As I argued in my previous review of Dembski’s work, he correctly states that a small probability in itself is not a sufficient reason to infer design. However, adding specification does not remedy the situation because the latter adds nothing qualitatively different from other factors affecting the estimate of probability. Therefore design inference is necessarily probabilistic, with or without specification, although it may be in certain cases extremely plausible.
6. Is complexity equivalent to low probability?
I would like to review one more
example which not only illustrates the deficiencies of Dembski's EF scheme, but,
moreover, shows the fallacy of one of his most fundamental assumptions – that
complexity of an event translates into its low probability.
I am indebted to Jeffrey Shallit
for pointing (in a private communication) to a website where references are
given to books [11, 12, 13]. In these books a rarely observed phenomenon is
described: sometimes freezing water forms unusual flat triangular crystals of
snow. This phenomenon is observed quite rarely and the mechanism for the
formation of such crystals is unknown.
Of course, even though the
detailed mechanism is unknown, it is obvious that the formation of the mentioned
crystals of an unusually simple shape is predetermined by certain weather
conditions under which the thermodynamic potential of water/snow that is
appropriate for these conditions has minimum, i.e., it can be said that the
formation of such crystals is due to a law of physics.
Dembski maintains that since the
formation of crystals is predetermined by law, his EF will not err in
attributing a formation of any crystal to design. For example, on page 12 in
[1] Dembski wrote: "Another concern is that filter will assign to design
regular geometric objects like the starshaped ice crystals that form on a cold
window. This criticism fails because such shapes form as a matter of physical
necessity simply in virtue of the properties of water (the filter therefore
assigns crystals to necessity and not to design)."
Let us notice, first, that the
above statement actually contradicts Dembski's scheme, according to which the
attribution of an event to law (necessity) occurs in the first node of his
filter if the probability of that event is found to be high. Indeed, when
stating that his filter reliably attributes the formation of crystals to law, he
actually bases his conclusion on his prior knowledge of the existence of a law
rather than on the estimation of the event's probability which his scheme
prescribes. He does not seem to notice that the procedure he himself employs is
opposite to that implied in his EF.
Turning again to the flat
triangular crystals, note that although their formation is indeed predetermined
by the laws of physics, it depends on the occurrence of certain weather
conditions. Such conditions occur very rarely. We do not possess the knowledge
which would enable us to predict when and why such conditions may occur.
The occurrence of the weather conditions necessary for the creation of flat
triangular crystals is a typical chance event. Therefore, contrary to
Dembski's scheme, the occurrence of flat triangular crystals is also a chance
event.
Look now at another aspect of the
described situation, pointed out by Shallit. Since the occurrence of flat
triangular crystals is a very rare event, its probability is very small.
On the other hand, since these crystals have a precisely defined shape, the
event is specified. The combination of low probability with specification,
according to Dembski's uncompromising theory is a reliable marker of
design. In fact, though, the occurrence of these crystals is due not to
design but to a combination of chance and law (such combination is not at all
envisioned in Dembski's theory where only individual actions of either law,
chance, or design are recognized). As Shallit has correctly pointed out in his
private communication, this is an example of a false positive which, as Dembski
vigorously asserts, his filter never produces.
Now let me point out the most
serious rejection of Dembski's thesis illustrated by the above example.
According to Dembski's theory, complexity is just another face of low
probability. Statements to this effect are scattered all over his books and
papers, including DNFL. However, in the above example the least probable
form of a crystal – the flat triangular one – is also the simplest of all
observed shapes of such crystals. This example illustrates the fallacy of
Dembski's position, according to which complexity necessarily translates into
low probability. A simpler event may very well turn out to be less
probable than a more complex one, which makes Dembski's theory unsubstantiated.
7. Dembski suggests a fourth law of thermodynamics
In his very popular book [14] first published some fifty years ago, the
well known writer Martin Gardner offered five features typical of the literary
production of what he called "cranks." One such feature is "a tendency to
write in a complex jargon, in many cases making use of terms and phrases he
himself has coined." Gardner wrote further that a crank does not have to
be a dunderhead. In some cases an obvious crank may nevertheless be quite
"capable of developing incredibly complex theories. He will be able to defend
them in books of vast erudition, with profound observations, and often liberal
portions of sound science. His rhetoric may be enormously persuasive. All the
parts of his world usually fit together beautifully, like a jigsaw puzzle. It
is impossible to get the best of him in any type of argument. He has anticipated
all your objections. He counters them with unexpected answers of great
ingenuity."
Complex jargon with many selfcoined terms is found in abundance in
Dembski's publications, including his newest book [1]. As to his ability to
develop complex theories which are sprinkled here and there with portions of
sound science, and which display erudition, Dembski seems to possess such
abilities as well.
A crank's use of complex jargon
replete with selfcoined terms often finds its most salient expression in
suggesting allegedly fundamental laws hitherto unknown in science, and Dembski
has an obvious propensity to do so. Perhaps the most vivid example of Dembski's
extraordinary claims is his announcement of a discovery of an additional law of
thermodynamics. On page 169 Dembski writes: "The traditional three laws of
thermodynamics are each proscriptive generalizations, that is they each make an
assertion about what cannot happen to a physical system." Leaving aside
the gist of that statement (which certainly can be disputed) we cannot fail to
notice that Dembski seems to have
forgotten certain simple facts from the introductory course of thermodynamics:
there are not three but four "traditional" laws of thermodynamics.
By a peculiar historical twist they were named the zeroth, the first, the
second, and the third laws, so, although there are four of them, none is named
the Fourth Law. Of course, this flop on Dembski's part does not seem to be
very important, but it makes a reader to pause and to consider whether or not
Dembski's statements, at least in that part where the new law of thermodynamics
is suggested, should be taken with caution.
It is a very rare situation when a scientist is lucky enough to discover a new
law which is then accepted by the scientific community and becomes a part of the
arsenal of science. It seems to be a more common situation when a new law
is suggested but dies out after the scientific community reviews it and finds it
unsubstantiated. It is a much more common situation when some allegedly
important law is claimed within the framework of pseudoscience. That is
why the claim of a discovery of a new law of science more often than not invokes
skepticism and suspicion that this is just a case of pseudoscience. I believe
the alleged Fourth Law of thermodynamics claimed by Dembski, as well as his
underlying Law of Conservation of Information (LCI), are examples of the latter
situation. In this article I intend to substantiate that conclusion.
The laws of science differ in importance and in the extent of the generalization
of the observed phenomena. For example, scientist S claims that in the course of
her research she found that at a pressure of x Pascal, certain metal A
melts at the temperature of T Kelvin. Other researchers try to reproduce
her results and obtain data, which, within the margin of a reasonably small
error confirm the claim of S. Then the law establishing that the melting point
of metal A at x Pascal is about T Kelvin is postulated and is
accepted by the scientific community as a reasonable approximation of reality.
This is a legitimate law of science, which, however, will hardly earn S a Nobel
prize. There are, though, other types of laws, those which constitute very
farreaching generalizations of a wide variety of phenomena. The four laws of
thermodynamics are of the latter type. The four laws of thermodynamics are among
the most general statements about nature known in science. If a scientist
managed to indeed introduce a Fourth Law of thermodynamics, thus making a total
of five laws in that science, this would constitute a great achievement.
Why are there four laws of thermodynamics but not three or, say, two? The
reason that the four laws of thermodynamics cannot be reduced to three, or two,
or one is that these laws are not derivable from each other. The second
law is not a consequence of the zeroth law or of the first law, and the first
law does not entail the second or the third law, etc. If a new law of
thermodynamics is to be discovered, it necessarily must be independent of the
four accepted laws. If a newly suggested law simply reformulates a concept
which has already been covered by one of the four existing laws, then it is not
a new law of thermodynamics.
Of course, another requirement for a supposedly new law of thermodynamics is
that it must not contradict any of the four laws of thermodynamics already
accepted in that science.
I intend to show that the Fourth Law of thermodynamics suggested by Dembski
fails on both accounts. First, it covers phenomena which have already been
covered by the second law of thermodynamics and therefore, even if it were
correct in itself, it would not constitute a new law but would at best be just
another way to state essentially the same postulate already adopted in science. However, the situation with Dembski's allegedly possible new
law of thermodynamics is worse because it actually contradicts the Second Law of
thermodynamics.
Let me start with the first point. The Fourth Law of thermodynamics suggested by
Dembski is a generalization of what he calls the Law of Conservation of
Information (LCI for short).
In my previous detailed discussion [2] of Dembski's earlier publications I
pointed to the flaws which, in my view, are present in Dembski's discourse
related to information. I will repeat here briefly some of the points of that
critique.
Dembski's treatment of information was also subjected to critique by several
other authors, including Victor Stenger [15], Matt Young [16] and others.
On page 140 of [1], Dembski offers the following definition of information
I associated with an individual event A:
I(A)= 
log_{2} P(A)...................................(1)
where P(A) is the probability of event A.
Formula (1)
as such was not given in Shannon's classical work [17] which was the real
foundation of information theory. However, this formula can be formally derived
from Shannon's formula which defines information as a change of entropy if we
apply the latter to an individual event. Formula (1) is not peculiar to
Dembski's discourse and can be found in textbooks and even in encyclopedias. For
example, on page 55 in the textbook [18] we find exactly the same formula (1)
for information. Likewise, an article on Information Theory by Professor George
R. Cooper in Van Nostrand Scientific Encyclopedia (1976 edition) also contains the same expression as a definition of information of an individual
event. Information theory has undergone substantial development after Shannon's
classical contribution. While formula (1) does not plainly contradict Shannon's
fundamental concepts, in the modern information theory it is viewed as
simplistic, while information is defined in various more sophisticated
ways. The quantity expressed by formula (1) is sometimes referred to as
"selfinformation" (see, for example Entropy and Information Theory).
Since Dembski has been acclaimed as not just an information theorist but an
"Isaac Newton of information theory," his treatment of information should be
expected to be on a level well above simplistic amateurish approach.
For the sake of discussion, let us accept Dembski's formula (1) as a working
definition of information which can be adequate if we do not require that discussion to be at the frontier of the modern
development of information theory.
A completely different question
is, though, whether or not Dembski uses formula (1) properly, and I agree with
critics (see, for example [15, 16] that his treatment of information, including
his use of formula (1) has a number of faults. An example will be discussed a few lines down (the example with the
word METHINKS).
In his previous book The Design Inference [3] Dembski did not discuss his
theory in detail from the viewpoint of information. In his other book
Intelligent Design [19], which was of a more popular type, there is a
chapter on information (wherein he first suggested his Law of Conservation of
Information). In his latest book [1], Dembski offers a rather detailed
discussion of information and, unlike in the earlier presentation of his views,
discusses the concept of Shannon's entropy. The mathematical expression
used by Dembski for entropy (page 131) is
H(a_{1.....}a_{n}) =
_{def}Σ_{I }
p_{i} log_{2} p_{i...............}(2)
Various versions of essentially the same expression, all stemming from Shannon's
original work [17] are commonly used and meet no objection. Dembski defines
entropy as "the average information per character in a string." Again, I
have no objection to that definition which is in agreement with Shannon's
definition. I believe, though, this definition dooms to failure Dembski's
attempt to introduce a Fourth Law of Thermodynamics based on his LCI. Like
information I defined by formula (1), entropy in information theory is
also measured in bits (or, more often, in bits per symbol, when specific entropy
is used. If instead of logarithms with the base of 2, natural logarithms are
used, the units for entropy and information are called nats).
In many books on information
theory one can find statements according to which entropy is one of the measures
of information and Dembski's quotation of equation (2) is in agreement with that
approach. Sometimes these two terms – entropy and
information  are used interchangeably (Shannon himself was not very stringent
in unequivocal usage of these terms). Some
writers prefer to use for H the term uncertainty instead of entropy (or
Shannon's uncertainty) and the term surprisal instead of information for the
quantity I. Regardless of the preferred usage of terms and whatever
the nuances in the interpretation of entropy are, the essence of that concept is
the same.
Before discussing
entropy, let us look at some of Dembski's examples wherein he estimates the
information carried by a certain string of characters. On page 166, Dembski estimates the complexity of the
word METHINKS. The formula used by Dembski for what he in this case calls
complexity is exactly the same as he previously introduced for
information, namely –log_{2} P (formula 1).
Note that the term
complexity is used by Dembski in this example in a different sense
than his own definition of complexity found in his previous book [1]. In that
book, Dembski defined complexity as "the best available estimate" of how
difficult it is to solve a problem at hand. Now, discussing the
complexity of the word METHINKS he uses for complexity formula (1)
which he introduced, a few pages earlier, for information I and whose connection to a difficulty of solving a problem seems to be rather farfetched.
Here is how Dembski calculates
the complexity of the word METHINKS. Since this word has 8 characters drawn
from the English alphabet which comprises 26 letters plus a character for space,
a total of 27 characters, Dembski estimates the probability P of that word's
occurrence as (1/27)^{8}. Logarithm on the base of 2 for this number is
38, so Dembski concludes that complexity of that word is 38 bits.
(Actually the expression used by Dembski is that "the complexity of the word
METHINKS is bounded by log_{2}1/27^{8}"; italics is mine).
Hence in his calculation Dembski
treats what he calls complexity by applying a formula introduced a few
pages earlier for information. His estimate assumes the
uniform distribution of 27 characters in what he calls a "reference class of possibilities" so that each
of the 8 characters in that word is assumed to have the same probability of
appearing in the string. Such an assumption is justified only for a string
of 8 characters randomly drawn from a stock which has an unlimited supply of all
27 possible characters. (The randomized texts obtained in such a way are
sometimes referred to as "monkey" texts, because of the famous example of
monkeys randomly hitting the keys on a typewriter). In the case of an urn
technique, formula (1) is applicable if each letter, after having been drawn
from the urn, is returned to the urn.
However, if the word METHINKS is a
part of a message received through a communication channel, then it has to be
expected to be a part of a natural language's vocabulary. On page 164, i.e.,
just two pages before calculating the complexity of the word METHINKS,
Dembski wrote about transmission of information "from one link to
another," about "the textual transmission of ancient manuscripts," "transmission
of texts" (page 165) and the like. He never indicates that on page 166 he
discusses the occurrence of the word METHINKS in a different way, as a result of
a random selection of letters from an unlimited stock of all 27 symbols, except
of using the term "bounded.".
The actual distribution of
characters in English texts is not uniform. (This is always true for meaningful texts,
and often also for gibberish. For example, if the letters in a
meaningful text are randomly permuted, the permuted text most often will become
meaningless, but the probability distribution of letter
frequencies in the permuted text will remain the same as it was in the original
meaningful text. The behavior of such randomized texts as compared with
meaningful texts, was analyzed in detail in a posting [28].)
If the letters of the word
METHINKS occur within a message in a natural language, the letter E has the maximum probability
(about 12%) of appearing in any location of the word, the letter T has a
slightly smaller probability, etc. In this case the use of formula (1), as was done in this case by Dembski while claiming that he calculated complexity, is wrong. Formula (1), which is legitimate for "selfinformation" associated with an
individual event, can be formally used for a series of events only if their
probability distribution is uniform. In the latter case, however, information
associated with each individual event (defined by formula 1) would coincide
with entropy defined by formula (2), which equals the average
information. Then what Dembski actually calculated under the label of
complexity, turns out to formally be the entropy of the word in question
assuming a uniform distribution of letters. This mess of terms is rather
typical of Dembski's discourse. Since Dembski makes no statement to
the contrary, the actual distribution of characters has to be assumed to be
nonuniform, so that a proper calculation of entropy should be done in
this case using formula (2).
However, even if Dembski used formula (2), thus accounting for the
nonuniform distribution of symbols, it still would not be sufficient for a
correct estimate of the word's entropy (which he calls in this case
complexity). Not only is the probability distribution nonuniform, the
word in question is also a part of a meaningful English vocabulary, so the
probability distributions of digrams, trigrams, etc., are also nonuniform.
Furthermore, the natural languages possess redundancy which substantially
decreases entropy of meaningful texts. As this was shown already by Shannon, the
first order entropy of a meaningful English text is about one
bit/character, hence the first order entropy of the word METHINKS is
about eight bits, rather than the thirtyeight bits of Dembski's estimate of the complexity's bound (not to mention that Dembki's estimate ignores the entropies of higher orders than 1, the existence of multiple symbols other than just the 26 letters of the alphabet, etc.).
Now consider a different situation in which the word METHINKS can occur either as a part of a message arriving through a communication channel or through the urn technique wherein, though, the elements of the phase space are not individual letters but whole words. For example, imagine that the urn holds every one of the entries from the unabridged Webster's dictionary. It means the urn holds about 315,000 words, each as an indivisible unit. Since each word happens only once, every word has the same probability to be randomly pulled out of the urn. The probability that the word METHINKS happens to be the randomly chosen is then P=1/(3.15x10^{5}). Then, using formula (1) suggested by Dembski for information, we find that information (which he also calls complexity) obtained when the word in question has been pulled out, is –log_{2}P = 18.3 bits instead of 38 bits of Dembski's estimate. This number, though, obviously had nothing to do with complexity associated with the word in question.
If the urn holds unequal numbers of words, say, the distribution of words in the urn is determined by the frequencies of their occurrence in English texts, the probability of the word METHINKS will be different from that estimated for the uniform distribution, and the information calculated by formula (1) will not only be different from either 18.3 bits, 38 bits, or 8 bits, but will also have nothing to do with complexity.
It is possible to define a procedure wherein a random occurrence of the word METHINKS results in information much exceeding 38 bits of Dembski's estimate and also has nothing to do with complexity. For example, assume a word is randomly chosen out of all the words found in all the books in the Library of Congress. What is in this case the probability that the randomly chosen word turns out to be METHINKS? If the total number of words in all the books in the library is N, and the word METHINKS happens among them X times, the probability in question is P=X/N. Obviously, N is a very large number while X is a relatively small one since the word METHINKS is a rare one. In such a procedure the information, if defined by formula (1), will be much larger than 38 bits of Dembski's suggested "bound."
This shows that the estimate of information associated with a certain word may be very different depending on the probability distribution so defining the information bound requires first defining which probability distribution is considered. Dembski's calculation of the "information bound" is valid only for a specific (uniform) probability distribution in a specific procedure (random choice of individual letters). Moreover, information associated with a certain word, if defined by formula (1), cannot be simply translated into complexity. In his discourse, Dembski does not clearly define the discussed situation, uses different terms in a haphazard way and thus creates a mess of concepts and definitions.
Now back to the discussion of the meaning of entropy.
The main founder of information theory Claude
Shannon introduced the term entropy for the quantity H (formula 2) because this
quantity seemed to behave like its namesake in thermodynamics. After
Shannon's classical paper [17] appeared in 1948, the legitimacy of his term and
its relation to the thermodynamic entropy were discussed at length and
ultimately a consensus was reached accepting the term as appropriate, and
informational entropy was accepted as not just a namesake of the thermodynamic
entropy but as essentially the same quantity albeit traditionally measured in
different units. (Thermodynamic entropy is measured in Joule/Kelvin, whereas
informational entropy in bits or bits/symbol, or also in
nats or nats/symbol. In theoretical physics entropy is often
considered a dimensionless quantity [20].)
In order to clarify that thermodynamic and informational entropies are
essentially the same, let us review the concept of entropy as it is interpreted
in thermodynamics.
Every professor of physics who has ever taught thermodynamics knows that
students usually rather easily accept the concept of energy but often have a
hard time comprehending the concept of entropy. In fact, however, the
concept of energy is one of the most mysterious concepts in science whereas
entropy is a quite simple one. Perhaps the different attitude to these two
concepts on students' parts is due to the fact that the term energy is commonly
used in the nonscientific speech and students are simply accustomed to it while
entropy is a purely scientific term with no use in the everyday vernacular. I
believe that nobody really understands what energy is. This concept has no
definition. The most fundamental law of physics, the law of energy conservation
in its general form, is a rather mysterious postulate which says that there is
some quantity we call energy, whose total amount in the universe is constant,
although we don't know what this amount is and it can very well equal zero.
There are many aspects of that law which are obscure (whose discussion is beyond
the scope of this paper). Nevertheless, it seems to be accepted by almost
everybody without much pain.
The First Law of thermodynamics is
the particular form of the law of energy conservation for macroscopic
systems.
The Second Law of thermodynamics
deals with entropy. Although entropy has a rather simple and transparent
definition, this concept is often absorbed by students only with considerable
difficulty.
I see no need to delve here into
the detailed history of the development of the entropy concept. The real meaning
of that quantity was not immediately properly interpreted when it was first
introduced in thermodynamics by Clausius. It took a considerable effort by
a number of outstanding scientists in the last quarter of 19^{th}
century to clarify that concept, which was successfully done within the
framework of statistical physics. When this was done (and one of the crucial
insights was provided by Ludwig Boltzmann) it transpired that entropy can in
fact be interpreted as a simple concept of an extremely versatile character.
Without discussing many nuances of
that concept, its most concise and universal definition is as follows: entropy
is a measure of the degree of randomness (disorder) in a system which comprises
many constituent elements. The more disordered is the conglomerate of
whatever elements the system encompasses, the larger is its entropy.
It is easy to see the extreme
versatility of that concept. It does not matter what the physical nature
of the system's constituent elements is. Initially introduced in thermodynamics,
entropy was meant to characterize thermodynamic systems. For example, it can
characterize the degree of disorder in a gas occupying a certain volume.
The gas consists of a large number of identical molecules and its entropy is a
quantity which is a measure of disorder (randomness) in the distribution of
those molecules over the volume. However, entropy is not a property of
those molecules. Their physical nature has nothing to do with the degree of
randomness characterizing their distribution in space. The choice of units for
entropy is not restricted by the properties of that quantity but can be
different depending on convenience. Since entropy is not a physical
property of a body or of any other conglomerate of constituent elements, it has
no "natural" units which therefore can be assigned at will. Historically,
the choice of Joule/Kelvin (or, say cal/^{0 }C) as units for
thermodynamic entropy was made not because these units are somehow intrinsic for
entropy but because the initial definition of entropy S by Clausius was in the
form of dS=dQ/T where Q stands for heat and T for temperature. As it was
realized later, Clausius's entropy is just a particular choice for that function
out of an infinite number of possible functions all of which could serve as
entropy as well. The sole requirement for a function to be capable of
serving as thermodynamic entropy is that it stays constant in a reversible
adiabatic process. When Boltzmann discovered the statistical meaning of
entropy, thus tying it to probabilities, he wished to preserve the quantitative
values of entropy matching Clausius's entropy, so he introduced a coefficient
(named the Boltzmann coefficient) which has a certain value expressed in
Joule/Kelvin (or cal/^{0}C) thus converting the dimensionless logarithm
of (thermodynamic) probability into Clausius's entropy. However, the
multiplication of the logarithm of probability by Boltzmann coefficient, while
being a clever and very useful device, was in fact an arbitrary choice as far as
the meaning of entropy itself is considered. Entropy can just as well be used as
a dimensionless measure of randomness (and indeed is often used in such way in
theoretical physics [2]) or measured in any arbitrary units if this facilitates
its use. The meaning of entropy does not depend on the units chosen for it, as
the essence of that quantity is in no way related to the physical properties of
a system. It can be equally applied to estimate the degree of disorder in
a gas occupying a certain volume, to a long string of characters, to the DNA
strand, to a large gathering of people, and to an infinite number of other
systems each comprising many constituent elements. Regardless of what
those constituent elements are, the behavior of entropy is determined by the
same laws, of which the 2^{nd} law of thermodynamics is the most widely
known.
As quoted above, Dembski
acknowledges that entropy is just the average information. Therefore
whatever new law of thermodynamics he may suggest, as long as it deals with
information, it necessarily deals with entropy. If that is the case, the new law
not only must not contradict the 2^{nd} Law but may become a legitimate
new law only if it adds something not already covered by the 2^{nd} Law.
Given the universal character of the 2^{nd} Law of thermodynamics, whose
validity extends well beyond its original home – thermodynamics – it is very
hard to discover a new law relating to entropy which would add anything
not covered by the 2^{nd} law of thermodynamics. Regardless of
whether or not Dembski's Fourth Law is correct or not, it deals with information
and hence it deals with entropy whose fundamental behavior is covered by the
2^{nd} law. Even if the new law about information (i.e., about entropy)
turns out to be correct, the chance that it will shed light on some hitherto
unknown features of entropy's behavior is very slim. Most probably, the
supposed Fourth Law, even if true, can only be a consequence of the
2^{nd} law or some corollary to it.
As I will argue now, however, the possible Fourth Law of thermodynamics suggested by Dembski is in fact wrong because it is
based on what he calls the Law of Conservation of Information. As I will argue,
LCI is neither a law of conservation nor a law about information.
Dembski's LCI is an unsubstantiated and contradictory statement which belongs in
pseudoscience.
Let us first discuss whether LCI
can indeed be named a conservation law. This seems to be a secondary point, but
I will discuss it because Dembski obviously attaches a considerable significance
to the name of his supposed new law, devoting many words to the justification of
the name he gave to it.
It seems a platitude to say that a
conservation law must necessarily be about something that is
conserved. Conservation laws are important in physics. The
most fundamental law of physics is the law of energy conservation. This
law is an example of that rare type of conservation laws which are
unconditional. The total amount of energy in the universe, according to that
law, is constant, so it is conserved regardless of any changes of anything else
in any processes. It encompasses energy in all of its multiple forms,
including the energy of the rest mass. A much more common type of a
conservation law is that which is conditional. For example, the law of momentum
conservation in Newtonian mechanics asserts that the momentum of a system is
conserved if no external forces act upon that system. The quantity which
is conserved, according to that law, is momentum, which is quite rigorously
defined within the framework of Newtonian mechanics. This conservation law
is conditional because it asserts that the momentum of a system is conserved
only under the condition that no external forces act upon the system.
However, this law, first, quite clearly states that something is indeed
conserved, second, clearly defines what is conserved and, third, clearly defines
under which conditions it is conserved.
It can also be said that each
conservation law is about the conditions ensuring that a certain quantity is an
invariant of a certain process. Momentum of a macroscopic mechanical
system is an invariant of such process in which no external forces act upon the
system in question.
There also are many laws in
science which are not conservation laws. In particular, out of the four laws of
thermodynamics only the First Law is a conservation law. It is a particular form
of the energy conservation law applicable to macroscopic systems. The
three other laws of thermodynamics are not conservation laws. This
includes the 2^{nd} Law. The 2^{nd} Law of thermodynamics
has a number of definitions, but essentially it deals with entropy. This law does
not state that entropy is conserved (although it is conserved in the particular
case of the socalled reversible adiabatic processes; the concept of a
reversible process is an abstraction since all real processes are not
reversible). The 2^{nd} Law of thermodynamics states that the
entropy of a closed ("isolated") system cannot decrease. In any spontaneous
process the entropy of a closed system can either increase or remain
constant. Therefore, the 2^{nd} law of thermodynamics is not
considered and is not named a conservation law.
Dembski's alleged LCI – Law of
Conservation of Information, is not about something that is conserved. It
states that a quantity he calls Complex Specified Information (CSI) can either
decrease or remain constant in what he calls "a closed system
of natural causes." Dembski provides no definition of the "closed system
of natural causes." I discussed this point in my previous review [2] of
Dembski's work. However, at this time I am only discussing whether or not his
LCI can indeed be interpreted as a conservation law, regardless of whether LCI
is correct or not. In [1] Dembski spends many words to justify the inclusion of
the word conservation into his LCI. All these words are pure
casuistry because they cannot change the simple fact – his LCI is not about
conservation of anything. CSI, according to LCI, can decrease, hence there
is no reason to name LCI a law of conservation which it obviously is not.
Now let us discuss whether or not
LCI is about information.
In several of his earlier
publications Dembski discussed an example (based on the famous example
originally suggested by Richard Dawkins) wherein he pointed out the difference
between two strings of letters of equal length. One of these strings
spells a phrase from Hamlet, "METHINKS IT IS LIKE A WEASEL," and the
other is a string of gibberish of the same length (28 characters if space is
counted among the characters).
Note that the 1^{st} order
entropy of the above meaningful quotation from Hamlet, according to
classical information theory, is about 28 bits, whereas the 1^{st} order
entropy of a random string of 28 characters taken from the English alphabet
which comprises 26 characters plus one more for space, is almost five times
larger, i.e., about 135 bits. If, though, we include in the set of available
symbols also numerals, commas, colons, periods, semicolons, exclamation and
question marks, mathematical symbols etc., the entropy of string will be larger.
Moreover, if we wish to account for the total entropy rather than for only the
1^{st} order entropy (which will be discussed later in this article) the
numbers will be larger by about 30 to 40%. Note also that the entropy of a
random string and, hence, also the amount of information brought by that string
to a receiver, is larger than it is for a meaningful string of the same
length.
According to Dembski, the
quotation from Hamlet must be attributed to design (of course, nobody
would argue against this obvious conclusion) whereas the string of gibberish of
the same length is attributed rather to chance. What is the reason,
according to Dembski's criteria of design, for the above conclusions? In
order to be attributed to design, asserts Dembski, the event (in this case the
occurrence of the string of characters) must have low probability (which in this
example is one in so many billions) and be specified (in this case to be a
recognizable meaningful English phrase). Low probability, according to
Dembski, is equivalent to large complexity (statements to this effect are found
in many places in Dembski's books and papers). Hence, in the example in
question, Dembski obviously finds CSI – his Complex Specified Information, in
the quotation from Hamlet but not in the string of gibberish of the same
length. What is actually the difference between the two strings? It is in
that one is a meaningful English phrase, i.e. is displays a recognizable pattern
to those who possess at least some minimal knowledge of English whereas the
other string is meaningless, i.e., displays no recognizable pattern (although,
unknowingly for us, it may happen to be a meaningful text in some other language
we are not familiar with). If we rely on the above example, it seems that
what Dembski means by CSI is equivalent to the recognizable meaningfulness of
the string. If we accept this interpretation of CSI, obviously this
concept is not what is called information in information theory. The concept of
information in that theory has nothing to do with the semantic contents of a
message. The definition of information adopted by Dembski himself (formula
1 above) also has nothing to do with the meaningfulness of a string.
Hence, the interpretation of CSI, as it follows from Dembski's own example,
shows that CSI, despite its name, is not information even in Dembski's own
interpretation of the latter term. If we were to stick to Dembski's concept of
CSI as rendered in his previous publications, we would have to conclude that
CSI, despite the inclusion in it of the term information, is not information at
all and therefore the LCI in its earlier form was not about information as well
as it was not about conservation.
In his new book [1], however,
Dembski discusses the CSI and LCI in terms sometimes rather different from those
seemingly following from the example of two strings of characters.
Actually there seem to be several
differing interpretations of CSI within the same new book by Dembski.
One of these interpretations is
based on Dembski's discrimination between what he calls conceptual and
physical information. On page 137 we find the following statement:
"In practice, there are two sources of information – intelligent agency and
physical processes." Immediately after that sentence Dembski tries to
protect his flanks by diluting the strict meaning of the above statement. He
writes: "This is not to say that these sources of information are mutually
exclusive – human beings, for instance, are both intelligent agents and physical
systems. Nor is this to say that these sources of information exhaust all
logically possible sources of information  it is conceivable that there could
be nonphysical random processes that generate information."
The last quoted sentence leaves a
reader confused about what actually is Dembski's position. The possibility of
the existence of what he calls nonphysical random sources of information seems
to negate his initial assertion about the two sources of information.
Dembski proceeds, though, to distinguish between what he calls conceptual and
physical information. On page 139 he offers definitions of these two kinds
of information:
Conceptual Information:
Intelligent agent S identifies a pattern and thereby conceptually reduces the
reference class of possibilities.
Physical Information:
Event E occurs and thereby reduces the reference class of
possibilities.
These definitions allow for
various interpretations, but it seems that conceptual information is used by
Dembski as a more general concept than just the semantic contents of a string of
characters. The meaning of the first definition seems to be that any
recognizable pattern, if identifiable by an intelligent agent, would point to
design. Obviously, though, if a string is a meaningful text in a language
familiar to the intelligent agent, it will, according to Dembski's definition,
carry conceptual information, although conceptual information is not limited to
strings of characters. Therefore, if we deal with texts, the term conceptual
information seems to coincide with the meaningfulness of that text. The term
text can have a very wide interpretation. A multitude of systems
can be encoded by a string of zeros and ones and hence be represented by a
text. If that is so, then conceptual information is not what is named
information in information theory.
Therefore Dembski's term
conceptual information, despite the inclusion of the word
information, is not information in the sense of information
theory.
Apparently being aware that his
term conceptual information is actually often synonymous with the
semantically meaningful contents of a message, Dembski tries to salvage that
term as allegedly denoting a different concept by devoting a separate section
(pages 145146) to the discussion of what he calls semantic information. He
asserts in that section that semantic information is not a part of CSI.
However, the concepts of semantic information and conceptual information,
although not fully synonymous, partly overlap. [Note that Dembski's discussion
of semantic information seems to indicate that he is not familiar with the
recent developments in the algorithmic theory of probability which trespassed
the boundaries of information theory and have some promising achievements toward
distinguishing between noise and useful information (see for example [21]).
Also, Dembski's assertion that what he calls semantic information does not
submit to "mathematical and logical analysis" (page 147) is incorrect. It were
true if it related only to information theory. But information theory is just a
part of science. Contrary to Dembski's assertion, semantic contents of texts
predetermine statistically discernable patterns. In particular, a method of
statistical analysis of texts named Letter Serial Correlation which enables one
to distinguish between semantically meaningful texts and gibberish was developed
by Brendan McKay and the author of this review [22]].
Dembski uses the term
conceptual information in his definition of Complex Specified Information
(CSI), also referred to as Specified Complexity (SC). This definition is given
on page 141:
Complex Specified
Information: The coincidence of conceptual and physical information
where the conceptual information is both identifiable independently of the
physical information and also complex.
This definition of CSI
comprises several points, each of which requires some deciphering. There is no
need, though, for such deciphering at this time, since now we are only
interested in an evaluation of whether CSI is defined in a way fitting its use
in Dembski's Law of Conservation of Information.
The point of interest in
this respect is that Dembski's concept of CSI as defined above incorporates, as
an inseparable part, conceptual information, which is not information in
the sense of information theory. Therefore CSI is not information either.
Hence, the acclaimed Law of
Conservation of Information suggested by Dembski is neither about conservation
nor about information.
With his typical inconsistency, in
other sections of his book Dembski offers different interpretation of LCI, which
introduce a new element absent from his earlier discussions, although this new
interpretation of LCI has its roots in his previous publications where he
suggested the socalled Universal Probability Bound of 10^{150}.
According to the new
interpretation of CSI given in [1] CSI is indeed information after all, but to
qualify for being CSI the amount of information must be not less than 500 bits.
The new formulation of LCI given
in [1] seems to become a statement asserting that both stochastic processes and
algorithms are capable of generating specified information up to 500 bits but no
more than that. The threshold value of 500 bits is based on Dembski's "universal
probability bound" of 10^{150}.
To provide a feeling for what the
seemingly minuscule probability bound of 10^{150} means, let us note
that a random string of characters drawn from the English alphabet (not
including numerals, punctuation marks and spaces) carries over 500 bits of
information when its length exceeds only 105 letters. A semantically
meaningful English text carries over 500 bits if its length exceeds about 500
characters, which is about 1/6 of an average singlespaced typewritten
page. (Actually these estimates are good only for the 1^{st} order
entropy, which, however, constitutes the main portion of the total entropy of a
text. The total entropy includes L1 terms where L is the text's length
expressed in the number of characters. For example the 2^{nd}
order entropy can be expressed by the same formula as formula (2) where though
p_{i} denotes the probability of a digram, that is of a combination of
any two characters, rather than of an individual character, and where the sum
also has to be divided by 2. If the total entropy is considered, the length of a
randomized string of characters drawn from the English alphabet and carrying 500
bits is about only 60 letters. For a meaningful English text that number
is close to 300 characters.)
Dembski offers his new definition
of the Law of Conservation of Information on page 160. This definition
provides a good example of Dembski's style which is characterized by convoluted
renditions of rather simple concepts. Here is the quotation from page
160:
"Law of
Conservation of Information. Given an item of CSI, call it
B=(T_{2}, E_{2}), for which E_{2} arose by natural
causes, any event E_{1} causally upstream from E_{2} that under
the operation of natural causes is sufficient to produce E_{2} belongs
to an item of CSI, call it A(T_{1},E_{1}), such that
(LCI _{csi
})
I(A&B)=I(A) mod UCB.....................(3)
where by definition the quantity of information in an item of
specified information is the quantity of information in the conceptual component
(i.e., I(A) =_{def} I(T_{1}) and
I(A&B) =_{ def} I
(T_{1} & T_{2}))."
If an average reader is puzzled by
the above definition, with its collection of constituent concepts piled upon
each other, such a reader can be consoled that he is not alone. First note that
the abbreviation UCB stands for "Universal Complexity Bound," which, Dembski
explains, "throughout this book we take to be 500 bits of information." He
also explains that the abbreviation "mod" stands for "modulo," which "refers to
the wiggle room within which I(A) can differ from
I(A&B)." Dembski elaborates by saying: "To say that these two
quantities are equal modulo UCB is to say that they are essentially the same
except for a difference no greater than UCB." (Note that this use of the term
modulo differs from its standard use in mathematics). Regarding notation
T, Dembski explains it on 141142 as follows: "The event E ... is an
outcome that occurred via some physical process. The target T... is a pattern
identified by the intelligent agent S without recourse to the event . ... both T
and E denote events. The ordered pair (T,E) now constitutes specified
information provided that the event E is included in the event T and provided
that T can be identified independently of E (i.e., is detachable from E)."
Whereas the quoted passages
illustrate Dembski's propensity for unnecessary quasiscientific esoteric
language, if an average reader is still in the dark regarding what exactly
Dembski's definition of LCI means, for such a reader Dembski also provides a few
more definitions in plain words. On pages 159160 we read: "If a natural cause
produces some event E_{2} that exhibits specified complexity, then for
every antecedent event E_{1} that is causally upstream from E_{2
}and that under the operation of natural causes is sufficient to produce
E_{2}, E_{1} likewise exhibits specified complexity." Now,
this is a bit simpler than the above quoted notationladen definition. It
lacks, though, that part of the full definition which is about the wiggle room.
Actually Dembski permits the natural causes which produce a consequent event
E_{2 }to add information to that already contained in an antecedent
event E_{1 }, but only if the additional information does not
exceed 500 bits. On page 161 he says: "Because small amounts of specified
information can be produced by chance, this 500bit tolerance factor needs to be
included in the Law of Conservation of Information." This seems to be a
small step in right direction on Dembski's part, because his earlier formulation
did not allow for any wiggle room as he asserted that "natural cause cannot
generate CSI" without exception.
8. Can functions add information?
It is instructional to look at
some passages in DNFL which precede the definition of the LCI. On pages 151154
we find a lengthy discussion of whether or not functions can add information.
Perhaps it would be better to use the term "algorithm" instead of function, but
function is the term chosen by Dembski. On page 152 we read: "Functional
relationships at best preserve what information is already there, or else
degrade it – they never add to it." Now jump over two pages. On page 154 we
read: "I have just argued that when a function acts to yield information, what
the function acts upon has at least as much information as what the function
yields. This argument, however, treats functions as mere conduits of
information, and does not take seriously the possibility that function might add
information." Dembski proceeds with an example of a function which adds
information and concludes the passage as follows: "Here we have a function that
is adding information. Moreover, it is adding information because the
information is embedded in the function itself."
Here is a quintessential Dembski
who makes two importantsounding statements within two pages of which the second
statement completely negates the first one. So, which statement is correct
– the one asserting that functions "never add" information or the one asserting
that there are functions adding information embedded in functions themselves?
Since Dembski has a goal  to
prove something he takes as true before even considering arguments in favor or
against his belief, namely that CSI can only be created by intelligent agent 
whereas his own example with functions seems to contradict his thesis, he
resorts to a mathematicallylooking acrobatics wherein the simple facts are
obscured by esoteric notations. At the end of page 154 and the beginning
of page 155 he offers what can only be viewed as a quasimathematical trick
aimed at allegedly reconciling his two irreconcilable statements.
To perform his trick Dembski
introduces a new operator U which comprises both the initial information
i (source information) and the initial function f (which can add
information embedded in it to the initial information i). He
insists that unlike f, U does not add information. In what way
inclusion of f into a composite function U makes f lose its
ability to add information, is not explained. If f can add
information embedded in it, no mathematical trick like making it a part of a
composite function can eliminate its ability to add information, regardless of
whether it does so as a standalone function or as a constituent of composite
function U.
One of his statement on page 155
is: "...distinction between functions and information is not hard and fast."
The intrinsic meaning of that statement seems to be Dembski's secret. For
anybody who is not Dembski's admirer, the concepts of information and of
function are quite distinctive. Having concluded that functions do not
after all add information (a conclusion which is necessary to support his
preconceived thesis) Dembski says: "Formula (*) confirms this as well." It
is rather odd to hear from a mathematician that a formula confirms
something. Formulas in themselves neither confirm nor negate
anything. Any formula is just a statement made in mathematically
symbolic form. Formulas, as any statements, are either postulated or derived. If
a formula is postulated it obviously does not confirm anything. If a
formula has been derived, it means it was obtained via a certain logical
procedure, in a mathematically compressed form, starting from a certain premise.
Therefore a formula is only as good as is the premise. A formula in itself
cannot confirm anything beyond whatever was already assumed in the
premise, although it may shed additional light on the premise. In particular, formula (*) (page 152) is as follows:
I(A&B) = I(A) + I(BA)
This formula is a consequence of,
first, the formula for the probability of two events A and B both actually
occurring (when A and B are not independent events) and, second, of the
definition of information as a negative logarithm of probability. It does not in
any way confirm or negate Dembski's thesis, according to which CSI can only be
created by intelligent agents, or even his narrower thesis that functions do not
add information (the latter is actually rejected elsewhere by Dembski
himself).
There is one more rather
convincing indication of the fallacy of Dembski's LCI. I intended to
provide this additional consideration but I was too late. Richard Wein has
already given it and shared it with me in a private communication. Wein's critique of DNFL has now been made public [25]. Whereas
Wein and I have very different backgrounds, education and experience, so that we
approach Dembski's work from different vantage points, his argument against LCI
in this specific case turned out quite close to what I had in mind but was too
slow in developing it. Upon my request, Richard kindly permitted me to use
his argument in this paper.
It goes as follows. Wein
first quotes the already quoted equation (*):
I(A&B) = I(A) +
I(BA)......(*)
In this
equation, A and B denote events of which A is the cause of B and B the
consequence of A. According to Dembski, since A entails B, therefore the
combination of events A and B carries no more information than was already
carried by A alone, i.e.
I(BA) = 0, or
I(A&B)=I(A)
As mentioned before, equation (*) is a consequence of going from
probabilities to information via a logarithmic transformation.
(I'd like
to add to Wein's argument that to reconcile equation (*) with Dembski's
definition of LCI, it would be necessary to assume not that
I(BA) =0,
but rather that (using Dembski's own notation mod)
I(BA)=0 mod UCB.
This is
just one more example of Dembski's inconsistency).
As Wein points out, I(BA) in
equation (*) "is not Dembski's specified information (SI)! The problem is
that I(BA) is just P(BA) transformed, and P(BA) is the
true conditional probability of the event, which in this case is 1. SI,
on the other hand, is based on the assumption of a uniform probability
distribution, regardless of the true probability of the event. "
I believe the combination of Wein's argument with
my preceding argument effectively lays to rest any claims of legitimacy of
Dembski's LCI.
Pages 152154 are full of
"notationheavy prose" (using Dembski's own expression) which he allegedly tries
to avoid (page xvii). This segment of his book is saturated with such
terms as "homomorphism of Boolean algebras" and the like. All these piles
of mathematical notations are irrelevant to his thesis. They serve no useful
role except for impressing readers with the alleged sophistication of Dembski's
discourse.
Note that in all of the reviewed
discussion Dembski always refers to information rather than to Complex Specified
Information. As discussed before, CSI is actually not information in the
sense of information theory. Since Dembski's Law of Conservation of
Information, despite its name, actually asserts something about CSI rather than
about information, the whole discussion on pages 151–154 does not seem to be
related to his subsequent discussion of LCI.
Dembski constantly switches
between CSI and information, without making any comments in regard to this
switching. This makes all of his discourse in regard to information and LCI
inconsistent and rather confusing. To understand what exactly he means by this
or that statement the reader must constantly be on alert and it seems impossible
to discuss Dembski's thesis in a consistent way. Overall, his conclusion,
repeated many times all over his book, that CSI cannot be created other than by
intelligent agent, remains utterly arbitrary.
In view of the above it can be
asserted that, whichever of several mutually contradictory interpretations of
Dembski's LCI is chosen, this alleged law makes no sense. Dembski's
suggestion that his LCI can be generalized as the Fourth Law of Thermodynamics
has no basis in facts. His supposed Fourth Law would actually contradict the 2nd Law of thermodynamics. The 2nd law asserts that entropy in a closed system cannot decrease, while Dembski's LCI states that information in a closed system cannot increase. Since average information and entropy are tied together according to Dembski's own definition, his alleged new law cannot be taken seriously, although we can expect that from now on his cohorts will trumpet the alleged fundamental discovery of a new law of nature by the great mathematician and philosopher Dembski.
There are many more parts in Dembski's new book dealing with a variety of
topics. For example, on pages 4951 Dembski purports to correct alleged
weaknesses in Fisher's statistical theory of hypothesis's testing by
generalizing it (without any reasonable substantiation for such claim).
Reviewing all of them would require much more time and effort than it
deserves.
In his new book Dembski continues to adhere to an obviously incorrect idea that
complexity is inextricably tied to low probability. This idea contradicts a variety of facts as the much more reasonable definitions of Kolmogorov complexity [23] and computational complexity. I have
previously offered examples illustrating that it is simplicity rather than
complexity which points to low probability [as in the case of irregularly shaped
(i.e., complex) pebbles vs a perfectly spherical (i.e., quite simple in shape)
piece of stone (see [2] and [4])]. Another example showing that high
complexity not necessarily means low probability was discussed above in the
section on EF, where the case of flat triangular ice crystals was reviewed.
9. Are NFL theorems relevant for Dembski's thesis?
It is time to say a few words about the title of Dembski's new book. It has been
borrowed from the name of a set of mathematical theorems proven a few years ago
by David Wolpert and William Macready (NFL theorems).
As far as the purely mathematical essence of the NFL theorems is in question,
these theorems are
proven beyond doubt. However, every mathematical theorem, however logically
impeccable, is always true only to the extent the premise accepted for its
derivation is fulfilled. The NFL theorems are applicable only if and when
certain conditions are fulfilled which constitute a part of the premise on which
the proof of these theorem was based. For example, the NFL theorems are
only applicable to the socalled "black box" algorithms. There are certain
other conditions which limit the area of applicability of these fine
mathematical results. There are certain situations wherein the NFL
theorems are either inapplicable or at least require an investigation of their
applicability. One such case is the biological evolutionary
algorithms. Before trying to apply the NFL theorems to his theory of the
solely intelligent origin of CSI, Dembski should have performed a detailed
analysis to find out whether or not the NFL theorems can be legitimately applied
to his case. He did not do that, simply assuming that the NFL
theorems work for biological evolutionary algorithms. Dembski applied these theorems to the case where their usefulness was plainly wrong.
A critique of Dembski's use of the NFL theorems has been suggested by several authors (for example in [24, 25]). Recently David Wolpert, one of the coauthors of the NFL theorems, wrote a brief review [14] of Dembski's NFL book where he dismissed Dembski's discourse as mathematically vague (in Wolpert's terms, "written in jello").
I offer a detailed analysis of Dembski's misuse of the NFL theorems in [32]. Here I present a brief exposition of the main point of my critique.
The NFL theorems assert that any two search algorithms "perform" equally well if their "performance" is averaged over all possible "fitness functions." From that Dembski concludes that no algorithm can outperform a random sampling (or "blind search'). Since a random sampling, in order to produce the complex organisms from a much simpler progenitor, need an enormous number of trials and therefore an enormously long period of time, then, if we accept Dembski's conclusion from the NFL theorems, no evolutionary algorithm can succeed in producing complex organisms within a period of time available for evolution. Hence, concludes Dembski triumphantly, the NFL theorems prove the impossibility of Darwinian evolution.
Without arguing about the reliability of Dembski's final triumphant conclusion about Darwinian evolution, I can categorically assert that such a conclusion does not at all follow from the NFL theorems.
As mentioned above, the NFL theorems only relate to algorithms' performance averaged over all possible fitness functions. These theorems say nothing about algorithms' relative performance on specific classes of fitness functions. In fact, various algorithms perform very differently on specific fitness landscapes and the NFL theorems in no way prohibit this. In Dembski's book there are many examples of such situations which he strangely seems not to perceive as contrary to his thesis. For example, Dawkins's algorithm generating a phrase from Shakespeare reaches its target in only about 40 iterations. A random sampling would, as Dembski himself points out, take about 10^40 iterations. This is an outperformance! The same is observed in other examples Dembski refers to  the search for an optimal shape of antenna, in a checkerplaying algorithm, etc.
Therefore Dembski's attempt to utilize the fine mathematical result of Wolpert and Macready  their NFL theorems is unsubstantiated.
Overall, Dembski’s new book is a hodgepodge of unsubstantiated but quite pretentious claims and unnecessary quasimathematical exercises serving no useful purpose, and displays many features of pseudoscience so eloquently described by Gardner.
Finally, I would like to say again, that the discussion of Dembski's work in this article, as well as in[2], addresses a reader who has no special training in information theory, mathematical statistics and related disciplines. A more detailed discussion, which can be comprehended only by readers with some mathematical background, is given in several publications, among which I recommend articles by Richard Wein,[25] [28], Wesley Elsberry and Wilkins,[30] and Wesley Elsberry and Jeffrey Shallit.[31] Except for the difference in the targeted audience and thus in the level of mathematical sophistication of the discourse, I share the views of these authors in their critique of Dembski's literary production.
10. Acknowledgments
I am indebted to Brendan McKay, Matt Young, and Richard Wein for useful comments and to Wein and Jeffrey Shallit for the permission to use some of their material before it was made public.
11. Appendix
(In May 2002, Dembski posted a lengthy response[26] to Wein's paper.[25] It was replete with irrelevant ad hominem remarks. Dembski often used the word rubbish to characterize Wein's arguments, but otherwise it was mostly a repetition of the stuff Dembski offered in his earlier publications. It failed to refute Wein's critique. Wein, however, decided to response[28] to Dembski's failed refutation of Wein's critique. Wein's excellent response speaks for itself, showing the complete absence of substance in Dembski's piece[26]. In June 2002, Dembski published one more response[29] to Wein's article [28]. Dembski's new rebuttal [29] is a remarkable document. It displays Dembski's enormous ego and arrogance. Again, it is replete with insulting personal remarks, references to Nobel laureates who all love Dembski, and scoffing advices to Wein as to what the latter's behavior should be. Leo Tolstoy wrote that the actual value of a human being is a fraction wherein the numerator is that person's talents and the denominator is what that person's opinion of himself is. If the denominator is very large, the fraction approaches zero. In Dembski's case, the numerator may be reasonably large, but the denominator is enormous.) What a waste!
12. References
1. William A. Dembski. No Free Lunch. Why Specified
Complexity Cannot Be Purchased without Intelligence. Rowman &
Littlefield Publishers, 2002.
2. M. Perakh, A Consistent Inconsistency.
3. W. Dembski, The Design Inference, Cambridge
University Press, 1998.
4. Michael J. Behe, Darwin's Black Box, The Biochemical
Challenge to Evolution, Simon and Schuster, 1996.
5. M. Perakh, Irreducible Contradiction.
6. M. Perakh, Science In the Eyes Of a Scientist.
7. John H. McDonald, A reducibly complex mousetrap.
8. W. Dembski, in coll. Science and Evidence for Design in
Universe, Ignatius Press, 2000.
9. Russell Doolittle, in Boston Review, FebruaryMarch
1997.
10. T.H. Bugge et al, Cell, 87, 709719, 1996.
11. W. Tape, Atmospheric
Halos. Antarctic Research Series, v. 69.
American Geophysical Union, 1994.
12. W.A. Bentley and W. J.
Humphreys, Snow Crystals (Dover, 1962).
13. U. Nakaya, Snow
Crystals: Natural and Artificial, Harvard University Press, 1954.
14. Martin Gardner,
Fads and Fallacies in the Name of Science, Dover Publications, 1957
(originally the book was published by G.P. Putnam's Sons in 1952 under the title
In the Name of Science).
15. Victor J. Stenger,
Messages from Heaven.
16. Matt Young, How to Evolve Specified Complexity by Natural Means.
17. Claude E.
Shannon, A Mathematical Theory of Communication, Bell System Tech. J.,
July 1948 and October 1948.
18. Richard
E. Blahut, Principles and Practice of Information Theory, AddisonWesley Publishing Co., 1990.
19. W. Dembski,
Intelligent Design, The Bridge Between Science & Theology,
InterVarsity Press, 1999.
20. L. Landau and E.
Lifshitz, Statistical Physics (Moscow: Gosfizmatizdat, 1971). In
Russian (an English translation is available).
21. Paul Vitanyi,
Meaningful information  in
Front for the Mathematic ArXiv.
22. Mark Perakh and
Brendan McKay, Study of Certain Statistical Properties of Meaningful Texts as Compared to Randomized Conglomerates of Letters.
23. A. N. Kolmogorov,
Three Approaches to the
Quantitative Definition of Information, in Problemy Peredachi Informatsii (in
Russian). Under the same title translation in 1(1)
1965.
24. Jeffrey Shallit (University of Ontario, Canada). Private communication.
25. Richard Wein, Not a Free Lunch But a Box of Chocolates, see also on this site.
26. William A. Dembski, Obsessively Criticized but Scarcely Refuted: A Response to Richard Wein.
27. David Wolpert (Santa Fe Institute), William Dembski's treatment of the No Free Lunch theorems is written in jello.
28. Richard Wein, Response? What Response?.
29. William A. Dembski, ARN Discussion Forum: Dembski Responds to Wein's Response.
30. John S. Wilkins and Wesley R. Elsberry, The Advantages of Theft over Toil: The Design Inference and Arguing from Ignorance, Biology and Philosophy, 16 (2001): 711
31. Wesley Elsberry and Jeffrey Shallit, Information Theory, Evolutionary Computation, and Dembski's "Complex Specified Information." A preprint made available through a private communication.
32. Mark Perakh, There Is a Free Lunch After All. (A chapter in the collection to be published, editors Matt Young and Taner Edis).
Discussion

