User:Aryeh M. Friedman/State of Fear Debate

This Wikipedia Article is meant to present an as fair and balance look at the science and the political debate that Michale Crighton's novel State of Fear has created. It is not a place to debate the actual merits of both sides of debate (please do that on the discussion page).

'DISCLAIMER: THIS ARTICLE PURPOSEFULLY DRAWS NO CONCLUSIONS. ITS ONLY AIM IS TO PRESENT THE COMPLEXITY OF THE DEBATE INVOLVED BEHIND THE EVIDENCE (OR LACK THEREOF) FOR GLOBAL WARMING, THEREFOR UNLESS KNOWN TO BE BASED ON FACTS THAT YOU KNEW BEFORE READING IT OR "STATE OF FEAR" TAKE ANYTHING SAID IN IT OR IN IT'S SOURCES WITH A LARGE DOSE OF SKEPTICISM.

'

Please note on the disclaimer that a careful reading of the end material in State of Fear shows that is after all Crichton wanted the reader to do in the first place. Namely get interested enough in the topics explored that they would go out and confirm the science behind them starting, but not stopping, with the references he gives.

Accuracy and Validity of Citations
The primary science and political debate surrounding State of Fear centers around how much of the "fiction" in the book is indeed based on an accurate assessment of the current "state of art" in climate research. Thus it is critical that if the debate on the actual science has any baring on the title then we must ask:


 * Are the citations Crichton gives based on "solid" and creditable journal articles and/or general academic oriented articles in reputable publications meant for the general public and the scientific community.  Even a complete list is not possible here all parties agree that Science, Nature and Scientific American are the leading publications in this field.


 * If there does exist a sound scientific consensus (a term that Crichton claims is by definition not science but only speculation) that disagrees with Crichton then even if he has not misused his source material then he is at least guilty of a interpratation that is as unscientific as the one he claims to be exposing in the title.


 * Does Crichton misuse and/or other wise mislead the reader as to the nature and meaning of his cited work


 * If all of Crichton's citations stand up to the above parameters has there mention in the title (allowing the general public to be aware of them) allowed politicians and other non-scientists to misuse and/or mislead with Crichton's references

Are the citations valid scientific works?
Several science, technical and general journalists and other independent observers have reviewed the citations supplied in an appendix of the book for wither or not they represent valid scientific works or not. Since the journalists are them selves divided on the matter the following will break the most notable and/or verifiable such reviews, both for the citations being authentic and for them not to be.

Note: There existed no critical comments about the authenticity of the comments in the first 5 pages returned when searching Google for " 'state of fear' citations"

Authentications (all relevant entries on the Google search parameters given above):


 * The Boston Globe while critical of Crichton's use of the sources does confirm they are authentic


 * Heartland Foundation is well known for its critical view on claims of GW has also verified the sources as being authentic


 * Institute for Skeptical Inquiry is extremely critical of Crichton's use of his citations does not dispute the sources themselves be genuine. The article only directly cited one citation and leaves the rest to the realm of being unchecked.


 * David Deming of College of Geosciences, University of Oklahoma in a pre-publishing version of a article for Journal of Scientific Exploration strongly rebukes Crichton's critics (implicit assumption by the Author is that the citations are all real; since this is peer reviewed Journel it highly unlikely unless the author did in fact check them himself)


 * An anonymous author offers a assignment she gives her Chemistry classes (which school is unknown) where she asks the students to critical look at the science behins State of Fear. She does not directly state that Crichton's citations are "real" but the fact that the assignment is given implicity says she believes them to be.


 * In a reply to a highly critical blog entry Tim Yakin's Blog on the book confirms the authenticity of the citations.


 * The Washington Times quotes William H. Schlesinger, dean of the Nicholas School of the Environment and Earth Sciences at Duke University as saying while Crichton has selectively selected and misused his sources they are all real scientific publications.

Does Scientific Consensus exist on the issue of Climate Change
Before discussion if there is or there is not scientific consensus on either side of Climate Change (or any other speculative field) we must clearly separate our level of confidence in what such a "consensus" says. In other words we must clearly separate fact from extremely likely to be fact but unproven theories from pure speculation:


 * Known facts: Scientific facts only exists when there exists unanimous agreement among anyone knowledgeable in the field that the only possible interruptration of the data is the one currently "known".  For example Newton's Laws of Motion (for non-quantum and/or relativistic special cases), no physicist (including amateur physicists that have had a completely non-technical introduction in high school only) disagrees with the following two formulas $$f=ma$$ or $$f_g=G\cfrac{m_1 m_2}{r^2}$$


 * Highly likely, but current unproven theory: Items that extremely few (<0.0000...%) researchers in the field would disagree with but have yet to be proven beyond all possible doubt. An other example from physics is there is almost universal agreement that there exists some form of "trivial" connection  between Newtonian Physics, Relativity and Quantum mechanics no experimental proof of this has yet to be found.  But, this is rather common in science the first actual physical proof for relativity was not found until nearly 20 years after Albert Einstein published by this time relativity was considered in the above category.


 * Pure speculation: Ideas/theories/etc. that while being dressed in the language of science are nothing more then speculation and/or provide a framework to be filled in once data numbers is available. Even the most ardent believers in the idea/theory, if they are intellectually honest, will not state the idea/theory as anything more then pure speculation (unless put under extreme pressure, as Crichton's primary thesis does; and even then not without every possible disclaimer they can "get away with").  A typical example is today (8/22/08) there exists a small group of astrophysics who think that under very unorthodox interpretation of a yet to be physical experimented on non-mainstream theory that the value of the Gravitation Constant ("G" in the second formula above) has not been constant through out the history of the universe, but  there is no agreement even among this group as to what changes and/or even the trends G has undergone.

As the above examples show it is possible to have known fact status for the "big picture", but there is dispute in the details. This means when we are discussing the anything that has less then "known fact" status we must ask what portions of the discussion have what degree of confidence placed in them. For example in the Global Warming debate there is no dispute that $$CO_2$$ is released by the burning of fossil fuels, but the extent, if any, "damaging" effects of any effects of long term Climate Change is nothing more then pure speculation. Estimates range from a boom to agriculture to a world that resembles the one in the movie "Water World" with almost no agreement even among the most ardent proponents of human activity being the only source of Climate Change.

While good science is not done by a skeptical review of the literature, such reviews are a important component in detecting the existence or lack thereof a given confidence level in the overall discussion and/or it's components. Once the fictional part of State of Fear stripped of the book is nothing more then a "critical" literature review of the available, as of 2005, "state of confidence" in the theory of Global Warming being caused by human activity. Thus any serious look at the "evidence" that Crichton offers requires us to look at areas that are not directly covered in the book, but are none the less required to gain a complete picture of the "state of confidence" in the theory of Global Warming being caused by human activity. This also requires that we look at fields that are not directly related to the climate; such math and computer science.

Bottom line we can not and should not talk about there being a single "scientific consensus" for or against Climate Change; rather we need to look at the existence or lack thereof many seemingly unrelated fields that are required to form the theory behind Climate Change and the verification of it's predictions. The balance of this section will attempt to summarize what the level confidence we can place on the most important aspects of the theory of human actives effect on the Climate. The reasons for why we have this level of confidence will be examined in detail in later sections.

Separating Objective from Subjective Sources of Data for the "State of Confidence"
Even though the current Scientific Method has come under an increasing number or criticisms of it's tendency towards "group think" as well its "current inefficiency" in reporting new findings, it is the only almost universally accepted method for gathering objective data about the natural world and resolving the differences of opinion when little or no hard data exists. Thus this article will use the following methods of gathering data on the level of current confidence in a idea/theory that are generally recognized as being valid under the Scientific Method, which are:


 * Peer reviewed literature reviews
 * Peer reviewed "unbiased" surveys of researchers in the field

The following some times cited, in non-scientific circles, sources will not be considered to be objective:


 * Letters to the editor in any periodical and/or blogs, including peer reviewed journals, general field survey magazines (such as CACM and IEEE Spectrum in computer science), general "quality" science magazines (such as Science, Nature and Scientific American), or General magazines/newspapers/the electronic media)


 * Editorials in any of the above periodical types
 * Introductions to and/or summary of special issues of the above periodtical types
 * Books that are not as peer reviewed as the above "legit" sources

The Pro's and Con's of Different Fact Gathering Methods
There are two ways to gather data about the existence and/or non-existence of consensus on any given topic. Some critics, such as Crichton, claim the very term "scientific consensus" is nothing more then a fancy way of saying "scientific speculation". The first type of data to be gathered is by doing a literature search of the avaible peer reviewed work in the field. The second is to conduct anonymous and non-anonymous surveys of people active in the field in question. There are important limitations to either form of data gathering:

Peer reviewed papers:


 * Due to the need for at least 3 other researchers (depending on publication) unless ones results are beyond question it is often risky to go against whatever pre-existing biases may be present in the community of like minded researchers


 * Most areas of research are so specialized today that there is a very high likelihood your are familiar, at least by their published work, the writing style of all other "serious" researchers in the field. Thus even though the peer review comments are returned to the author anonymously one can often guess within one or two people who wrote the review.  Since most people in the community of "serious" researchers know that this likely to occur both the peer reviewers and the authors will pull their punches to some extent


 * Due to a specialized community the jargon used in such publication is often not understandable to even those familiar with the broader field (including researchers). For example if you asked most computer scientist for a description of how quantum computer works in theory (there exist no practical examples of such machines) you will likely be given a layman's to a general computer science description of one, which is basically the layman's description without the need to define terms that all computer scientists know.    But, since the field is so specialized and outside of classical computer science that unless you have been active in quantum computing research it is highly unlikely you're knowledge will be any better then "Joe Six Pack", just more educated in the general field.


 * Due to the jargon issue above a completely outsider to the established community will find it very hard to publish even if their results are completely correct.  If it is difficult for academics to publish outside their field then it is nearly impossible for people who have somehow gained the majority of their knowledge of the field by self education due to self educated people tending to use technical slang vs. "official" jargon


 * Due to personality (natural introversion and/or some mental disability, like Asperger's) reasons many researchers are afraid to publish, especially if their view is not the "orthodox" view. For example Sir Isaac Newton did not publish his results in Physics or Calculus until after his death and in the second case allowing Leibniz to make a good enough claim on co-inventing Calculus that most texts credit both as co-inventors of the subject.   In extreme cases there may be the indirect threat of physical harm such as the very extreme case of Galileo Gallie being placed under house arrest for offering undeniable evidence that the Copernican model of the solar system must be correct.

Surveys:


 * If too many open ended questions are used gathering reliable quantifiable statistics is very difficult at best


 * If too few open ended questions are asked then the survey must have subtle or sub-consensus biasing shortcomings


 * The wording of questions often has either subtle or sub-consensus biasing effect on both the questioner and the survey resondant


 * Non-anonymous surveys have the same flaw as the second point made about peer reviewed publication


 * Unless the same survey and collection methods are used it is nearly impossible to do a fair comparision of the results of any two surveys and/or track general thinking in the field over time

The Big Picture Issues
Is there a current short-term global warming trend? Known fact

Is this trend a part of a longer term trend? Likely, but unproven

Do we have the necessary data to determine the above questions? Slightly speculative

Do humans have a measurable effect on the Climate? Speculative

Can any action or inaction by humans effect the Climate? Highly speculative

Does the current short-term global warming trend accurately predict "serious" changes in the climate? Highly speculative

Is it possible to understand the climate sufficiently well to answer any of the above questions? Speculative

Please note that almost all the items that are labeled as not being a "known fact" are so labeled not due to "high level" speculation and/or lack of there of but is rather based on less then "known fact" status of the important fields needed to answer this question.

Data Accuracy
The first three "Big Picture Issues" are essentially questions about the accuracy and quality of the raw data available; thus we only look at these two items to answer the questions:

Have uniform methods been used over the course of the last century or so for direct measurement of weather conditions and seasonal averages? Known to be untrue

Can we correct for the non-uniformed of collection methods? Very likely, but unproven

Is the coverage, both spatially and across time, of direct measurements sufficient to get a good picture of the total climate? Slightly speculative

If all of the above is corrected for are there known non-"Climate Change Theory" based errors (such as changes in relative heat reflection based on land use)? Known fact

Can we correct for such non-Climate Change based errors? Speculative

Do we have verifiable objective data for a long enough duration (no less then a 400 or 500 years) as to the general climate in the long run past? Likely, but unproven

Is the raw data accurate enough for input into further stages of climate modeling? Highly speculative (see computer modeling section)

Climate Modeling
The balance of "Big Picture Issues" deal with our ability to accurately model the climate thus we need to ask the following more directed questions:

Do we understand the underlying forces that shape have shaped the Climate in the past? Speculative

Has Human activity fundamentally altered any of these forces? Pure speculation

Do we understand the climate well enough to create accurate computer models? Pure speculation

Once computerized does computer science and mathematical limitations on nature of the type of system being modeled have measurable effects on the results of running the model? Known fact

Can we correct for such limitations? Known to be theoretically impossible (i.e. it has been shown beyond all doubts that it is not possible to construct a theory that would allow for their correction)

Are such errors significant to cause false positives and/or false negatives for predicting major trends/events? Known fact (in fact for all non-trivial models it is guaranteed to happen)

Does Crichton misuse any of his cited work?
Before we can say if Crichton has misused his sources we need to establish if the sources are misusable and then, and only then, can we ask if a certain person did misuse them. But, a person can not be accused of misuse if they themselves admit that they "misused" the source. So the Crichton needs to not meet the following criteria before we can say he misused the original science:


 * Unless it is a "known fact" (as defined above) the burden of proof is on the originator of that "fact", not the person questioning it


 * When dealing with things that are less then known facts it is ok to "misuse", "use out of context", "cherry pick", etc. as long you freely admit that's what your doing

Due to State of Fear being a work of fiction some artistic license needs to be allowed. Specifically any statements needed to maintain intellectual honesty have to be made in such a way as not to disrupt the story line. Preferably one of the characters should state any intellectual issues when they start to present the "evidence" for their point of view. Thus we need add one more limiting criteria to show that Crichton has or has not misused his non-fiction sources:


 * When taken as a whole does any claim made that may be intellectual questionable, by any of the character, in the course of plot does that character or some other story telling device clearly state that the statement may of been intellectual questionable and if it is questionable to be honest about why it is questionable.  For example when Jennifer Hiens presents certain climate data to Evans that is, in real life, disputed she clearly states that it is disputed.  Thus Crichton while perhaps using the data in a way that is favorable to his central thesis; he *IS NOT* misusing the data (because he admits he is doing this).

Misuse of known facts
Before asking if person X uses or misuses a given fact, it is by definition impossible to use or misuse speculation, we need to ask in a objective way what exactly correct vs. incorrect use of the fact is. In the case of known facts unless you can offer very convening evidence that the current understand of that "fact" could not possibly correct. Before relativity became widely accepted and then eventually proven it was hypothesized that all waves required some form of medium (i.e. substance) to be propagated. Light was "known", later proven to not be completely true (it is in fact both a wave and a particle at the same time {this is the central paradox of quantum mechanics}), to be wave. Thus, by definition, the universe must contain some undetectable medium which light travels through. Since light must travel through this medium ("ether"), but because Newtonian Mechanics requires a stationary "background" to measure stuff against. For example neither formula in the previous sub section is possible unless you measure it against a fixed (i.e. absolute) reference point. Therefore if you're on a moving vehicle then it should be possible to detect slightly different speeds of light depending on if you look through a cross section of the ether (it does not move my definition). These differences should show up as interference patterns similar watching a object pass in front of a light source. But when Michelson and Morley [find date] attempted to find just this interference and failed to find (no finer tuned repetition of the experiment has ever found such a pattern). Thus the "known facts" of Newtonian Mechanics is not in fact "known facts". This and other mounting evidence was calling the Newtonian Universe into question in extreme cases (very fast speeds and/or very small objects). The point here is Newton was accepted without question before Michelson and Morley because all the avaible data fit it. In other words when discussing it compared to other theories, including the hand of god, it is the simplest explination thus the most likely to be correc; please note that Occum's Razor does not rule out any other theories it just says that in the lack of contrary evidence (like Michelson and Morley) the simplest possible explination is usually the "correct" one. In other words unless you can convinvingly prove, typically by showing directly observable independently verifiable physical evidence, there is a flaw in some known fact or the theory behind it then it is pure fiction to say that it is not fact. Please note that just because a fact is proven to be incomplete at the extremes does not negate the fact that it is still correct for non extreme cases. Thus any new theory needs to account for that and this is why it was not til the publication in 1907 of General Relativity that there existed a likely to be true, but unproven explination of what was happening at the extremes; as stated above this was not physical proven until the mid 20's though.

The above paragraph is only meant to show how high of a purden of proof there is if you question a known fact, and why it is so hard for a theory to achieve that status (relativity as a whole has not quite acheived this status in some of the more "out there" predictions it makes; but for normal every day physics it is a known fact). The reason why relativity is almost universally accepted (the 5% or so that is not proven via direct observation) is that everytime new physical evidence is discovered that can support or not support it the evidence has always agreed with relativity. Ironically there some predictions it made that where so bizzare that not even Einstein was willing to accept them. For example as orginally published a slightly unorthodox set of solutions to the equations implied a expanding universe, which among other things raised some very "troubling" meta-physical questions, Eistein refused to believe this since there was no physical evidence, at the time, of a expanding universe (later though through the use of mass-spectrumity it was shown that due to the very predictions that relativity made, and already proven, that the only reasonable explination for certain observations was that the universe was expanding). Einstein there for published a corrected version of relativity that introduced the "Cosmological Constant" so as to make a expanding universe "impossible". Once the expanding universe was confirmed the orginal theory was "correct" and Einstein was "wrong" (it the last 5 to 10 years certain other observations only make sense if ther is a Cosmological Constant but no where near as large as the one Einstein used). Bottom line the same "known fact" has been revised 4 times, not including the issues that quatum mechanics raises, to make its definition more precise (and thus limiting).

Misuse of pure speculation
It is impossible to misuse pure speculation. The reason being since it is nothing more then a complete guess I am free to make an alternate guess and because there is no evidence for or against either guess we can both be wrong (or right if the guesses are not mutually exclusive). For example if I say that I will roll "box cars" on my next through of the dice in a craps game and you say I will roll a "7" until I actually make the roll neither one of is either right or wrong; thus it is impossible for me to misuse what you said and it is impossible for you to misuse what I said. Now once I make the roll and it is box cars and you insist it is a "7" then your are not only a liar but are misusing what you said. Bottom line until some thbing has some physical evidence supporting or not supporting it I can not misuse it.

Misuse stuff that is neither "pure speculation" or "known facts"
When there exists some good (physical evidence or based on a theory that explains all current physical evidence) evidence supporting, but the evidence is not enough to remove the theory from being "unproven theory" to being a "known fact" we enter the realm of the possibility of otherwise intellectually honest people overtly or covertly misusing portions of the theory that do not have been shown via physical evidence to be beyond a doubt. When Crichton's critics engage in a intellectually honest debate as to his use or misuse of the data this the gray area everyone focuses on. In other words if you present a data set and then draw certain conclusions based on that data set then I am free to draw different conclusions as long neither of conclusions violates some "known fact".

Has Crichton Used His Sources in a Misleading Manner.
Note that in many cases the people making the claim of misuse are the authors of the very sources that Crichton sites. Additionally the section "Crichton's reaction" uses the data and statements about why and how that data is presented in the narrative portion of State of Fear only (not the appendixes and/or any speechs/testimony/etc. he has made before or after the book was published)

A few storline/character points need to be made to understand the following:


 * Kenner is a stand in for Crichton himself
 * Evans is the stand in for the reader (who Cricthon presumes accepts most of the tenants of climate change without any detailed study of it, but not unquestioningly)
 * Drake is a stand in for enviromental movement "professional" activist
 * Ted/Ann are stand ins for people who accept the "enviromentalist" party line without question
 * Sara/Jennifer are stand ins for the academic community (intellegent enough to follow the debate but undecided until the evidence is presented) with Sara being the portion of the community likely to believe in GW on less then undeniable evidence (they will accept Likely, not proven as sufficent "proof") and Jennifer represents the part of the community that accepts undeniable evidence only
 * Jennifer is also a stand in for conflicts of interest created by how the research is funded (i.e. her "official" story changes based on who is paying the bills but in private she makes her true feelings known)
 * Sanjun is a stand in for the local university library/reputable Internet source verification, etc.

Claim: Crichton misuses the/my research

Crichton's reaction: No direct reaction because the claim is to broad to make a specific claim or counter claim about

Claim: Crichton has used the/my research out of context

Crichton's Reaction: Kennner states that "he" (Crichton as speaking through Kenner) is suggesting different possible conclusions (he never actually offers his own conclusions). Kenner goes on to explain that he is not questioning the data only the speculative parts of the source. Additionally he offers Evans (the reader) the "references" so that that Evans (and thus the reader) can verify that the data is not being used out of context that only the specluative part is.

Claim: Crichton has cherry picked and/or in other ways selectively used the data to support his personal theory

Crichton's Reaction: The majority of the data this claimed for is presented as Jennifier "interrogates" Evans under the guise of pre-jury selection data gathering. Evans continually asks for alternative data in an attempt to find a whole in the "hand picked" data that Jennifer is showing him. The following sample exchange makes it clear that when Jennifer (Crichton) presents "cherry picked" data he makes it clear that is what he is doing (get page ref):


 * "... but aren't you just cherry picking the data that is useful to your case? ", said Evans
 * "Yes we are and you can expect the defense to do the same, so we need to try every trick they will try to see if the data still stands up", said Jennifer

Claim: Crichton does not give due time to opposing views

Crichton's Response: Through a combination of Drake, Ted and Ann Crichton does present the "orthodox" view and does not deny their factual findings (Kenner does not once deny anything that is a known fact)

Claim: Crichton misrepresents the data

Cricton's Response: Every "factual" statement is fully referenced. The one place where Crichton uses pure speculation to the advantage of the "good guys" he clearly states it is pure speculation when Sanjun says "Most of the work is classified" (i.e. I am speculating here and have admitted it by offering no citations)

Claim: Crichton attempts to confuse the reader dumping too much detail on them; thus obscuring the big picture (i.e. that the theory of global warming is "proven")

Crichton's Response: If you draw a faulty/questionable conclusion then the conculusion is likely correct based on your understanding of the details; thus at least one of the detail must be faulty/questionable. For example if I am told "Hillary Clinton is a man" I would like to know how you came to that conclusion. So you give the following arguement, which is nothing thing more then if a and b are true then c must also be true", a) "All politicans are men", b) "Hillary Clinton is a politican", therefor c) "Hillary Clinton is a man". It is now clear that you mistakenly said that all polticians are men (as Hillary Clinton proves) thus falsely conclude she is a man.

In the case of the theory of Global Warming we are asked essencially "if a is true and b is true and c is true then d must be true" with a, b, c, and d being:

d (conclusion). Humans are ruining the climate by causing unnatural tempure rise that threatens the whole planet. (in slang/rhetorical version)

a. The climate is warming unnaturally fast b. Humans are causing it c. We know this trend will continue into the future

We need to translate this into more precise definitions because the above does not really ask anything it just a request for us to take more claims on faith:

d (conclusion). Average global temputures are rising at a faster pace then can be accounted for without human activity. We know this based on historical data and the results of historical data plugged into various computer climate models.

a. The average global tempure is increasing faster then be accounted for due to natural causes alone b. Human activity is the only viable explination for a c. We know the above because our computer models predict this when fed all relivent data

For reasons stated in the "Sciencitfic Consensus" section a, b and c are all speculation of some strip or an other thus it is completely misleading to conclude d.  But if Crichton had not presented the evidence to show that a, b anc c where in question then the reader has no factual (or lack thereof) basis to say if d is true or not.

Have politicians and other non-scientests misused Crighton's work (or the work he cites)?
Politicians and the media of all strips have very bad track records for presenting/using/understanding science that it is virtually guaranteed that both sides of the debate of GW (and of SoF by implication) have misused the science without admitting it in a honest way, or simply not caring about "known scientific facts" a few examples:


 * The Ohio State Legislature in 1885 [make sure of state and date] redefined $$\pi$$ to be "4" and forced every school text to use that value (the law was resended the next year)


 * The Reagan Administration in 1984 defined Ketchup to be a vegetable for the purposes of school lunches (i.e. a burger with ketchup on it qualified as meeting the vegetable requirement for the meal)


 * Textbook publishers regularly make errors in "typesetting" and do not correct them even when the author or some other expert points it is an error (for example one Geometry book for the last 3 editiions has incorrectly stated that the Phagrian Theorem as being a=b+c instead $$a^2=b^2+c^2$$


 * Every major network called Florida for Bush and then Gore before the polls on the west coast had even closed in the 2000 election


 * Environmental groups (*AND* "industry") routinely publish unsourced and thus unverifiable statistics

Given these facts both sides of the GW debate have misused Crichton's citations (see external links for examples from both sides of the debate)

Raw Data Reliability
Since global averages and such are based on indivual data points (i.e. weather stations) this section only focuses on the reliablity of data collected from stations and not any subsiquent processing of the data (i.e. even something as simple as showing the global average is a form of processing).

Accuracy of Collection Methods
There are several sub-questions that need to be asked to see if the raw data collection methods we currently and have used in the past are trustworthy enough to trust the resulting data. The main ones are:

Has the physical collection method changed over the life of station?
There appears to be no reliable documentation, at least on the Internet, that documents on a station by station basis the physical procedures used to collect the raw measurements. In some cases there are progress reports and/or upgrade requests that mention the status/benefits of installing new instrumentation in a station.

Before 1950 or so it is safe to assume that there was no computer aided data collection so all data was, by definition collected by humans, and since the no online documentation on what standards and such such human reporting required it is safe to say that the data is incomplete at best and factually incorrect at worse. Specifically we would want to know at a minimum to trust the historical data:


 * Frequency of collection (and some sense of how often data was missing for techinical or human reasons)


 * Homogenality of collection (was the instrumentation observed without moving it or not, was it collected at the same time of day, etc.)


 * Did the measurements occure in one temputure measurement system (Ferehnight, Celsis or Kelvin) as they are reported in?  If so where the calculations done by the collection personal or a third party?  If computerized did the station do the conversion or did it get done by hand/via computer by a third party? (it is possible that slightly different interprations of different measurement/computation standards would cause a disagree for example if done by computer does it handle "round off" errors correctly or does the station use one floating point/word size standard and the centeral collection point uses an other... see Theory of Computation section for implications of this)


 * How fined grained (aka decimal places) the data is and how many decimal places the convertion formulas where?


 * When new instruments where installed did thay get callobrated and/or computationally adjusted to report the same thing the "legacy" instrumentation did?


 * For older and/or third world stations was their sufficent proof that the human collectors where properly trained in recording and doing any needed convertions?

Due to the lack of documentation we can not definitivelly make any statement about all the above. Bottom line is we have to take raw station on "good faith" that the best avaible methods where used in it's collection.

In the process of standardizing data do we distort it?
Assuming that the same collection methods and standards have been used we still need to correct for known errors in the physical collection process. For example how do we handle situations where no data is available for technical or human factors. In the NASA GISSTEMP data set (which both Crichton and the critics use for their primary raw data source) an erroneous value is denoted as being 999.9 degrees Cecilius. Due to the vast volume of data that needs to be processed it is only possible to standardize it with a computer (humans are too slow and error prone). A properly designed and tested program will be able to easily identify and "ignore" such sentinel values. The problem here is most of the software written to do such work was written by the climate scientist interested in the data and not a professional programmer and thus many subtle possible bugs and/or design flaws are not tested for, see the article on Software Engineering for more details. As a result the standardized data needs to be taken with a grain of salt and this is when looking at a data set that has already had some conversions and such done on it (i.e. some errors may have occured from how real number math is implimented on most computers).

The resolution of the standardized dataset compared to the raw dataset also needs to be examined. Specifically GISSTEMP data is to tenths of a degree Celsius which is actually a loss of precision when converted from a tenth of a degree in ferenhight measurement. Depending on which "floating point" (numbers with decimals/fractions) standard is used rounding errors could occure under different circumstances and unless the program is meant specifically to deal with such errors it is possible to round off in the wrong direction. For example the current IEEE floating point standard, used on all modern computers except certain IBM Mainframes, states that a round off error will always be in the form of .xxxx9999.... where .xxxxx is the non-round off error digits. If our program just blindly rounds up to the thousandths then the final x in .xxxx will be one higher then it should be (and in extreme cases cascading all the way down the line incorrect round up non-decimal portion of the number for example 8.9999^99... with the ^ denoting the end of the valid result it would be rounded to 9.0000 which is clearly not right. A naive programmer might also mistakenly just ask the computer to "truncate" the number (round down no matter what the digit being rounded is).  For example the same measurement above would be truncated to 8 if we went to no decimal places.   A final potential rounding issue is how to handle the rounding of a "5" (i.e. some methods/standards say to round down, others up and still others base it on if the next digit is even or odd [under most US States Codes this is the official way of handling round offs in land surveys]).

We are actually measuring slightly different things depending on if the station is human run or automated. Human run stations are instantaneous measurements where is most automated ones are running averages for the measurement period. Additionally most human run stations are sampled less frequently then the "tick points" in automated stations. An additional issue in automated stations is electronics, by definition, produces heat thus the manufacture of one instrument has likely calibrated, in pre-solid state electronics the components gave slightly different results depending on how hot the are (this is one of the reasons most professional guitarist prefer tube to solid state amplifiers) the device to factor out its own hear. It is unlikely though that unless the station comes as a pre-built unit from a single manufactor (only possible in the last 20 years or so, due to component sizes) that any of the instruments are calibrated to factor out the heat of the other instruments in the station. Such calibration requires a very high degree of skill, which is often not available locally to the station. Due to the sensitivity of the instrumentation and it's at least partial exposure to the elements frequent recalibration is often needed (older the equipment the more frequent it needs to be recalibrated). Lastly in the area of calibration there exists no documented standard for how to correct for tempure differences that are not caused by the actual temperature of the atmosphere, for example sunny vs. over cast, windy vs, calm day, etc. In older semi-automated stations has the human observer been trained sufficently well to take the heat of the automated components into account.

Does the summarizing the per station data distort it?
Beyond the obvious issues that are also true in summarizing the data we mostly likely are forced to make some judgment calls and other things when producing composite and summarized data from the same station and/or between stations. For the same reason it is not possible for humans to standardize a computer is required to summarize it, in theory climate models often use the non-summarized (but as described below normalized) data, the process requires the computer to make these judgment calls and they are notorious for making the wrong call in many such situations (most life or society critical applications will flag such guesses and let a human sort through them by hand but in the case of climate data this is still too much data for a human to process by hand).

In grade school we all learned that the average (aka mean) of a series numbers summerized in formal math as $$\cfrac{\Sigma x_i}{n}$$, i.e. the sum of the list divided by the number of items in the list. In formal statistics this is only partially correct there exists different methods for finding means in the following cases:


 * Some value(s) are so far out side the range of the rest of the values they are disguarded (outlayers are disregauraded)


 * Some values have more importance then others thus we apply some weight to each value to "normalize" it (see next section for how climate data is normalized)


 * If there is too much data to make a coherant meaning of a graph/diagram is often drawn. In the simplest possible case we just divide the data in a series of subranges and use a weighted average, the midpoint of the range times the number of items in the range, to get the overall mean.

Obviously each of these methods will give slightly different reasults thus without knowing how a particular dataset summerized (except by special arrangement most datasets are summeries and not the actual real raw data [per station GISSTEMP data for example]).

Dealing with outlayer deserves some special thought because we need to know the standard deviation, on average how far away from the mean is a certain perchantage of the data points, then we need to decide how many standard deviations we consider to be reasonable (typically 2 or 3) and all values outside that are disregaurded. Only problem is we now have to repeat the process we ae dealing with a slightly different mean. We continue repeating until current itteration doesn't elimenate any outlayers.

Dealing with Random Distortions
Due to the law of large numbers and more and more of the software being written by professional programmers being used to process the raw data many of the above distortions are either canceled out, do not contribute a measurable amount of distortion when looking at the climate as a whole. The law of large numbers says the following two things about what happens when dealing with large numbers. The first thing it says is if there is any possibility of a event happening or not happening it will *ALWAYS* occur if we attempt to do it or not do it an infinite number of times. Namely no matter how unlikely something is if we try to do it an infinite number of times it *WILL* occur (e.g. if I flip a coin an infinite number of times it is guarenteed to have a infinite number of tosses that are neither heads or tails [it landed on its side or something]). The second thing it says is over an infinite number of trials there will never be out layers. More formally the graph of the distribution is 100% identical to the theoretical one. For this reason one often sees in math and other theortically work phrases like "for all sufficently large/small values". Basically this means once we try something enough times there is so small a difference between the predicted outcomes and the actual ones that are produced. Thus we can say that the data does agree with theory if, and only if, we have enough of it (in some cases less is better but typically more is better) we do not need to continue making attempts to make it happen.

In terms of climate data this means that most if not all the distortions above will most likely cancel each other out, get flattened by the law of large numbers or in some other way not be a serious enough distortion to bring any of the conclusions based on it in question. A big disclaimer here is if the same data is put into a computer climate model then due to the type of system, the climate, we are modeling, it is quite possible (actually guernteed due to the law of large numbers) that a tiny error could have huge effects. For more details see the computer modeling section.

Dealing with Non-Random Distortions
Some known sources of distortion are not random. As an example Crichton offers the "urban heat bubble effect" (UHBE), but this only a special case of land use (which as discussed in the next section is itself a special case of other non-random distortions). Crichton claims (not verified by any editors of this article yet) the only non-random distortion dealt with in a systemic way is the UHBE, and even then not sufficently to really remove it's distortions.

Let's consider the larger case of land use in general (and in some very restricted cases open ocean use). GISS and other data sets often use relatively "coarse" methods due to some of the problems already noted by this article and Crichton. Many of the reasons for using these simplistic models is the amount of data that needs to sorted through is lessened. This was important when climate modeling started, in any serious manner, in the mid-1970's when the msot powerful computers in the world where about as powerful as a hgh end PC today. Also due to some advances in networking and other areas the most powerful "computers" (some times the components of the same "machine" are not even on the same continent) today are as much a leap forward as going from an abbacus to a high end PC. Also since the time these earlier models where created Geographic Information Systems (maps + databases) has matured into a proven techinilogy and almost of all the relivent historical data has been digitized we can use techiniques that are vastly improved over the ones used to handle the UHBE. But, in order to make it so there is a common line in the data that everyone is using (i.e. a paper written in 1980 uses the same error correction model as one in 2008, this allows for the comparing of apples and apples, which incorrectly thought to not be possible if we change how we correct for errors).

We have almost 40 years of good accurate weather station, surface observartions and satellite imagery that in combination with a GIS techinque called rasterization allows us to get a very good idea of what effect different land uses have on the climate. For example if we want to know the effet of trees it perhaps a 10 to 15 minute job (on a mid-range PC and even here 80% of time is human time because it is "one shot" ananylsis if automated the effect can be calculated in near "real time" on the same machine) to get a historical global average of the effects of forest (i.e. we compare the amount of climate change and/or the instanious average across all the forest in the world then subtract out the effects of the local climate and we are left with a very good idea of the effect a forest (and by implications smaller groupings of trees) have on the climate.

Bottom line is just measuring and correcting for the UHBE and even then not completely accuratly is pure laziness on the bahalf of the climate science community. Even when doing this we do not account for urban growth and such (every county in the industrial world has for the most part put all their map archives/other space data sources into some GIS system or an other thus it is not a lack of data).

What does the raw data say about actual changes (if any) in the climate
Like almost every other topic in this article we can not give an answer until we look at what the answer means. Specifically a very fine line has to be drawn between what that really says and how it is turned into something useful (for the public at large). Scientists, like almost anyone, find it harder to stare at column after column of numbers instead of as a chart (aka graph for non-mathematicians) that summarizes in a hopefully fairly clear and understandable way. Sadly unless one is very careful about their biases in the preparation and interpretation of such graphs it is very easy to delude yourself and/or purrposely mislead others looking at the graph only and not the underlieing numbers it is based on. The how and what of this is way beyond the scope of this article please refer to the classic cook "How to Lie with Statistics". Almost every professional consumer of graphical data knows this but when it is presented to people who do not have the necessary background to know that a picture doesn't always equal a thousand words of truth in all cases either through unintentional pictorial distrortions or purpseful attempts to decive (almost never used by scientests and if discovered and serious enough is a carrer ending trick; but frequently used without being honest aboout it by adovacy groups and the media). Bottom line what the data tells a professional is often very different then what it appears to tell the average person. Part of this is partially due to scientests not being trained specifically explain complex concepts in everyday language, many fields actually informally disguarge it as not being "formal" enough.

Given all that it is clear that there is a general warming trend through out the peroid we have had good direct measurements of tempature, not indirect methods that will be discussed later, from roughly 1810 (and in many cases not having good enough coverage for any clear global picture until 1880 or so). Thus *ALL* of our data is post-Industrial Revlution and if the theory of Global Warming states that the primary case of Green House Gases (GHG's) is industrial and transportation related activties then by definition we *DO NOT* know what the climate looked like before these became factors (yes arguablly to small to matter into the mid 20;th century). As Crichton points out there are local variations but when we apply the law of large numbers (assuming we can handle the non-random distortions listed above) the trend is unmistakable. While it is possible that when we finally get around to make all the corections from non-random sources it will not be as pronouncd it is still there. Crichton even admits this back handedly when he makes a arbitary prediction of a global rise in tempture over the next 100 years at 0.84234234 degrees ferenhite.

Do we have enough historically accurate raw data to say if there is climate change in the long term or not?
As mentioned indirectly above we have nothing but rather crude evidence for what the climate was like before the industrial revolution. For most of Christen era we have various manuscripts written by everything from some of the msot notable people in the history of science to what can best be described as semi-literate people who had absolutely no clue how to record any kind of scientific data. We may have really bizzare wording also for example, fictional example, the chronicle of a 11th century monk in England says "Today was so cold it was colder then the great freeze in which all the ponds frooze over which was two full season cycles before the old king abuducated". Also the coverage of such manuscripts is spotty at best in both the amount of the planet they covered and the frequency of the recording (in some extreme years could pass between recording the same location twice [but due to the bizzare record keeping above god knows how many years that was]). In other words it is very hard if not impossible to try to piece anything before the Industrial Revolution and even then we might cover Western Europe only.

Given the above and the need to look much farther back in the past we need to use indirect measurements. The only problem with such measurements is we are making certain assumptions about we are looking for before we find it. Namely the only way we can estimate the tempure based on ice cores in the past is to look at the trapped atomospher in both the water and air pockets. In theory if we knew the state of the climate completely now, for reasons discussed elee where in the article, this is impossible to do to the level of accuracy needed to make sense out the ice core data. Additionally since we are measuring the $$CO_2$$ content of the past atmospher we have created a paradox in that it is not completely proven that $$CO_2$$ (see below) is the primary GHG. Additionally we assume that the climate essencially had the same underlaying physics it has today. There are a few other indirect measurement techiniques that often are even more speculative then ice cores.

Do we understand the primary forces that drive the climate?
We have some very good clues to what the primary forces in the climate may be. But, all these are based on indirect observation and extrapolation, small scale controlled experiments (most of which studied each force independently of all others so we know very little about their interactions), and/or theory (which in most cases is in the speculative realm vs.likely, but unproven). In this section I will examine just of the these forces (the must critical one in most peoples minds).

We have known for a while (as much over a century) that some or all the GHG's are capable of causing tempure increases by reflecting away from itself. For example $$CO_2$$ was discovered in the 1890's to this and if an external clear encloser container and exposed to a light source that the container whould be hotter then the surronding environment (but it is possible that something else could possibly explain this if we are willing to look at unorthodox methods and that is the very fact that the encloser had no outlit and warm air rises it had no where to go [and all light interferes with the air to make heat to some level). We also know, not conclusivelly because the techinque used (Mass Spectrography) in some rare cases [that we are not 100% sure of but thing very unlikelly to be an issue with any soloristral body] some of the effects of quantum mechincs and/or relativity causes us to make minor adjustments to the measurements and just on general prinicible any time raw data needs to be "correct" I get real nervous. We also "know" (see above section to why) that Mercury and Mars have roughly the same percant of $$CO_2$$ in the atmosphere, seas (earth only) and in the rocks. Since given this Mercury has a run away green house effect (which has not been proven for large scale experiments like the atmosphere of a planet) and Mars in in permafrost and we think that Mercury's temp was caused by this high level $$CO_2$$ being trapped in the rocks and dirt (or even deeper) this is especially important that we have recovere ice from one of the mars rovers.

Do we have sufficient raw data to support or not support the hypothesis that human activity is driving measurable climate change?
While there has been a marked rise in GFG's since the Industrial Revolution as discussed above in the accuracy of our temperatures well enough to make a causal link, but definative proof of this. While we know that $$CO_2$$ levels have been rising we are not 100% sure if this is caused by humans we can not directly link this to human acticity. The reasons are disscussed above.

Computer Modeling
Humans are incredible slow and error prone when make very many calculations as fast as possible. In the time it takes us to solve a single complex dirvative a scientific calculator could of done it 10,000 times, PC's about 1,000 faster then that and supercomputers yet an other 1,000 times for a totaly increase of 10 billion times (these numbers are only round offs since the technique used by both is slightly different). Most climate models have 1 billion indivual data where up to 10,000 such operations needs to be done. Thus it takes a super computer run a climate model like this for "a year" or so in just under 1.2 minutes where is would take a human 100's of years. Thus if we are going to attempt climat we *MUST* use computers to do it. No that we are stuck with the machine doing all the work we need to ask what the machine can and can not do. Obviously if we put the wrong data in (which seems almnost certain) then the results will be wrong. Due to some computer science and math limitations (discussed in "Theory of Computation") and the mnature of complex systems the answer is we can not prove or disprove if the procedure we use to process the data is right or wrong.

Can we quantify our knowledge of the climate sufficently to create realistic mathimatical models?
Note: This section assumes that there are no limitations on the computational capabilities of computers and/or other "known" limitations to the ability to model complex systems. We know a lot about the lower atmosphere but relatively little about the upper atmosphere (we have been collecting continues data on the first for almost 200 years now but the second since only the 70's or so). For example we do not know if the Ozone Hole always existed in some size or if it's current growth is a part of natural cycle or not because we didn't even know that there was ozone at the hieght into the earl 60's and no perminate monitoring of it until the mid-80's (which concidently is the first time the "problem" was reported... I find it very hard to believe that we can say something is growing or not after a few months of observatrion). There also is starting to be some evidence that volcanos and other geologic activty have effects on levels that we didn't know about into 5 years ago. We still have not found a suitable way to make the corrections we know exists in the raw data. Bottom line it is highly unlikely the data is accurate (see section on input senstivity see why even the smallest error can have huge effects)

Theory of Computation Limitations
Around the late 1800's a group of mathematicians started to get rather worried about a major problem in math, specifically the harder the looked they where unable to find a set of theorums and definitions that couldn't do really simple things like count (1, 2, 3, ...) without either contradicting it self or making the definition in some way be defined interms of it self. For example to prove counting you need to prove that every number has a successor and the only way to prove that is with induction that says: a) Prove it for some base case (like 2 is afer 1) and then b) prove some randomly picked case (lets call it n) is true then n+1 most also be true (we connect the 2 by making the base case n+1). But how can we prove that there is always a n+1 for any n. We have to say that no matter what you pick there is atleast one number that suceeds it, but in order for this to be true you have to prove the existence of inifity which by defition is the "largest" n where there is no n+1 case (which logically can not happen). Bottom line it started to appear that upto then was considered a science of pure logic and there was no part of math that required knowledge outside of that mathimatical system (like the infinity/counting system). For many years this remained an open question until two different proofs where offered that said a mathimatical symbol would never be complete (Godel's Thoerem) and an other that said it was impossible for one machine to completly verify that some other machine possed some quality or not. To do this Turning created a machine called the Universal Turing Machine (which every modern computer is equivelent to [none of them use the actual design because it is very inefficent]) that consisted of a set of finite number states (is the light red or green, is there less then $1000 in you checkng account, etc.) and infinitely long tape divided into cells where you could write or read a single number (based on what the state said to do) and then move right or left one cell. Every turning machine has a special state called "halt" which as soon it is encountered the machine stops and does nothing. In a very long and hard to follow paper he showed that there was not a single procedure that was done in math that was not programmed into it. He then asked what should of been one of the easiest question: can machine A detect if machine B would eventually enter the halt state (would it ever complete its operation). To everyones complete amaziment Turning proved that it was impossible for machine A tro detect if machine B would ever halt because by definition A had to be more complex then B, but what happens if you run A on a copy it self? Short answer is you will not get an anwer of any kind (see halting problem).

Basic Theory of Computational Terminology
Decidablity: Can a given mathematical property be proven/disproved and if that is not possible can we prove it is impossible to prove/disprove. If yes to both questions then the problem is "undecidable".

Determinable vs. non-determinable processes: Determinable processes do not require the "machine" to try every possible outcome before selecting the correct solution (namely only one solution is possible if one is possible). Non-determinable processes require the machine to explore at least 2 possible solutions before selecting the "correct" one (if a "correct" one exists).

Universal Turning Machine: The simplest possible machine that can do every task that a general purpose computer can do (efficiency is not a concern here)

Space/Time Complexity: How much memory requires and/or time a decidable determinable process takes to complete. Typically this stated as being either "deterministically polynomial" [P] (the process completes in some linear time based on the input size) or as being "non-deterministically polynomial" which means that if we had an infinite amount of memory or processors assigned to the task then it would complete in P time but since that is theortically impossible it will require some higher order exponiantial time/space to complete on a machine with finite processors and memory.

Barber Paradox: The logical paradox that all undecidable problems "reduce" to. In simplest terms if there is a town where everyone must shave everyday and everyone who does not shave them selves is shaved by the only barber in town. Who shaves the barber? If the barber shaves himself then he has violated the rule he only shaves people who do not shave themselves, but if he does not shave himself then he violates the rule that everyone must shave everyday. In other words no matter how you answer you create a unsolvable paradox.

Rice's Theorem: A theorem that states that for all non-trivial problems they are undecidable.

NP-Completeness: A decidable problem that has been proven to only be solvable in NP space/time

Does the Theory of Computation limit how accurate a climate model is?
Note: This section assumes that the model is completely correct we only want to know if by running it on a computational machine will it be distorted by the nature of the Theory of Computation

If we assume that we are using accurate data on a accurate model then the answer to this is question is there some aspect of the model that is undecidable and centeral enough to it's predictions that due to this undecidability we can not verify this. One jumps immediately out which for a lack of a better term can be called the "no presence paradox". Since we know that all events in the real world that happen some instance X do happen at that instance. For example is two cars collide then at the instance of impact both cars are in contact with each other. This is not necessarily true in a computer model. Look up "Conway's Game of Life" for a simple example, we will assume from here on out that the reader has looked it up. Let's say we have a situation where cell c1 dies and c2 is born if and only if c1 is alive when we compute the status of c2. If we did everything at exactly the same time tnen this would bny definition not happen. But in every well written version of life it will happen because we have the immediate past that we use to calculate the immediate future with (i.e. we still treat c1 as alive even though it is not since there is no way to have a present). This is an example of a problem that is decidable if and only if we can show that a non-deterministic problem always convertanble to a equivelent deterministic decidable one (which has shown to be not true because if it where there is no difference between P space and NP space problems).

In a more mondane example there are many loops, by definition (according to IPCC the climate is the single most mathimatical complex system we gace ever attempted to model thus by definition there will be such loops), that do not have easy to determine starting and ending places that can be tested for (i.e. they are reducable to Rice's Theorem or the Halting Problem).

Does Godel's Incompleteness Theorem Limit the accuracy of our models?
It has been shown that if a system of logic contains undecidable propipisitions then it is by definition incomplete. Since we have already shown this tro be true then any possible climate model is incomplete.

Does input sensitivity and the need to estimate infinitely long/irrational numbers have a measurable impact on our climate models?
Note: Do we even need to use such numbers in our climate models?