The code of Nature: making authors part with their programs

However, as it is now often practiced, one can make a good case
that computing is the last refuge of the scientific scoundrel.

—Randall LeVeque

Nature - show me your code if you want to

Some backstory first

A very interesting editorial has appeared recently in Nature magazine. What is striking is that the editorial picks up the same strands of argument that were considered in this blog – of data availability in climate science and genomics. Arising from this post at Bishop Hill, cryptic climate blogger Eli Rabett and encyclopedia activist WM Connolley claimed that the Nature magazine of yore (c1990), required only crystallography and nucleic acid sequence data to be submitted as a condition for publication, (which implied, that all other kinds of data was exempt).

We showed this to be wrong (here and here). Nature, in those days placed no conditions on publication, but instead expected scientists to adhere to a gentleman’s code of scientific conduct. Post-1996, it decided, like most other scientific journals, to make full data availability a formal requirement for publication.

The present data policy at Nature reads:

… a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols promptly available to readers without undue qualifications in material transfer agreements.

Did the above mean that everything was to be painfully worked out just to be gifted away, to be audited and dissected?  Eli Rabett pursued his own inquiries at Nature. Writing to editor Philip Campbell, the blogger wondered: when Nature says ‘make protocols promptly available’, does it mean ‘hand over everything’, as with the case of software code used?

I am also interested in whether Nature considers algorithmic descriptions of protocols sufficient, or, as in the case of software, a complete delivery.

Interestingly, Campbell’s answer addressed something else:

As for software, the principle we adopt is that the code need only be supplied when a new program is the kernel of the paper’s advance, and otherwise we require the algorithm to be made available.

This caused Eli Rabett to be distracted and forget his original question altogether. “A-ha!”. “See, you don’t have to give code” (something he’d assumed, was to be given).

At least something doesn’t have to be given.

A question of code

The Nature editorial carried the same idea about authors of scientific papers making their code available:

Nature does not require authors to make code available, but we do expect a description detailed enough to allow others to write their own code to do a similar analysis.

The above example serves to illustrate how partisan advocacy positions can cause long-term damage in science. In some quarters, work proceeds tirelessly to obscure and befuddle simple issues. The editorial raises a number of unsettling questions that are sought to be buried by such efforts. Journals try to frame policy to accommodate requirements and developments in science, but apologists and obscurantists seek to hide behind journal policy for not providing data.

So, is the Nature position sustainable as its journal policy?

Alchemy - making things without knowing

A popularly held notion mistakes publication for science; in other words – it is science in alchemy mode. ‘I am a scientist and I synthesized A from B. I don’t need to describe how, in detail. If you can see that A could have synthesized B without needing explanations, that would prove you are a scientist. If you are not a scientist, why would you need to see my methods anyway?’

It is easy to see why such parochialism and close-mindedness was jettisoned. Good science does not waste time describing every known step or in pedantry. Poor science tries to hide its flaws in stunted description, masquerading as the terseness of scholarly parlance. Curiously, it is often the more spectacular results that are accompanied by this technique. As a result, rationalizations to not provide data or method take on the same form – ‘my descriptions may be sketchy but you cannot replicate my experiment, because, you are just not good enough to understand the science, or follow the same trail’.

If we revisit the case of Duke University genomics researcher Anil Potti, this effect was clearly visible (a brief introduction is here). Biostatisticians Baggerly and Coombes could not replicate Potti et al’s findings from microarray experiments reported in their Nature Medicine paper.  Potti et al’s response, predictably, contained the defense: ‘You did not do what we did’.

Unfortunately, they have not followed our methods in several crucial contexts and have made unjustified conclusions in others, and as a result their interpretation of our process is flawed.

Because Coombes et al did not follow these methods precisely and excluded cell lines and experiments with truncated -log concentrations, they have made assumptions inconsistent with our procedures.

Behind the scenes, web pages changed, data files changed versions and errors were acknowledged. Eventually, the Nature Medicine paper was retracted.

The same thing repeated itself in greater vehemence with another paper. Dressman et al published results on microarray research on cancer, in the Journal of Clinical Oncology. Anil Potti and Joseph Nevins were co-authors. The paper claimed to have developed a method of finding out which patients with cancer would not respond to certain drugs. Baggerly et al reported that Dressman et al’s results arose from ‘run batch effects’ –  i.e., results that varied solely due to parts of the experiment being done on different occasions.

This time the response was severe. Dressman, with Potti and Nevins wrote in their reply in the Journal of Clinical Oncology:

To “reproduce” means to repeat, following the methods outlined in an original report. In their correspondence, Baggerly et al conclude that they are unable to reproduce the results reported in our study […]. This is an erroneous claim since in fact they did not repeat our methods.

Beyond the specific issues addressed above, we believe it is incumbent on those who question the accuracy and reproducibility of scientific studies, and thus the value of these studies, to base their accusations with the same level of rigor that they claim to address.

To reproduce means to repeat, using the same methods of analysis as reported. It does not mean to attempt to achieve the same goal of the study but with different methods. …

Despite the source code for our method of analysis being made publicly available, Baggerly et al did not repeat our methods and thus cannot comment on the reproducibility of our work.

Is this a correct understanding of  scientific experiment? If a method claims to have uncovered a fundamental facet of reality, should it not be robust enough to be revealed by other methods as well, which follow the principle but differ slightly? Obviously, Potti and colleagues are wandering off into the deep end here. The points raised here are unprecedented and go well beyond the specifics of their particular case – not only do the authors say: ‘you did not do what we did and therefore you are wrong’, they go on to say: ‘you have to do exactly what we did, to be right’.  In addition they attempt to shift the burden of proof from a  paper’s authors to those who critique it.

Victoria Stodden

The Dressman et al authors face round criticism by statisticians Vincent Carey and Victoria Stodden for their approach. They note that a significant portion of Dressman et al results were nonreconstructible – i.e., could not be replicated even with the original data and methods, because of flaws in the data. This was only exposed when attempts were made to repeat their experiments. This defeats the authors’ comments about the rigor of their critics’ accusations. Carey and Stodden take issue with the claim that only the precise original methods can produce true results:

The rhetoric – that an investigation of reproducibility just employ “the precise methods used in the study being criticized” – is strong and introduces important obligations for primary authors. Specifically, if checks on reproducibility are to be scientifically feasible, authors must make it possible for independent scientists to somehow execute “the precise methods used” to generate the primary conclusions.

Arising from their own analysis, they agree firmly with Baggerly et al’s observations of ‘batch effects’ confounding the results. They conclude, making crucial distinctions between experiment reconstruction and reproduction:

The distinction between nonreconstructible and nonreproducible findings is worth making. Reconstructibility of an analysis is a condition that can be checked computationally, concerning data resources and availability of algorithms, tuning parameter settings, random number generator states, and suitable computing environments. Reproducibility of an analysis is a more complex and scientifically more compelling condition that is only met when scientific assertions derived from the analysis are found to be at least approximately correct when checked under independently established conditions.

Seen in this light, it is clear that an issue of ‘we cannot do what you say you did’ will morph rapidly to a  ‘does your own methods do what you say they do?’ Intractable disputes arise even with both author and critic being expert, and with much of the data openly available. Full availability of data, algorithm and computer code is perhaps the only way to address both questions.

Therefore Nature magazine’s approach to not ask for software code as a matter of routine, but to obtain everything else, becomes difficult to reconcile.

Software dependence

Results of experiments can hinge just on software, just as it can on the other components of scientific research. The editorial recounts an interesting example of one more instance of bioinformatics findings which were dependent on the version number of commercially available software employed by the authors.

The most bizarre example of software-dependence of results however comes from Hothorn and Leisch’s recent paper ‘Case studies in reproducibility‘ in the journal Breifings in Bioinformatics.  The authors recount the example of Pollet and Nettle (2009) reaching the mind-boggling conclusion that wealthy men give women more orgasms. Their results remained fully reproducible – in the usual sense:

Pollet and Nettle very carefully describe the data and the methods applied and their analysis meets the state-of-the-art for statistical analyzes of such a survey. Since the data are publicly[sic] available, it should be easy to fit the model and derive the same conclusions on your own computer. It is, in fact, possible to do so using the same software that was used by the authors. So, in this sense, this article is fully reproducible.

What then was the problem? It turned out that the results were software-specific.

However, one fails performing the same analysis in R Core Development Team. It turns out that Pollet and Nettle were tricked by a rather unfortunate and subtle default option when computing AICs for their proportional odds model in SPSS.

Certainly this type of problem is not confined to one branch of science. Many a time, description of method conveys something, but the underlying code does something else (of which even the authors are unaware), the results in turn seem to substantiate emerging, untested hypotheses and as a result, the blind spot goes unchecked. Veering to climate science and the touchstone of code-related issues in scientific reproducibility— the McIntyre and McKitrick papers, Hothorn and Leisch  draw obvious conclusions:

While a scientific debate on the relationship of men’s wealth and women’s orgasm frequency might be interesting only for a smaller group of specialists there is no doubt that the scientific evidence of global warming has enormous political, social and economic implications. In both cases, there would have been no hope for other, independent, researchers of detecting (potential) problems in the statistical analyzes and, therefore, conclusions, without access to the data.

Acknowledging the many subtle choices that have to be made and that never appear in a ‘Methods’ section in papers, McIntyre and McKitrick go as far as printing the main steps of their analysis in the paper (as R code).

Certainly when science becomes data- and computing intensive, issues of how to reproduce an experiment’s results is inextricably linked with its own repeatability or reconstructibility. Papers may be fall into any combination of repeatability and reproducibility, with varying degree of both, and yet be wrong. As Hothorn and Leisch write:

So, in principle, the same issues as discussed above arise here: (i) Data need to be publically[sic] available for reinspection and (ii) the complete source code of the analysis is the only valid reference when it comes to replication of a specific analysis

Why the reluctance?

(C) Jan Hein van Dierendonck

What reasons can there be, for scientists not willing to share their software code? As always, the answers turn out far less exotic. In 2009 Nature magazine, devoted an entire issue to the question of data sharing. Post-Climategate, it briefly addressed issues of code. Computer engineer Nick Barnes opined in a Nature column on the software angle and why scientists are generally reluctant. He sympathized with scientists – they feel that their code is very “raw”, “awkward” and therefore hold “misplaced concerns about quality”. Other more routine excuses for not releasing code, we are informed, are that it is ‘not common practice’, will ‘result in requests for technical support’, is ‘intellectual property’ and that ‘it is too much work’.

In another piece, journalist Zeeya Merali took a less patronizing look at the problem. Professional computer programmers were less sanguine about what was revealed in the Climategate code.

As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software, say computer scientists. At best, poorly written programs cause researchers such as Harry to waste valuable time and energy. But the coding problems can sometimes cause substantial harm, and have forced some scientists to retract papers.

While Climategate and HARRY_READ_ME focused attention on the problem, this was by no means unknown before. Merali reported results from an online survey by computer scientist Greg Wilson conducted in 2008.  Wilson noted that most scientists taught themselves to code and had no idea ‘how bad’ their own work was.

As a result, codes may be riddled with tiny errors that do not cause the program to break down, but may drastically change the scientific results that it spits out. One such error tripped up a structural-biology group led by Geoffrey Chang of the Scripps Research Institute in La Jolla, California. In 2006, the team realized that a computer program supplied by another lab had flipped a minus sign, which in turn reversed two columns of input data, causing protein crystal structures that the group had derived to be inverted.

Geoffrey Chang’s story was widely reported in 2006. His paper in Science on a protein structure, had by the time the code error was detected, accumulated 300+ citations, impacted grant applications, caused contrary papers to be bounced off, and resulted in drug development work. Chang, Science magazine reported scientist Douglas Rees as saying, was a hard-working scientist with good data, but the “faulty software threw everything off”. Chang’s group retracted five papers in prominent science journals.

Interestingly enough, Victoria Stodden reports in her blog, that she and Mark Gerstein wrote a letter to Nature, responding to the Nick Barnes and Zeeya Merali articles voicing some disagreements and suggestions. They felt that journals could help tighten the slack:

However, we disagree with an implicit assertion, that the computer codes are a component separate from the actual publication of scientific findings, often neglected in preference to the manuscript text in the race to publish. More and more, the key research results in papers are not fully contained within the small amount of manuscript text allotted to them. That is, the crucial aspects of many Nature papers are often sophisticated computer codes, and these cannot be separated from the prose narrative communicating the results of computational science. If the computer code associated with a manuscript were laid out according to accepted software standards, made openly available, and looked over as thoroughly by the journal as the text in the figure legends, many of the issues alluded to in the two pieces would simply disappear overnight.

We propose that high-quality journals such as Nature not only have editors and reviewers that focus on the prose of a manuscript but also “computational editors” that look over computer codes and verify results.

Nature decided not to publish it. It is now obvious to see why.

Code battleground

Small sparks about scientific code can set off major rows. In a more recent example, the Antarctic researcher Eric Steig wrote in a comment to Nick Barnes that he faced problems with the code of Ryan O’Donnell and colleagues’ Journal of Climate paper. Irked, O’Donnell wrote back that he was surprised Steig hadn’t taken time to run their R code, as reviewer of their paper, a fact which was had remained unknown up-to that point. The ensuing conflagration is now well-known.

In the end, software code is undoubtedly an area where errors, inadvertent or systemic, can lurk and impact significantly on results, as even the meager examples above show, again and again. In his paper on reproducible research in 2006, Randall LeVeque wrote in the journal Proceedings of the International Congress of Mathematicians:

Within the world of science, computation is now rightly seen as a third vertex of a triangle complementing experiment and theory. However, as it is now often practiced, one can make a good case that computing is the last refuge of the scientific scoundrel. Of course not all computational scientists are scoundrels, any more than all patriots are, but those inclined to be sloppy in their work currently find themselves too much at home in the computational sciences.

However, LeVeque was perhaps a bit naivé, when expecting only disciplines with significant computing to attempt getting away with poor description:

Where else in science can one get away with publishing observations that are claimed to prove a theory or illustrate the success of a technique without having to give a careful description of the methods used, in sufficient detail that others can attempt to repeat the experiment? In other branches of science it is not only expected that publications contain such details, it is also standard practice for other labs to attempt to repeat important experiments soon after they are published.

In an ideal world, authors would make their methods, including software code available along with their data. But that doesn’t happen in the real world. ‘Sharing data and code’ for the benefit of ‘scientific progress’ may be driving data repository efforts (such as DataONE), but hypothesis-driven research generates data and code, specific to the question being asked. Only the primary researchers possess such data to begin with.  As the “Rome” meeting of researchers, journal editors and attorneys wrote in their Nature article laying out their recommendations (Post-publication sharing of data and tools):

A strong message from Rome was that funding organizations, journals and researchers need to develop coordinated policies and actions on sharing issues.

When it comes to compliance, journals and funding agencies have the most important role in enforcement and should clearly state their distribution and data-deposition policies, the consequences of non-compliance, and consistently enforce their policy.

Modern-day science is mired in rules and investigative committees. What’s to be done naturally in science – showing others what you did – becomes a chore under a regime. However, rightly or otherwise, scientific journals have been drafted into making authors comply. Consequently it is inevitable that journal policies become battlegrounds where such issues are fought out.

N.B. This story appears as a guest post at WUWT.



  1. Bruce

    Excellent information. We need a volunteer effort like Anthony Watts Surface Staions to audit journals, determine the availability of data, and publish results. I would be willing to subscribe to one journal and request information on papers, and I’m sure many others would. What we need is someone to lead this effort. Any volunteers?

  2. Pingback: Socrates Paradox » Audit of Journals for Data, Methods & Code
  3. j melcher

    if the code runs with logical errors that invalidate the interpreted results , how does simply re-running it help? review, line be line, against the intended algorithm, is needful– and very tedious work

  4. Steve Reynolds

    “Within the world of science, computation is now rightly seen as a third vertex of a triangle complementing experiment and theory.”

    How is ‘computation’ anything other than either showing what predictions a theory makes, or analyzing experimental data to put it into a more comprehensible form?

  5. Shub Niggurath

    Computing is thought to be an actual *threat* to science reproducibility.

    A good article that puts forth this point is here

    Keeping computers from ending science’s reproducibility

    My initial experience (image analysis work) has corresponded to what you say. The code, or the sequential steps in a macro, is nothing but the very method itself.

    Recently I started using Java-based tools, and I was stumped when running analysis on my laptop gave me different results from my desktop. I am not software savvy enough to explain why. (I was thinking it is Java, therefore platform-independent).

  6. Pingback: Shub Niggurath on Archiving Code « Climate Audit
  7. Shallow Climate

    SN: I appreciate this post–thank you for your efforts here. Among other points you make I appreciate your differentiation between “reconstruction” and “reproduction”. Remember the “polywater” flap of many years ago? The experiments (reportedly the synthesis of a polymerized form of water) could not be reproduced, and it turned out that they could not, after a lot of to do, because HUMAN PERSPIRATION on the walls of glass capillaries was creating the “polywater” effect; “polywater” could not be created in air-conditioned chemistry labs. In that case human perspiration was like the unknown, unreported problem in the code. If not ALL of the code is made available, the code becomes a Black Box, and then we might as well be back in the era of alchemy and mumbo-jumbo.

  8. Dalcio

    Important discussion. I fully agree that data and source codes used in scientific articles should be made unconditionally available. This should not be difficult as most journals offer Web space for hosting supplementary information about published articles. However I disagree with the suggestion that journal referees should be required to examine and test source codes for software used, even if a team of “computational editors” is assigned the job. Testing and verifying computational codes can take anywhere from weeks to months of full-time work depending on the complexity of the code. This would make the time it takes to publish an article much longer than already is. Let those that have critically read a particular article and are suspicious of the results and conclusions do the examining and testing as well as the other required analysis. I think this goes to the core of the distorted meaning of “peer-reviewed” article that is being presented in discussions about AGW in the press and other media. The publication of an article in a refereed scientific journal is not a guarantee of absolute absence of errors either mathematical and computational or methodological. It should be viewed more as a quality assurance processes. Meaning that errors that can be ascertained by a knowledgeable reader within a reasonably short period of time are not present, that the scientific contents are connected to a larger body of knowledge, that relevant references have been cited, and that the procedures used and results obtained are novel and original. That is why scientists should read articles with a critical mind and an attitude of sympathetic skepticism: sympathetic in the sense that the reader should be able to recognize the value of such research presented in the article in question and skeptical about whether the authors really made their case so that the reader will be motivated to examine the article in detail.

  9. Chip

    Hi Shub,

    Very interesting, especially with regard to things I’ve been reading lately about the decline effect and its impact on the reproduceability of almost any scientific result over time. Software seems to me to be different, but maybe something odd and interesting is at work here.

    Speaking of odd and interesting, I have to ask if you keep up with Cthulhu or Nyarlathotep. Hopefully they will not be returning anytime soon 🙂

  10. j ferguson

    Interesting article in New Yorker, thanks for the reference. It’s good to be reminded how tentative my grasp of things is.

  11. Shub Niggurath

    Thanks for the link Chip. Being able to read New Yorker without subscription..imagine that. I think Nature has a similar piece called ‘hide the decline effect’, something or other.

    Edit: Found the Nature article. It is here – ‘Unpublished results hide the decline effect’. I always thought all things that can cause a ‘decline effect’ was already subsumed under the concept of Galton’s regression to the mean effect.

    I think calling something as the ‘decline effect’ is just obfuscating, and arises directly from taking published work too much at its word

    ‘Oh no! the effect we published in our seminal paper in Nature/Science/Cell is not as strong anymore’.

    No. it is just as strong. It is not as strong as you believed it to be, or wanted us to believe it to be. 🙂

    No keeping up with Cthulhu or Nyarlathotep,…they are busy sleeping. 🙂

  12. omnologos

    Now I know what’s wrong with me. I became a computer scientist before I became a scientist. This makes any computer code highly suspicious, and therefore next-to-impossible for me to trust “computer models” of any kind.

    Nowadays, more than 18 years later, I find it amazing to learn that there are intelligent people out there unaware of the infinite range of results that can come out of a relatively simple group of programs. A computer, as the saying goes, is an ass, i.e. it will only do what asked and will only do it literally. Anybody programming it without the necessary level of expertise is bound to be lead around by the ass.

    In other words, the vast majority of scientific papers based on computer models are very likely to be elegantly-thought garbage.

  13. anon

    “Therefore Nature magazine’s approach to not ask for software code as a matter of routine, but to obtain everything else, becomes difficult to reconcile.”

    If the underlying program is Excel or some other proprietary program, it may be it cannot be given away. But the program should be documented and made as free as possible.

    Open source code should be the scientific standard at this time.

    In addition to code, all other code artifacts should be made available to, including:

    Descriptions of the hardware and software environment

    Scripts to run the software

    Test scripts and test data, verification and validation.

    It so happens that after 30 years in programming, the best programmers I’ve come across are the physicists, not the software engineers.

    But let’s get serious.

    If they can afford a $50,000 – $500,000 grant, to work in a $1e6 lab, take trips bordering on $10K and up, pay for n salaries, attend m conferences, then scientists can learn how to make replicable code.

    We don’t need, god forbid, CMMI Level 5, but CMMI Level 1 or 2 would be very appropriate.

  14. Chip

    Hi Shub,

    Good news about the big C, perhaps I can continue my studies at Miskatonic University in the fall after all.

    In the meantime, the place the decline effect (intrigued by the regression to the mean stuff – can you elaborate?) really bothers me is in medicine. I think many of the studies are oversold to the point of having conclusions that do not match the results. Your comment seems in line with that. Of course, most people just skip straight to the conclusion section.

    How many medicines are being sold that do not have the effect attributed to them, or that that have an effect that is at best marginal and that at least appers to decline over time? Statins come to mind. Very troubling, I think.

  15. Pingback: Shub Niggurath on Archiving Code | Another Newyork Times
  16. Shub Niggurath

    Regression to the mean is observed in clinical studies all the time. Every ‘clinical breakthrough’ is taken with buckets of salt by practicing clinicians, super-flashy ads on TV notwithstanding.

    If a mild form of a new effect is discovered initially, it wont be reported in the literature. When the dramatic high-efficacy high-impact form of the same effect is stumbled upon, it makes great news.

    From there on, it is obviously downhill! People getting large doses of the same drug (for example) with no effect, or with much reduced efficacy will then be reported in the literature. These will be seen as ‘skeletons tumbling out of the closet’. Then the meta-analyses guys will walk in and clean up. By this time, a few prominent side effects would have made themselves known as well. So it is honeymoon over. Everyone knows this – the drug marketeers, pharmacists, pharmacologists, clinicians, and of course, the patients.

    That is why I did not get the point of the Nature piece. Should read through it again. 🙂

  17. Nick Barnes

    Sorry if you found my opinion piece in Nature patronising; the tone was largely set by the forum (specifically: the editor wanted something outspoken and applicable to all scientists). The first draft was quite different.

    Just about all science uses software. But almost all science appears to be sound: the exceptions are memorable, but very rare when one considers the total volume of science output. I infer (essentially on the unscientific bases of this quantitative intuition and my personal experience) that almost all science software is essentially sound: like the ICCER code, or the GISTEMP code when I first worked with it, it contains bugs, but those bugs do not greatly affect the results. The essentially nihilistic views of some, such as omnologos above, are completely mistaken (and are seldom applied – as they should be for consistency – to fields other than climate science: I wonder whether omnologos ever takes medication, flies in an aircraft, or uses a computer).

  18. Shub Niggurath

    Maybe patronizing was too harsh a word. I felt the tone of the article was somewhere between patronizing and forgiving.

    In any case, I think your basic point of scientists simply releasing their code is right, and probably the best thing to do.

    It is notable that the ICCER code was released and the Muir Russel panel commented that they were able to put together their code in an easy manner and therefore, wondered loudly as to why the sceptics made such a big deal about the significance of coding errors.

    At the same time though, did it not occur to them as to why the CRU, the author and custodian of the very same supposed simple, block of code, made such a mess of the product ‘Harry’ was working on?

  19. omnologos

    Nick – you are making an incorrect analogy. Software used for aircrafts, and the vast majority of software used in computers, it is tested at length. Why? Because it’s one thing what the programmers wanted to write, it’s a wholly different thing what the programmers actually wrote. Cue debugging (an activity that can take much longer than the original development time).

    The end result is called “production code”, to differentiate it from the tentative stuff that is “development code”. Analogously for all kinds of medication. Who would be foolish or desperate enough to take any drug simply because a breakthrough, computer-model-based paper in Nature or anywhere else indicates it might be beneficial?

    And no, I am not used to fly experimental aircraft right after it has flown flawlessly only in a computer model.

    My point is actually bordering on the obvious. “Production”-quality code cannot usually be expected from science-related activity (if it’s “production” how can it be innovative enough to warrant a new finding and a scientific paper about it?). That doesn’t mean computer-based science is flawed in principle: it means that scientists that use computer code without understanding its limitations, without grasping concepts such as “production” and “development”, are bound to use their code wrongly in practice.

    Given how apparently widespread such a mindset is (even you end up making that mistake) , science that is based on computer models should be considered weak/badly argued/wrong – until the code underlying it gets published and mostly survives other people’s using it.

    Nature’s code unavailability rule can therefore only undermine all the papers that follow it. If one constantly hides one’s nose, one should not be surprised if the rest of the world will take for granted that there is something quite wrong with that nose.

  20. Chip

    So things don’t get worse over time, they were just never that great to begin with. This explains a great deal.

  21. Nick Barnes

    science that is based on computer models should be considered weak/badly argued/wrong

    All science, to a first approximation, is based on computer models.

  22. omnologos

    even if that were true, still…all science that is based on undisclosed materials and methods should be considered weak/badly argued/wrong. Especially concerning those materials and methods whose use and usability are widely misunderstood.

    Corollary : no matter how well-researched, any scientific paper will remain as amateurish as its most unprofessional part.

  23. Nick Barnes

    Well, you won’t find me arguing for less disclosure, but telling scientists that all their work is weak or amateurish is not a good way to persuade them to strengthen or professionalise it. You catch more flies with honey: the science is good, let us work to make it better.

  24. Pingback: So You Believe In Computer Models… « The Unbearable Nakedness of CLIMATE CHANGE
  25. Shub Niggurath

    Nick, a couple of points here,…I don’t know if you disagree.

    In cases where there are problems, asking scientists to become more professional begins with identifying and then, telling the scientists that their present work is not professional.

    This is the nasty part – try telling any set of people with big egos that something is wrong with their work (which is their very source of pride). Without this, the honeyed advice wont be taken,

    Secondly, using the ‘science is good, so the code must be OK or the bugs do not matter’ argument is circular. There are two things wrong with this:

    1) The code should not affirm its validation by the very scientific phenomenon it seeks to compute, but independently. I know this may be difficult to obtain, but strictly speaking, that’s how it should be. For eg., you cannot say ‘my hockeystick algorithm is OK because other studies have recreated the hockeystick’.

    2) The second point arises as a corollary to (1). Scientific findings seek to affirm their validation on the independence of their algorithms. This is at the heart of the scientific method – especially in the science of prediction – be it clinical trials or climate modeling or whatever. There is no ‘science’ for us to say the ‘science is OK’, if the code is broken.

    If the science comes out OK even if the code is broken or the underlying assumptions wrong (which is a different thing), then it is just a co-incidence.

  26. anon

    “All science, to a first approximation, is based on computer models.”

    Corollary: To a first approximation, science did not begin until 1942.

  27. anon

    ““All science, to a first approximation, is based on computer models.””

    I’m sorry, but regardless of how you want to parse the word “is” you’re still wrong.

    At the least, you’re going to need a citation for your claim.

    [‘Anon’: You have to provide a valid email address]

  28. Nick Barnes

    Consider any issue of Nature. Every science paper in it uses computation to produce its results. Some of the papers are ‘pure modelling’ papers, i.e. describe a computational model and its results. Pretty much all of the other papers use a combination of several models to convert raw observations into results.

    For instance, scientists are in the process of discovering hundreds of ‘exoplanets’: planets orbiting other stars. Most of these have been discovered by the Kepler space telescope, designed and built for the purpose. The main instrument on-board Kepler is a huge multi-CCD camera, 95 megapixels read out every six seconds.

    Most of the raw data from this instrument is CCD voltage levels – approximately 16 million analog values per second. They are measured and digitized on-board Kepler. The device which does that is built and calibrated based on a model of the CCD behaviour. The digital values are then summed over a 30-minute interval and selected data (the pixels of interest) from each 30-minute interval is requantized and stored for transmission to the ground once a month. The details of the summation and re-quantization depend on a data model.
    Once the data is on the ground, it is calibrated according to another model, before being analyzed for light curves (the brightness of a star varying when a planet passes in front of it). This search for light curves depends on another model (which “knows about” stellar sizes and brightnesses and possible transit geometries). Discovered light curves are then analyzed according to further models, constantly being refined, which (for instance) try to account for the effect on planetary periods and transits of the multi-body dynamics of a planetary system. The outcome is a series of published papers announcing the discovery of large numbers of exoplanets, including orbital parameters, sizes, masses, and sometimes some information about planetary atmospheres (more modelling there).

    The above is a simplification of the Kepler data processing pathway: the way in which raw data turn into scientific results. Each of these models is computer software developed over some years by numbers of different scientists, based on theoretical models also developed over long periods, often in parallel with the software,often by the same scientists. The science absolutely depends on the models. The raw voltages out of the CCD are of no scientific value at all without the models.

    Oddly enough, nobody challenges the exoplanet research on the basis that it is “based on computer models”, but it most assuredly is, and it is not atypical of science today.

    All scientific data passes through computers between the instrument and the page. Pretty much all theoretical models (i.e. scientific hypotheses, “theories”) are turned into software models used to process the data. The results of the data processing are used to refine the software and to inform the development of the theories.

    Part of that development, across science, involves writing software purely representing the theory – i.e. not processing observational data – observing the results produced, comparing them with processed observational data, and refining both the pure model and the data processing pathway accordingly. Looking at the model often yields new hypotheses about the behaviour of the real system, which can then be tested either by developing new experiments or observations or often by inventing new techniques for processing existing observational data.

    A good example of this process might be the recent development of new models for ice sheet meltwater drainage and its effects on seasonal glacial movements. On this subject there are pure model papers and observational papers, and each sort develops and moves forward based on the other.

    If you don’t like this process, that is your prerogative. But you should be aware that all science is currently done this way.

  29. omnologos

    nice and long answer Nick but astronomers do spend time calibrating their instruments and models against the real world of stars and CCDs; have been at the forefront of raw data sharing, after an appropriate time for publication of course; don’t go around accusing of denialism those that don’t believe every candidate Kepler planet is a planet. Etc etc (the rules for disseminating Kepler data are particularly well thought out and make eg UEA look like a bunch of hominins). Furthermore they don’t go around the world telling people to be frugal and stop travelling.

    So yes, I’d cut them some slack if the Kepler people were to obtusely hide their source codes or at least I’d do that for a little while.

    Still the more time would pass with lots of efforts put into obfuscation and rationalising the more the Kepler results would appear weak/badly argued/wrong. So we’re back to square one…

  30. Nick Barnes

    And climate scientists do of course calibrate their instruments and models against the real world. It’s a great deal of the work of climate science.

    Have you read ‘A Vast Machine’ by Paul Edwards?

    In my opinion Kepler scientists might well use the word ‘denialist’ to describe anyone who suggests, for instance, that all astronomers are frauds and liars because planets don’t really orbit stars. Many of the attacks on climate science are at approximately that intellectual level (e.g. that climate scientists are all frauds and liars because the greenhouse effect doesn’t really exist).

  31. Shub Niggurath

    Oddly enough, nobody challenges the exoplanet research on the basis that it is “based on computer models”, …

    But that was not the original argument made, was it?

    Secondly, there are always periods in science when scholars look inward, and stop collecting real data, for various reasons.

    Whatever these trends be, measurements and data are the only true measure of a science’s worth.

  32. Chip

    I just finished my doctoral work recently and I know that one problem for many researchers is that a great deal of what you must do lies outside your expertise. Many researchers hire statisticians, for example. I am curious if the same is true of programmers. I supect that it is. How many scientists actually write their code? And is there danger in the interface, since programmers are not necessarily scientists? Finally, the vast majority of scientists in the past have been wrong, even Newton and Einstein. Why should I expect the current crop to do any better?
    I no longer delegate responsibility for thinking to someone else. They need to be able to demonstrate their truth. My opinion is that the current crop of climate models are descriptive rather than predictive and are of small practical value in my life, except to the extent that they can be used by politicians to limit my choices and freedoms.

  33. Perry

    I posted at WUWT:

    Andrew Bolt this morning on the radio in Melbourne this morning:

    “We chat to Jill Duggan, from the directorate-general for climate action at the European Commission, who says the opposition here to a carbon dioxide tax is ”slightly bizarre” when Europe has no problem with its own price on carbon dioxide. Really, I ask, with European unemployment at 10 per cent and growth at just 1.6 per cent? So I ask this salesman of the EU emissions trading scheme the two basic questions everyone should ask of anyone selling anything: how much does it cost, and what will it do? How many billions will Europe spend on this scheme to cut its emissions by 20 per cent by 2020, and by how much will that cut the world’s temperatures by 2100? The interview suddenly goes very pear-shaped for one of us – and is a stunning indictment of the EU’s foolishness. The question about job losses caused by Europe’s green schemes goes no better. ”

    Please listen to this show. It will inform and greatly amuse. The link to the recording is under the picture of Jill Duggan. Make it viral.

  34. Brad Keyes

    In my opinion Kepler scientists might well use the word ‘denialist’ to describe anyone who suggests, for instance, that all astronomers are frauds and liars because planets don’t really orbit stars. Many of the attacks on climate science are at approximately that intellectual level (e.g. that climate scientists are all frauds and liars because the greenhouse effect doesn’t really exist).

    Wow. Is this simply a rhetorical posture on your part Nick, or are you genuinely, honest to god, laboring under the illusion that CAGW skepticism has something to do with denying the existence of the greenhouse effect?

    May I ask—without meaning to be impolite—whether you’ve ever read anything a CAGW skeptic has written on the topic of climate? I mean: read and parsed the sentences, as opposed to just scanning for keywords?

  35. Brad Keyes

    There’s another fact of which I can’t in good conscience allow you to remain unaware.

    The following activities are not “attacks on climate science” (a phrase which, in any case, has no more meaning than “attacks on mathematics”):

    Disagreeing with conclusions drawn by scientists; asking to see the data they used, the data they rejected, and their reasons for doing so; asking for sufficient detail to enable replication; nitpicking their methods and the software that ostensibly implements them; reviewing their work, with our without patents of nobility establishing your own status as a peer; asking them to make all of this available to you when your objective is to try and find something wrong with it, in Phil Jones’ deathless words.

    These activities are climate science.

  36. Pete H

    Strawman argument.!

    Look, its so simple! Forget the thin skin and publish all procedures and codes so science can do its work!

    I seem to think falsification is the term used! You want trust! No amount of bitching about attacks on “Climate Science” will suffice! The group think will never be accepted or the “pall review”!

    No one wants to rubbish people trying to produce science from a chaotic system and no serious sceptic denies climate change! They simply want to know when politics is entering the field! Models do not cut the ice until they match real world scenarios! So far, as climate goes, do they make the grade in any form!

  37. Brad Keyes

    Hi Pete, don’t get worked up about Nick—I don’t think he gets how science works.

    Keep up the great blogging Shub!

  38. Nick Barnes

    Surely you are being disingenuous. You cann’t deny the existence of the large subset of “sceptics” who do, at great and hopelessly ignorant length, deny the existence of the greenhouse effect.

    There’s a long, and far from exhaustive list here:

    “Slaying the Sky Dragon” is largely based on idiocy like this.

    Sorry, I’m too busy with the Google Summer of Code to keep up with this discussion any more.

  39. Shub Niggurath

    Hi Nick

    Nice to hear about the Google Summer of Code thing.

    If we trace back to your point of disagreement with Maurizio above, it goes to where you call his view ‘nihilistic’.

    The said view was that all computer programs only do what they do, and not necessarily what scientists say they do.

    Not necessarily a very controversial thing to say, is it?

    But yet you disagreed, I presume, because to simply doubt every program written by scientists until they are checked would be counterproductive and corrosive – please alert me if I am putting words in your mouth.

    I don’t think you are disagreeing with Maurizio then. He and you are essentially making two separate points.

    What a computer program does cannot be considered valid solely because its outputs are consistent with results from other experiments.

  40. Nick Barnes

    The target of my ‘nihilistic’ comment was this remark of Omnologos (Maurizio?):
    “the vast majority of scientific papers based on computer models are very likely to be elegantly-thought garbage”.

    The core of my argument is that essentially all scientific papers are based on computer models, so Omnologos’ statement amounts to a dismissal of the vast majority of current science. I find that nihilistic.

    I think that science code should be better tested, and should be routinely published. I am spending most of my time at the moment (and thus forgoing a large proportion of my income) working to encourage the changes in this direction which we are seeing in science. But the idea that science software is completely valueless is just wrong, as is the idea that climate science software practices are, in any important respect, different from those in other sciences.

  41. omnologos

    I thought we had moved on (I don’t mind to agree to disagree – my experience as scientist and as programmer both in development and support leads me in a direction, yours in another).

    I haven’t seen a reply to these thoughts of mine though:

    “all science that is based on undisclosed materials and methods should be considered weak/badly argued/wrong. Especially concerning those materials and methods whose use and usability are widely misunderstood.

    Corollary : no matter how well-researched, any scientific paper will remain as amateurish as its most unprofessional part.”

  42. Pingback: datanalytics » Limpieza de cartera y miscelánea de artículos
  43. Pingback: So You Believe In Computer Models… | Omnologos
  44. webcams sex chat

    Wonderful beat ! I would like to apprentice while
    you amend your website, how could i subscribe for a weblog web site?
    The account helped me a appropriate deal. I have been a little bit familiar of this your broadcast offered brilliant transparent idea