Fixing Formalin: Perhaps Not a Game-Changer, But a Good Start!

Sarah Hykin and Jim McGuire

In our recently published paper in PLoS One, we provided ‘proof-of-concept’ that it is possible to obtain genome-scale data from formalin-fixed specimens. This study was proposed by one of us (SMH) about four years ago while contemplating how she might be able to undertake a phylogeographic study of a rare lizard species for which there was little hope of resampling the entire range.

Of course, people have been contemplating the challenge of obtaining DNA sequence data from museum specimens for decades, often with limited success. The typical approach involved targeting mitochondrial genes, developing sets of nested primers amplifying short fragments (often only 50-100 bp in length), and then a brute-force amplification and sequencing in an effort to score a few hundred usable base pairs.

However, our discussion was informed by the recent development of short-read Next-Generation Sequencing on the Illumina platform, which produces genomic-scale data 50-100 bp at a time. Surely, we thought, if any method could efficiently pull DNA sequence data from formalin- damaged DNA, this was it. Our timing was impeccable because our campus had just obtained a major foundation grant to support, among other things, the development of risky technology that could enhance the utility of historical museum specimens. We obtained a small subaward and Sarah went to work studying the literature on historical DNA sequencing, and figuring out how to perform NGS. Illumina sequencing and bioinformatic processing have become pretty routine now, but this was a very challenging undertaking at the time, and Sarah had to pursue this while focusing her energies on her unrelated dissertation research.

The first decision that we had to make was which species and specimens to select for sequencing. This turned out to be a ‘no-brainer’ because the only squamate genome available was Anolis carolinensis and we needed a genome to which we could map reads. Being a conservative museum curator, Jim suggested using no-data or limited-data specimens so that when the project inevitably failed we would not have cut up particularly important specimens. In retrospect, this was a mistake. For example, we used a limited-data specimen of Anolis carolinensis that was in the database as having been obtained in 1985. Further investigation of this specimen would have shown that this date was dubious, and it now appears the specimen was actually obtained and prepared between late 1986 and 1988 (and accessioned in 1990). We also have reason to believe the specimen was fixed in buffered formalin for one-week prior to rinsing and immersion in ethanol, though this is not known for sure since the specimen does not have associated field notes. If we had this to do over again, we would make sure that we knew as much as possible about the source specimens rather than taking a ‘let’s limit the damage if this experiment fails’ approach to specimen selection.

It works (sometimes)!

As indicated in our paper, it is indeed possible to obtain genome-scale data from formalin-fixed specimens housed for decades in a museum collection. However, our method is far from fool-proof and we strongly suspect that idiosyncratic features of individual specimens will determine success or failure in many instances. The age of the specimen is likely to be one of the most important variables as suggested by our published study – sequencing of our 100-year old sample failed, whereas sequencing for our ~25 year old sample was successful. However, other parameters are likely to prove important. For example, others have shown via direct experiments that DNA is better protected by buffered (versus unbuffered) formalin. We also suspect that other features of the specimen’s preparation such as the time spent soaking in formalin prior to immersion in ethanol, the concentration of the formalin used, the quantity of formalin injected into the specimen, and the time that passed between the death of the specimen and its preparation could all make a difference. These conditions are rarely recorded at the time of preparation, which means that for the vast majority of specimens that might be targeted for NGS, the researcher cannot know ahead of time whether the specimen is likely to be a good versus poor candidate for sequencing.

Our paper received some attention on social media with some calling our study a “game changer” and others arguing that such a statement is overblown. From our perspective, this is semantics. Have we completely solved the issue of obtaining genomic sequence data from formalin-fixed samples? Certainly not. Have we identified in a controlled way, the precise conditions underpinning success or failure of NGS from formalin-fixed samples? Again, not by a long-shot.

However, we have shown some things that are likely to be important in moving this technology forward (which could be interpreted as changing the game). Most importantly, we have shown definitively that it is possible to obtain genomic data from old museum specimens. This had not been shown previously and we believe that this will encourage many more people to give it a try than would be the case if our paper had appeared in the Journal of Negative Results. Further, we were able to shed some light on methodological issues that are actually quite important. First, we obtained a sufficient quantity and quality of DNA for sequencing from liver tissue and not from either leg muscle or, most importantly, bone. Many of you will be aware of a terrific study published by Maureen Kearney and Bryan Stuart in 2004, in which they provided a phylogeny for amphisbaenians that was based in large part on sequences obtained from old museum specimens. In their groundbreaking study, they obtained mitochondrial and nuclear sequence data using laborious traditional Sanger sequencing of short DNA fragments, along with non-traditional extraction of DNA from bone tissues using methods developed for human forensic DNA analysis. Their successful extractions required sampling bone from pickled specimens, which can be quite destructive. In contrast, pulling liver from a museum specimen is minimally invasive – especially when you consider that most newly collected specimens will have had their livers removed prior to preparation. Thus, our finding that liver is an optimal DNA source for NGS is important.

Further, second, we found that a modified phenol-chloroform extraction protocol outperformed Qiagen extraction for NGS purposes. Indeed, we now suspect that our failed attempt to perform NGS on our older sample could very well be the result of our effort to systematically compare extraction protocols. For both of our Anolis specimens, we subsampled the entire liver, which was divided into a small piece and a much larger piece. For the younger sample, the larger piece was extracted using phenol-chloroform, whereas for the older sample, the larger chunk was extracted using Qiagen. Importantly, from that older sample,we obtained more DNA from a sample ~20 times smaller in mass using phenol-chloroform versus Qiagen extraction. If we had dedicated the larger subsample from the 100-year old specimen to phenol-chloroform, we believe this might have resulted in successful NGS.

Figure: Sample placement in the phylogeny. As expected, our sequencing effort generated low-coverage (~0.5X) of the Anolis carolinensis genome. However, we did obtain ~60X coverage of the mitochondrial genome, providing a means of evaluating the quality of our sequence data after processing. We aligned the ND2 sequence from our sample with GenBank sequences representing A. carolinensis from Louisiana, the Anolis carolinensis complete genome, a variety of additional Anolis species, and a more distant outgroup. The phylogram shown here suggests that our mitochondrial sequence data obtained from a formalin-fixed specimen are reliable. If sequencing errors were evident, we would minimally expect the branch representing our formalin-fixed specimen to be relatively long compared with other Louisiana A. carolinensis, or perhaps even misplaced on the tree.

Where to from here?

One critique that we received in review was that we had failed to perform rigorous controlled experiments testing the various conditions that could impact success with NGS from formalin-fixed samples. Though we would have loved to perform such experiments ourselves, even limited NGS sequencing is still sufficiently expensive (thousands of dollars, rather than hundreds) that we were not in a position to pursue this. However, we would love to see someone – perhaps reviewer number 2 – grab the bull by the horns and perform this experiment! Such a follow-up study, should it identify via controlled experimental procedures the key parameters for successful NGS of formalin-fixed samples, would come closer to meeting the criteria of being a ‘game-changer.’