Remember that song, “John Jacob Jingleheimerschmitt, his name is my name too”? Yes, histones have a similar problem to the guy in a song I first learned from a Shmoo cartoon in the early ’80s. Specifically, variant histones—those non-allelic, alternate versions of the classic DNA-spooling protein—get all kinds of wacky names from the scientists who first identify them. And sometimes those names are nearly the same for two different types of histone variants—in totally different organisms.
These are key molecules with important roles in controlling gene expression, and for anyone studying epigenetics, the current naming scheme is confusing and unnecessarily complex—and downright unseemly. Fred Hutchinson Cancer Center researchers Paul Talbert and Steve Henikoff—along with about 40 co-authors and an enthusiastic group of histone researchers—aim to change all that.
I spoke to Talbert and Henikoff to find out a little bit more about their quest for reasonable names, as well as a couple of details about their new system, which they and their co-authors published in the recent issue of the open-access Epigenetics & Chromatin as “A Unified Phylogeny-Based Nomenclature for Histone Variants.” (pdf here.)
[Oh, and trying to think of a ridiculous title for this post, I find Talbert and Henikoff have bested me with a review about histone variants in Nature Reviews Molecular Cell Biology with 2010's "Histone Variants—Ancient Wrap Artists of the Epigenome." Recognize.]
So, a few details. Histone variants often replace their close relatives, the standard (or “core”) histones, creating special types of nucleosomes that do something different when they wrap-up DNA, performing some functions we understand, and some we don’t yet. Nucleosomes featuring certain variants can keep particular areas of DNA available for active transcription, for example.
Paul Talbert says the histone variant H2A.Z goes by at least a half dozen names, including Htz1, Hv1, H2Av, and even D2 in Drosophila. Also, there’s an H2Bv in trypanosomes and an H2Bv in malaria parasites, “but they’re not closely related, and they’re not orthologs,” he says.
And in recently described sperm histone variants, H2A.L and its close relatives go by very different names. “They’re all kind of the same general group, but it’s not very clear how they’re related to each other, so we’ve tried to make that a bit more clear, and more clarity will come with more research,” Talbert adds.
People who work on specific variants know its many aliases, but that’s no help to the uninitiated, and it complicates electronic searches for related research.
It’s hard for specialists in any area to abandon an established naming convention—things must have gotten bad in histone variant naming. How bad did they get?
Talbert: I was quite surprised when Steve first asked me to do this. My first thought was “whipping boy.” I thought I’d be bludgeoned and hated—but it actually turned out quite the reverse.
All the feedback was positive. We did have people offering alternative points of view about some things, but really, it seems like everyone was anxious for somebody to do this, and was glad that it wasn’t them.
One of the big goals for your naming system was to make it machine searchable—it’s interesting to see the Internet age have an effect on how these things are done.
Henikoff: Oh, yeah. That’s been a major impetus for it, and when we contacted [U.S. National Center for Biotechnology Information's] David Landsman, that was his first point. He said, “Don’t use all the punctuation marks that people have been using in variant names!”
There was one case of an H1 that included the “degree” sign. When we got the [article] proofs, it was missing! People use all kinds of special characters, and they’re really not machine-readable, for the most part. They’re mutually exclusive when you do a search, and you can’t pick ‘em up. For example, you wouldn’t be able to use an asterisk to run a wild-card search, because that would give you lots of false positives.
So it had gotten pretty much out of hand, and it was gradually getting worse and worse. Particularly with large-scale groups trying to give protein names for annotation purposes.
Can you explain the role phylogeny plays in the new convention?
Talbert: We started looking at how they’re named now, because you don’t want to disrupt more than necessary. They’re already grouped into families—H2A.Z, H2A.X, blah, blah—so we took cues from the way the nomenclature already works, and tried to make that systematic and make as much sense as possible, without making it completely illogical.
In a couple of places, we kept things that are illogical from a phylogenetic point of view, just because historical precedent seemed more important. But in general, we tried to get them to go together.
Henikoff: I think what you’ll find is that all protein families, when they get renamed, it’s all based on phylogenies—take kinesins, myosins, etc.—they were all renamed, but based on subfamily classifications. So the tradition’s there.
With histones, that had gotten started in a very crude way in 1975, but just by dividing them up. Before 1975, there were several names for what we now agree is H2A, and you don’t hear any of those anymore. So that first classification worked, and then with DNA sequencing, it caught on that you’re going to have these protein names for renaming, rather than anything goes, as is the case for gene names.
What will this change for epigeneticists?
Henikoff: The main thing is [electronic] searches. But also, when it comes to guiding people who’re doing annotations for new genomes. I think this’ll be very helpful because right now they don’t have guidelines, so it can be very confusing to get into this.
Specifically, in the case of the histone H1, we were surprised by how easy it was, when we realized that when you look at them phylogenetically, they seem completely interchangeable, in a sense. They’re probably functionally equivalent, with few exceptions. So they actually turned out to be quite easy.
Prior to that, people thought there were all these H1 subfamilies. There’s less [evolutionary] constraint on histone H1 than there is on core histones, so they diverge more rapidly. At first, I think people were misled into thinking that because they’re [structurally] different, they’re also functionally different. But when they do the experiment, they find they’re pretty much interchangeable—they may be expressed at different times, but they’re more or less the same.
And that’s what came out of our phylogenetic approach, and we realized that by renaming them such that we just give them numbers—and not try to make artificial distinctions among them—the phylogeny actually informs us about the function, in that they’re all pretty much the same.
Is there any down side to establishing a new system? What happens with changes in biological classifications?
Talbert: We didn’t put organisms into the names, partly because of that issue. We invented these descriptors—well we didn’t invent them—but we said this is when you should use them and what you should use them for. People have been using them forever.
But sometimes people would add the organism name to the protein, and we just thought that created complications—renaming is one of them, and recognition by searches is another.
So that’s one reason why we said none of that stuff is part of the name. You can describe histone variants in any way you want that’s informative, but that shouldn’t be made part of the name.
Are there any downsides? Well, having made most of this system, I’m wondering what I didn’t anticipate that will turn up as we sequence more genomes that may not quite fit into the scheme. But I think it should be good for a while. I don’t think there should be too many obvious problems with it right away, hopefully.
So, it’s not like you can just crack the whip and make everyone follow this standard—what do you do to encourage its adoption?
Henikoff: Well, it’s not a matter of cracking the whip. We want to provide guidelines so that people who want a unified, systematic phylogenetic-based approach can have something to use. I think a lot of people will ignore it, but if they write an abstract that doesn’t have a histone variant name that can be picked up by searching, it’s to their own detriment.
So I think it’s in everybody’s interest to go along with at least getting it right in their title and abstract. But whatever they want to call it in the context of their paper, that’s fine.
Is there anything you’re doing to popularize the new naming scheme?
Talbert: I’m going to two conferences in a couple weeks where I’ll have a poster where people can learn about it. Also, it’s on the web, and the Epigenetics & Chromatin editors put out a press release for us. I think it’ll catch on.
Henikoff: Putting it in an open-access community journal, one that we can make updates to, lets the paper itself serve as a website. So there’s a certain convenience there—we don’t have to figure it out again in five years and sort things out. We can make important updates directly, and maybe add links as well. With one-stop shopping, I think that’ll help get it out there.
How did you get involved in this effort to set up a naming convention for histone variants?
Henikoff: I was at a [EMBO workshop] in Strasbourg on histone variants in October of last year, and one of the organizers, [co-author and Temasek Lifesciences Laboratory researcher] Fred Berger, said we should consider a histone variant nomenclature—it’s kind of a mess. Everybody agreed it was a mess.
As a result, I contacted David Landsman who’s at the NCBI and is basically in charge of issues like this—and he’s also a histone guy. And he said, “Oh yeah, why don’t you do something, that’d be great!” I reported back to the meeting, and they agreed, and we left it at that.
I talked to Paul [Talbert] about it because we’d published a comprehensive review on the phylogeny of histone variants, and it seemed like a phylogenetic approach would be a logical way to go about doing the nomenclature. That’s what got it started, and then we contacted the other people in the community and they were all for it.
[The picture of a roadside-attraction sign at Confusion Hill, California is by Flickr user DominusVobiscum, and it's used here under a Creative Commons license.]
Talbert PB, Ahmad K, Almouzni G, Ausió J, Berger F, Bhalla PL, Bonner WM, Cande WZ, Chadwick BP, Chan SW, Cross GA, Cui L, Dimitrov SI, Doenecke D, Eirin-López JM, Gorovsky MA, Hake SB, Hamkalo BA, Holec S, Jacobsen SE, Kamieniarz K, Khochbin S, Ladurner AG, Landsman D, Latham JA, Loppin B, Malik HS, Marzluff WF, Pehrson JR, Postberg J, Schneider R, Singh MB, Smith MM, Thompson E, Torres-Padilla ME, Tremethick DJ, Turner BM, Waterborg JH, Wollmann H, Yelagandula R, Zhu B, & Henikoff S (2012). A unified phylogeny-based nomenclature for histone variants. Epigenetics & chromatin, 5 (1) PMID: 22650316