Tag Archives: former poster: mjg59

Head and shoulders photo of Margaret Dayhoff

Wednesday Geek Woman: Margaret Dayhoff, quantum chemist and bioinfomaticist

This post appeared on my blog for Ada Lovelace Day 2011.

Head and shoulders photo of Margaret Dayhoff

It’s become kind of a cliché for me to claim that the reason I’m happy working on ACPI and UEFI and similarly arcane pieces of convoluted functionality is that no matter how bad things are there’s at least some form of documentation and there’s a well-understood language at the heart of them. My PhD was in biology, working on fruitflies. They’re a poorly documented set of layering violations which only work because of side-effects at the quantum level, and they tend to die at inconvenient times. They’re made up of 165 million bases of a byte code language that’s almost impossible to bootstrap[1] and which passes through an intermediate representations before it does anything useful[2]. It’s an awful field to try to do rigorous work in because your attempts to impose any kind of meaningful order on what you’re looking at are pretty much guaranteed to be sufficiently naive that your results bear a resemblance to reality more by accident than design.

The field of bioinformatics is a fairly young one, and because of that it’s very easy to be ignorant of its history. Crick and Watson (and those other people) determined the structure of DNA. Sanger worked out how to sequence proteins and nucleic acids. Some other people made all of these things faster and better and now we have huge sequence databases that mean we can get hold of an intractable quantity of data faster than we could ever plausibly need to, and what else is there to know?

Margaret Dayhoff graduated with a PhD in quantum chemistry from Columbia, where she’d performed computational analysis of various molecules to calculate their resonance energies[3]. The next few years involved plenty of worthwhile research that aren’t relevant to the story, so we’ll (entirely unfairly) skip forward to the early 60s and the problem of turning a set of sequence fragments into a single sequence. Dayhoff worked on a suite of applications called “Comprotein”. The original paper can be downloaded here, and it’s a charming look back at a rigorous analysis of a problem that anyone in the field would take for granted these days. Modern fragment assembly involves taking millions of DNA sequence reads and assembling them into an entire genome. In 1960, we were still at the point where it was only just getting impractical to do everything by hand.

This single piece of software was arguably the birth of modern bioinformatics, the creation of a computational method for taking sequence data and turning it into something more useful. But Dayhoff didn’t stop there. The 60s brought a growing realisation that small sequence differences between the same protein in related species could give insight into their evolutionary past. In 1965 Dayhoff released the first edition of the Atlas of Protein Sequence and Structure, containing all 65 protein sequences that had been determined by then. Around the same time she developed computational methods for analysing the evolutionary relationship of these sequences, helping produce the first computationally generated phylogenetic tree. Her single-letter representation of amino acids was born of necessity[4] but remains the standard for protein sequences. And the atlas of 65 protein sequences developed into the Protein Information Resource, a dial-up database that allowed researchers to download the sequences they were interested in. It’s now part of UniProt, the world’s largest protein database.

Her contributions to the field were immense. Every aspect of her work on bioinformatics is present in the modern day — larger, faster and more capable, but still very much tied to the techniques and concepts she pioneered. And so it still puzzles me that I only heard of her for the first time when I went back to write the introduction to my thesis. She’s remembered today in the form of the Margaret Oakley Dayhoff award for women showing high promise in biophysics, having died of a heart attack at only 57.

I don’t work on fruitflies any more, and to be honest I’m not terribly upset by that. But it’s still somewhat disconcerting that I spent almost 10 years working in a field so defined by one person that I knew so little about. So my contribution to Ada Lovelace Day is to highlight a pivotal woman in science who heavily influenced my life without me even knowing.

[1] You think it’s difficult bringing up a compiler on a new architecture? Try bringing up a fruitfly from scratch.
[2] Except for the cases where the low-level language itself is functionally significant, and the cases where the intermediate representation is functionally significant.
[3] Something that seems to have involved a lot of putting punch cards through a set of machines, getting new cards out, and repeating. I’m glad I live in the future.
[4] The three-letter representation took up too much space on punch cards

Creative Commons License
This post is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Want to highlight a geek woman? Submissions are currently open for Wednesday Geek Woman posts.

Looking to the past

It’s an oft-voiced suggestion that rather than looking at the bad things that happen in our communities, we should focus on the good things. There’s a number of highly successful geek women already – should we not be concentrating on encouraging more of them, rather than scaring people away with tales of thoughtlessness, discrimination and outright abuse?

Let’s draw an analogy. One day, a $20 charge appears on your credit card. You didn’t make it. You report it to your credit card company, who assure you that they take fraud seriously and then do nothing. A few days later, another $20 charge. Your credit card company tells you that such events are rare, unrepresentative of the general credit card experience and continue to do nothing. A week afterwards, another charge. This time your credit card company describes how they’re planning on implementing a brand new anti-fraud system, but that this is unrelated to any events that may currently be occuring and will give no details as to when it’s going to be rolled out. And proceed to ignore any further reports you make about fraudulant transactions.

Would you stay with this company? Or would you take your business somewhere else?

The problem with the “Let’s look to the future rather than spending too much time getting stuck in the present” argument is that it assures people that things will get better without providing a roadmap for getting there. It does nothing to validate their concerns or make them feel wanted within a community. It assumes either that people will stick with a community that doesn’t respond to their complaints, or that it’s possible to construct a community that’s welcome to an assortment of genders, ethnicities and lifestyles without any of those people being represented in the first place.

Ignoring people’s concerns is an excellent way to drive them away from your community. Doing so because of a potential future that’s probably conditional on you having those people in your community is short sighted and self defeating. Ignoring the present doesn’t benefit the future. It benefits the status quo.