Showing posts with label genetics. Show all posts
Showing posts with label genetics. Show all posts

29 December 2008

A combinatorial problem from Crick

I recently read What Mad Pursuit: A Personal View of Scientific Discovery
, which is Francis Crick's account of the "classical period" of molecular biology, from the discovery of the double helix structure of DNA to the eventual figuring out of the genetic code. It differs from the more well-known book by James Watson, The Double Helix: A Personal Account of the Discovery of the Structure of DNA, which focuses more on the characters involved and less on the science.

Crick was trained as a physicist, and learned some mathematics as well, and every so often this pokes through. For example, back when the nature of the genetic code wasn't known, combinatorial problems arose to prove that a genetic code of a certain type was or was not possible. One idea, due to Gamow and Ycas was that since there are twenty combinations of four bases taken three at a time where order doesn't matter, perhaps each one of those corresponded to a different amino acid. This turned out to be false. Another, more interesting problem comes from asking how the cell knows where to begin reading the code. What is the largest size of a collection of triplets of four bases such that if UVW and XYZ are both in the collection, then neither VWX nor WXY is? The reason for this constraint is so that the "phase" of a genetic sequence is unambiguous; if we see the sequence UVWXYZ, we know to start reading at the U, not the V or the W. Thus the collection can't contain any triplet in which all three elements are the same, and it can contain at most one of {XYZ, YZX, ZXY} for any bases X, Y, Z, not necessarily distinct. There are sixty triplets where not all three elements are the same, thus at most twenty amino acids can be encoded in such a code. There are solutions that acheive twenty; see the paper of Crick, Griffith, and Orgel.

Note that the "20" in the two types of code here arises in different ways. If we assume a triplet code with n bases, then the first type of code can encode as many as n(n+1)(n+2)/6 amino acids, the second (n3-n)/3.

Crick says that the more general problem of enumerating the number of codes which imply their own "reading frame" was considered by Golomb and Welch, and separately Freudenthal. Based on the title and the date, I think the first of these is the paper I point to below -- but our library doesn't have that journal in electronic form, and the physical library is closed this week!

References
F. H. C. Crick, J. S. Griffith, L. E. Orgel. Codes Without Commas. Proceedings of the National Academy of Sciences of the United States of America, Vol. 43, No. 5 (May 15, 1957), pp. 416-421.

George Gamow, Martynas Ycas. Statistical Correlation of Protein and Ribonucleic Acid Composition Statistical Correlation of Protein and Ribonucleic Acid Composition. Vol. 41, No. 12 (Dec. 15, 1955), pp. 1011-1019.

Golomb, S.W., Gordon, B., and Welch, L.R., "Comma-Free Codes", The Canadian Journal of Mathematics, Vol. 10, 1958. (Citation from this list of Golomb's publications; I haven't read it.)

21 July 2007

why is there mental illness?

Mark Dominus at the Universe of Discourse makes an argument that homosexuality could be hereditary and yet still not ruled out by natural selection. Basically, the argument is that human sexuality is very complicated and isn't shaped by a single gene (which is patently obvious). We make the assumption that people having more than some number of "gay genes" turn out homosexual and people having less than that number turn out heterosexual. Then in a family where there are lots of people with lots of "gay genes", occasionally one of the kids turns out to be gay and doesn't reproduce, but then this person takes care of their nephews and nieces.

I'm not sure if I believe this, mainly because I know of no evidence that gay people actually pay more attention to their relatives who are not their children that straight people do. (Of course, just because I haven't heard it doesn't mean it's not there.)

Furthermore, on average people share half their genes with their children and one-quarter of them with a niece or nephew. So in order for this to work out in some sort of "expected value" framework, a gay person would have to be able to enhance the survival probability (or, more accurately, the expected number of children, or grandchildren) of their nephews and nieces by twice as much as they'd help their children, if they had them.

However, this could have an effect in times when "expected value" isn't what really matters -- when a family (and therefore a set of people with similar genes) are just barely clinging to life, very close to dying out. The logic then is that a straight person and their gay siblings can put all their eggs in one basket -- and then watch that basket very carefully.

Although I've never heard this sort of argument applied to homosexuality, I have heard it applied to various mental illnesses. There are people who believe that although, say, schizophrenia is obviously very harmful to the people who suffer from it, certain good qualities (say, high intelligence -- I don't remember if this is actually one of them!) tend to occur in the near relatives of schizophrenics. (Let me just say that I in no way am attempting to compare homosexuality and schizophrenia.)

Let's say, hypothetically, that there are ten genes, each of which occur in two variants called "red" and "blue", which cause schizophrenia. Let's say that each of these is "red" with probability p and "blue" with probability q = 1-p. Furthermore, a person which has zero or one of the "red" genes is "normal"; one who has two is of high intelligence, or "smart"; one who has three of more is schizophrenic. Assuming people mate at random, we are in a state of Hardy-Weinberg equilibrium. We compute:

  • The probability of a person having zero or one "red" genes is P1 = q10 + 10q9p.
  • The probability of a person having two "red" genes is P2 = 45q8p2.
  • The probability of a person having three or more "red" genes is P3 = 1 - (P1 + P2). (It can be written out as the sum of the probabilities of having 3, 4, ..., 10 red genes, but it's easier to compute this way.)

Now, consider various proportions of "red" genes in the gene pool; what are the probabilities of a randomly selected person being smart? Schizophrenic?
p .02 .05 .10 .15 .20 .25 .30 .35
P2 .01531.07463 .19371 .27590 .30199 .28157 .23347 .17565
P3 .00086.01550 .07019 .17980 .32220 .47441 .61722 .73839

What we see here is clear. When the frequency of red genes is low, most people are normal. When the frequency is moderate, we see a large minority of smart people and a small minority of schizophrenics. When the frequency is high, the schizophrenics begin to outnumber everybody else. Presumably, then, evolution would want (and here I commit the common sin of anthropomorphizing evolution) a moderate frequency of the "red" genes. As for how that is created, I think that assuming that high intelligence has survival value will do it; when the red genes are rare, "normal" people are the most common but the smart people will out-reproduce them, increasing the frequency of red genes; and when the red genes are common, schizophrenics are the most common but the smart people will out-reproduce them, decreasing the frequency of red genes. But even at the equilibrium point, not everyone will have exactly two red genes, which is what you need to be smart in this model. So there will still be variation.

Something similar actually goes on with the inheritance of sickle-cell anemia; having two copies of a certain allele gives people sickle-cell anemia, but having one copy of that allele confers resistance to malaria.

08 July 2007

what does "losing 35 IQ points" mean?

The Gregarious Brain, in today's New York Times magazine. The article is about people who suffer from Williams syndrome, and focuses on the fact that people with Williams have trouble with abstract thought and have "exuberant gregariousness and near-normal language skills". These individuals don't have the best social skills, though, which makes one feel sorry for them -- they want to connect, but they can't.

This comes about because the part of their brains which deals with abstract thought (the dorsal part) are underdeveloped, but the parts dealing with language (the ventral part) are normally developed, because certain genes don't act during the formation of the brain.

What grabbed me, mathematically, was this:

These deficits generally erase about 35 points from whatever I.Q. the person would have inherited without the deletion. Since the average I.Q. is 100, this leaves most people with Williams with I.Q.’s in the 60s. Though some can hold simple jobs, they require assistance managing their lives.

Does this mean anything? Obviously "I. Q. points" are not something which just sits there in our brain. If my IQ is, say, 130, there aren't 130 little blobs sitting there in my brain which do my thinking for me -- or even 1.3 times as many such little blobs as the average person. (I know, you might be thinking that neurons are such "little blobs", but brain size isn't correlated very well with intelligence.) Furthermore, IQ scores are set up to have a Gaussian distribution, which I suspect is not the right thing to do. Perhaps the intelligence of individuals whose brains have developed "normally" are normally distributed, but the fact that there are a large number of disorders which "take away" IQ points makes me think there'd be a bump around, say, 60 or 70 IQ points -- a couple standard deviations below the median.

And that's only if intelligence is normally distributed to begin with. You'd expect that if intelligence were the sum of a bunch of independent effects, but I suspect there's some sort of synergy going on where "the whole is greater than the sum of its parts" -- a moderately above-average ability in, say, spatial reasoning and computational ability might make a better mathematician than someone who's really good at one of those and only average at the other. In general there are lots of complex skills which are made up of simpler skills in this way.

I suspect the central part of the distribution is approximately normal, though; the weirdness probably goes on with the very smart or the very stupid.