God Plays Dice: calendars

Showing posts with label calendars. Show all posts

19 January 2009

Back-to-back days off for some

Ken Jennings points out that today and tomorrow are both federal holidays (tomorrow only in the Washington, DC area), meaning that employees of the US federal government who work in the DC area will get the day off. Have there ever been two consecutive federal holidays before?

For those who don't know the schedule of holidays: the third Monday in January (that's today) is Martin Luther King, Jr. Day. January 20, in years with number one more than a multiple of four (that's tomorrow), is Inauguration Day. The reason that federal employees in the DC area get it off, if I understand correctly, is to keep the traffic down. (Not that it'll help tomorrow; from what I understand Washington will still be a mess.) As Ken Jennings points out, these days are consecutive if January 20 falls on a Sunday or a Tuesday.

Now, a presidential term is a whole number of weeks (208, to be exact) and five days long. (How do I know this? A common year is one day longer than a whole number of weeks; a leap year is two days longer than a whole number of weeks; thus a presidential term, consisting of three common years and a leap year, has five "extra" days.) So we can work backwards. The 2009 inauguration is on a Tuesday; the 2005 inauguration was on a Thursday. The 2001 inauguration was on a Saturday (which I could have told you anyway; I got in a car accident that day and remember the circumstances pretty well). 1997 was a Monday, 1993 was a Wednesday, 1989 was a Friday, and 1985 was a Sunday.

But Martin Luther King Day wasn't observed for the first time until 1986. So the answer is that Martin Luther King Day and the inauguration have never fallen on consecutive days. The pattern of when they do is kind of complicated, because leap years are periodic with period 400. But the 2013 inauguration falls on a Sunday; 2037 and 2041 are a Tuesday and Sunday, respectively; and most of the time these come in pairs; a Tuesday inauguration is followed by a Sunday inauguration. (The end of a century could break this pattern.)

However, the answer to Ken Jennings' actual question is yes, because of an obscure piece of trivia I just remembered: January 2, 2007 was a federal holiday, an official day of mourning for President Ford. (For some reason I remember checking my mail and being surprised there was none. This is strange, because there are lots of days where mail is delivered and I don't get any.) January 1, 2007, was of course New Year's Day. And December 31, 2006 was a Sunday, so there was actually no mail for three days.

10 April 2008

e day

I have been informed that tomorrow is "e day", February 71st. (In American date notation, that's 2/71; compare "π day" which is March 14th.)

In non-leap years, "e day" is still February 71st, but that's equivalent to April 12th.

It's too bad Euler wasn't born a few days earlier; he was born on April 15, 1707.

21 March 2008

Easter's early this year. Deal with it.

Family holidays ruined by earliest Easter in 90 years (from the Daily Mail).

About halfway down, a formula is given:

It may look daunting to non-mathematicians but the fiendishly complex formula used to work out when Easter actually falls is:

((19*t+u-w-(u-(u+8)\25)+1)\3)+15)mod30)+(32+2*x+2*y-(19*t+u-w- (u-(u+8)\25)+1)\3)+15)mod30)-z)mod7)-7*(t+11*(19*t+u-w(u- (u+8)\25)+1)\3)+15)mod30)+22*(32+2*x+2*y-(19*t+u-w-(u- (u+8)\25)+1)\3)+15)mod30)-g)mod7)+114)\31

Um, do you understand that formula? I think I know why some of the numbers are there -- the 31 at the end probably has something to do with the length of months, the 7 with the length of weeks, and the 19 with the Metonic cycle. Also, any sane mathematician wouldn't write the formula like that. First, there are repeated subexpressions like that ((u + 8) \ 25 + 1); I'd just call that by some other name and be done with it. Second, the formula just sits there in the middle of the article; this gives people the idea that mathematicians are freaks of nature who think in formula. What do the variables mean?

If you're curious, there is an algorithm at the Calendar FAQ. Easter is the first Sunday after the first (computed) full moon on or after the vernal equinox (calculated, and assumed to be March 21). The algorithm reflects this. First, assume that the Metonic cycle, which says that lunar phases repeat every 19 solar years, is exactly correct in the Julian calendar. (The algorithm was invented back when the Julian calendar was used.) Then make two corrections, one for the fact that the Julian calendar includes leap years that the Gregorian doesn't (years divisible by 100 but not 400) and one for the fact that the Metonic cycle's a bit off. (The expression "(u+8)\25" in the formula above comes from the second correction.) This gives the date of the full moon. Presumably if you've gotten this far you already know what the days of the week are.

Anyway, the cycle of Easter dates repeat themselves every 5,700,000 years. The cycle of epacts (which encode the date of the full moon) in the Julian calendar repeat every nineteen years. There are two corrections made to the epact, each of which depend only on the century; one repeats (modulo 30, which is what matters) every 120 centuries, the other every 375 centuries, so the air of them repeat every 300,000 years. The days of the week are on a 400-year cycle, which doesn't matter because that's a factor of 300,000. So the Easter cycle has length the least common multiple of 19 and 300,000, which is 5,700,000.

This whole computation is known as the computus (Latin for "computation"; I guess it was just that important at the time). Not surprisingly, Gauss had an algorithm which is much easier. Let Y be the current year. Then take:

a = Y mod 19

b = Y mod 4

c = Y mod 7

d = (19a + M) mod 30

e = (2b + 4c + 6d + N) mod 7

where M and N are constants depending on the century that don't look that hard to calculate, and which I assume are the corrections I alluded to above; the Wikipedia article gives them in a table. Then Easter falls on the d+e+22 of March or the d+e-9 of April, with certain exceptions which move it up a week when this algorithm gives a very late date for Easter. Basically, d finds the date of the full moon (so M is something like the epact) and e find the day of the week. In the case of this year you get a = 13, b = 0, c = 6; a table gives M = 24, N = 5 for this century, so d = 1, e = 0, and Easter is on the 23rd of March.

As for when Easter usually falls, well, go back to the original description: Easter is the date of the first Sunday after the first full moon on or after March 21. To me this seems like adding two random variables -- the number of days between March 21 and the first full moon, which is roughly uniformly distributed over [0, 29], and the number of days between that moon and the next Sunday, which is uniformly distributed over [1, 7]. There are 210 ordered pairs in ([0, 29] × [1, 7]). One of them sums to 1, giving an Easter date of March 22 in about one year out of 210. Two sum to 2, giving an Easter date of March 23 in two years out of 210. Three sum to 3 (March 24), ..., six sum to 6 (March 27). Seven sum to each of 7 through 30, giving Easter dates of each of March 28 through April 20 in seven years out of 210. Six sum to 31, giving April 21 in six years out of 210, ..., one sums to 36, giving April 26 in one year out of 210.

Indeed, this is basically what computations show, except that for some reason, when the methods given above call for Easter to be on April 26 it gets moved up to April 19. But basically the distribution of Easter dates is just a convolution of two uniform distributions! The Wikipedia article on the computus has a nice graph.

And I have no sympathy for the people quoted in that article. They've known this was coming since 1752, when the UK changed over to the Gregorian calendar. (It perhaps says something about me that I have more sympathy for the bakeries with lots of Irish patrons that are unhappy because Easter was only six days after St. Patrick's day this year.)

01 March 2008

A leap year scheme based on binary expansions

Yesterday I wrote about leap day, and how a different scheme of determining which years are leap years could make calendrical calculation easier. In particular, the number of days in a year is very nearly 365+31/128; how can we pick 31 years out of every 128 to be leap years? (As was pointed out in the comments, 128 is a power of two, which is what makes this whole post work.)

The answer is obvious -- take every fourth year, except don't take years divisible by 128.

But then I asked -- what if we needed to take 33 years out of every 128? We clearly should take every fourth year... and then one more out of every 128. But which one?

I'm implicitly using the fact 31/128 = 1/4 - 1/128 and 33/128 = 1/4 + 1/128. But we can also write:

33/128 = 1/2 - 1/4 + 1/128.

Why would I do this? Because it gives a very good scheme for assigning 33 leap years out of every 128. Include in the set of leap years all years which are even, but not those that are divisble by 4, but do include those which are divisible by 128. So in every 128-year period we include the year 0, and the years 2, 6, 10, ..., 126.

But there's a nicer way to express that. Look at the binary expansion of such a year. Either it ends in exactly one 0 (it's 2 more than a multiple of 4) or it ends in at least seven 0s (it's a multiple of 128). It turns out that for any fraction of the form m/2ⁿ, where 0 ≤ m < 2ⁿ, we can write m/2ⁿ as an alternating sum of powers of 1/2. For example, consider

59/128 = 1/4 + 1/8 + 1/16 + 1/64 + 1/128.

where that's just the ordinary binary expansion. We can group the consecutive powers of 2 in the binary expansion together to get

59/128 = (1/4 + 1/8 + 1/16) + (1/64 + 1/128)

and then each sum of consecutive powers can be written as a difference, giving

59/128 = 1/2 - 1/16 + 1/32 - 1/128.

So let's say we want 59 leap years out of every 128. We include all the even years, but we don't include those that are divisible by 16, but we do include those that are divisible by 32, but then we don't include those that are divisible by 128.

It sounds complicated -- but there's a better way to say it. If you think about it, the rule I just gave says that the binary expansion of a leap year must end in 1, 2, 3, 5, or 6 zeroes. Write 59/128 = .0111011₂. Now, there are 1s in exactly the 2nd, 3rd, 4th, 6th, and 7th places after the decimal point. That's not a coincidence. The proportion 1/2ⁿ⁺¹ of integers will have binary expansions ending in exactly n zeroes. In general, if we want m/2ⁿ of our years to be leap years, then we can determine if any given year k is a leap year via a scheme like this, as follows:
- let p be the number of zeroes terminating the binary expansion of k.
- if the (p+1)st bit of m/2ⁿ after the decimal point is 1, then k is a leap year, otherwise it's a common year.

The years for which we examine the jth bit are exactly 1/2^j of all years, so this works.

For 31/128 = .0011111, this says that a year should be a leap year if its binary expansion ends in exactly 2, 3, 4, 5, or 6 zeroes -- exactly the rule I suggested in the first place. For 33/128 = .0100001, a year is a leap year if its binary expansion ends in exactly 1 or 6 zeroes. That's one flaw with this scheme -- the set of leap years changes radically as m/2ⁿ passes through some small power of (1/2). But that wasn't my aim here; my aim was to be able to read off if a year is a leap year directly from the binary expansion, just as one can almost do with the decimal expansion in the current scheme. The 8/33 scheme I talked about yesterday doesn't have this property in any small base, although I made the argument that since 33 = (100-1)/3 there are worse situations to be in.

(Exercise for the reader: can you come up with a scheme like this in decimal? Calling this an "exercise" isn't quite fair, because I don't know if it's possible.)

29 February 2008

Leap days

Mark Dominus has written an interesting post about leap days, of which today is one. I welcome this turn of events, because it means I don't have to! But I have a few things to add.

The reason that today is February 29 -- and not March 1 like it would be in a normal year -- is because the year is almost a quarter of a day longer than 365 days. So we add an extra day every four years. But not quite -- it's actually more like 365.24 days -- so we leave out one of these extra days every century. But no, it's really more like 365.2425 days -- so we add back in one of the days we got rid of every four centuries. (We did that in 2000, as you may remember. 2000 was a leap year.)

The real number of days per year is something like 365.24219; thus what one wants to do is to find a rational approximation p/q to .24219 and add p leap years every q years. The approximation we use, 97/400, actually isn't that good for a number with such a large denominator; 8/33, a convergent of the continued fraction of .24219, is better. Dominus proposes having leap years in those years congruent to 4, 8, 16, ... , 32 mod 33, a scheme that was also suggested in 1996 by Simon Cassidy on Usenet, complete with the same choice 4, ..., 32 of leap years in each 33-year cycle. The reason for this coincidence is that 1984, 1988, ..., 2012 are leap years in both systems.

And although dividing by 33 is hard, as Dominus points out, it's not that hard to reduce a number modulo 33 in your head. The year abcd (where a, b, c, and d are digits) reduces modulo 99 to ab + cd; for example, 2008 reduces to 28 modulo 99. Then just subtract 33 or 66 as necessary. I'd rather have to do lots of reduction modulo 33 in my head than, say, modulo 37. (And with 33 there's probably some slick Chinese-remainder-theorem method that allows you to reduce a number mod 33 by reducing it mod 3 and 11, both of which are easy.) In any case, there's no reason we couldn't have a calendar that requires more complicated arithmetic than the current one; most people are not in the business of calculating calendars. (And calendars are calculated by computers now; Dominus points out his new rule actually requires less computation than the old one. The 97/400 rule is an artifact of the fact that we work in decimal.)

As is pointed out both by Cassidy and by this page, Jesus is traditionally said to have lived for 33 years; for some people that might lend additional appeal to a 33-year cycle.

Edit (8:23 am): There's also something to be said for a system that includes 31 leap years every 128 years -- have leap years in years divisible by 4, except those divisible by 128 -- this was apparently suggested by Johann Heinrich von Mädler, and would only be off by something like a quarter-second a year. But striving for such accuracy is silly, because the length of the year isn't actually constant.

26 February 2008

"One Giant Leap For Babykind"

One Giant Leap for Babykind, from the (Feb. 25?) New York Post.

A mother born February 29, 1980 is due to have a baby on February 29, 2008. (Her doctors have said they'll induce labor that day if it doesn't happen naturally.)

Now, one out of every 1461 days is a leap day; assuming that the events of a mother and a baby being born on that day are independent (which seems reasonable), this should be true for about one in every (1461)² = 2,134,521 mother-baby pairs. So about 140 people in the U. S. should be born on February 29 and also have a mother born on that day.

I was actually reading the article expecting them to make some sort of mathematical mistake -- this feels like the sort of place where an "expert" is consulted and gives some ridiculous figure -- but they managed to restrain themselves.

01 January 2008

Writers' strike day -307

Here in the United States, the Writers Guild of America -- who write a lot of the television shows -- are on strike, and have been for two months. As a result, a lot of TV shows are in reruns, and the networks are starting to move into "reality" television which doesn't need writers.

Anyway, zap2it.com, a web page with TV listings, currently has a banner ad reading: "WRITERS' STRIKE DAY -307: Find out how the Hollywood writers' strike will affect you." The strike actually started on November 5, 2007; this is day 58.

Apparently whoever wrote the code which automatically generates these banners didn't consider the possibility that the strike might run into 2008. And 308 days from today (the putative "Day 1" if the count increments by one each day) is November 4, not November 5... 2008 is a leap year! If I had to guess, I'd say that the code incorporates the fact that the strike started on the 309th day of the year... which is November 5 in an ordinary year, but November 4 in a leap year.

And do people remember how occasionally, around eight years ago, you'd see web sites referring to "19100" for 2000, "19101" for 2001, and so on, since the code which was automatically generating the dates hadn't been fixed for Y2K? This error reminds me of that, although there's no mathematical similarity. But they're both "stupid calendar tricks".

(While I'm on the topic of calendars, check out Claus Tondering's Calendar FAQ.)

25 November 2007

compressed calendars and the Doomsday algorithm

Infodesign challenge -- how to fit a calendar for a year onto a business card.

There are many solutions; basically it seems that in order to do this well one has to exploit the fact that our calendar has some sort of structure. In short, all months look the same if you just rename the days of the week.

#2, for example, takes this into account: in a non-leap year January and October "look the same", as do February, March and November; September and December; April and July.

Incidentally, this is not a problem I particularly worry about, because I know the Doomsday algorithm. In short:

It is relatively easy to determine what day of the week the last day of February falls on in a given year. (Because of leap years, a lot of calendar algorithms focus on this day. There are some cases where for the purposes of formulas it is convenient to think of January and February as the 13th and 14th months of the previous year.) This day of the week is called "Doomsday". Doomsdays run in a 400-year cycle, as does the entire Gregorian calendar. Doomsday in 1900 was a Wednesday, and in 2000 it was a Tuesday. For every twelve years after a multiple of 100 Doomsday moves forward one day; this is because twelve years include three leap years, so Doomsday actually moves forward 15 days. (So Doomsday in 2012 is a Wednesday, in 2024 a Thursday, and so on.) So take the last two digits of the year and divide by 12; call the quotient q and the remainder r. Move Doomsday forward from its day in the last century year by q+r days (each dozen years moves Doomsday forward fifteen (= one) days, then each year after that moves it forward one more day) plus r/4 days rounded down (for leap years). So, for example, Doomsday in 2000 was Tuesday; 7 divided by 12 is 0, with remainder 7. So Doomsday in 2007 is Tuesday, plus zero days, plus seven days, plus one day (7/4 rounded down is 1); that's Wednesday. Indeed, February 28 was a Wednesday this year.

certain days will always fall on the same day of the week as that day; the easiest set to remember is 4/4, 6/6, 8/8, 10/10, 12/12, 5/9, 9/5, 7/11, 11/7 [note that for the set named so far! it doesn't matter whether you put the month first or the day first!], 3/7, 2/(28 or 29), 1/(3 or 4). (A few others that stick in my head are 2/14 (Valentine's Day) in ordinary years, 7/4 (Independence Day), 10/31 (Halloween) and 12/26 (the day after Christmas).) All of these days were Wednesdays this year as well.

once you have a set that includes at least one day in each month, it's not hard to work within each month, probably because we have a decent amount of practice doing so. November 7 was a Wednesday; thus November 21 (fourteen days later) was also a Wednesday; thus November 25 is a Sunday. And my television agrees with me; it is showing that execrable show that only survives because of its cushy post-Simpsons time slot, King of the Hill.

A real challenge would be to figure out rules that let someone calculate the day of the week of a given date in the Hebrew calendar in their head. I suspect it's not possible.

13 September 2007

Rosh Hashanah, and somehow the circle of fifths.

Today is Rosh Hashanah, the Jewish New Year.

The Hebrew calendar is a lunisolar calendar, which means that the months remain roughly in step with the phase of the moon (months start approximately at the new moon) but the beginning of the year is always at about the same time with respect to the seasons. This is arranged by having "leap months"; some Hebrew years have 12 months (of, on average, 29.5 or so days each) and some have 13.

It turns out that the leap years in the Hebrew calendar follow the Metonic cycle, which is a name for the fact that 235 lunar months is very nearly 19 solar years. In fact, 235 months are .076 days longer than 19 years, which seems to indicate that the calendar drifts by .076 days per 19 years, or one day per 219 years, with respect to the seasons. The Rosh Hashanah article backs this up; currently Rosh Hashanah can occur no earlier than September 5, but after 2089 it will occur no later than September 6. (It's legitimate to use the Gregorian calendar here, as the corresponding error in that calendar is something like one day in three thousand years.) The Hebrew calendar was codified in its present form several centuries before the Gregorian calendar, which probably explains why the Gregorian is more accurate; I have no doubt that the creators of the Hebrew calendar would have found a way to make it more accurate if they'd had the data to do so.

I learned from the Wikipedia article on the Hebrew calendar that

Another mnemonic [for remembering the pattern of common and leap years] is that the intervals of the major scale follow the same pattern as do Hebrew leap years: a whole step in the scale corresponds to two common years between consecutive leap years, and a half step to one common between two leap years.

For a moment this seemed surprising, but then I stopped to think about it. One has to have seven leap years (i. e. 13-month years) out of every nineteen. This is slightly more than one out of every three, so one could imagine taking a 21-year cycle of the form

LCCLCCLCCLCCLCCLCCLCC

where L represents a leap year and C a common year, and then deleting two of the C's. It is reasonable to delete two common years that are as far apart as possible, so that the calendar doesn't get too far ahead or behind. Similarly, in the case of the major scale we have two half steps and five whole steps to be arranged; one "wants" the half steps to be as far apart as possible. (This is a little more dubious, but seems reasonable.)

In music we have the "circle of fifths", so one note is a fifth, or four diatonic tones higher than the one before but seven half-steps higher; this only breaks down when seven fifths has to be twenty-eight diatonic tones (four octaves) but forty-nine half-steps (four octaves plus a half-step). A similar construction could exist in the Hebrew calendar, where seven is replaced by eleven; "usually" a period of eleven years has four leap years and seven common years, but if one had nineteen such periods that would give 76 leap years and 133 common years, while eleven nineteen-year cycles ought to have 77 leap years and 132 common years.

In fact, both 4/11 and 7/19 are convergents of a certain continued fraction, the one which is the expansion of the actual number of lunar months per solar year, minus twelve. There is apparently a "tabulated Muslim calendar" that uses a similar-looking 11/30 ratio, although for a different reason; Islam forbids leap months, but it turns out that to keep the calendar in sync with the moon one has to add a day every so often. The average lunar month is slightly longer than 29.5 days, and this "tabular Islamic calendar begins by giving alternate months 30 and 29 days and then adding a day to the last month of eleven years out of thirty. I was actually about to state that 11/30 was the next convergent of the continued fraction in question, but it's not; it's actually 123/334, so one could have a cycle of 123 leap years out of every 334. This would be build up from seventeen of the 19-year cycles and one 11-year cycle. Of course, the problem with such a complicated leap-year rule would be that nobody can remember which years are leap years!

(Incidentally, there are also rules about which day of the week Rosh Hashanah can fall on; it has to be Monday, Tuesday, Thursday, or Saturday. The reason for this is as follows:

Yom Kippur, which is the tenth day of the year, should not fall on a Friday or Sunday, i. e. a day adjacent to a Saturday, for this would make two consecutive days when no work can be done. This means that the year should not begin on a Wednesday or Friday.

The twenty-first day of the year, which is the last day of Sukkot, called Hoshanah Rabbah, for some reason should not fall on a Saturday. (I once read why, but I forgot.) So the year should not begin on a Sunday.

Therefore Rosh Hashanah doesn't always actually start on the day of the new moon; sometimes it's moved around by a day or two to avoid it falling on such a day of the week.)

11 September 2007

Fencepost errors, intervals, redeeming coupons, and drinking

A friend of mine asked today: if a coupon says it expires on a certain date, is that the last date that it's usable, or the first date that it's not usable? (He was referring in particular to the customized coupons that print out on receipts at CVS if you participate in their "loyalty program".)

My answer: the last day it's usable. This is basically by analogy with more "traditional" coupons, which are likely to expire on, say, "September 30". It seems reasonable that one could use the coupon on every day in September, and on no day in October; it would not seem reasonable to be able to use the coupon on every day in September but one. Coupons printed out by this program expire on seemingly random dates, but it seems that the same principle should apply there.

Similarly, if I said "the homework is due on Thursday" to my students, and then one of them handed it in on Thursday at noon, I would not say that what I really meant is "Thursday is the first day on which I will not accept the homework", and that therefore the homework is late. (In practice I usually say what time the homework is due in order to avoid this question.)

I'm not sure what the law actually has to say about this, though. I do know that laws in various places endorse different interpretations with regard to the "sell by" date on milk: in some jurisdictions the "sell by" date is the last date on which the milk can be sold, and in others it is the first day on which it cannot.

I'm inclined to say that one is always allowed to do the thing on the date which is stated, so it should be permitted to redeem a coupon, sell a carton of milk, etc. on the date listed on it. Similarly, if there's something that I'm allowed to do if I am "at least N years old" I should be able to do it on my Nth birthday, and the law agrees with me. (There's an exception of sorts for drinking, actually; in some states one cannot drink at midnight on one's 21st birthday, but must wait until the bars have closed and re-opened again. A friend of mine turned 21 in Massachusetts and was disappointed to learn this. Apparently the reason for these laws is that some people were trying to take 21 shots between midnight and last call, often 1am or 2am, and they were dying. Dying was seen by the lawmakers to be a Bad Thing.) There's a symmetry here, in that both of these are "permissive" interpretations.

But the scenario in which one is always forbidden to do a thing on the date stated on the appropriate piece of paper (so no redeeming a coupon on its expiration date, no drinking on your 21st birthday) also is symmetrical, in a way -- one might call it a "strict" interpretation. One could also have "early" and "late" interpretations -- the "early" interpretation would be that you can't redeem the coupon but you can drink, and the "late" interpretation the reverse.

These four cases are roughly analogous to the four sets of inequalites a ≤ x ≤ b, a < x < b, a ≤ x < b, and a < x ≤ b, although making the analogy precise takes more time than I want to spend on this.

31 July 2007

how to break up time and space

Squaring the hexagon, from Strange Maps. This map illustrates a proposal for dividing France into perfectly rectangular (well, as rectangular has something on a sphere can be -- but France is small enough that one needn't worry too much about this) regions, of which there were 80 or so. (As you may note, France is not a rectangle.)

This proposal was made around the time of the French Revolution, and it has the ring of proposals from that period; these are the same people that invented the French Republican calendar and the metric system. The metric system has turned out to be a good idea (although some people will argue that many of its units don't correspond to anything on a human scale). The Republican calendar, for those who aren't familiar with it, broke up the year into twelve thirty-day months, of three ten-day "decades" each (these are analogous to weeks); there seems to be no a priori reason why this doesn't work, except that we're used to seven-day weeks. It's rather inconvenient that seven is prime, in fact; it would be useful to have a unit of the "half-week".

If you look at a Major League Baseball schedule some time, you'll see that that's actually the fundamental unit of baseball scheduling; teams generally play two series a week, one of them running from Monday to Wednesday, Tuesday to Thursday, or Monday to Thursday, and the other from Thursday or Friday to Sunday. You might notice that Thursdays are kind of ambiguous in this scheme; sometimes Thursday is the end of a series, sometimes it's the beginning, and sometimes it's neither. Teams often have off on Mondays or Thursdays. The weirdness of Thursday in baseball scheduling is a direct consequence of the fact that seven is odd.) In fact, I'd argue that the fact that the week has seven days is a lot more inconvenient than a thirteen-month year would be; the fact that thirteen is prime, and that therefore no simple fraction of a hypothetical 13-month year is a whole number of months, is pretty much irrelevant. When was the last time you ever did something that took exactly three months (one quarter) or six months (half a year)?

Returning to geography, drawing straight lines as boundaries often seems to be a bad idea; lines that are drawn without regard for population centers inevitably, if you draw enough of them, pass through some population center and divide it up. This causes the people in charge on either side of the line to ignore the other side in government, making the region less able to function as a whole. Surprisingly, it's hard to think of population centers in the U.S. that lie along a state border that's just a line on the map; most of the big population centers near state borders are along rivers, such as Philadelphia, New York, or Washington (note that even if the District of Columbia didn't exist, Washington would be on the Maryland-Virginia border). This isn't surprising; rivers are natural dividing lines. But I suspect that the situation would be different in more densely populated France. And drawing lines on a map without regard for settlement patterns is a cause of the chaos in the Middle East.

The difference between the U. S. and Europe is that in Europe the lines have evolved along with the historical population patterns (which haven't changed that much even in centuries), whereas in the U. S. a lot of the lines between states were drawn before the states in question were settled. One can't expect the people making the maps to forecast in advance where people are going to choose to live, especially in modern times when the population centers grow up along roads (which people can build) and not rivers (which are where they are).

A lot more thoughts on how territory is often divided up -- countries into states, states into counties, etc. -- can be found in Ed Stephan's book The Division of Territory in Society, text available online. His hypothesis is that there's a relationship between the size of a state, county, etc. and its population density; smaller states within a country, counties within a state, and so on tend to be denser. This can be derived from the assumption that "social structures evolve in such a way as to minimize the time expended in their operation", although it only gives a relation between density and county area. It doesn't make forecasts about shape, and it seeems to assume that densities are locally uniform when this is very far from the case.

13 July 2007

roller-coaster of a week: Friday the 13th

Just six days after "the luckiest day of the century", we have Friday the 13th, supposedly the unluckiest day.

The 13th day of a month occurs on Friday more often than on any other day. But it's not that often -- 688 times every 400 years. You'd expect 4800/7 = 685.714... times.

Perhaps there should be a tradition of, say, Wednesday the 25th being a day of good luck. (Let's see -- maybe Saint Nicholas was born on a Wednesday, and the 25th of December is Christmas...) Of course Friday the 13ths and Wednesday the 25ths would always occur in the same month. I wonder how long it would take the average person to realize that the good-luck-day and the bad-luck-day always happening in the same month was more than a coincidence. The most likely day for Christmas, by the way, is not Wednesday -- it's a tie between Sunday, Tuesday, and Friday.

I'd heard that the 13th occurred most commonly on a Friday before. A few weeks ago I tried to compute the probabilities in my head. Unfortunately they are all so close to one-seventh that I couldn't, at least not while walking.

To calculate days of the week in general, in your head, you can use the Doomsday Algorithm, created by J. H. Conway. I seem to recall reading once that Conway had his computer programmed so that he couldn't log onto it unless he could determine the day of the week corresponding to a given date within some (short) amount of time; this may or may not be true.

Last but not least, the Phillies. I have tickets tonight. They will lose, although I don't want them to, and that will be their 10,000th loss. How do I know this? Because they're playing the Cardinals. And game four of the 2004 World Series -- the one the Red Sox won in four game -- was played against the Cardinals, during a total lunar eclipse. Historic things happen when the Cardinals are playing and there's already some sort of bad omen going on.

(Note to the stupid: I don't really believe the previous paragraph.)

07 July 2007

the 7/7/07 baseball coincidence

From Monday's New York Times, With a Big Day Ahead, Marketers Are Turning to Numerology:

The winner’s proposal is to be seen on computer-generated signs behind home plate during the three Major League Baseball games that Fox plans to broadcast on Saturday in different parts of the country: Atlanta Braves-San Diego Padres, Los Angeles Angels-New York Yankees and Minnesota Twins-Chicago White Sox.

Wait a minute. “Atlanta,” seven letters. “Angeles,” seven letters. “New York,” seven letters. “Yankees,” likewise. Ditto “Chicago.”

And “Minnesota Twins” and “San Diego Padres” are each 14 letters long, twice 7.

Spooky.

With standards that loose, it's easy to manufacture coincidence. By my count, the following teams have seven letters in either their city or their nickname, or fourteen letters in the combination of the two:

AL East: Toronto Blue Jays, New York Yankees, Baltimore Orioles
AL Central: Cleveland Indians, Detroit Tigers, Minnesota Twins, Chicago White Sox
AL West: Los Angeles Angels (of Anaheim), Seattle Mariners, Oakland Athletics, Texas Rangers
NL East: New York Mets, Atlanta Braves, Florida Marlins
NL Central: Milwaukee Brewers, Chicago Cubs, St. Louis Cardinals, Pittsburgh Pirates, Houston Astros, Cincinnati Reds
NL West: San Diego Padres, Los Angeles Dodgers, Arizona Diamondbacks, Colorado Rockies

It's probably easier to list the teams that aren't includes: the Red Sox, Devil Rays, Royals, Phillies and Nationals. If you pick six teams at random, there are (25 choose 6) = 177100 ways that they'll be among the 25 with at least one seven-letter part in their name, out of 593775 total ways to pick six teams -- thus there's a 30% chance that this would happen, if you're lenient with the definition of "seven-letter team name". (For the truly pedantic among you, the schedule isn't balanced -- so certain teams are more likely to play each other than you'd expect by chance -- and also FOX tends to favor large media markets and winning teams for their Saturday afternoon coverage. This makes me happy because it means lots of Phillies and Red Sox games, which works against the coincidence; but it also means lots of games involving New York, Los Angeles, or Chicago teams, which works for it.)

Over the rest of the season, this same sort of coincidence happens again on July 7, July 28, August 4, August 18 (but there are only two games that day), and September 1, for five out of the eleven weeks that remain. (FOX makes it difficult to find their schedule for the first half of the season -- obviously nobody would care about that because it already happened.)

(Yes, the city of St. Louis is actually called "Saint Louis" -- but when was the last time you saw it written out?)

Oh, and today's games start at 3:55 PM Eastern. Could one of them end at 7:07?

04 July 2007

the first and the fourth of July

It's often been pointed out by Americans that for some reason, a lot of countries have their national holidays in July.

This is because if Americans can remember the national holidays of two countries, they're the U.S. (July 4) and Canada (July 1); if they can remember a third there's a good chance it's France (July 14).

What are the chances that two countries which border each other have their national days within three days of each other? You can pick when the first one should be at random, on the Nth day of the year; then the other one must have its day sometime between N-3 and N+3, a seven-day span So the probability is one in fifty-two; one expects there to be two countries which border each other and have national days within three days of each other.

Two bordering countries have a one-in-365 chance of having the same national day, if such days are chosen at random. But since such days often commemorate historical events, and two adjacent countries probably share some history, I'd think that such events are more likely than one in 365.

Of course, this is all made a bit trickier by the fact that some countries seem to have more than one "national day". Mexico's Independence Day, for example, is September 16 -- but I bet a lot of Americans thinks it's May 5. Canada celebrates July 1 -- but Quebec calls June 24 "la fête nationale du Québec".

A quick look at the Wikipedia lisf of national holiays reveals the following coincidences:

Canada has Canada Day on July 1; the U. S. has Independence Day on July 4

Pakistan has Independence Day on August 14; India has Independence Day on August 15

Costa Rica, El Salvador, Guatemala, Honduras, and Nicaragaua all have Independence Day on September 15; Mexico has it on September 16. (Chile declared its independence from Spain two days after Mexico did, but it looks like they were separate events. Belize declared Independence on September 21, but over a century later.)

The bunch of Central American countries on September 15 all commemorate the same event, the formation of the Federal Republic of Central America in 1821, which split up into those countries about twenty years later. Mexico declared its independence on September 16, 1810.

The Canadian and American holidays of course celebrate different events -- unless somehow Canada started in Philadelphia and nobody told me. (Incidentally, the Continental Congress actually voted for independence on July 2. On this basis I think I should not have had to serve jury duty on Monday, since that should have been a holiday -- and less than a mile from where it all happened, no less!

The Pakistan and India days seem to both commemorate the 1947 partition of India; it's not clear to me why they're not the same day.

But there seem to be at least two cases in which adjacent countries celebrate their national days within three days of each other and they're not commemorating the same event -- namely Mexico-Guatemala and U.S.-Canada. (And, of course, the U.S. borders Mexico.) This is less than I would have expected -- you'd expect two such near-collisions if there were about 104 borders between countries in the world, when there are clearly more than that. But I am working from a list which is clearly incomplete. This list is more complete but not sorted by date.

29 June 2007

when's the Fourth of July weekend this year?

Phillyist asks:

We've always wondered what the protocol is for celebrating a holiday weekend if the actual holiday falls squarely in the middle of the week. Should we be celebrating Independence Day this weekend? Or next weekend? Or should we just celebrate both weekends and spend two weekends in a row gorging ourselves on various grilled meats and icy-cold Coronas and margaritas? (This Phillyist votes the latter.)

and Jacqueline Urgo of the Philadelphia Inquirer
asks the same question:

Surely there'll be a Fourth of July weekend at the Jersey Shore. But when?
Because the Fourth falls on Wednesday this year, schedule shilly-shallying has driven the Shore into a near panic.
Will bars and restaurants need those extra ice cubes this week or next week? What about more linens for the tables? More food for the hungry?

In general, I imagine people are taking off more time for the Fourth on average this year. My guess is that the most common behavior among people taking vacations is as follows, depending on the day of the week on which the 4th falls:

Thursday: people take off Thursday the 4th through Sunday the 7th.

Friday: people take off Friday the 4th through Sunday the 6th

Saturday: people take off Friday the 3rd through Sunday the 5th

Sunday: people take off Saturday the 3rd through Monday the 5th

Monday: people take off Saturday the 2nd through Monday the 4th

Tuesday: people take off Saturay the 1st through Tuesday the 4th

But this year, do you take off from Saturday, June 30 through the 4th? Or from the 4th through Sunday, July 8? Or just throw up your hands and take the whole week? (One person is quoted in the Inquirer article as saying that people will take the weekend after, not the weekend before, because the weekend before falls partially in June. He might be on to something, although I'm not totally sure how much month boundaries affect people.)
But the Inquirer article makes it sound like this never happens. In fact, it happens one year in seven. You'd think that the people who have been in business for a while could go back and see what happened in 2001. Or 1996. Or 1990. Or 1984. Or... you get the idea. In fact, it happens in 58 years out of every 400, very nearly one in seven. (The link is to the Wikipedia article on "Dominical letter", which is the Catholic Church's system for encoding how the days of the week fall in a given year with a single letter; in every year with dominical letter G or AG, the Fourth of July falls on a Wednesday. Looking at the table there makes it easy to count.
A few random facts about the Gregorian calendar:

The Gregorian calendar repeats itself every 400 years. In any 400-year period there are 97 leap years, so the total number of days is (365*400) + 97 = 146,097, or exactly 20,871 weeks. (This is only true if you don't care about the date of Easter; if you do, the period is 5,700,000 years.

the 13th of a month is more likely to occur on a Friday than any other day. And in any of these years when the 4th of July is a Wednesday, the 13th is...?

I'll leave that as an exercise for the reader.

21 June 2007

The summer solstice and the longest days

Quick, what's the longest day of the year? (In the northern hemisphere.)

If you answered June 21 (today!), you're probably right. That's the date of the summer solstice, at least in most years and in most time zones. (You may have thought that the solstice was an entire day, but in fact it's just a moment in time, the moment when the sun is furthest north. The sun doesn't move, of course, at least not in the usual treatment of astronomy -- but the Earth moves around the sun, so sometimes its northern part is pointed more towards the sun and sometimes its southern part is.

But on what day does the sun rise the earliest? Or set the latest?

This is a trickier question. "Trickier", here, means "I don't remember the answer". But the U. S. Naval Observatory makes available a sun or moon rise/set table for one year. You can enter your location and it'll tell you when the sun rises and sets on each day in, say, 2007. The patterns don't change from year to year, because the Gregorian calendar is what we call a "solar calendar" and is pretty well correlated with the seasons. Its predecessor, the Julian calendar, didn't have this property -- it slipped relative to the seasons by a bit under a day per century. For more than you ever wanted to know about calendars, see Claus Tondering's calendar FAQ.

If I enter my location -- Philadelphia -- into the table, it tells me that the day the sun rises the earliest is any day between June 10 and June 18, when it rises at 5:31 am. Let's say that the actual earliest sunrise is in the middle of this period, June 14. Similarly, the latest sunset is on any day between June 26 and June 29; let's call it June 27. These are a week earlier and later than the solstice. On the winter side of things, the shortest day is December 20 (only nine hours and nineteen minutes - sunrise is at 7:19, sunset at 4:38), but the earliest sunset is around December 7 (4:35) and the latest sunrise is around January 5, 2008 (7:23).

What's the cause of this? It's a little something known as the equation of time, which basically says that the earth runs "fast" in some seasons and "slow" in others. In December it's running faster than in January, and in early June it's running faster than in late June.

I first really became aware of this phenomenon when I lived in Boston. In Boston winters, night comes very early -- it's not uncommon to see the pink and purple shades of sunset at around 3:30 on a December afternoon. The actual earliest sunset comes at 4:12 on the 8th of December; as you head further north the earliest sunset and latest sunrise both move towards December 21, because the "equation of time" becomes less significant with respect to the variation in day length. (In Miami, for example, they're November 29 and January 14; in Anchorage, they're December 15 and December 26.) You end up getting some strange asymmetries. You'd think that on dates equidistant from the winter solstice, you'd have the same time of sunset.

But you don't. In mid-November and late January, the sun sets in alignment with MIT's Infinite Corridor, which is a very long hallway running through the center of campus. In November it happens around November 12 (thirty-nine days before the winter solstice), at 4:20 pm; in January, it happens around January 29 (thirty-nine days after), at 4:50 pm. That actually helps in the Boston winters, believe it or not -- by the time it's getting really cold at least it feels like the sunlight is starting to come back. At least if you were someone like me who was never awake for sunrise.

Finally, in the summer the sun lines up with Manhattan streets at sunset, on May 28 and July 12. This is called "Manhattanhenge", and some people claim that the alignments are cosmic signs of Memorial Day and baseball's All-Star break. Of course, they're not; Manhattan just isn't aligned with the "north" that we usually call by that name. Most places with a regular grid of streets will have a day like this, although I haven't seen references to it happening in places other than Manhattan.

God Plays Dice