So I asked on Sunday the following question: pick two points on a unit sphere uniformly at random. What is the expected distance between them?
Without loss of generality we can fix one of the points to be (1, 0, 0). The other will be chosen uniformly at random and will be (X, Y, Z). The distance between the two points is therefore
√((1-X)2 + Y2 + Z2)
which does not look all that pleasant. But the point is on the sphere! So X2 + Y2 + Z2 = 1, and this can be rewritten as
√((1-X)2 + 1 - X2)
or after some simplification
√(2-2X).
But by a theorem of Archimedes (Wolfram Alpha calls it Archimedes' Hat-Box Theorem but I don't know if this name is standard), X is uniformly distributed on (-1, 1). Let U = 2-2X; U is uniformly distributed on (0, 4). The expectation of √(U) is therefore
∫04 (1/4) u1/2 du
and integrating gives 43/2/6 = 8/6 = 4/3.
(The commenter "inverno" got this.)
Of course it's not hard to simulate this in, say, R, if you know that the distribution of three independent standard normals is spherically symmetric, and so one way to simulate a random point on a sphere is to take a vector of three standard normals and normalize it to have unit length. This code does that:
xx1=rnorm(10^6,0,1); yy1=rnorm(10^6,0,1); zz1=rnorm(10^6,0,1)
d1=radic(xx1^2+yy1^2+zz1^2)
x1=xx1/d1;y1=yy1/d1;z1=zz1/d1;
xx2=rnorm(10^6,0,1); yy2=rnorm(10^6,0,1); zz2=rnorm(10^6,0,1)
d2=radic(xx2^2+yy2^2+zz2^2)
x2=xx2/d2;y2=yy2/d2;z2=zz2/d2;
d=radic((x1-x2)^2+(y1-y2)^2+(z1-z2)^2);
and then the output of mean(d), which contains the distances, is 1.333659; the histogram of the distances d is a right triangle. (The code doesn't make the assumption that one point is (1, 0, 0); that's a massive simplification if you want to do the problem analytically, but not nearly as important in simulation.)
15 December 2011
11 December 2011
A geometric probability problem
Here's a cute problem (from Robert M. Young, Excursions in Calculus, p. 244): "What is the average straight line distance between two points on a sphere of radius 1?"
(Answer to follow.)
If any of my students are reading this: no, this should not be interpreted as a hint to what will be on the final exam.
(Answer to follow.)
If any of my students are reading this: no, this should not be interpreted as a hint to what will be on the final exam.
06 December 2011
16 November 2011
In which I declare four things which my probability class is not about
In class today, I said approximately this:
So people decide whether to have children by flipping a coin, and if it comes up tails they have a kid, and if it comes up heads they don't. They repeat this until it comes up heads. This is probably not a good model of how people decide whether or not to have children, but maybe it's good in the aggregate. And anyway this isn't a class about how people decide whether to have kids.
Then there are two kinds of children, girls and boys -- well, not always, but this isn't a class about that -- and each child is equally likely to be a boy or a girl -- well, wait, that's not exactly true, but it's not a horrible assumption about how reproduction works on a cellular level, but this isn't a class about that either.
And people's decisions to stop having kids is independent of the sex of the children they've had -- which says this isn't China, because people do interesting things under the one-child policy -- but this isn't a class about that.
(Then I actually did some math -- namely, assume that the number of children a random family has is geometrically distributed with some parameter p, and assume that all children are equally likely to be male or female and that their genders are independent of the gender of any other children or the number of children in the family. Pick a random family with no boys. What is the distribution of the number of children they have?)
So people decide whether to have children by flipping a coin, and if it comes up tails they have a kid, and if it comes up heads they don't. They repeat this until it comes up heads. This is probably not a good model of how people decide whether or not to have children, but maybe it's good in the aggregate. And anyway this isn't a class about how people decide whether to have kids.
Then there are two kinds of children, girls and boys -- well, not always, but this isn't a class about that -- and each child is equally likely to be a boy or a girl -- well, wait, that's not exactly true, but it's not a horrible assumption about how reproduction works on a cellular level, but this isn't a class about that either.
And people's decisions to stop having kids is independent of the sex of the children they've had -- which says this isn't China, because people do interesting things under the one-child policy -- but this isn't a class about that.
(Then I actually did some math -- namely, assume that the number of children a random family has is geometrically distributed with some parameter p, and assume that all children are equally likely to be male or female and that their genders are independent of the gender of any other children or the number of children in the family. Pick a random family with no boys. What is the distribution of the number of children they have?)
11 November 2011
11/11/11
You may have heard that it's 11/11/11. (Or, if you live in the UK, 11/11/11.) When I was growing up, I'd get confused and think that World War II ended on this day, one hundred years ago. You know, at the eleventh hour of the eleventh day of the eleventh month of the eleventh year.
The New York Times says that marketers are viewing this as a singular event -- but they went on about this four years, four months, and four days ago.
The Corduroy Appreciation Club says it is Corduroy Appreciation Day.
A bit more mathematically, you can watch a video about the number eleven by James Grime, which appears to be the first of a series of Numberphile videos.
Edited, November 12, 12:29 pm: from the New York Times, a hundred years ago: "To-day it is possible to write the date with the repetition six times of a single digit." The article also points out that a digit will probably never occur again seven times in the date -- we'd have to make it to November 11, 10011 for that to happen.
The New York Times says that marketers are viewing this as a singular event -- but they went on about this four years, four months, and four days ago.
The Corduroy Appreciation Club says it is Corduroy Appreciation Day.
A bit more mathematically, you can watch a video about the number eleven by James Grime, which appears to be the first of a series of Numberphile videos.
Edited, November 12, 12:29 pm: from the New York Times, a hundred years ago: "To-day it is possible to write the date with the repetition six times of a single digit." The article also points out that a digit will probably never occur again seven times in the date -- we'd have to make it to November 11, 10011 for that to happen.
09 November 2011
Small sample sizes lead to high margins of error, unemployment version
The ten college majors with the lowest unemployment rates, from yahoo.com. I've heard about this from a friend who majored in astronomy and a friend who majored in geology; both of these are on the list, with an unemployment rate of zero.
The unemployment rates of the ten majors they list are 0, 0, 0, 0, 0, 0, 1.3, 1.4, 1.6, and 2.2 percent.
I would bet that the six zeroes are just the majors for which there were no unemployed people in the sample. The data apparently comes from the Georgetown Center on Education and the Workforce; there's a summary table at the Wall Street Journal, and indeed the majors which have zero unemployment are among the least popular. Just eyeballing the data, some of the majors with the highest unemployment are also among the least popular. The red flag here would be, say, an unemployment rate of 16.7% (one out of six) or 20.0% (one out of five) for some major near the bottom of the popularity table, but I don't see it; I guess their sample is big enough that no major is that small, or maybe they actually made some adjustments for this issue.
The actual Georgetown report seems to be available here but I am having trouble viewing it.
In case you were wondering, mathematics is the 28th most popular major (of 173) and has 5.0% unemployment; "statistics and decision science" is 128th most popular and has 6.9% unemployment, which seems to go against the popular wisdom these days that statistics majors are more employable than math majors. (But I work in a statistics department, so my view of the popular wisdom may be biased.)
The unemployment rates of the ten majors they list are 0, 0, 0, 0, 0, 0, 1.3, 1.4, 1.6, and 2.2 percent.
I would bet that the six zeroes are just the majors for which there were no unemployed people in the sample. The data apparently comes from the Georgetown Center on Education and the Workforce; there's a summary table at the Wall Street Journal, and indeed the majors which have zero unemployment are among the least popular. Just eyeballing the data, some of the majors with the highest unemployment are also among the least popular. The red flag here would be, say, an unemployment rate of 16.7% (one out of six) or 20.0% (one out of five) for some major near the bottom of the popularity table, but I don't see it; I guess their sample is big enough that no major is that small, or maybe they actually made some adjustments for this issue.
The actual Georgetown report seems to be available here but I am having trouble viewing it.
In case you were wondering, mathematics is the 28th most popular major (of 173) and has 5.0% unemployment; "statistics and decision science" is 128th most popular and has 6.9% unemployment, which seems to go against the popular wisdom these days that statistics majors are more employable than math majors. (But I work in a statistics department, so my view of the popular wisdom may be biased.)
06 October 2011
Solution to a puzzle from a few months ago
I never posted a solution to this puzzle, and today one of my students asked me about it.
The puzzle was to find all three-digit numbers that, when multiplied by their successor, give a number concatenated with itself.
So of course when you concatenate a three-digit number x with itself, you get 1001x. So the question becomes: when is k(k+1) a multiple of 1001?
1001 is 7 times 11 times 13. k and k+1 have no prime factors in common, so we have to have that some subset of 7, 11, and 13 are prime factors of k, and the rest are prime factors of k + 1. Furthermore these are going to be proper subsets; if we have that 7, 11, and 13 are prime factors of k and none of those are prime factors of k+1, or vice versa, then we get that k doesn't in fact have three digits.
So we can have the following six situations:
(1) k is a multiple of 7 and k + 1 is a multiple of 143;
(2) k is a multiple of 11 and k + 1 is a multiple of 91;
(3) k is a multiple of 13 and k + 1 is a multiple of 77;
(4) k is a multiple of 77 and k + 1 is a multiple of 13;
(5) k is a multiple of 91 and k + 1 is a multiple of 11;
(6) k is a multiple of 143 and k + 1 is a multiple of 7.
In each case we can then use the Chinese remainder theorem to find k. Let's consider the first case: we have that k ≡ 0 (mod 7), k ≡ -1 (mod 11), k ≡ -1 (mod 13).
The CRT tells us that the solution to the system of congruences
k ≡ a7 (mod 7), k ≡ a11 (mod 11), k ≡ a13 (mod 13)
is
k ≡ a7(143)(143-1)7 + a11(91)(91-1)11 + a13 (77)(77-1)13 (mod 1001)
where I'm using (a-1)b to stand for the inverse of a mod b. The other solutions differ from this by multiples of 1001. We can find the inverses by brute force:
(143-1)7 = (3-1)7 = 5, (91-1)11 = (3-1)11 = 4, (77-1)13 = (12-1)13 = 12.
So we finally get
k ≡ 715 a7 + 364 a11 + 924 a13 (mod 1001).
Now, the case (1) corresponds to a7 = 0, a11 = -1, a13 = -1; so we get
k ≡ 0 - 364 - 924 (mod 1001)
and so we get the solution k = -364 - 924 + (2)(1001) = 714. Indeed (714)(715) = 510510. (714 and 715 are a Ruth-Aaron pair.) The other five situations lead to, respectively, k = 363, 923, 77, 637, 286 and k(k+1) = 132132, 852852, 6006, 406406, 82082.
The puzzle was to find all three-digit numbers that, when multiplied by their successor, give a number concatenated with itself.
So of course when you concatenate a three-digit number x with itself, you get 1001x. So the question becomes: when is k(k+1) a multiple of 1001?
1001 is 7 times 11 times 13. k and k+1 have no prime factors in common, so we have to have that some subset of 7, 11, and 13 are prime factors of k, and the rest are prime factors of k + 1. Furthermore these are going to be proper subsets; if we have that 7, 11, and 13 are prime factors of k and none of those are prime factors of k+1, or vice versa, then we get that k doesn't in fact have three digits.
So we can have the following six situations:
(1) k is a multiple of 7 and k + 1 is a multiple of 143;
(2) k is a multiple of 11 and k + 1 is a multiple of 91;
(3) k is a multiple of 13 and k + 1 is a multiple of 77;
(4) k is a multiple of 77 and k + 1 is a multiple of 13;
(5) k is a multiple of 91 and k + 1 is a multiple of 11;
(6) k is a multiple of 143 and k + 1 is a multiple of 7.
In each case we can then use the Chinese remainder theorem to find k. Let's consider the first case: we have that k ≡ 0 (mod 7), k ≡ -1 (mod 11), k ≡ -1 (mod 13).
The CRT tells us that the solution to the system of congruences
k ≡ a7 (mod 7), k ≡ a11 (mod 11), k ≡ a13 (mod 13)
is
k ≡ a7(143)(143-1)7 + a11(91)(91-1)11 + a13 (77)(77-1)13 (mod 1001)
where I'm using (a-1)b to stand for the inverse of a mod b. The other solutions differ from this by multiples of 1001. We can find the inverses by brute force:
(143-1)7 = (3-1)7 = 5, (91-1)11 = (3-1)11 = 4, (77-1)13 = (12-1)13 = 12.
So we finally get
k ≡ 715 a7 + 364 a11 + 924 a13 (mod 1001).
Now, the case (1) corresponds to a7 = 0, a11 = -1, a13 = -1; so we get
k ≡ 0 - 364 - 924 (mod 1001)
and so we get the solution k = -364 - 924 + (2)(1001) = 714. Indeed (714)(715) = 510510. (714 and 715 are a Ruth-Aaron pair.) The other five situations lead to, respectively, k = 363, 923, 77, 637, 286 and k(k+1) = 132132, 852852, 6006, 406406, 82082.
Labels:
chinese remainder theorem,
number theory,
puzzle
20 September 2011
Using generating functions to prove that the only sequences which are their own sequence of running averages are the constant sequences
Here's a problem that occurred to me yesterday: consider a sequence of real numbers a0, a1, a2, ... . Let bk = (a0 + a1 + ... + ak)/(k+1) be the average of the first (k+1) of the ai. When is a sequence equal to its own sequence of averages? That is, if we see a bunch of numbers, when is it true that every number we see is the average of all the numbers we've seen so far?
Of course the answer is the constant sequences. But can we prove this using generating functions?
It turns out we can. If we have
f(z) = a0 + a1 z + a2 z2 + ...
then
f(z)/(1-z) = a0 + (a0 + a1)z + (a0 + a1 + a2) z^2 + ... = b0 + 2b1 z + 3b2 z^2 + ...
and let's call the right-hand side here g(z). What can we do to g(z) to transform it into
h(z) = b0 + b1 z + b2 z2 + ... ?
Well, a linear operator that takes the function (of z) zk to the function (of z) zk/(k+1) could be applied to g in order to get h. Clearly this is related to integration. In fact the operator
I φ(z) = (1/z) ∫0z φ(s) ds
does the trick. So we have
h(z) = I g(z) = (1/z) ∫0z f(s)/(1-s) ds.
But remember that we wanted the generating function h of the averages to be just the generating function f of the original sequence. So this gives
f(z) = (1/z) ∫0z f(s)/(1-s) ds
and this is in fact a differential equation. Multiply through by z and differentiate both sides to get
z f'(z) + f(z) = f(z)/(1-z).
A bit of rearranging gives
f'(z)/f(z) = 1/(1-z)
and we recognize that the left-hand side is the derivative of log f(z). Integrating both sides gives
log f(z) = log(1-z) + C
where C is a constant of integration, and finally we get f(z) = eC/(1-z). But this is just the generating function of a constant sequence.
Of course the answer is the constant sequences. But can we prove this using generating functions?
It turns out we can. If we have
f(z) = a0 + a1 z + a2 z2 + ...
then
f(z)/(1-z) = a0 + (a0 + a1)z + (a0 + a1 + a2) z^2 + ... = b0 + 2b1 z + 3b2 z^2 + ...
and let's call the right-hand side here g(z). What can we do to g(z) to transform it into
h(z) = b0 + b1 z + b2 z2 + ... ?
Well, a linear operator that takes the function (of z) zk to the function (of z) zk/(k+1) could be applied to g in order to get h. Clearly this is related to integration. In fact the operator
I φ(z) = (1/z) ∫0z φ(s) ds
does the trick. So we have
h(z) = I g(z) = (1/z) ∫0z f(s)/(1-s) ds.
But remember that we wanted the generating function h of the averages to be just the generating function f of the original sequence. So this gives
f(z) = (1/z) ∫0z f(s)/(1-s) ds
and this is in fact a differential equation. Multiply through by z and differentiate both sides to get
z f'(z) + f(z) = f(z)/(1-z).
A bit of rearranging gives
f'(z)/f(z) = 1/(1-z)
and we recognize that the left-hand side is the derivative of log f(z). Integrating both sides gives
log f(z) = log(1-z) + C
where C is a constant of integration, and finally we get f(z) = eC/(1-z). But this is just the generating function of a constant sequence.
13 September 2011
The quadratic equation, Dr. Seuss style
I've been busy, but here's a quick one: The quadratic equation, Dr. Seuss style, by Katie Benedetto. My favorite stanza:
This is the antidote to if the IRS had discovered the quadratic formula.
Each value there was, well she renamed each one
Claiming that nice names would make this more fun.
“Variables, well, they’re whatever we wish
So like one, two, red, blue, I’ll call this one “fish”!
This is the antidote to if the IRS had discovered the quadratic formula.
19 August 2011
Some thoughts on the mathematics of tax withholding
Here's a question of some practical importance: let's say that I make $2,000 in each of the first six months of the year, and $4,000 in each of the second six months. Will I have more or less tax withheld than if I make $3,000 every month?
(This is inspired by the fact that I taught this summer, and was paid for doing so, but because of the way my contract is written, my pay for the academic year is spread out over twelve months. As a result I got extra-large paychecks over the summer, partially for the work I was doing in the summer and partially for work that I had already done in the previous academic year or will be doing in the next academic year.)
For those not familiar with the US tax system: if your net pay is X you don't get a check for X, but for some smaller amount, because various taxes are withheld. Chief among these is the federal income tax. Now, the federal income tax is not a flat tax, but a progressive tax -- if your income is higher then you pay a larger percentage of your income in tax. Tax returns have to be filled out on a yearly basis, but most people get paid more often than yearly. So the amount of tax withheld is determined, based on the amount of the paycheck and the period that the paycheck is for, in such a way that the total amount of tax withheld is somewhere near the amount that you're expected to owe. (Most Americans actually end up overpaying through this system, and get a small refund back at tax-filing time.)
So say that if you make 36x per year, then your taxes will be f(36x). Then you'd expect that if you make 3x in a given month, you will have f(36x)/12 withheld, for a total of f(36x) over the course of the year.. If instead you make 2x in each of six months and 4x in each of six months, then in each of the months in which you make 2x tax will be withheld as if you make 24x per year, and in each month in which you make 4x tax will be withheld as if you make 48x per year. So total withholding will be
6f(24x)/12 + 6f(48x)/12
or, simplifying, [f(24x) + f(48x)]/2. Call this T'. Is this less than or greater than f(36x), which we'll call T?
We can easily see that T' ≥ T if and only if
f(48x)-f(36x) ≥ f(36x) - f(24x).
That is, T' ≥ T if and only if the amount of extra tax owed when you go from $36,000 to $48,000 is more than the extra amount owed when you go from $24,000 to $36,000. But since marginal tax rates are increasing -- since the tax is progressive -- this is true.
More generally, given progressive taxation, withholding is smallest for a given annual income if that income is spread out exactly evenly throughout the year. This is a consequence of Jensen's inequality. The more unevenly spread out the earnings are, the more money will be withheld.
In reality this is slightly more complicated because there are tax brackets, which are reflected in the withholding formulas, so f is actually piecewise linear (see page 36 of this IRS publication). For example, a single person paid between $883 and $3,050 per month (after subtracting withholding allowances) will have $70.80 + .15(x-$883) withheld from a paycheck of x; a single person paid between $3,050 and $7,142 will have $395.85 + .25(x-$3050) withheld. (Note that putting $3,050 into either of these formulas gives $395.85; the amount withheld is a continuous function of the amount earned.) So if every paycheck is under $3,050, or if every paycheck is over $3,050, then the amount withheld ends up being the same no matter how the pay is distributed. But a person who makes, say, $3,000 in each of two months will have $388.35 withheld from each, for a total of $766.70; a person who makes $2,000 in one month and $4,000 in another will have $238.35 withheld from the first and $633.35 withheld from the second, for a total of $871.70 withheld.
I had known all this intuitively before this afternoon but I'd never bothered to actually write down why it is...
This all applies to people who make varying amounts in differing pay periods from a single job. People who have multiple jobs can be burnt in the withholding process because we have progressive taxation; if you make x in each of two jobs you have less withheld than if you make 2x in a single job. If that's you, be careful.
(I'm not an accountant. None of this should be taken as financial advice.)
(This is inspired by the fact that I taught this summer, and was paid for doing so, but because of the way my contract is written, my pay for the academic year is spread out over twelve months. As a result I got extra-large paychecks over the summer, partially for the work I was doing in the summer and partially for work that I had already done in the previous academic year or will be doing in the next academic year.)
For those not familiar with the US tax system: if your net pay is X you don't get a check for X, but for some smaller amount, because various taxes are withheld. Chief among these is the federal income tax. Now, the federal income tax is not a flat tax, but a progressive tax -- if your income is higher then you pay a larger percentage of your income in tax. Tax returns have to be filled out on a yearly basis, but most people get paid more often than yearly. So the amount of tax withheld is determined, based on the amount of the paycheck and the period that the paycheck is for, in such a way that the total amount of tax withheld is somewhere near the amount that you're expected to owe. (Most Americans actually end up overpaying through this system, and get a small refund back at tax-filing time.)
So say that if you make 36x per year, then your taxes will be f(36x). Then you'd expect that if you make 3x in a given month, you will have f(36x)/12 withheld, for a total of f(36x) over the course of the year.. If instead you make 2x in each of six months and 4x in each of six months, then in each of the months in which you make 2x tax will be withheld as if you make 24x per year, and in each month in which you make 4x tax will be withheld as if you make 48x per year. So total withholding will be
6f(24x)/12 + 6f(48x)/12
or, simplifying, [f(24x) + f(48x)]/2. Call this T'. Is this less than or greater than f(36x), which we'll call T?
We can easily see that T' ≥ T if and only if
f(48x)-f(36x) ≥ f(36x) - f(24x).
That is, T' ≥ T if and only if the amount of extra tax owed when you go from $36,000 to $48,000 is more than the extra amount owed when you go from $24,000 to $36,000. But since marginal tax rates are increasing -- since the tax is progressive -- this is true.
More generally, given progressive taxation, withholding is smallest for a given annual income if that income is spread out exactly evenly throughout the year. This is a consequence of Jensen's inequality. The more unevenly spread out the earnings are, the more money will be withheld.
In reality this is slightly more complicated because there are tax brackets, which are reflected in the withholding formulas, so f is actually piecewise linear (see page 36 of this IRS publication). For example, a single person paid between $883 and $3,050 per month (after subtracting withholding allowances) will have $70.80 + .15(x-$883) withheld from a paycheck of x; a single person paid between $3,050 and $7,142 will have $395.85 + .25(x-$3050) withheld. (Note that putting $3,050 into either of these formulas gives $395.85; the amount withheld is a continuous function of the amount earned.) So if every paycheck is under $3,050, or if every paycheck is over $3,050, then the amount withheld ends up being the same no matter how the pay is distributed. But a person who makes, say, $3,000 in each of two months will have $388.35 withheld from each, for a total of $766.70; a person who makes $2,000 in one month and $4,000 in another will have $238.35 withheld from the first and $633.35 withheld from the second, for a total of $871.70 withheld.
I had known all this intuitively before this afternoon but I'd never bothered to actually write down why it is...
This all applies to people who make varying amounts in differing pay periods from a single job. People who have multiple jobs can be burnt in the withholding process because we have progressive taxation; if you make x in each of two jobs you have less withheld than if you make 2x in a single job. If that's you, be careful.
(I'm not an accountant. None of this should be taken as financial advice.)
16 August 2011
God tweets about playing dice
There's a book coming out in November, The Last Testament: A Memoir by God. "God" tweets at TheTweetOfGod, and "He" just twote:
Do you ever lose when you play cards?" Remember Einstein! I play not cards, but dice. And I never lose. The dice are loaded. And so am I.There you have it, folks. The title of this blog is correct.
14 August 2011
867-5309
The number 8,675,309 is the title of a song. It is also a prime number, and as far as I can tell, it's the largest prime number to appear in a song title. According to this list of songs with numbers in the title from Wikipedia (which is currently in the "Wikilists" space because it was deleted from mainstream Wikipedia, larger numbers in song titles are: 9 million, 30 million, 93 million, 100 million, 1 billion, 1 trillion, 1 quadrillion, 1 googolplex, infinity minus one, and infinity.
I'm not surprised to see that most of these are "round numbers", simply because they're easier to say; in particular by inspection they are not prime, since they're all multiples of large powers of 10. But I'm wondering if anybody out there has written a song titled, say, "six billion and one".
(No fair going out and writing such a song.)
I'm not surprised to see that most of these are "round numbers", simply because they're easier to say; in particular by inspection they are not prime, since they're all multiples of large powers of 10. But I'm wondering if anybody out there has written a song titled, say, "six billion and one".
(No fair going out and writing such a song.)
12 August 2011
A puzzle about splitting up numbers into groups
A puzzle from David Radcliffe: "Split {1,2,...,16} into two groups of the same size having equal sums, equal sums of squares, and equal sums of cubes."
Solution: one group is A, D, E, G, J, K, M, P; the other is B, C, E, H, I, L, N, O. (I've replaced each number with the corresponding letter of the English alphabet to obscure things a bit.)
Here's how I found this. It's often useful to look at cubes mod 9, because any cube of an integer is either a multiple of 9, one less than a multiple of 9, or one more than a multiple of 9. These cases correspond to the integer itself being a multiple of 3, one less than a multiple of 3, or one more than a multiple of 3. So 13 + 23 + 33 is a multiple of 9; so are 43 + 53 + 63, ..., 133 + 143 + 153. Since 163 is one more than a multiple of 9, so is the sum of the first sixteen cubes, which we'll call N.
We want to find eight integers in 1, ..., 16 that have sum of cubes equal to N/2. Now, if N is congruent to 1 mod 9, then N/2 is congruent to 5 mod 9. This means it's relatively far from a multiple of 9, so a lot of the numbers that are one more than a multiple of 9 are going to have to go into the same group. In particular, each group must contain either five more 1 mod 3 than 2 mod 3 integers, or four more 1 mod 3 than 2 mod 3 integers. Keeping in mind the limited supplies -- we only have six integers in our set congruent to 1 mod 3, and five each congruent to 0 and 2 mod 3, the only possible groups are:
Furthermore we either have a group of the type given in the first row and one of the type given in the fourth row, or one of the second-row type and one of the third-row type, in order to meet the supply restrictions.
But it's not possible to have a group of the first-row type and a group of the fourth-row type. Let's consider a group of the fourth-row type -- that is, it has two multiples of three, one number that's one more than a multiple of three, and five that are one less than a multiple of three. The five that are one less than a multiple of three must be 2, 5, 8, 11, and 14. The square of each of these is one more than a multiple of three, so 22 + 52 + 82 + 112 + 142 is congruent to 2 mod 3. Call the other three members of this group x, y, and z. From the table above, two of these are multiples of 3.
But the square of every integer is either 0 mod 3 (if it's a multiple of 3) or 1 mod 3 (if it's not a multiple of 3) and so the sum of the first sixteen squares is 2 mod 3. So each group must have sum of squares congruent to 1 mod 3. The five integers that we've already put in the group have squares summing to 2 mod 3; thus x2 + y2 + z2 is also 2 mod 3, contradicting the fact that exactly one of x, y, and z is not a multiple of 3.
So we must have groups of the type in the second row and of the type in the third row. Let's look at the group of the second-row type. It contains all six integers in {1, 2, ..., 16} congruent to 1 mod 3 -- that's 1, 4, 7, 10, 13, and 16. By symmetry the other two elements -- call them t and u -- must sum to 17. Furthermore the sum of the first 16 squares is (16)(16+1)(2×16+1)/6 = 1496; so the sums of the squares in each group must be 1496/2 = 748. 12+42+72+102+132+162 = 591 and so we must have t2 + u2 = 748 - 591 = 157. So we need two integers that sum to 17, with squares summing to 157; solve the quadratic t2 + (17-t)2 = 157 to get t = 6 or t = 11, and correspondingly u = 11 or u = 6.
So if there's any way at all to do what we were asked, it's with one group being 1, 4, 6, 7, 10, 11, 13, 16 -- and this checks out.
Of course, it would be trivial to write a program to check this...
Solution: one group is A, D, E, G, J, K, M, P; the other is B, C, E, H, I, L, N, O. (I've replaced each number with the corresponding letter of the English alphabet to obscure things a bit.)
Here's how I found this. It's often useful to look at cubes mod 9, because any cube of an integer is either a multiple of 9, one less than a multiple of 9, or one more than a multiple of 9. These cases correspond to the integer itself being a multiple of 3, one less than a multiple of 3, or one more than a multiple of 3. So 13 + 23 + 33 is a multiple of 9; so are 43 + 53 + 63, ..., 133 + 143 + 153. Since 163 is one more than a multiple of 9, so is the sum of the first sixteen cubes, which we'll call N.
We want to find eight integers in 1, ..., 16 that have sum of cubes equal to N/2. Now, if N is congruent to 1 mod 9, then N/2 is congruent to 5 mod 9. This means it's relatively far from a multiple of 9, so a lot of the numbers that are one more than a multiple of 9 are going to have to go into the same group. In particular, each group must contain either five more 1 mod 3 than 2 mod 3 integers, or four more 1 mod 3 than 2 mod 3 integers. Keeping in mind the limited supplies -- we only have six integers in our set congruent to 1 mod 3, and five each congruent to 0 and 2 mod 3, the only possible groups are:
≅ 0 (mod 3) | ≅ 1 (mod 3) | ≅ 2 (mod 3) |
3 | 5 | 0 |
1 | 6 | 1 |
4 | 0 | 4 |
2 | 1 | 5 |
Furthermore we either have a group of the type given in the first row and one of the type given in the fourth row, or one of the second-row type and one of the third-row type, in order to meet the supply restrictions.
But it's not possible to have a group of the first-row type and a group of the fourth-row type. Let's consider a group of the fourth-row type -- that is, it has two multiples of three, one number that's one more than a multiple of three, and five that are one less than a multiple of three. The five that are one less than a multiple of three must be 2, 5, 8, 11, and 14. The square of each of these is one more than a multiple of three, so 22 + 52 + 82 + 112 + 142 is congruent to 2 mod 3. Call the other three members of this group x, y, and z. From the table above, two of these are multiples of 3.
But the square of every integer is either 0 mod 3 (if it's a multiple of 3) or 1 mod 3 (if it's not a multiple of 3) and so the sum of the first sixteen squares is 2 mod 3. So each group must have sum of squares congruent to 1 mod 3. The five integers that we've already put in the group have squares summing to 2 mod 3; thus x2 + y2 + z2 is also 2 mod 3, contradicting the fact that exactly one of x, y, and z is not a multiple of 3.
So we must have groups of the type in the second row and of the type in the third row. Let's look at the group of the second-row type. It contains all six integers in {1, 2, ..., 16} congruent to 1 mod 3 -- that's 1, 4, 7, 10, 13, and 16. By symmetry the other two elements -- call them t and u -- must sum to 17. Furthermore the sum of the first 16 squares is (16)(16+1)(2×16+1)/6 = 1496; so the sums of the squares in each group must be 1496/2 = 748. 12+42+72+102+132+162 = 591 and so we must have t2 + u2 = 748 - 591 = 157. So we need two integers that sum to 17, with squares summing to 157; solve the quadratic t2 + (17-t)2 = 157 to get t = 6 or t = 11, and correspondingly u = 11 or u = 6.
So if there's any way at all to do what we were asked, it's with one group being 1, 4, 6, 7, 10, 11, 13, 16 -- and this checks out.
Of course, it would be trivial to write a program to check this...
08 August 2011
Dimensional analysis for gravity trains
A surprising fact: drill a straight tunnel through the earth, between any two points. Drop a burrito in at one end. Assuming that you could actually build the tunnel, and that there's no friction, the burrito comes out the other end in 42 minutes. This is called a gravity train and it's not hard to prove (the version I link to is due to Alexandre Eremenko of Purdue) that the time it takes the burrito to get from one end to the other is (3π/4)-1/2 (G ρ)-1/2, where G is the gravitational constant and ρ is the density of the Earth. Alternatively this can be written as (r3/Gm)1/2, where r is the radius of the earth and m its mass.
Everyone's so surprised, when they see this, that the time doesn't depend on the distance between the two points! And this is interesting, but as a result you don't see the more subtle fact that the time doesn't depend on the size of the planet. If I make a super-Earth that is twice the radius but made of the same stuff, so it's eight times as massive, then the density stays the same. Somewhat surprisingly you can see this using dimensional analysis. It's "obvious" that this time, if it exists, can only depend on the mass of the earth, the radius of the earth, and the gravitational constant. The mass of the earth, m, has dimension M; the radius, r, has dimension L; the gravitational constant has dimension L3 M-1 T-2. The only combination mα rβ Gγ that has units of time is G-1/2 m-1/2 r3/2.
Of course I'm making a big assumption there -- that the constant time "42 minutes" is actually a constant! It seems perfectly reasonable that it could depend on the distance between the two termini. I'll handwave that away by saying that it depends on the angle formed by the two termini and the center of the earth. And angles are of course dimensionless.
(The Alameda-Weehawken burrito tunnel, being non-fictional, uses magnets to accelerate and decelerate the foil-wrapper burritos and takes 64 minutes instead of the theoretical 42.)
Everyone's so surprised, when they see this, that the time doesn't depend on the distance between the two points! And this is interesting, but as a result you don't see the more subtle fact that the time doesn't depend on the size of the planet. If I make a super-Earth that is twice the radius but made of the same stuff, so it's eight times as massive, then the density stays the same. Somewhat surprisingly you can see this using dimensional analysis. It's "obvious" that this time, if it exists, can only depend on the mass of the earth, the radius of the earth, and the gravitational constant. The mass of the earth, m, has dimension M; the radius, r, has dimension L; the gravitational constant has dimension L3 M-1 T-2. The only combination mα rβ Gγ that has units of time is G-1/2 m-1/2 r3/2.
Of course I'm making a big assumption there -- that the constant time "42 minutes" is actually a constant! It seems perfectly reasonable that it could depend on the distance between the two termini. I'll handwave that away by saying that it depends on the angle formed by the two termini and the center of the earth. And angles are of course dimensionless.
(The Alameda-Weehawken burrito tunnel, being non-fictional, uses magnets to accelerate and decelerate the foil-wrapper burritos and takes 64 minutes instead of the theoretical 42.)
03 August 2011
Give your money to Samuel Hansen's math kickstarter!
Alright, folks. You should give some money to Samuel Hansen's Kickstarter project Relatively Prime: Stories from the Mathematical Domain. This is what it sounds like -- Samuel will tell eight hours' worth of stories about mathematics. Samuel is perhaps the world's most accomplished mathematical podcaster -- or at least the most prolific --- and is the force behind Combinations and Permutations (sort of a mathematical comedy podcast), Strongly Connected Components (interviews with mathematicians), and Math/Maths with Peter Rowlett. Here's Samuel's blurb about the project:
He needs $8,000 in pledges. With 12 hours to go (until 11:13 PM US Eastern Time tonight) he's got $7,115. The way Kickstarter works is that Samuel only gets any of the money if at least $8,000 gets pledged. So if
Maybe the puppets from Avenue Q will convince you, although they're raising money to found a Monsterssori school. But hey, they're both good causes. On some level storytelling is what all education is about.
Relatively Prime will be an 8 episode audio podcast featuring stories from the world of mathematics. Tackling questions like: is it true that you are only 7 seven handshakes from the President, what exactly is a micromort, and how did 39 people commenting on a blog manage to prove a deep theorem. Relatively Prime will feature interviews with leaders of mathematics, as well as the unsung foot soldiers that push the mathematical machine forward. With each episode structured around topics such as: The Shape of Things, Risk, and Calculus Wars, Relatively Prime will illuminate each area by delving into the history, applications, and people that underlie the subject that is the foundation of all science.
He needs $8,000 in pledges. With 12 hours to go (until 11:13 PM US Eastern Time tonight) he's got $7,115. The way Kickstarter works is that Samuel only gets any of the money if at least $8,000 gets pledged. So if
Maybe the puppets from Avenue Q will convince you, although they're raising money to found a Monsterssori school. But hey, they're both good causes. On some level storytelling is what all education is about.
07 July 2011
A cute number theory puzzle
Today's MAA number of the day, when multiplied by its successor, gives a number concatenated with itself. (Like 455 * 539 = 245245, except 539 isn't the successor of 455.)
Puzzle: find all such three-digit numbers. (The fact that I'm phrasing it this way should indicate to you that there's more than one.)
Meta-puzzle: generalize this. (I have some generalizations in mind.)
Puzzle: find all such three-digit numbers. (The fact that I'm phrasing it this way should indicate to you that there's more than one.)
Meta-puzzle: generalize this. (I have some generalizations in mind.)
17 June 2011
X-Y correspondence
Google hits for "Pascal-Fermat correspondence": 3,230. For "Fermat-Pascal correspondence": 206.
Does anyone have any idea why? In particular it seems like one would sort either on importance (but which of these two is more important?) or alphabetically (but that gives the wrong result).
And given two individuals X and Y, how can we predict whether "X-Y correspondence" or "Y-X correspondence" is more common? I'm sure there are no hard and fast rules here but there must at least be some trends, and would expect a situation similar to how gender is assigned to words in languages, foreign to me, where words have gender.
Does anyone have any idea why? In particular it seems like one would sort either on importance (but which of these two is more important?) or alphabetically (but that gives the wrong result).
And given two individuals X and Y, how can we predict whether "X-Y correspondence" or "Y-X correspondence" is more common? I'm sure there are no hard and fast rules here but there must at least be some trends, and would expect a situation similar to how gender is assigned to words in languages, foreign to me, where words have gender.
16 June 2011
A mathematics quiz on Sporcle that isn't just basic arithmetic
From Sporcle: name the famous mathematicians associated with some theorems.
I don't think I'm giving anything away by telling you that one of the answers is Euler. I also take issue with the result they give under this name (link goes to Wikipedia article with the same name as one of the answers).
I don't think I'm giving anything away by telling you that one of the answers is Euler. I also take issue with the result they give under this name (link goes to Wikipedia article with the same name as one of the answers).
26 May 2011
The most well-read cities in the United States
From Berkeleyside: Berkeley is the third most well-read city in the US, according to amazon.com data.
This is among cities with population 100,000 or greater. Number 1 is Cambridge, Massachusetts (105K people); number 2 is Alexandria, Virginia (140K); number 3 is Berkeley, California (112K); number 4 is Ann Arbor, Michigan (114K); number 5 is Boulder, Colorado (100K). There are 275 cities of population greater than 100,000 in the US; Alexandria, the most populous of these five, is ranked 177.
My first thought upon seeing this is that these are all small cities, and of course you expect to see more extreme results in small cities than in large cities. Small cities are perhaps more likely to be homogenous. (This seems especially likely to be true for small cities that are part of larger metropolitan areas.) Actually, my quick analysis of the top five doesn't hold up for the top twenty; the average rank of the top twenty cities listed at amazon is 127.1, which is LOWER (although not significantly different) from the 138.5 you'd expect if being on this top-twenty list was independent of size. But it's certainly possible that, say, some 100,000-person section of the city of San Francisco actually has higher amazon.com sales than Berkeley. (There are a surprisingly large number of bookstores in the Mission.)
Also, people in college towns tend to read a lot -- that's no surprise (although one does hear that students don't read any more a lot these days). Four of the top five (all but Alexandria) are college towns; also in the top 20 are Gainesville (Florida), Knoxville (Tennessee), and Columbia (South Carolina). And in case you're wondering, Alexandria is not named after the city in Egypt with the Great Library.
This is among cities with population 100,000 or greater. Number 1 is Cambridge, Massachusetts (105K people); number 2 is Alexandria, Virginia (140K); number 3 is Berkeley, California (112K); number 4 is Ann Arbor, Michigan (114K); number 5 is Boulder, Colorado (100K). There are 275 cities of population greater than 100,000 in the US; Alexandria, the most populous of these five, is ranked 177.
My first thought upon seeing this is that these are all small cities, and of course you expect to see more extreme results in small cities than in large cities. Small cities are perhaps more likely to be homogenous. (This seems especially likely to be true for small cities that are part of larger metropolitan areas.) Actually, my quick analysis of the top five doesn't hold up for the top twenty; the average rank of the top twenty cities listed at amazon is 127.1, which is LOWER (although not significantly different) from the 138.5 you'd expect if being on this top-twenty list was independent of size. But it's certainly possible that, say, some 100,000-person section of the city of San Francisco actually has higher amazon.com sales than Berkeley. (There are a surprisingly large number of bookstores in the Mission.)
Also, people in college towns tend to read a lot -- that's no surprise (although one does hear that students don't read any more a lot these days). Four of the top five (all but Alexandria) are college towns; also in the top 20 are Gainesville (Florida), Knoxville (Tennessee), and Columbia (South Carolina). And in case you're wondering, Alexandria is not named after the city in Egypt with the Great Library.
25 May 2011
xkcd, philosophy, and Wikipedia
If you hover the cursor over today's xkcd, you'll see the following:
I first heard this a few days ago, but with "Philosophy" replaced by "Mathematics". Here's an example:
I clicked on "Random article" which took me to Billy Mercer (footballer born 1896). Following the instructions goes to England (Mercer was English), Country, Geography, Earth, Orbit, Physics, Natural science, Science, Knowledge, Fact, Verification, Formal verification, Mathematical proof, Mathematics.
(A few days ago "fact" went to "information"; the article starts "The word fact can refer to verified information" and someone made "verified" into a link recently. In that case the sequence is fact, information, sequence, mathematics.)
If you keep going you get "quantity", "property (philosophy)", "modern philosophy", "philosophy", "reason", "rationality", "mental exercise", "Alzheimer's disease", "dementia", "cognition", "thought", "consciousness", "mind", "panpsychism", and back to "philosophy".
("rationality" used to go to "philosophy", until someone edited it, leaving the note "Raised the period of the Philosophy article... it was ridiculously low." Of course once someone points out some property of Wikipedia, people will tamper with it.
This doesn't seem to happen if you click on random links, or even second links. The basic reason seems to be a quirk of Wikipedia style -- the article for X often starts out "X is a Y" or "In the field of Y, X is..." or something like that, so there's a tendency for the first link in an article to point to something "more general". Does this mean that "mathematics" necessarily has to be the attractor? Of course not. But it does mean that the attractor, if it exists, will probably be some very broad article.
Edited to add, Thursday, 10:26 am: Try the same thing at the French wikipedia; it doesn't work. This seems to depend on certain conventions that English-language Wikipedians have adopted. However, it seems to work at the Spanish wikipedia, with FilosofÃa as the target.
Wikipedia trivia: if you take any article, click on the first link in the article text not in parentheses or italics, and then repeat, you will eventually end up at "Philosophy".
I first heard this a few days ago, but with "Philosophy" replaced by "Mathematics". Here's an example:
I clicked on "Random article" which took me to Billy Mercer (footballer born 1896). Following the instructions goes to England (Mercer was English), Country, Geography, Earth, Orbit, Physics, Natural science, Science, Knowledge, Fact, Verification, Formal verification, Mathematical proof, Mathematics.
(A few days ago "fact" went to "information"; the article starts "The word fact can refer to verified information" and someone made "verified" into a link recently. In that case the sequence is fact, information, sequence, mathematics.)
If you keep going you get "quantity", "property (philosophy)", "modern philosophy", "philosophy", "reason", "rationality", "mental exercise", "Alzheimer's disease", "dementia", "cognition", "thought", "consciousness", "mind", "panpsychism", and back to "philosophy".
("rationality" used to go to "philosophy", until someone edited it, leaving the note "Raised the period of the Philosophy article... it was ridiculously low." Of course once someone points out some property of Wikipedia, people will tamper with it.
This doesn't seem to happen if you click on random links, or even second links. The basic reason seems to be a quirk of Wikipedia style -- the article for X often starts out "X is a Y" or "In the field of Y, X is..." or something like that, so there's a tendency for the first link in an article to point to something "more general". Does this mean that "mathematics" necessarily has to be the attractor? Of course not. But it does mean that the attractor, if it exists, will probably be some very broad article.
Edited to add, Thursday, 10:26 am: Try the same thing at the French wikipedia; it doesn't work. This seems to depend on certain conventions that English-language Wikipedians have adopted. However, it seems to work at the Spanish wikipedia, with FilosofÃa as the target.
07 May 2011
Two no-hitters four days apart is not that rare
Justin Verlander just threw a no-hitter for the Detroit Tigers. On May 3rd, Francisco Liriano threw one for the Minnesota Twins.
There have only been 271 no-hitters in one hundred years of Major League Baseball, so two separated by four days seems unusual.
But two no-hitters within four days of each other has happened several times before. From Wikipedia, there have been two no-hitters within four days of each other on the following dates:
August 19 and 20, 1880
September 19 and 20, 1882
two on April 22, 1898
September 18 and 20, 1908
August 26 and 30, 1916
May 2, 5, and 6, 1917
September 4 and 7, 1923
June 11 and 15, 1938
June 26 and 30, 1962
September 17 and 18, 1968
September 26 and 29, 1983
June 1 and 2, 1990
two on June 29, 1990
September 4 and 8, 1993
May 11 and 14, 1996
Is this list surprisingly long? If you assume that baseball has been played 180 days a year for 130 years, then that's 23,400 days on which baseball has been played. There have been 271 no-hitters, so on an average baseball-playing day there are 0.01158 no-hitters. After any given no-hitter there's a four-day window in which the list I gave above could be added to. So you'd expect (271)(0.01158)(4) = 12.5 pairs in that list. There are 17 pairs on the list. (I'm counting the 1917 triplet as three pairs. I'm not counting today's no-hitter.) So there doesn't seem to be particularly strong evidence for no-hitters somehow causing more no-hitters in their wake. (Although my model of the baseball schedule is, I admit, ridiculously crude. In particular I have ignored the fact that the number of teams isn't constant.)
There have only been 271 no-hitters in one hundred years of Major League Baseball, so two separated by four days seems unusual.
But two no-hitters within four days of each other has happened several times before. From Wikipedia, there have been two no-hitters within four days of each other on the following dates:
August 19 and 20, 1880
September 19 and 20, 1882
two on April 22, 1898
September 18 and 20, 1908
August 26 and 30, 1916
May 2, 5, and 6, 1917
September 4 and 7, 1923
June 11 and 15, 1938
June 26 and 30, 1962
September 17 and 18, 1968
September 26 and 29, 1983
June 1 and 2, 1990
two on June 29, 1990
September 4 and 8, 1993
May 11 and 14, 1996
Is this list surprisingly long? If you assume that baseball has been played 180 days a year for 130 years, then that's 23,400 days on which baseball has been played. There have been 271 no-hitters, so on an average baseball-playing day there are 0.01158 no-hitters. After any given no-hitter there's a four-day window in which the list I gave above could be added to. So you'd expect (271)(0.01158)(4) = 12.5 pairs in that list. There are 17 pairs on the list. (I'm counting the 1917 triplet as three pairs. I'm not counting today's no-hitter.) So there doesn't seem to be particularly strong evidence for no-hitters somehow causing more no-hitters in their wake. (Although my model of the baseball schedule is, I admit, ridiculously crude. In particular I have ignored the fact that the number of teams isn't constant.)
17 April 2011
Get drunk not broke
Dreading the upcoming work week, because it's Sunday night and you still don't make enough money to buy fancy booze? Get drunk not broke sorts drinks by alcohol to cost ratio. See, math is good for something.
There's also Get drunk not fat (alcohol to calories).
And finally there's Brian Gawalt's Get drunk but neither broke nor fat. This page finds the three drinks such that every other drink is either more expensive or more caloric per ounce of alcohol.
There's also Get drunk not fat (alcohol to calories).
And finally there's Brian Gawalt's Get drunk but neither broke nor fat. This page finds the three drinks such that every other drink is either more expensive or more caloric per ounce of alcohol.
12 April 2011
On the inclusion of solutions in textbooks
From Jones, Game Theory: Mathematical Models of Conflict (link goes to Google Books), in the preface:
"Some teachers may be displeased with me for including fairly detailed solutions to the problems, but I remain unrepentant [...] In my view any author of a mathematical textbook should be required by the editor to produce detailed solutions for all the problems set, and these should be included where space permits."
By the way, Jones was writing this in 1979; presumably if space does not permit, in the present day solutions can be posted on the author's web site. (This will pose a problem if websites move, though; perhaps an arXiv-like electronic repository of solutions would be a good idea?) A reviewer at Amazon points out that the inclusion of solutions to problems might be an issue for those choosing to assign the textbook in a course where homework is collected and graded. Jones has a PhD from Cambridge and as far I can tell was at Imperial College, London at the time of writing; the willingness to include solutions may have something to do with the difference between the British and American educational systems.
I've seen frustration about the lack of provided solutions in textbooks on the part of my more conscientious students. (This isn't with regard to this text - I'm not currently teaching game theory - but with regard to other texts I've used in other courses.) They want to do as many problems as they can, which is good. This practice of leaving out the solutions is perhaps aimed at the median student - in my experience the median student does all of the homework problems but would never consider trying anything that's not explicitly assigned. (And although I don't know for sure, the student who goes out of their way to get a bootleg solutions manual is probably not the conscientious student I'm referring to.)
"Some teachers may be displeased with me for including fairly detailed solutions to the problems, but I remain unrepentant [...] In my view any author of a mathematical textbook should be required by the editor to produce detailed solutions for all the problems set, and these should be included where space permits."
By the way, Jones was writing this in 1979; presumably if space does not permit, in the present day solutions can be posted on the author's web site. (This will pose a problem if websites move, though; perhaps an arXiv-like electronic repository of solutions would be a good idea?) A reviewer at Amazon points out that the inclusion of solutions to problems might be an issue for those choosing to assign the textbook in a course where homework is collected and graded. Jones has a PhD from Cambridge and as far I can tell was at Imperial College, London at the time of writing; the willingness to include solutions may have something to do with the difference between the British and American educational systems.
I've seen frustration about the lack of provided solutions in textbooks on the part of my more conscientious students. (This isn't with regard to this text - I'm not currently teaching game theory - but with regard to other texts I've used in other courses.) They want to do as many problems as they can, which is good. This practice of leaving out the solutions is perhaps aimed at the median student - in my experience the median student does all of the homework problems but would never consider trying anything that's not explicitly assigned. (And although I don't know for sure, the student who goes out of their way to get a bootleg solutions manual is probably not the conscientious student I'm referring to.)
07 April 2011
The probability of Asian-Hispanic food
From tomorrow's New York Times: an article on the growing prevalence of Asian-Hispanic fusion food in California. It's part of a series on the Census. Orange County, California is 17.87% Asian and 34.13% Hispanic -- so the majority of the population, 52.00%, is either Asian or Hispanic. Not surprisingly there's a guy there with a food truck named Dos Chinos, which serves such food as sriracha-tapatito-tamarind cheesecake.
This is accompanied by a little map showing the sum of Asian and Hispanic population in any given county. (Well, it might be the sum; to be honest I don't know, as in the Census "Hispanic" vs. "non-Hispanic" is orthogonal to "race", which takes values White, Black, Asian, American Indian, and Native Hawaiian.) In many places in the southern half of the state it's over 50%.
But wouldn't the relevant statistic for this article be not (0.1787) + (0.3413), but 2*(0.1787)*(0.3413) = 0.1219, the probability that if two random Orange Country residents run into each other, one of them will be Asian and the other will be Hispanic? Fresno County, for example, is 50.3% Hispanic and 9.6% Asian -- that's 59.9% "Hispanic or Asian" -- but there wouldn't seem to be quite as many opportunities for such fusion as the probability of a Hispanic-Asian pair in Fresno County is only 2*(50.3%)*(9.6%) = 9.7%.
(Except that 97% of Fresno County's Asians are Hispanic, according to the frustratingly hard-to-navigate American FactFinder.So maybe some "fusion" has already taken place.)
This is accompanied by a little map showing the sum of Asian and Hispanic population in any given county. (Well, it might be the sum; to be honest I don't know, as in the Census "Hispanic" vs. "non-Hispanic" is orthogonal to "race", which takes values White, Black, Asian, American Indian, and Native Hawaiian.) In many places in the southern half of the state it's over 50%.
But wouldn't the relevant statistic for this article be not (0.1787) + (0.3413), but 2*(0.1787)*(0.3413) = 0.1219, the probability that if two random Orange Country residents run into each other, one of them will be Asian and the other will be Hispanic? Fresno County, for example, is 50.3% Hispanic and 9.6% Asian -- that's 59.9% "Hispanic or Asian" -- but there wouldn't seem to be quite as many opportunities for such fusion as the probability of a Hispanic-Asian pair in Fresno County is only 2*(50.3%)*(9.6%) = 9.7%.
(Except that 97% of Fresno County's Asians are Hispanic, according to the frustratingly hard-to-navigate American FactFinder.So maybe some "fusion" has already taken place.)
06 April 2011
Folding toilet paper thirteen times
James Tanton, of the St. Mark's Math Institute at St. Mark's School in Southborough, Massachusetts, makes excellent short mathematical videos.
He and his students also folded a very long piece of paper 13 times -- that is, they created 213-ply toilet paper. This is a world record. (There's a bit of a question about whether they actually got 13 folds or just 12 -- the 13th fold has to be held in place. 12 has been done before.) You can read about it in a local newspaper, or see a video on Youtube. They did it in the "Infinite Corridor" of MIT, which is not infinite but is very long, about 800 feet. On a Sunday, apparently, and on what must be the third or fourth floor. They got access thanks to OrigaMIT, MIT's origami club. I am only very mildly surprised that such a club exists.
This whole thing may be the only known good use of single-ply toilet paper.
He and his students also folded a very long piece of paper 13 times -- that is, they created 213-ply toilet paper. This is a world record. (There's a bit of a question about whether they actually got 13 folds or just 12 -- the 13th fold has to be held in place. 12 has been done before.) You can read about it in a local newspaper, or see a video on Youtube. They did it in the "Infinite Corridor" of MIT, which is not infinite but is very long, about 800 feet. On a Sunday, apparently, and on what must be the third or fourth floor. They got access thanks to OrigaMIT, MIT's origami club. I am only very mildly surprised that such a club exists.
This whole thing may be the only known good use of single-ply toilet paper.
02 April 2011
A street-fighting approach to the variance of a hypergeometric random variable
So you all1 know that if I have a biased coin with probability p of coming up heads, and I flip it n times, then the expected number of heads is np and the variance is npq. That's the binomial distribution. Alternatively, if I have an urn containing pN white balls and qN black balls, with p + q = 1, and I draw n balls with replacement then the distribution of the number of white balls has that mean and variance.
Some of you know that if I sample without replacement from that same urn -- that is, if I take balls out and don't put them back -- then the expected number of white balls is np and the variance is npq(N-n)/(N-1). The distribution of the number of white balls is the hypergeometric distribution.
So it makes sense, I think, to think of (N-n)/(N-1) as a "correction factor" for going from sampling with replacement to sampling without replacement. This is the approach taken in Freedman, Pisani, and Purves, for example, which is the book I'm teaching intro stats from this semester.
How do you prove this? On this, FPP are silent. The proof I know -- see, for example, Pitman -- is as follows. Write the number of white balls, when sampling without replacement, as
Sn = I1 + ... + In
where ISk is 1 if the kth draw gives a white ball and 0 otherwise. Then E(Ik) is just the probability of getting a white ball on the kth draw, and so it's equal to p by symmetry. By linearity of expectation E(Sn) = np. To get the variance, it's enough to get E(Sn2). And by expanding out that sum of indicators there, you get
Sn2 = (I12 + ... + In2) + (I1 I2 + I1 I3 + ... + In-1 In).
There are n terms inside the first set of parentheses, and n(n-1) inside the second set, which includes every pair Ij Ik where j and k aren't equal. By linearity of expectation and symmetry,
E(Sn2) = nE(I1) + n(n-1)E(I1 I2).
The first term, we already know, is np. The second term is n(n-1) times the probability that both the first and second draws yield white balls. The first draw yields a white ball with probability p. For the second draw there are N-1 balls left, of which pN-1 are white, so that draw yields a white ball with probability (pN-1)/(N-1). The probability is the product of these. Do the algebra, let the dust settle, and you get the formula I claimed.
But this doesn't explain things in terms of the correction factor. It doesn't refer back to the binomial distribution at all! But in the limit where your sample is small compared to your population, sampling without replacement and smapling with replacement are the same! So can we use this somehow? Let's try to guess the correction factor without writing down any random variables. We'll write
Variance without replacement = f(N,n) npq
where n is the sample size and N is the population size, and think about what we know about f(N,n)
First, f(N,1) = 1. If you have a sample of size 1, sampling with and without replacement are actually the same thing.
Second, f(N,N) = 0. If your sample is the entire population, you always get the same result.
But most important is that if we sample without replacement, and take samples of size n or of size N-n, we should get the same variance! Taking a sample of size N-n is the same as taking a sample of size n and deciding to take all the other balls instead. So for each sample of size n with w white balls, there's a corresponding sample of size N-n with pN-w white balls. The distributions of numbers of white balls are mirror images of each other, so they have the same variance. So you get
nf(N,n)pq = (N-n)f(N, N-n)pq.
Of course the pq factors cancel. For ease of notation, let g(x) = f(N,x). Then we need to find some function g such that g(1) = 1, g(N)=0, and ng(n) = (N-n)g(N-n). Letting n = 1 you get g(1) = (N-1)g(N-1), so g(N-1) = 1/(N-1). The three values of g that we have so far are consistent with the guess that g is linear. So let's assume it is -- why should it be anything more complicated? And that gives you the formula. This strikes me as the Street-Fighting Mathematics approach to this problem.
Question: Is there a way to rigorize this "guess" -- some functional equation I'm not seeing, for example?
1. I use "all" in the mathematician's sense. This means I wish you knew this, or I think you should know it. Some of you probably don't. That's okay.
Some of you know that if I sample without replacement from that same urn -- that is, if I take balls out and don't put them back -- then the expected number of white balls is np and the variance is npq(N-n)/(N-1). The distribution of the number of white balls is the hypergeometric distribution.
So it makes sense, I think, to think of (N-n)/(N-1) as a "correction factor" for going from sampling with replacement to sampling without replacement. This is the approach taken in Freedman, Pisani, and Purves, for example, which is the book I'm teaching intro stats from this semester.
How do you prove this? On this, FPP are silent. The proof I know -- see, for example, Pitman -- is as follows. Write the number of white balls, when sampling without replacement, as
Sn = I1 + ... + In
where ISk is 1 if the kth draw gives a white ball and 0 otherwise. Then E(Ik) is just the probability of getting a white ball on the kth draw, and so it's equal to p by symmetry. By linearity of expectation E(Sn) = np. To get the variance, it's enough to get E(Sn2). And by expanding out that sum of indicators there, you get
Sn2 = (I12 + ... + In2) + (I1 I2 + I1 I3 + ... + In-1 In).
There are n terms inside the first set of parentheses, and n(n-1) inside the second set, which includes every pair Ij Ik where j and k aren't equal. By linearity of expectation and symmetry,
E(Sn2) = nE(I1) + n(n-1)E(I1 I2).
The first term, we already know, is np. The second term is n(n-1) times the probability that both the first and second draws yield white balls. The first draw yields a white ball with probability p. For the second draw there are N-1 balls left, of which pN-1 are white, so that draw yields a white ball with probability (pN-1)/(N-1). The probability is the product of these. Do the algebra, let the dust settle, and you get the formula I claimed.
But this doesn't explain things in terms of the correction factor. It doesn't refer back to the binomial distribution at all! But in the limit where your sample is small compared to your population, sampling without replacement and smapling with replacement are the same! So can we use this somehow? Let's try to guess the correction factor without writing down any random variables. We'll write
Variance without replacement = f(N,n) npq
where n is the sample size and N is the population size, and think about what we know about f(N,n)
First, f(N,1) = 1. If you have a sample of size 1, sampling with and without replacement are actually the same thing.
Second, f(N,N) = 0. If your sample is the entire population, you always get the same result.
But most important is that if we sample without replacement, and take samples of size n or of size N-n, we should get the same variance! Taking a sample of size N-n is the same as taking a sample of size n and deciding to take all the other balls instead. So for each sample of size n with w white balls, there's a corresponding sample of size N-n with pN-w white balls. The distributions of numbers of white balls are mirror images of each other, so they have the same variance. So you get
nf(N,n)pq = (N-n)f(N, N-n)pq.
Of course the pq factors cancel. For ease of notation, let g(x) = f(N,x). Then we need to find some function g such that g(1) = 1, g(N)=0, and ng(n) = (N-n)g(N-n). Letting n = 1 you get g(1) = (N-1)g(N-1), so g(N-1) = 1/(N-1). The three values of g that we have so far are consistent with the guess that g is linear. So let's assume it is -- why should it be anything more complicated? And that gives you the formula. This strikes me as the Street-Fighting Mathematics approach to this problem.
Question: Is there a way to rigorize this "guess" -- some functional equation I'm not seeing, for example?
1. I use "all" in the mathematician's sense. This means I wish you knew this, or I think you should know it. Some of you probably don't. That's okay.
Labels:
hypergeometric,
probability,
statistics,
street-fighting
30 March 2011
How long is the coast of Maryland?
Last night's Final Jeopardy clue: "With 301 miles, it has the most coastline of current states that were part of the original 13 colonies." (Thanks to the Jeopardy! forum for the wording.)
This agrees with the Wikipedia list which is sourced from official US government data.
But as Mandelbrot told us, coastlines are self-similar . (Link goes to the paper How Long is the Coast of Britain as reproduced on Mandelbrot's web page, which unfortunately doesn't have pictures. I'm not sure if the original version of this paper in Science did. Wikipedia's article on the paper does.) That is, the length of a coastline depends on the size of your ruler. Furthermore, I would suspect that the fractal dimension of some states' coastlines is larger than others. Wikipedia states that "Measurements were made using large-scale nautical charts" which seems to imply that all the measurements were done at the same length scale, but if you did the measurements at a smaller scale, the states whose coastline have higher fractal dimension would move up the list.
So last night I spent twenty seconds yelling this at the TV, and ten seconds getting out the answer Maryland. Which is wrong. Also wrong: Maine, New York. (Maine used to be part of Massachusetts; the wording is a bit ambiguous.) It appears that only the south shore of Long Island counts as "coast"; the north shore, which borders Long Island Sound, doesn't.
And of course Chesapeake Bay doesn't count either.
This agrees with the Wikipedia list which is sourced from official US government data.
But as Mandelbrot told us, coastlines are self-similar . (Link goes to the paper How Long is the Coast of Britain as reproduced on Mandelbrot's web page, which unfortunately doesn't have pictures. I'm not sure if the original version of this paper in Science did. Wikipedia's article on the paper does.) That is, the length of a coastline depends on the size of your ruler. Furthermore, I would suspect that the fractal dimension of some states' coastlines is larger than others. Wikipedia states that "Measurements were made using large-scale nautical charts" which seems to imply that all the measurements were done at the same length scale, but if you did the measurements at a smaller scale, the states whose coastline have higher fractal dimension would move up the list.
So last night I spent twenty seconds yelling this at the TV, and ten seconds getting out the answer Maryland. Which is wrong. Also wrong: Maine, New York. (Maine used to be part of Massachusetts; the wording is a bit ambiguous.) It appears that only the south shore of Long Island counts as "coast"; the north shore, which borders Long Island Sound, doesn't.
And of course Chesapeake Bay doesn't count either.
25 March 2011
Vallentin's probability cheat sheet
John Allen Paulos pointed me to Matthias Vallentin's probability and statistics cheat sheet. It's a big "sheet" -- twenty-seven pages -- but maybe you have a big blank wall to put it on.
(To my students, if you read this: remember that you only get one page of notes on the midterm, and you have to write it yourself.)
(To my students, if you read this: remember that you only get one page of notes on the midterm, and you have to write it yourself.)
du Sautoy on symmetry
An interesting TED talk: Marcus du Sautoy on symmetry -- interesting to watch, lots of pictures; ignore the fact that it starts with the standard slightly overwrought version of Galois' story. If you want a more accurate version, I recommend Amir Alexander's Duel at Dawn: Heroes, Martyrs, and the Rise of Modern Mathematics . About halfway through he gives a presentation of a proof that the two groups of order 6 are different. (One, the cyclic group of order six, is abelian; the other, the symmetric group on three elements, is not.) I particularly like the pictures of the Alhambra, on which du Sautoy has overlaid animations showing the effect of rotating them. Presumably they won't let you do this if you actually go there.
23 March 2011
22 March 2011
Are food-borne pathogen survival times really exponentially distributed?
From Scientific American, an excerpt from Modernist Cuisine: The Art and Science of Cooking on the complex origins of food safety rules.
This is the six-volume, six-hundred-dollar magnum opus of Nathan Myhrvold (former chief technology officer at Microsoft, and chefs Chris Young and Maxime Bilet; you can read more about it at the Wall Street Journal.
In particular I noticed the following:
A "nD" reduction is one that kills all but 10-n of the foodborne pathogens.
What struck me here is that the distribution of the pathogen lifetimes, assuming these numbers are actually correct, is exponential. And, therefore, memoryless -- if you're a bacterium under these conditions, your chances of dying in the first eighteen minutes are ninety percent, and if you're still alive at ninety minutes, your chances of dying in the next eighteen minutes are still ninety percent. This surprised me. The decay of radioactive atoms can be described in this way -- but are bacteria really so simple?
The excerpt as a whole is quite interesting -- apparently a lot more than just science is going into recommendations of how long food should be cooked.
(Myhrvold has a bachelor's degree in math and a PhD in mathematical economics, among other degrees; Young has a bachelor's degree in math and was working on a doctoral degree before he left for the culinary world. So perhaps it is fair of me to think that they would get this right.)
This is the six-volume, six-hundred-dollar magnum opus of Nathan Myhrvold (former chief technology officer at Microsoft, and chefs Chris Young and Maxime Bilet; you can read more about it at the Wall Street Journal.
In particular I noticed the following:
If a 1D reduction requires 18 minutes at 54.4 degrees C / 130 degrees F , then a 5D reduction would take five times as long, or 90 minutes, and a 6.5D reduction would take 6.5 times as long, or 117 minutes.
A "nD" reduction is one that kills all but 10-n of the foodborne pathogens.
What struck me here is that the distribution of the pathogen lifetimes, assuming these numbers are actually correct, is exponential. And, therefore, memoryless -- if you're a bacterium under these conditions, your chances of dying in the first eighteen minutes are ninety percent, and if you're still alive at ninety minutes, your chances of dying in the next eighteen minutes are still ninety percent. This surprised me. The decay of radioactive atoms can be described in this way -- but are bacteria really so simple?
The excerpt as a whole is quite interesting -- apparently a lot more than just science is going into recommendations of how long food should be cooked.
(Myhrvold has a bachelor's degree in math and a PhD in mathematical economics, among other degrees; Young has a bachelor's degree in math and was working on a doctoral degree before he left for the culinary world. So perhaps it is fair of me to think that they would get this right.)
18 March 2011
What is the origin of Kirillov's lucky number problem?
I've run across a few references to a Russian superstition as follows: a bus ticket has a six-digit number, and a ticket is said to be "lucky" if the sum of the first three digits equals the sum of the last three digits. See, for example, Lectures on Generating Functionsby Sergei Lando -- an excellent little book I accidentally discovered in the library a couple days ago. Lando gives the problem of figuring out how many lucky tickets there are, and says that in the early 1970s A. A. Kirillov would often open his seminar this way. The cataloging information in the book says that Lando was born in 1955; Olshanski writes, in his preface to Kirillov's seminar on representation theory, that many students attended Kirillov's seminar. I suspect that Lando actually heard Kirillov pose this problem at some point.
What I'm interested in is whether this is an actual superstition, perhaps held by non-mathematicians. It seems like the sort of thing that a mathematician would make up to create a good problem. This web page mentions the superstition in a non-mathematical context, with the twist that you're supposed to eat lucky tickets. This one does as well, and links to these people who sell lucky ticket cookies. It seems quite likely that:
1. the superstition exists among some segment of the Russian population, or at least did at some point, and
2. the entrance of the problem into the mathematical culture is due to Kirillov -- if I were still at Penn I'd ask him.
But did the problem travel from the rest of the world to mathematics? Or, because this sum condition feels so mathematical, did it travel from mathematics to the rest of the world?
(Also, can you solve the problem? How many lucky tickets are there?)
What I'm interested in is whether this is an actual superstition, perhaps held by non-mathematicians. It seems like the sort of thing that a mathematician would make up to create a good problem. This web page mentions the superstition in a non-mathematical context, with the twist that you're supposed to eat lucky tickets. This one does as well, and links to these people who sell lucky ticket cookies. It seems quite likely that:
1. the superstition exists among some segment of the Russian population, or at least did at some point, and
2. the entrance of the problem into the mathematical culture is due to Kirillov -- if I were still at Penn I'd ask him.
But did the problem travel from the rest of the world to mathematics? Or, because this sum condition feels so mathematical, did it travel from mathematics to the rest of the world?
(Also, can you solve the problem? How many lucky tickets are there?)
17 March 2011
Devlin has an upcoming biography of Fibonacci
From Keith Devlin's twitter feed (@nprmathguy): he's meeting with his publisher about a biography of Fibonacci which will come out this July, entitled The Man of Numbers: Fibonacci's Arithmetic Revolution.
What surprises me most about this is that Devlin says that this is the first biography of Fibonacci. The St. Andrews' biographical page on him, for what it's worth, only lists two books among its sources. One is entitled Leonard of Pisa and the New Mathematics of the Middle Ages and the other Leonardi Pisani Liber Abbaci oder Lesevergnügen eines Mathematikers -- these don't sound like biographies. I'm surprised because you'd think there'd be a built-in market for such a book -- everyone knows about the Fibonacci sequence. And you could put bunnies on the cover! It looks as if Devlin's publishers are more serious than I am, though, and have not done so.
What surprises me most about this is that Devlin says that this is the first biography of Fibonacci. The St. Andrews' biographical page on him, for what it's worth, only lists two books among its sources. One is entitled Leonard of Pisa and the New Mathematics of the Middle Ages and the other Leonardi Pisani Liber Abbaci oder Lesevergnügen eines Mathematikers -- these don't sound like biographies. I'm surprised because you'd think there'd be a built-in market for such a book -- everyone knows about the Fibonacci sequence. And you could put bunnies on the cover! It looks as if Devlin's publishers are more serious than I am, though, and have not done so.
24 February 2011
How do graphing calculators do numeical integration?
Here's a question I don't know the answer to, which Sam Shah asked: how does the TI-83 do ∫-zz exp(-x2) dx, or more generally numerical integration? Of course as z goes to ∞ this approaches √π, with very small tails. (The link goes to an old post of mine that unfortunately has broken LaTeX; you can read the alt text for the images. The idea is that in∫z∞ exp(-x2) dx, the integrand can be bounded above by the exponential exp(-z2-2xz); integrating this, the original integral is less than exp(-z2)/2z and this is pretty tight. And yes, I know, I should switch to a platform with LaTeX support.)
So you expect to get something near √π for any reasonably large value of z. But if z is large enough -- say 1000 -- then you get a very small value, in this case on the order of 10-7. Presumably if the range of integration is wide enough, then the integration method used by the calculator doesn't actually pick up the central region where the action is actually happening.
So you expect to get something near √π for any reasonably large value of z. But if z is large enough -- say 1000 -- then you get a very small value, in this case on the order of 10-7. Presumably if the range of integration is wide enough, then the integration method used by the calculator doesn't actually pick up the central region where the action is actually happening.
18 February 2011
Experimental mathematics for small children
a six year old solves the subset sum problem: a post on an operations research science fair project.
Roll five dice. Can you make a subset of the numbers appearing on them add up to 10? to 12? Laura McLay's six-year-old daughter did this experiment as a science fair project, solving the problem visually with bars of length 1 inch to 6 inch fitting in a 10-inch or 12-inch frame.
This is both the cutest and most awesome thing I have seen so far this morning. (But be aware that I have only been awake for half an hour.) There should be more things like it.
It's also what I'd do if I were seeing this problem for the first time. (Although, being a Grownup, I'd skip the manipulatives. Which would probably make it less fun.)
Roll five dice. Can you make a subset of the numbers appearing on them add up to 10? to 12? Laura McLay's six-year-old daughter did this experiment as a science fair project, solving the problem visually with bars of length 1 inch to 6 inch fitting in a 10-inch or 12-inch frame.
This is both the cutest and most awesome thing I have seen so far this morning. (But be aware that I have only been awake for half an hour.) There should be more things like it.
It's also what I'd do if I were seeing this problem for the first time. (Although, being a Grownup, I'd skip the manipulatives. Which would probably make it less fun.)
13 February 2011
Which comes first, the nominal or real record high?
A quick puzzle: say that gas prices hit a record high in nominal dollars. And say they also, around the same time, hit a record high in real (inflation-adjusted) dollars. Which comes first?
Say the nominal price is f(t) and the real price is h(t) = f(t)g(t), where g is monotone decreasing. Then h'(t) = f'(t) g(t) + f(t) g'(t). So if the nominal price is at a maximum at time T, then f'(T) = 0 and so h'(T) = f(T) g'(T). f(T) is positive because it's a price and g'(T) is negative by assumption. So h'(T) is negative and at the time of the nominal high, the real price is already decreasing. The real high comes first, and there's a short period in which the real price is decreasing but the nominal price is still increasing. This makes sense - a nominally constant price is decreasing in real dollars.
This is brought to you by an example from Geoff Nunberg's book Talking Right, on how the Right has distorted political language in the United States in an effort to marginalize the Left. At the particular point I'm commenting on, argues that the right likes to proclaim "record highs" in gas prices which are basically always in nominal dollars and therefore make the gas price rise look worse than it is. My argument, however, says nothing about that - taking derivatives makes this a local problem, and "record highs" (nominal or real) are global maxima.
Say the nominal price is f(t) and the real price is h(t) = f(t)g(t), where g is monotone decreasing. Then h'(t) = f'(t) g(t) + f(t) g'(t). So if the nominal price is at a maximum at time T, then f'(T) = 0 and so h'(T) = f(T) g'(T). f(T) is positive because it's a price and g'(T) is negative by assumption. So h'(T) is negative and at the time of the nominal high, the real price is already decreasing. The real high comes first, and there's a short period in which the real price is decreasing but the nominal price is still increasing. This makes sense - a nominally constant price is decreasing in real dollars.
This is brought to you by an example from Geoff Nunberg's book Talking Right, on how the Right has distorted political language in the United States in an effort to marginalize the Left. At the particular point I'm commenting on, argues that the right likes to proclaim "record highs" in gas prices which are basically always in nominal dollars and therefore make the gas price rise look worse than it is. My argument, however, says nothing about that - taking derivatives makes this a local problem, and "record highs" (nominal or real) are global maxima.
09 February 2011
How to know if your hiring practices work?
So one of the courses I'm teaching this semester is introductory probability, with a calculus prerequisite. Not surprisingly this attracts a lot of students in various subjects that are math-heavy but that are not actually mathematics or statistics. They also tend to be juniors and seniors.
In office hours today, one of my students got to criticizing the fact that lots of consulting, finance, etc. jobs like to ask questions like "how many ping pong balls fit in a 747" in interviews, on the basis that this has nothing to do with what they'd actually have to do on the job. I pointed out that there's some difficulty in knowing whether your hiring practices are working. You'd like to compare the success of the people that you hire into your company with the success that the people that you don't hire would have had in your company. The former might be difficult to measure, but it's probably not impossible assuming your company does something quantifiable. But the latter is essentially impossible.
So what's the right way to design this sort of study? Is there any serious research on which interview techniques actually succeed in finding the best people?
In office hours today, one of my students got to criticizing the fact that lots of consulting, finance, etc. jobs like to ask questions like "how many ping pong balls fit in a 747" in interviews, on the basis that this has nothing to do with what they'd actually have to do on the job. I pointed out that there's some difficulty in knowing whether your hiring practices are working. You'd like to compare the success of the people that you hire into your company with the success that the people that you don't hire would have had in your company. The former might be difficult to measure, but it's probably not impossible assuming your company does something quantifiable. But the latter is essentially impossible.
So what's the right way to design this sort of study? Is there any serious research on which interview techniques actually succeed in finding the best people?
06 February 2011
Correlation in betting on the NFL.
Nate Silver points out that just because the spread in today's Super Bowl is small (the Packers are something like a three-point favorite) doesn't mean that the game will necessarily be close. It just means that it's almost equally likely to be a blowout in one team's favor as in the other's.
Not surprisingly, though, the regression line for margin of victory, as predicted from point spread, is very close to having slope 1 and passing through the origin. As it should, because otherwise bettors would be able to take advantage of it! Say that 7-point favorites won, on average, by 9 points. Assume that the distribution of actual margin of victory, conditioned on point spread, is symmetrical; then half of 7-point favorites would win by 9 points or more, so more than half would win by 7 points or more, and one could make money by betting on them. On the other hand, say that 7-point favorites won, on average, by 5 points; then you could make money by betting agsinst them.
(For what it's worth, I don't have a particular interest in this game. In fact I probably won't even watch it. I have no connection to either Pittsburgh or Green Bay, and as longtime readers know, I'm a baseball fan.)
Not surprisingly, though, the regression line for margin of victory, as predicted from point spread, is very close to having slope 1 and passing through the origin. As it should, because otherwise bettors would be able to take advantage of it! Say that 7-point favorites won, on average, by 9 points. Assume that the distribution of actual margin of victory, conditioned on point spread, is symmetrical; then half of 7-point favorites would win by 9 points or more, so more than half would win by 7 points or more, and one could make money by betting on them. On the other hand, say that 7-point favorites won, on average, by 5 points; then you could make money by betting agsinst them.
(For what it's worth, I don't have a particular interest in this game. In fact I probably won't even watch it. I have no connection to either Pittsburgh or Green Bay, and as longtime readers know, I'm a baseball fan.)
Pixar mathematics
Pixar's use of harmonic functions (by David Austin) describes mathematical techniques used by Pixar. Incidentally, apparently there exists something called Pixar University, which I learned when I went to the excellent Pixar exhibit at the Oakland Museum of California. As far as I can tell they are not hiring, they're really an internal training program, and anyway I don't know anything about animation. (The exhibit's next stop is in Hong Kong.
The problem that the article addresses is basically this: 3-D animated characters have many "control points" on their surface. Animators would not like to have to specify how all of these move individually, so they don't; they only specify the motion of a subset of these called the "cage". How do we interpolate? When there are just three control points this is obvious -- use barycentric coordinates in both triangles. This is basically exploiting the fact that there's a unique linear transformation taking a generic triangle to another. generic triangle
But what if there are more than three control points? One generalization is "mean value coordinates", which are a linear transformation, but which are problematic when the cage is not convex. Since the cage is often, say, the outline of a human being, this is a real problem! Apparently the newer technique is "harmonic coordinates" -- define the coordinates of the n boundary points of a cage to be (1,0,...,0), (0,1,...,0), ..., (0,0,...,1). Then for any point p in 3-space, let hi(p) be linear on the edges of the cage, and harmonic (having zero Laplacian) elsewhere. Then the non-cage control point p has harmonic coordinates h1(p), ..., hn(p); if you move the cage points then these points keep the same harmonic coordinates but change their coordinates in 3-space. Unfortunately inverting the h-function is a bit harder but it seems worth the trouble.
There's a video explaining this, with lots of images!
The problem that the article addresses is basically this: 3-D animated characters have many "control points" on their surface. Animators would not like to have to specify how all of these move individually, so they don't; they only specify the motion of a subset of these called the "cage". How do we interpolate? When there are just three control points this is obvious -- use barycentric coordinates in both triangles. This is basically exploiting the fact that there's a unique linear transformation taking a generic triangle to another. generic triangle
But what if there are more than three control points? One generalization is "mean value coordinates", which are a linear transformation, but which are problematic when the cage is not convex. Since the cage is often, say, the outline of a human being, this is a real problem! Apparently the newer technique is "harmonic coordinates" -- define the coordinates of the n boundary points of a cage to be (1,0,...,0), (0,1,...,0), ..., (0,0,...,1). Then for any point p in 3-space, let hi(p) be linear on the edges of the cage, and harmonic (having zero Laplacian) elsewhere. Then the non-cage control point p has harmonic coordinates h1(p), ..., hn(p); if you move the cage points then these points keep the same harmonic coordinates but change their coordinates in 3-space. Unfortunately inverting the h-function is a bit harder but it seems worth the trouble.
There's a video explaining this, with lots of images!
05 February 2011
Computing distance in the Facebook graph
Is there some nice algorithm for computing the distance between two nodes in a graph, that gracefully fails when they're far apart? I'm asking this prompted by this metafilter question on how to compute the distance between two individuals in the Facebook graph; it seems to me that if someone is at distance 5 versus 6 from me, for example, I don't really care which it is.
Dijkstra's algorithm, for example, runs in time proportional to the number of edges of the graph. (Well, O(E+V log V), but I'm guessing that log V is smaller than E/V.) This is the search algorithm I learned in an algorithms class once. And most algorithms that I've seen -- of course this isn't an area I know a huge amount about -- tend to run in time at least as large as the number of edges of the graph. (Of course this makes sense because one could have to actually look at all the edges of the graph -- such are the pitfalls of worst-case analysis.)
It seems like bidirectional search would work -- if my friends of friends overlap with your friends of friends, then we're at a distance of most 4, for example. (I didn't realize this had a name until just now; it just seems like an obvious idea to me.) But is there some reason I'm overlooking that this wouldn't be practical?
Dijkstra's algorithm, for example, runs in time proportional to the number of edges of the graph. (Well, O(E+V log V), but I'm guessing that log V is smaller than E/V.) This is the search algorithm I learned in an algorithms class once. And most algorithms that I've seen -- of course this isn't an area I know a huge amount about -- tend to run in time at least as large as the number of edges of the graph. (Of course this makes sense because one could have to actually look at all the edges of the graph -- such are the pitfalls of worst-case analysis.)
It seems like bidirectional search would work -- if my friends of friends overlap with your friends of friends, then we're at a distance of most 4, for example. (I didn't realize this had a name until just now; it just seems like an obvious idea to me.) But is there some reason I'm overlooking that this wouldn't be practical?
03 February 2011
Snowdecahedron redux
A few days ago I posted about the Porter Square snowdecahedron. Here are more pictures of it and pictures of some smaller ones. The project is due to Dan Beyer. He previously made a dodecahedron out of a tree stump (apparently the trick is to make a wedge corresponding to the dihedral angle of the dodecahedron beforehand) and has proposed dodecahedra built of traffic cones as public art for construction sites.
(I found Dan Beyer's site via metafilter; I don't recall how I found the original picture on flickr linked to in the previous post.)
(I found Dan Beyer's site via metafilter; I don't recall how I found the original picture on flickr linked to in the previous post.)
02 February 2011
How to win the Ontario lottery's tic-tac-toe game
Jonah Lehrer writes for Wired on breaking a scratch-off game in the Ontario lottery. In 2003, Mohan Srivastava, a geostatistician, figured out a way to crack a tic-tac-toe game that the Ontario lottery was running at the time. In this game, you're given a set of eight three-by-three grids with numbers between one and thirty-nine on them (seventy-two numbers in total) -- these are visible to you when you buy the tickets. After buying the ticket, you then scratch off a set of "your numbers"; if three of these numbers appear in a row in one of the grids, in tic-tac-toe fashion, you win. Since there are 72 numbers on the ticket and they are between 1 and 39, there is much repetition. It turned out that if a ticket contained three non-repeated numbers in a row it was very likely to be a winner.
The article doesn't say how the tickets are turned out this way, though; what sort of algorithm could produce this behavior? But for Srivastava's purpose of demonstrating that it's possible to tell winning tickets from losing tickets with high probability, this was not necessary. Srivastava also points out that this isn't worth it as a way to make money, unless possibly if you could hypothetically get your hands on a pile of tickets, go through them at home, and return the losing ones to the store.
(I learned about this from metafilter. The commenters there, a usually reliable bunch, seem to be split on whether you could return the losing tickets or not.)
The article doesn't say how the tickets are turned out this way, though; what sort of algorithm could produce this behavior? But for Srivastava's purpose of demonstrating that it's possible to tell winning tickets from losing tickets with high probability, this was not necessary. Srivastava also points out that this isn't worth it as a way to make money, unless possibly if you could hypothetically get your hands on a pile of tickets, go through them at home, and return the losing ones to the store.
(I learned about this from metafilter. The commenters there, a usually reliable bunch, seem to be split on whether you could return the losing tickets or not.)
A database of applied math problems
A friend of mine teaches at the British Columbia Institute of Technology. They are building a database of applied math problems, at the 11th or 12th-grade level. Their goal is to give students a better idea of "why do we need to learn this?", which is the bane of all math teachers.
They're not asking for calculus problems. But I've taught calculus and I often had the sense, while teaching the "applied" problems, that they were just straight-up asking the students to do derivatives or integrals, with some words added purely as a red herring to confuse the students. I mean, really, if a ladder leaning against a wall falls down, is there any situation in which one cares how quickly the area underneath the ladder is changing? My memory of pre-calculus classes is hazy, because I haven't taught at that level, but I do remember having a pervasive sense that the applications were contrived.
They're not asking for calculus problems. But I've taught calculus and I often had the sense, while teaching the "applied" problems, that they were just straight-up asking the students to do derivatives or integrals, with some words added purely as a red herring to confuse the students. I mean, really, if a ladder leaning against a wall falls down, is there any situation in which one cares how quickly the area underneath the ladder is changing? My memory of pre-calculus classes is hazy, because I haven't taught at that level, but I do remember having a pervasive sense that the applications were contrived.
30 January 2011
Porter Square snowdecahedron
You may appreciate the Porter Square snowdecahedron. This is what it sounds like, a dodecahedron made of snow. I couldn't find a isnowsahedron but I found these people rolling a giant icosahedral die down a snow-covered hill.
As for me, there's a mysterious dodecahedron-shaped lawn ornament in the backyard of the house I live in in Oakland, and we don't have snow.
As for me, there's a mysterious dodecahedron-shaped lawn ornament in the backyard of the house I live in in Oakland, and we don't have snow.
26 January 2011
How to draw Voronoi diagrams
How to draw Voronoi diagrams by hand. Just to remind you, a Voronoi diagram is associated with a set of points x1, ..., xn in the plane; for each point, the cell containing it is the set of points in the plane that are closer to xi than to any other xj. I've spent a fair amount of time doodling Voronoi diagrams in boring places, so this is interesting. Apparently they also run into the problem that sometimes it's hard to tell which cells will border which other cells.
Yes, I know, it's been six months since I posted. As you can see, I'm still alive, and I made it through my first semester here at Berkeley. I live in North Oakland, closer to downtown Berkeley than to downtown Oakland, and I find myself describing where I live as "the part of Oakland that's really more Berkeley-ish". More formally, I'm in downtown Berkeley's Voronoi cell, not downtown Oakland's.
(VIa Proof Math is Beautiful.)
Yes, I know, it's been six months since I posted. As you can see, I'm still alive, and I made it through my first semester here at Berkeley. I live in North Oakland, closer to downtown Berkeley than to downtown Oakland, and I find myself describing where I live as "the part of Oakland that's really more Berkeley-ish". More formally, I'm in downtown Berkeley's Voronoi cell, not downtown Oakland's.
(VIa Proof Math is Beautiful.)
Subscribe to:
Posts (Atom)