God Plays Dice: Knuth

Showing posts with label Knuth. Show all posts

09 March 2009

Knuth on solitaire

I'm browsing through Knuth's The Art of Computer Programming (Volume 1, Volume 2, Volume 3), because it's Spring Break, so I have time. I'm reading the mathematical bits, which are perhaps half the work; I'm less interested in the algorithms.

Anyway, we find on page 158 of Volume 2: "Some people spend a lot of valuable time playing card games of solitaire, and perhaps automation will make an important inroad in this area." This is part of Chapter 3, on the generation and testing of random numbers. Of course, this book was published in 1969; Windows Solitaire didn't exist then. (It's also amusing to see Knuth describing things that will be in, say, Chapter 10; he's currently working on Chapter 7, which will be the first half of Volume 4.)

30 October 2008

Mandelbrot, Knuth, and open-ended citations

A citation reproduced verbatim from Benoit Mandelbrot's The Fractal Geometry of Nature (1983):

Knuth, D. 1968-. The art of computer programming. Reading, MA: Addison Wesley.

The "1968-" is not Knuth's birth date, of course, but the date at which he started writing the work in question, which was fifteen years in the making when Mandelbrot wrote. It's still not done.

Incidentally, Mandelbrot's book is a good one, full of pretty pictures (although of course not as intricate as one might see now, because the book is a quarter-century old); it's also fairly casually written. Mandelbrot describes it as an "essay", in what I take to be the original Francophone sense of the word that means roughly an "attempt" at something, and so it's rather discursive, personal, and nonrigorous; these idiosyncrasies are good for a book one wants to read casually, as I do right now, but someone who wanted a rigorous understanding of the concepts might look elsewhere. I'm tempted to say it's a good series of lectures crystallized into paper form, although as far as I know it was never intended as such.

I think I'd heard of the book before, but it was Nova's Hunting the Hidden Dimension (aired on Tuesday night) that got me to actually head to the library and find it. I suspect I'm not alone, because apparently it's selling quite well on Amazon.

(The usual disclaimer on books I'm reading that I borrowed from the library applies: ignore my recommendations if you're at Penn, because I have the library's copy.)

No more Knuth checks

Donald Knuth is no longer giving reward checks for those who find errors in his books. Apparently people have been stealing his banking information and he's had to close some checking accounts. He's maintaining a web page with the amount of rewards people would receive, which is basically as good, because nobody was cashing the checks anyway. And perhaps it's better, because now everybody in the world can see that you found a bug in one of Knuth's books! If you just had a framed check on your wall, only the people you invited in could see it.

16 September 2008

Something interesting about libraries

At least two volumes of archival versions of Donald Knuth's papers, Selected Papers on Discrete Mathematics and Selected Papers on Analysis of Algorithms, are kept in Penn's Van Pelt library, which for the most part houses the humanities and social science collections. (This occurred to me while I was going to the library today to pick up the second of these.)

Of course, the journals in which these papers were originally published, and his other books, are for the most part in the Math, Physics, and Astronomy library, which isn't important enough for any rich benefactor to have the naming rights to. (Are you wondering why those three subjects share a library? I think it's because they share a building.) I know for sure that Mathematics for the Analysis of Algorithms, Concrete Mathematics, and The Art of Computer Programming live in the math library. (Okay, to be totally honest, the library's copy of Concrete Mathematics lives in my apartment.)

I suspect this is because books of "selected papers" are seen as historical documents; the books on mathematics, broadly defined, that are shelved in Van Pelt are books on the history of mathematics, mathematics education, the philosophy of mathematics, etc.

14 April 2008

Knuth on teaching calculus

In 1998, Donald Knuth wrote a letter on using the O-notation in teaching calculus, which Alexandre Borovik has reproduced.

10 January 2008

Knuth is 70

Donald Knuth is seventy. The four posts I've linked to say more about this than I can.

In his honor, I hope to take a class in the analysis of algorithms this term. (Okay, I was planning to anyway.) In fact, just now I got it approved that I could take this class. It's not in my department, so doing so required a couple e-mails -- but the analysis of algorithms is a legitimate field of mathematics. I'm not going to explain why here, because in doing so I would probably just say lots of foolish things that I don't fully understand. This is why I'm taking the class -- I am interested in the analysis of algorithms but I don't know much about it.

I hope that one day I have enough money that I feel like I can give $2.56 to everybody who catches a mistake that I make and not be bankrupted by this. (Knuth gives this amount -- "one hexadecimal dollar" -- to anyone who finds an error in one of his books.) I would be willing to adopt such a scheme right now if all other authors adopted it as well, though; since I read much more than I write I'm reasonably sure I'd come out ahead -- so long as I was the only one eligible to receive such money. (Is there anybody who reads less than they write? That seems like it would be a very strange state of affairs.) However, if all other authors adopted such a scheme they would probably also be more diligent in proofreading their work, and I'd have competition for finding the errors; in the end some sort of equilibrium would be reached among all people who read and/or write, and I'm not sure whether I would end up paying out more in such bounties than I take in.

15 December 2007

Tilings of the plane

For various reasons, I've been thinking about tilings of the plane a lot lately. This reminded me of a paper I haven't thought about in a while, by Cris Moore, Some Polyomino Tilings of the Plane, which "calculates generating functions for the number of tilings of rectangles of various widths by" various polyominoes. Let T_m,n be the number of tilings of an m-by-n board by a given set of tiles; then the bivariate generating function

$\sum_{m \ge 0} \sum_{n \ge 0} T_{m,n} x^m y^n$

is in general a quite complicated object. (I don't know of any explicit examples of this function, which I think just goes to show that it's complicated!) Even the "diagonal" generating function,

$\sum_{n \ge 0} T_{n,n} x^n$

which has as the coefficient of xⁿ the number of tilings of an n-by-n square, seems to be on the ugly side. On the contrary,

$\sum_{n \ge 0} T_{m,n} x^n$

is usually a rational function. In a sense tilings of strips of fixed width are a "one-dimensional" problem -- for example, one could encode tilings of fixed-width strips with a certain set of tiles as words over some finite alphabet, I suspect. The fact that these generating functions are rational seems to be a "folk theorem" -- anybody who's thought about tilings knows it, but I've never seen a proof.

For example, the number of tilings of the 2-by-n rectangle with dominoes are just the Fibonacci numbers! (See Mathworld's pictures.) This is easy to see. Let T_n be the number of ways to tile a 2-by-n rectangle with domnioes. The left edge of a 2-by-n rectangle (that's 2 rows, n columns) tiled with dominoes is either a single vertical domino, in which case there are T_n-1 ways to finish the tiling, or two horizontal ones, in which case there are T_n-2 ways to finish the tiling. So T_n = T_n-1 + T_n-2; noting that T₁ = 1, T₂ = 2 finishes the proof.

But the number of tilings of an m-by-n rectangle by dominoes is given by the quite difficult formula

$T_{m,n} = 2^{mn/2} \prod_{j=1}^m \prod_{k=1}^n \left( \cos^2 {j\pi \over m+1} + \cos^2 {k\pi \over n+1} \right)^{1/4}$

which is one of those curious formulas that occur in combinatorics in which some expression that has no business even being an integer is not only an integer, but manages to count something! (Notice that if both m and n are odd, then the number of domino tilings of T_m,n should be zero; the formula takes care of this nicely. Namely, if m and n are both odd then the j = (m+1)/2, k = (n+1)/2 term of the product is zero. (I'm quoting Graham, Knuth, and Patashnik for this formula; it's originally due to Kasteleyn. Anybody who mentions domino tilings is required by law to cite Kasteleyn's paper.) There's a rather strange interpretation of this formula I read somewhere once: given an m-by-n board, you can write the number

$2^{1/2} \left( \cos^2 {j\pi \over m+1} + \cos^2 {k\pi \over n+1} \right)^{1/4}$

in the (j, k) square, and the product of those numbers is the number of domino tilings of the board.

Furthermore, if you consider log T_m,n the product becomes a sum:

$\log T_{m,n} = {mn \over 2} \log 2 + {1 \over 4} \sum_{j=1}^m \sum_{k=1}^n \log \left( \cos^2 {j\pi \over m+1} + \cos^2 {k\pi \over n+1} \right)$

and we can approximate that sum by an integral, since it's a Riemann sum; thus

$\log T_{m,n} \approx {mn \over 2} \log 2 + {mn \over 4\pi^2} \int_0^\pi \int_0^\pi \log(\cos^2 x + \cos^2 y) \thinspace dx \thinspace dy .$

The integral isn't elementary, but it can be evaluated numerically; thus we get log T_m,n ≈ .29156 mn.

Perhaps a bit surprisingly, this only depends on the size of the rectangle being tiled, not its shape; that's not quite true, though, because the approximation by an integral is a bit crude. Still, for example, an 80 by 80 rectangle has 1.18 × 10⁸⁰⁰ domino tilings, while a 40 by 160 rectangle has 4.26 × 10⁷⁹⁷; in terms of the logarithm (which is how these things should be viewed) those are quite close. There are deep connections to statistical physics here that I don't quite understand, although I could if I put my mind to it; the limit of log T_m,n/mn as m and n get large is called the entropy of a set of tiles (Moore), and I believe something like this is true for any set of tiles, although entropies aren't exactly easy to calculate.

References
1. Ronald L. Graham, Donald E. Knuth, Oren Patashnik. Concrete Mathematics, 2nd edition. Addison-Wesley, 1994.
2. P. W. Kasteleyn, "The statistics of dimers on a lattice, I: The number of dimer arrangements on a quadratic lattice." Physica 27 (1961) 1209-1225.
3. Cris Moore, "Some Polyomino Tilings of the Plane". Availalable from author's web site, http://www.santafe.edu/~moore/.

29 October 2007

Enumerating multiset partitions

Brent Yorgey on enumerating multiset partitions. He came to the problem the same way I did -- I was trying to count the number of factorizations of an integer. An integer of the form pⁿ, where p is a prime, has p(n) factorizations, where p(n) is the partition function; for example

giving the p(5) = 7 factorizations of p⁵. An integer of the form p₁p₂...p_n has B_n factorizations (the nth Bell number); for example, 2310 = (2)(3)(5)(7)(11) has fifty-two factorizations. But what about, say, an integer of the form p³qr? How many factorizations does it have? Naturally, this associates an integer with each partition; from the number p⁵ we get the partition 5 and the number of factorizations 15; from the number p₁p₂p₃p₄p₅ we get the partition 1 + 1 + 1 + 1 + 1 we associate 52.

Unfortunately, I don't really follow Yorgey's work because I don't know Haskell. And I don't know a general way to compute these yet, except in the trivial cases. But, for example, p² q has four factorizations:

(p²q) = (p²)(q) = (p)(pq) = (p)(p)(q)

and so we associate to 2 + 1 the number 4; this is intermediate between p(3) (which is 3) and B₃ (which is 5). This makes sense; in general p(n) counts the number of partitions of {1, ..., n} when no elements are distinguishable, and B_n when all elements are distinguishable. Multiset partitions interpolate between integer partitions and set partitions.

(Yorgey says that the problem is also solved in Volume 4, Fascicle 3 of Knuth, which I may track down. But I want the pleasure of working this out for myself, to some extent.)

Yorgey also says that he found the problem via Project Euler, which gives a couple hundred examples of problems that can, in general, be solved by ridiculously inefficient brute force methods or by elegant methods that require not so much computation.

19 June 2007

Names for large numbers

Language Log on number delimitation. In the United States, we put commas between every three digits: 123,456,789. In Europe they use periods instead of commas, but in the same places.

In China, they group into sets of four digits instead of three: 1,2345,6789. (I've seen this a few times written with the commas, in English-language texts; it's very disorienting.) The ancient Greeks also did something like this: they referred to "myriad", "myriad myriad", and so on, where "myriad" is 10⁴. ("Myrio-" and "myria-" are also obsolete metric prefixes for 10⁴ and 10^-4 respectively.)

In India, they break into a low-order group of 3 and then groups of 2: 12,34,56,789. This reflects the structure of the language -- certain odd powers of 10 have names, with 10⁵ being "lakh" and 10⁷ being "crore".

In a way, though, we do the same thing, and least if you consider the etymology of the number names thousand, million, billion, trillion, and so on. (By "billion" I mean an American billion, 10⁹.) "Thousand" is somehow special; we're treating the first power of 10³ differently than all the others. The British system, where successive powers of 1000 are named thousand, million, milliard, billion, billiard, trillion, ... is more "logical". But shouldn't "thousand" be "thousard" or something like that? And "billiard" is also a name for that game where you hit the balls with the sticks.

There's also the Knuth -yllion notation, which answers a question that Poser asked: are there systems with groups that double in size? The answer is yes, although I don't think anyone seriously uses this system.

God Plays Dice