04 August 2007

geographical random walks

Let's say you have a map of the streets in an area. Could you guess, from looking at that map, which intersections the locals consider "most important"?

You could probably make a decent first guess by assuming that the importance of an intersection goes up with the number of roads at the intersection.

Why? Imagine you are randomly walking around a town. Each time you get to an intersection you choose uniformly at random from each of the possible ways you could go. (For the sake of simplicity, I'll assume that this includes going back the way you came.) It's a well-known result that this sort of random walk on a graph leads to a certain stationary distribution; that is, if a bunch of people walk around randomly you'll get to a point where the number of people at any given intersection is roughly constant. And that result is that number of people at any intersection is proportional to the number of roads coming into it. (I'll count the two directions of a road coming into the same intersection as two distinct roads.)

This has implications if, say, you want to start a business. If you're located in the middle of a block, there are two roads coming in. If you're at a "T"-shaped intersection, there are three. A normal grid intersection, four. An intersection like 48th and Baltimore or 23rd and South, five -- both of these have two streets that go through and a third that starts there. (Most of the intersections on Philadelphia's Baltimore Avenue are five-pointed, because the street grid has a sort of discontinuity there; this may explain why it's sort of the main street of its part of the city.) An intersection like Broad and McKean, which has three streets going through it, six. You want to be where there are more roads, so that more people will walk by. In Washington they have intersections like Dupont Circle where five roads go through, for a total of ten "arms". (The counting gets a bit tricky, because some of the roads don't actually go "through".) In Paris there's an intersection with eleven arms, fittingly called L'Etoile. (Much more than that seems impractical.)

Of course, this neglects a huge number of facts. Not all roads are equal. For example, I'm ascribing Baltimore Avenue's primacy in its part of Philadelphia to the fact that it's got five-pointed intersections; but probably more important is that, historically, it was the road that went to Baltimore. (Hence the name.) Also, a trolley runs down Baltimore Avenue and has for over one hundred years; but why is it there, and not somewhere else? Because people already were living or working on or near that street. And they were doing that because other people were. Once a slight imbalance is created -- say by the random-walk model I alluded to above -- people are naturally going to gravitate, even if it's very slightly, towards the place where people already are. The roads that lead towards the important nodes will become important in their own right. And street networks aren't static -- they can change. (River networks can't, though -- and a lot of cities are located where two rivers come together to form a third, the most famous U. S. example probably being Pittsburgh.) But the initial imbalance has to come from somewhere, and this is as good a place as any, especially in places where the roads mostly form a grid and hence their arrangement is determined well in advance of their actually being built.

Incidentally, this is my theory on why Boston and Philadelphia -- two cities which are of similar size and population density, at least if you consider them as the center of their respective metropolitan areas, and the two cities I am most familiar with -- differ in their transit system. Boston is able to have subways because for the most part it's obvious where the subway stops should be. Radical Cartography has a map of Boston as a series of squares; the original transit system was basically built around people getting from one square to the next. Philadelphia would have trouble supporting a subway system more extensive than its current two lines, because it's hard to point to a place that a huge number of people go which isn't served by the existing system; the population is more diffuse, because for the most part Philadelphia's streets form a grid and no node is any different from any other. If I wanted to I could draw, say, six or eight subway lines that I'd like to exist in Philadelphia -- but none of them seem essential to me, at least at the price that it costs to build a subway. (Some Philadelphians will point out that there's a long-standing plan to put a subway down Roosevelt Boulevard; to them, I will say that I have very rarely had any reason to be in Northeast Philly, so I forget about the Boulevard. I'm sorry.) Manhattan has subways -- even though it's for the most part a grid -- because it's so dense. (But I think I heard somewhere that Times Square is the busiest subway stop -- and it's under Broadway, which cuts through the grid diagonally.)


Unknown said...

You might guess from a map that Baltimore and Broadway are important streets, because in cities built in grids, diagonal streets are more likely to be used to minimize walking distance (or alternately, are so old that they predate the grid pattern).

Anonymous said...

Should two way streets be counted as two streets coming into the intersection, and one way streets be counted as one? I think so, but perhaps a two way street is not worth quite as much as two one-ways at an angle. In fact, I wonder if you should multiply the streets by something like c + (1-c)sin(alpha), where c is some constant in [0,1], maybe 1/2, and alpha is the angle between each street in the intersection and the next one clockwise. (In the case of only two streets, use the smallest angle, so alpha is always in [0, pi/2].) I suggest this because an intersection between streets that are at a greater angle may be more likely to bring together travellers from places that are further apart, and may thus serve a wider area.

But ignoring these considerations, the general approach you mention -- analyzing the intersections -- is essentially the same as the Google algorithm for page rank, where links like intersections and pages are like streets. Indeed, one way of thinking about page rank is to think of it as a rough reading of the likelihood you'd wind up on a particular page by browsing somewhat randomly. And, analogously to your considerations of weighing some streets more heavily than others, Google weighs some links more heavily than others, based on position on the page and based on fonts used, as well as some other factors.

The computation involves reading out the eigenvector of the square adjacency matrix, where the entry for each page gives page rank, and pages are adjacent if there is a link between them. (But there are some manipulations needed to ensure that this can be found and is stable. The matrix is transformed to make it an irreducible, stochastic matrix.) The AMS published
a nice article on it.

Google also takes some measures to identify completely "fake" links, which exist only to boost page rank. Of course there's no equivalent problem with roads.