Terence Tao gives a (well-known) proof of the crossing number inequality. This inequality states that for a graph of average degree at least eight,
cr(G) ≥ E3/(64V2)
where V is the number of vertices of G, E is the number of edges of G, and cr(G) is the so-called "crossing number" -- the minimum number of crossings of edges in a drawing of the graph. (As is usual in the "probabilistic method", a fairly simple proof gets the same form as the best known proof but with weaker constants; a paper of Pach, Radiocic, Rados, Tardos, and Gabor1 improves the constant to 31.08. (From a quick skim through the paper, it doesn't look too interesting, being mostly a bunch of messy computations.)
Why should the crossing number inequality have these exponents? Say that we expect a lower bound for cr(G) of the form cEαVβ. (We have this, with c = 1/64, α = 3, β = -2.)
Say we replace G with two copies of G. Then we double its number of edges, its number of vertices, and its number of crossings. Our proposed lower bound now becomes c(2E)α(2V)β, and we'd like this to equal twice the old bound. So we have
cEαVβ = c(2E)α(2V)β
and canceling what will cancel, we get α + β = 1. (Of course, this is all technically wrong, because I've been adding inequalities like they're equalities.) So if we can guess what either α or β will be, we win! (What do we win? A cup of bad coffee. Please go to any mathematics department to claim your prize.)
So now let's say we double the number of edges in a graph; we want to know why this should multiply the number of crossings by 23 = 8. This seems not so promising. Imagine two graphs on the same vertex set, G and G', which we've smushed together to make a new graph. There is some number of crossings among edges of G, and some number among edges of G'; we want the total nubmer of crossings on this new graph to be eight times the number on either of the original graphs. So the number of crossings involving an edge of G and an edge of G' must be six times the crossing number of G or G'; this might be true but seems hopeless. (Throwing three graphs on top of each other only makes it worse; now we need twenty-seven times as many crossings on G + G' + G'' as we would on any of the three summands, and so we'll have to get eight times cr(G) crossings from each pair of summands. It only gets worse with more summands.
Okay... here's another idea. Hold V constant. Then the inequality says that the crossing number is at least proportional to the cube of the number of edges. This is strange, as it seems to imply somehow that a crossing is going to involve three edges. But now think about how many potential crossings there are in a graph with E edges. These are just pairs of edges, so there are about E2/2 of them. So the inequality is just saying that the probability of two edges crossing in a graph is proportional to the total number of edges!
But what makes things cross? Well, edges that are long are more likely to cross than edges that are short. (It's like how you find yourself maneuvering around fat people on the sidewalk, but not skinny people.) In particular, if we have two curves of length l1 and l2, their probability of crossing -- assuming they move around "randomly" -- will be proportional to the product of their lengths. But how long are the edges in a drawing of a graph? In order to minimize crossings, we probably want edges which are connected to be near each other. And the vertices are probably "evenly spaced" in the plane; say each one has an area of order 1 that is closer to it than to any other vertex. Let d = 2E/V be the average degree of our graph; then the optimal drawing of a graph should have the edges connected to any given vertex v within a circle of area ~d around that vertex... or a circle of radius ~d1/2. Thus edge length is proportional to the square root of the average degree... and the probability of two edges crossing is proportional to the square of average edge length. So the probability of two edges crossing is proportional to average degree, which if we're holding V constant is proportional to the number of edges.
Just like we wanted.
Of course, there are a lot of places where this argument doesn't make sense; in particular I feel a bit wrong about my argument for the probability of random curves crossing being proportional to the product of their lengths. But it's still how I will understand why this result is true.
(There is a way to reinterpret the crossing number inequality, if for some reason that V in the denominator bothers you, as it did me at first We can interpret it in terms of the average degree d of a graph G, which is just 2E/V. Then we have
cr(G) ≥ Ed2/256
and this form seems clearer to me. Crossing number increases linearly with number of edges, which makes perfect sense -- if you take two disjoint copies of a graph, they each have the same number of crossings, so the total number of crossings should be twice what you started with. And it increases with the square of the average degree. None of this silly stuff with the number of vertices in the denominator. The problem with this, as I discovered, is that the average degree depends on the number of edges...)
1. Pach, János; Radoi\v ci\'c, Rado\v s; Tardos, Gábor; Tóth, Géza Improving the crossing lemma by finding more crossings in sparse graphs. Discrete Comput. Geom. 36 (2006), no. 4, 527--552. Available from Tardos' web site at http://www.renyi.hu/~tardos/crossinglemma.ps.