The Phillies' current record is 46 wins and 46 losses.
When I heard this, I thought "hmm, the Phillies have been at .500 quite often this season". Baseball-reference.com tells us that they have been 0-0 (yes, that counts!) 20-20, 21-21, 22-22, 23-23, 24-24, 26-26, 28-28, 29-29, 44-44, and 46-46 this season; that's eleven times. Is that a lot? (I remember first noticing that between the 40th and 48th games of this season; after they were 20-20 they lost, won, lost, won, lost, won, lost, and won, in that order.)
Given that the team is 46-46, how many times should we expect them to have had the same number of wins and losses? It's a lot easier to work this out, of course, if we replace "46" with some smaller number.
For example, say the team had won two games and lost two games. Then there are C(4,2) = 6 ways we can arrange their two wins and two losses: WWLL, WLWL, WLLW, LWWL, LWLW, LLWW. In the first and last of these, the team was 0-0 and 2-2 at various times; in all the others they were also 1-1 after two games. This seems kind of obtuse, but let's flip things around. In six of these possibilities (which are all equally likely, because we've assumed the team wins exactly half its games), they're 0-0 after 0 games. In four of them, they're 1-1 after 2 games. In six of them, they're 2-2 after 4 games. The expected number of times that the team is at .500? It's (6+4+6)/6, or 16/6.
Sixteen is a power of two.
If we try this again for a 3-3 team, there are C(6,3) = 20 ways we can arrange three wins and three losses; there are 20, 12, 12, and 20 ways to arrange them so that the team is at some point 0-0, 1-1, 2-2, and 3-3 repspectively. So the total number of times we expect them to be at .500? It's (20+12+12+20)/20, or 64/20.
Sixty-four is again a power of two. Hmm, this can't be a coincidence.
Let's try to find that sum in the numerator in general. If the team has n wins and n losses (so eventually I'll set n=46 to solve the original problem), then how many ways are there to arrange the wins and losses so that the team wins m of the first 2m games? Clearly this is C(2m,m) C(2(n-m), n-m); we first have to pick which of the first 2m games are the first m wins, then which of the remaining 2(n-m) wins are the n-m remaining wins. So what we want to find is the sum
C(0,0) C(2n,n) + C(2,1) C(2n-2, n-1) + ... + C(2n-2, n-1) C(2,1) + C(2n, n) C(0,0)
and I don't see how to do this directly. However, consider the (infinite) power series
1 + 2z + 6z2 + 20z3 + 70z4 + ...
where the coefficients are C(0,0) = 1, C(2,1) = 2, C(4,2) = 6, C(6,3) = 20, C(8,4) = 70, and so on. (This is called the generating function of this series; generating functions are a ridiculously powerful tool which I will only scratch the surface of here.) This turns out to be the Taylor series of the function (1-4z)-1/2 at z=0. Now, consider what happens if we multiply this power series by itself, so we have
(1 + 2z + 6z2 + 20z3 + 70z4 + ...)(
1 + 2z + 6z2 + 20z3 + 70z4 + ...)
= (1)(1) + [(2)(1) + (1)(2)]z + [(6)(1)+(2)(2)+(1)(6)] z2 + [(20)(1)+(6)(2)+(2)(6)+(1)(20)] z3 + ...
and the coefficient of zn is exactly the sum we want to find! But the power series multiplied by itself is just (1-4z)-1, so the coefficient of zn is 4n.
Finally, we conclude that if we work out the expected number of times at .500 for a team with n wins and n losses, it's 4n/C(2n,n). But it's well-known that C(2n,n) is approximately 4n/(πn)1/2. So a team with n wins and n losses is expected to have been at .500 very nearly (πn)1/2 times.
When n=46, this approximation gives 12.021. (The exact number 446/C(92,46) is, to three decimal places, 12.054.) The Phillies have been at .500 eleven times so far; this is actually less than the expectation, which surprised me. A team which is .500 at the end of the season is expected to have been at .500 sixteen times during the season. For the Phillies, though, since they're already at 46-46, that adjusts the estimate upward, to around twenty-two.
In general, though, one might not want to use the expectation of a random variable like this. It's possible that most teams which are .500 at the end of the season really hit that mark in the middle of the season, but a very few teams are .500 some ridiculously large number of times. However, the most times a team can be at .500 over a 162-game regular season is 82 (0-0, 1-1, 2-2, ..., 81-81), so the expectation probably is a decent guess. Also, the expectation is often a lot more accessible than more detailed information. It is in this case, because I haven't figured out how to get the whole distribution yet, so I don't know the probability, say, that a 46-46 team has been at .500 eleven or more times. That seems harder to figure out, and the best way to find that number would probably be via a simulation; getting an exact, analytic answer doesn't seem easy.