## 02 December 2007

### Redistricting maps

I've previously mentioned the shortest splitline algorithm for determining congressional districts in the United States. (This could work in other districting situations, as well.) The algorithm breaks states into districts by breaking them up along the shortest possible lines; for a more thorough description see this.

Well, today I got an e-mail from the good people at rangevoting.org saying that they now had computer-generated maps of their algorithm's redistrictings for all fifty states. These, I assume, supercede their approximate sketches

I'm not entirely sure how good an idea this particular redistricting algorithm is. Basically, assuming that straight lines are the right way to break things up seems to imply that all directions should be treated equally, when actual settlement patterns aren't isotropic. But the beauty of any algorithm which doesn't include any "tunable" parameters -- of which this is an example -- is that there is zero possibility of gerrymandering. If we imagine an algorithm that takes into account "travel time" between points, for example, instead of as-the-crow-flies distance, then how do you define travel time? And next thing you know, you'll see new roads getting built because of how you'd expect them to change congressional districting. As-the-crow-flies distance along the surface of the earth doesn't have these issues.

Not surprisingly, the people at rangevoting.org also support something called range voting, which would basically allow people to give scores to candidates in an election, and the candidate with the highest scores would win. I haven't much thought about it, but it seems like a good idea. And here's their page for mathematicians!

Ian said...

Thanks for the links to the site. I'd not seen it before, despite a deep interest in voting methods and systems.

After losing about an hour scouring around the site, it strikes me as a "doth protest too much" sort of problem. The hint is when the site claims that a change to range voting would provide more social benefit than the original move to democracy in the US.

The only way this is shows is in some aggregation of utility gleaned from how people enter their values in the voting vector. The assumption that's bad isn't a mathematical one, it's a social one. The method purports to measure some form of "regret" (that it's bayesian is a statistical issue that seems largely like it could be skipped and just straight sums of votes counted) that takes the difference between the social welfare had in the case of one candidate and the "optimal" candidate. Of course, the optimization is again done using individual "utility". Which is fine, as far as it goes.

The real problem is that this skips the whole problem, well talked about in economics and political science and elsewhere, that a social welfare function is impossible. You cannot sum up the revealed preferences of voters and say "ah ha! I know how to make you all the happiest". You can't do it because of the unmeasured disutility I might get from not only seeing your candidate beat mine, but from my just generally not liking to see you happy. I'm sure the suggestion that there is a full range from -1 to +1 inclusive that you could enter as your preference for a candidate is meant to address this, but in the site's discussion of objections, it explicitly say it doesn't want to measure distance (a metric) for happiness. But this is precisely what would be required for a true social welfare function. One would have to know how much disutility I might get (if any) from seeing you get utility. And you'd have to know just HOW much I don't like a certain candidate.

Does ranking George Bush a -1 versus a Gore at +1 really capture the full level of dislike one might have for GWB (or vice versa, to take all points of view into account)? Changing the scale does nothing either. I'm pretty sure you'll be able to find a lot of people for whom a "Negative Gazillion" wouldn't be enough to rate GWB properly. This isn't because they are strategic; just the opposite, as some people are trying to get a real measure for their honest feelings. I just don't think you can sum up a couple of values, take their distance and claim you've minimized "regret" while at the same time claiming you don't want a metric.

The real problem is that this skips the whole problem, well talked about in economics and political science and elsewhere, that a social welfare function is impossible. You cannot sum up the revealed preferences of voters and say "ah ha! I know how to make you all the happiest". You can't do it because of the unmeasured disutility I might get from not only seeing your candidate beat mine, but from my just generally not liking to see you happy.

Well, that's just wrong. The individual utilities of the voters represent all of that. They represent the sum of the satisfaction derived from the candidates' hairstyles, voices, policies, clothing choices, etc. as well as from the satisfaction of seeing others be happy, or suffer, or what have you. The voters in the utility simulations vote of the basis of how happy they'd be to see the various candidates win or lose. So the utilities they have are right, and the simulations were done properly.

in the site's discussion of objections, it explicitly say it doesn't want to measure distance (a metric) for happiness.

I'm not sure what you're talking about. This is clearly a misunderstanding. Please explain precisely what you mean.

But this is precisely what would be required for a true social welfare function. One would have to know how much disutility I might get (if any) from seeing you get utility. And you'd have to know just HOW much I don't like a certain candidate.

The utilities in the calculations represent whatever the real utilities of these voters would be, without any regard for HOW they were determined. It doesn't matter whether a voter wanted Obama to win just to make his friend upset, or to make his mother happy, or anything like that. What one ultimately has is utilities, and then he votes in order to maximize his derived utility. Election simulations DO NOT CARE why you like the candidates you like. It's completely and utterly irrelevant, aside from ensuring the distribution of utilities is relavatively realistic (and the Princeton math Ph.D. who did these sims took pains to make them suitably realistic).

He also added ignorance factors, to represent the disparity between the perceived value of a candidate, and the real value of a candidate. That is, maybe I really wanted Hillary to win, just because I wanted to see my country go down in socialist/military-industrial-complex flames. But then say she actually surprised me by re-instating the Constitution, and totally disappointing me. Behold, the disparity between what I expected and what I got. You could also relate that to the imperfection in one's judgment of how much the result would tweak someone else's nose hairs (since he can't read that person's mind and know his precise utilities).

The point I'm rather long-windedly making here, is that such simulations only have to measure how well a voting system satisfies your preferences. It is agnostic to your specific reasons for liking what you like, and there's no reason these simulations have to care about that. Make sense?

Does ranking George Bush a -1 versus a Gore at +1 really capture the full level of dislike one might have for GWB (or vice versa, to take all points of view into account)?

Of course not. That's why Range Voting gets between a 78% and a 95% social utility efficiency, and not a (perfect/mind-reading) 100%.

I'm pretty sure you'll be able to find a lot of people for whom a "Negative Gazillion" wouldn't be enough to rate GWB properly.

Which is why the Bayesian regret calculations on RangeVoting.org show that Range Voting is close to perfect, but not quite actually perfect. And hugely better than e.g. IRV, Condorcet, Borda, plurality, etc.

I just don't think you can sum up a couple of values, take their distance and claim you've minimized "regret" while at the same time claiming you don't want a metric.

You're absolutely not understanding. Here is a helpful page that will explain these measurements better to you.

Ian said...

Thanks for the link to the page, but I'm pretty sure I get what's going on.

And this is probably a debate the site proprietor would like to have elsewhere. But I'll take another shot, briefly.

I don't doubt the simulations -- and especially the maths -- were done correctly. (And, frankly, I'm not sure this required a "Princeton Ph.D". The school, even with a nice name, seems largely irrelevant. Is math more right when it's from an Ivy?)

Getting "around" Arrow's impossibility theorem is pretty standard fare, so long as you're clear that you're violating one of the criteria. It's admitted on the site that the IIA criterion is broken, but since Arrow is said to have given a "silly" definition I'm not sure this is a fruitful area of argument.

What I have true problems with is the focus on maximizing social welfare. What this seems to do, and well I think, is find the winner who is most liked by the most people. But this isn't the same as social welfare. It's revealed preference over a limited range of choices. No one cares how a person comes up with their ordering of preferences, but the amount of value on each is an important factor in societal utility. If you want to maximize the societal utility of an election, you'd also need to minimize the disutility people have from seeing their lower-ranked choice win. Say I give A a -1 and B a +1, and you do the opposite. Then say A wins. But I may hate this more than 10 times the amount you like it. While you may have only disliked a B win 3 times as much as an A win. If the population is evenly balanced between people like us and the pivotal voter voted for A just barely above B and is nearly indifferent over the outcomes, the A win doesn't really maximize "social welfare" (largely because you can't get a good measure). The system cannot make this kind of inter-personal utility comparison.

What one ultimately has is utilities, and then he votes in order to maximize his derived utility.

That is, maybe I really wanted Hillary to win, just because I wanted to see my country go down in socialist/military-industrial-complex flames. But then say she actually surprised me by re-instating the Constitution, and totally disappointing me. Behold, the disparity between what I expected and what I got.

And this is what I mean. Now the voting system solves the problem of inter-personal utility comparison AND the problem of intertemporal utility comparison?

The intensity of preference isn't captured since the range of numbers is only ordinal, not cardinal. The difference between a 1 and a .9 isn't necessarily me liking the second 10% less.For some it might be, but if any one person it isn't, then any claims to interpersonal comparison are lost. And my preference the day before the election isn't what it might be the day after. I might want socialism today, then tonight after an attack by Canada, really want facism to reign so the northern menace is crushed.

But to be fair, the system only wants to be better than current methods. Which it definitely appears to be in terms of getting a clearer measure of voter intent at that time.

John Armstrong said...

The problem with "social welfare" functions and utility functions in general is that they're either oversimplifications or meaningless.

Ian claims that they oversimplify, and that they don't take into account all sorts of other sources of motivation. He seems to take the narrower view of utilities as something that can be counted up from this source or that.

Broken Ladder claims that the more ephemeral sources are actually contained in the utility functions already, and we don't have to explicitly list where utility comes from. This is the wide-ranging view that leads Benthamites into the tautology that people act to maximize utility because utility is what is maximized by people's actions.

Here the wide view is meaningless because it, by definition, can't be accounted for. And so it can't be used to make predictions or set policy. It's a great idea for theories, but it's completely useless in practice.

Ian said...

John,

That's essentially my point, with one modification. In this kind of analysis, I don't really care how people arrive at their utility functions. Revealed preference is the only useful thing that can be taken away from voting, precisely because the interpersonal comparison is impossible. It's not that I want to sum things up from this or that, I just exclude the notion that if you take some stated ranking and find an optimum, you have somehow included the intensity of preference. And so if you can't get intensity of preference, you can't claim to have minimized (max'ed) social disutility (utility).

Ian,

Range Voting does not violate any of Arrow's criteria, as it is a cardinal voting method, and Arrow's theorem (and the more important Gibbard-Satterthrwaite theorem) apply only to ordinal (rank-order) voting methods. This is a common mistake.

And Range Voting passes IIA in the sense that, looking at a set of RV ballots, adding or removing an irrelevant alternative to the set that we are evaluating will never change the outcome. That is not true of any rank-order system. Range Voting only fails IIA in the sense that people will actually cast different scores in real life, because of things like strategy, and normalization. No voting method can escape such effects of course (unless we use hedonimeters to cast honest utility values, free of any insincerity).

What I have true problems with is the focus on maximizing social welfare. What this seems to do, and well I think, is find the winner who is most liked by the most people.

Absolutely. This is mathematically "proven" to be the ideal goal of a social choice system ("voting method").

But this isn't the same as social welfare. It's revealed preference over a limited range of choices.

Incorrect. Range Voting is (somewhat) revealed preference, over a limited range of choices. If you had read the link I posted, you would understand that we do not calculate social utility efficiency (an inverse, scaled expression of Bayesian regret) by treating the Range Voting scores as actual revealed utility values. The digital voters have real private utilities in their "minds", that they are free to lie about when they cast their votes. We then compare voting methods based upon the ACTUAL net social welfare, considering strategy, ignorance, and other factors - which we can do in these simulations, since we can precisely read the minds of the voters.

See, in real life we could only do this by reading the minds of voters, which is impossible. But it's quite possible to start with a realistic distribution of utilities, and then model how real voters would vote, given those utilities. A sincere voter who prefers Nader, votes for Nader. A strategic voter casts a vote for his favorite front-runner, realizing that a sincere vote would have a lower expected value. We tweak the proportion of strategic/honest voters, and other variables like that, and average millions of elections together, and see what happens.

So Smith knew what he was doing, and did it correctly.

If you want to maximize the societal utility of an election, you'd also need to minimize the disutility people have from seeing their lower-ranked choice win.

The simulations calculate the net (social) utility of each election, based on who won. Maybe voter 12334 preferred McCain, and voter 33245 hated him. That will be reflected in the summed utility. So the utility calculations do that, as explained in the link I posted.

Say I give A a -1 and B a +1, and you do the opposite. Then say A wins. But I may hate this more than 10 times the amount you like it.

Which the simulations take into account, as explained in the link I posted.

While you may have only disliked a B win 3 times as much as an A win. If the population is evenly balanced between people like us and the pivotal voter voted for A just barely above B and is nearly indifferent over the outcomes, the A win doesn't really maximize "social welfare" (largely because you can't get a good measure). The system cannot make this kind of inter-personal utility comparison.

Sure it can, because it knows the exact utility values for each voter, and knows all this stuff about intensity of preference. You need to undestand the difference between actual utility values (up in the voters head) and scores that are marked down (on the ballot).

Now the voting system solves the problem of inter-personal utility comparison AND the problem of intertemporal utility comparison?

It doesn't "solve" it. That is, Range Voting doesn't make voters clairvoyant. It just better satisfies the electorate, even when we take voter non-clairvoyance into account.

The intensity of preference isn't captured since the range of numbers is only ordinal, not cardinal.

Range Voting is cardinal, not ordinal. An ordinal voting method is one that only considers order of preference, and nothing more. Range Voting uses scores, not a description of orders.

The difference between a 1 and a .9 isn't necessarily me liking the second 10% less.

We (arbitrarily) describe an "honest" Range Voting ballot as having one's least liked candidate be a 0 (or whatever the minimum score is), and the most liked be a "10" (or whatever the biggest score is), such that a 9, for example would represent a decrease in utility (relative to a 10) equal to 10% of the difference in utility between the favorite and least favorite candidates.

Now, real human voters can exaggerate, or make mistakes, or whatever. The simulations take all of that into account.

Moral: all this stuff that you think we have not thought of, we have thought of. And you would know that if you had...

For some it might be, but if any one person it isn't, then any claims to interpersonal comparison are lost.

Wrong. We can make perfect interpersonal comparisons of utility, because we can read (and even re-write) the digital voters' actual private utility values, in their digital brains. Let me repeat, one more time: scores are a function of, among other things, honest utility values. Scores are NOT intended to be synonymous with utility values.

And my preference the day before the election isn't what it might be the day after.

Which is why the simulations incorporate random "ignorance factors", which represent the difference between prior estimation of utility, and ultimate satisfaction (e.g. in retrospect, after the candidate has served his term, or at the time of the voter's death).

To my knowledge (you are free to pour over the full datasets if you like), voter ignorance levels don't have a very strong differential effect among voting methods. That is, if you turn the knob on the simulations from "society is completely ignorant" to "society can see the future that would result from the election of any candidate", the relative quality of the voting methods would not change hugely.

In any case, you are but one in an endless string of critics to make such arguments, and think you've spotted things that were missed. Moral: they weren't missed, and the simulations are correct.

John,

The simulations were certainly not "oversimplifications". They included 5 knobs governing such variables as the proportion of strategic-vs.-honest voters, the number of candidates, voter ignorance levels, etc. Different utility distribution systems were used, and many other factors were taken into account.

The results show that Range Voting is a comparable improvement in net satisfaction as the advent of democracy. That is most certainly a reason to implement Range Voting for all sorts of social choices, especially political ones. You have not cited any flaws in these simulations, or any reason whatsoever to call them "meaningless". Do believe that having a "more happy" or "more satisfied" society is a meaningless concept?

Ian said...

BL,

We're clearly not going to come to agreement on the central problem. I believe that in no way RV captures anything other than a revealed choice. Precisely because you cannot know what is in someone's mind you cannot maximize social welfare. It is, to be frank, a ludicrous notion.

Your assertion that the scale is cardinal is also beyond support. That I rank people finely or crudely makes no difference. This is still putting people in order. You ASSUME some distance between preference means something -- a 10 to a 9 is a 10% decrease -- when in fact you have no way to know that it means such for each person. You are, then, violating the dictatorship criterion, since you are restricting certain methods of making choices. You state this explicitly by saying you assume some normal range for people. People are mapping their understanding of their preferences into your scale. Your scale isn't some method for capturing a "true" utility.

And, in re: the Poundstone book -- since no one argued that the current system is the best choice, this seems entirely tangential at best, or an example of confirmation bias at worst.

Ian,

You are correct that we cannot know what is in someone's mind. Bayesian regret for a voting method is the difference between

1. The utility we get by using a particular voting method, and
2. The utility we'd have gotten if we could have read the voters' minds

So if we could select a winner by putting hedonimeters on every voter's head, and precisely measuring his honest utility, we'd pick the social welfare maximizer, and have a Bayesian regret of precisely ZERO, i.e. a "social utility efficiency" of 100%.

Range Voting instead achieves around a 78% social utility efficiency if voters are strategic, and arounda 95% if they are honest. So, by definition, Range Voting doesn't maximize social welfare - not in the absolute sense. But it is far superior to the other common methods that have been proposed, such as plurality, IRV, Condorcet, and Borda. And yes, there are methods that are even more utilitarian, but they come at a price of increased complexity/cost.

Your assertion that the scale is cardinal is also beyond support.

Well, no. It's a simple fact that you can check in the glossary of various mathematics texts.

That I rank people finely or crudely makes no difference. This is still putting people in order.

On the contrary, it makes all the difference, if you understand the definition of "ordinal" vs. "cardinal".

You ASSUME some distance between preference means something -- a 10 to a 9 is a 10% decrease -- when in fact you have no way to know that it means such for each person.

False. We don't make any such assumption.

You are, then, violating the dictatorship criterion

False. Range Voting does not specify that any voter have more power than any other voter, much less complete dicatorship against any number of opposing voters. This is a complete fabrication on your part.

..since you are restricting certain methods of making choices.

All voters are equally "restricted". They are given an allowed range of scores, and can use any score within that range.

You state this explicitly by saying you assume some normal range for people. People are mapping their understanding of their preferences into your scale. Your scale isn't some method for capturing a "true" utility.

Of course it's not, as I explained to you at great length. The voters have private "true utility" values in their heads, which are considered when they cast their ballots, along with factors like strategy, ignorance, and normalization. We never ever ever said that the scores are an exact measure of utility.

And, in re: the Poundstone book -- since no one argued that the current system is the best choice, this seems entirely tangential at best, or an example of confirmation bias at worst.

His book specifically argues for Range Voting, not just for something better than the current system.

In any case, please read the link I posted, and read my previous responses to you, and actually stop and process what has been said before sending another reply that demonstrates a total lack of understanding of this subject.

Thank you.