Problem 1.1(c). Most present-day [mid-1980s] computers only have one central processor -- to use our analogy, one clerk. This single file clerk sits there all day long working away like a fiend, taking cards in and out of the store like mad. Ultimately, the speed of the whole machine is determined by the speed at which the clerk -- that is, the central processor -- can do these operations. Let's see how we can maybe improve the machine's performance. Suppose we want to compare twon-bit numbers, wherenis a large number like 1024; we want to see if they're the same... [suppose] we can hirenfile clerks, or 2nor perhaps 3n; it's up to use to decide how many, but the number must be proportional ton. [How can you get the comparison time to be proportional to logn?]

The proof is by induction. The base case, when

*n*= 1, is clear -- we can check that two 1-bit numbers are the same in one step. We want to show that if

*n*clerks can do this in time t(n), then 2

*n*clerks can do it in time

*t*(

*n*) +

*c*, where

*c*is a constant. But that's easy -- split each of the two

*n*-bit numbers

*a*and

*b*into half:

*a*= a

_{1}a

_{2},

*b = b*. If a

_{1}b_{2}_{1}= b

_{1}AND a

_{2}= b

_{2}, return TRUE; otherwise return FALSE. That's just one more time step once you've figured out if a

_{1}= b

_{1}AND a

_{2}= b

_{2}.

The picture I get in my head of this situation makes me smile. You need n clerks -- n parallel processors. The

*k*th clerk compares the

*k*th bit of the first number with

*k*th bit of the second number; ey returns a 0 if the two bits are different and a 1 if they're the same. (This is, of course, bitwise

_{2}

*n*.

The next subproblem in Feynman is to

*add*two

*n*-bit numbers in logarithmic time by using such parallel processing. (This is harder because you have to worry about carrying!) Again, it reduces to a problem of combining information on words of one length to words of twice that length -- if you have

*a = a*(where

_{1}a_{2}*a*is the "high", or more significant half, and a

_{1}_{2}is the low half), and

*b = b*, what is

_{1}b_{2}*a+b*? Breaking it down into its high half and its low half,

*a + b = (a*

_{1}+b_{1}+c)(a_{2}+ b_{2})where

*c*denotes a

*carry*from the addition of

*a*and

_{2}*b*, and

_{2}*a*denotes addition modulo 2

_{2}+ b_{2}^{k}where

*k*is the length of the summands. We assume

*a*and

_{1}+b_{1}, c,*a*are already known. The problem is... what if

_{2}+ b_{2}*a*ends in a bunch of 1's, and

_{1}+b_{1}*c*is 1? We have something like

...0111111 + 1

and we want to add the 1, so we think "okay, I'll just change the last bit in the first summand from 0 to 1"... but it's 1... so you change the last two bits from 01 to 10... but no, that won't work either... so you change the last

*three*bits from 011 to 100...

*On average*this isn't a problem.

*On average*the number of 1s at the end of

*a*is unity.

_{1}+b_{1}But not always. Somebody could be deliberately feeding you numbers that are hard to add. Some sort of sadistic grade school teacher.

edited, 10:48 am: Welcome to those of you who came here from reddit. Be aware that I am not a computer scientist -- I'm a mathematician -- so I might not really know what I'm talking about here.

## 8 comments:

But not always. Somebody could be deliberately feeding you numbers that are hard to add. Some sort of sadistic grade school teacher.From what I remember of my halcyon computer-science days, this is the difference between "worst-time" analysis and "amortized" analysis.

I've heard of amortized analysis; I'm not sure if that's exactly what I'm doing here, or if what I did here was what they call "average-case" analysis.

Amortized analysis is where you prove that aggregating a bunch of consecutive operations takes a given order of time (often less than the worst case for a single operation).

As you said, what you mentioned above is closer to average-case analysis.

Amortized analysis usually has some concept of which operations are "cheap" and which ones are "expensive", and then construct a proof that there are enough cheap operations to cancel out the expensive ones.

You didn't mention the complexity of addition, but using O(n^3) clerks (assuming a clerk is a transistor) you can can add in worst case O(log n) time.

Sorry, colin's right. As I said, it's been a long time...

But the two aren't wholly unrelated. Let's say you're given the task of adding up a big list of pairs of binary numbers. You can partition them into rough difficulty classes and say "these are easy, those are hard". Then you get a running time for easy sums and hard sums, and average over some probability distribution. Is this amortized or average-case?

That's average case, because you can't guarantee the proportions of easy to hard numbers across all possible inputs.

Imagine something like a resizable array, where each time the amount of space allocated to the array runs out we double the amount of space allocated. Then adding to the array is amortized O(1), since most of the time we just write into the available space (O(1)) and each time we run out of space we have to allocate more and copy all the old data over (O(n)).

Call the amount of space allocated the "size" of the array, and the number of items in it the "length". If the array starts with size 4 and length 0, adding the first four items take O(1) time. Adding the fifth item (n = 5) takes O(n) as we have to copy all 5 items across. The size is now 8, so adding the next three items (n = 6, 7, 8) takes O(1) time. Adding the ninth takes O(n). And so on.

Because we expand the size of the array by a constant proportion (multiply by two in this case), each doubling takes O(n) time but only happens 1/n as often (at n = 4, 8, 16, 32 ...) so *on average* this is an O(1) operation. Because this property has nothing to do with the nature of the inputs, it is amortized O(1) time.

See also http://en.wikipedia.org/wiki/Amortized_analysis and http://en.wikipedia.org/wiki/Dynamic_array

> returns a 0 if the two bits are different and a 1 if they're the same. (This is, of course, bitwise exclusive OR.)

I think you meant XNOR.

Post a Comment