C / C++ FAQs & Programming Resources - ProkutFAQ : SwapTwoNumbersWithoutTemporary

HomePage Recent Changes Recently Commented Login/Register
Quick Links
Categories
Links

How to swap two numbers without using temporary variable ?

This one would figure in anyone's list of elite questions in Orkut programming communities that are guaranteed a minimum of 50 responses everytime someone raises it, if at all anyone were to be jobless enough to compile one such list. This question is often asked in many interviews in India and so obviously many would be interested in knowing the answer.

And it would come as a rude jolt to many when confronted and told that there is no standard conformant way of doing this which would apply equally well to all built-in types.

I would like to discuss all the solutions that have been offered so far.

a=a+b;
b=a-b;
a=a-b;


The above method wont work because it doesn't take into account overflow errors. Suppose for example if we had to swap two signed numbers 1 and 32767 on a system where int is 16 bits long. On such a system, the maximum value an int can store (INT_MAX) would typically be 32767.

So when the first step a+b is executed the result would be 32768, which would be larger than the maximum value that can be stored in an int variable. Consequently this value cant be stored in a and hence an overflow occurs. What happens after this is undefined. No matter what happens under the hood, the result is bound to be undefined and hence this method fails whenever an overflow occurs.

a=a*b;
b=a/b;
a=a/b;


This method would fail for the same reason the previous code snippet would fail. Here a*b can overflow. This has more problems as well. a/b can overflow too. This method would also fail if b were zero, as it would cause divide by zero error. So this is another worthless suggestion.

a=a^b;
b=b^a;
a=a^b;


The method above wont work for floating point values. Bitwise operators are not defined for floating points and hence this solution can be used only when the numbers to be swapped are not floating points. Usage of this method on variables other than int (char, long, short) would result in Undefined Behavior(UB). In all the three methods above, the compiler implicitly uses temporary values to store the result from the mathematical operations and this temporary value is then copied into the l-value. So technically we are using temporary values here too. But let us proceed assuming that the question here is to swap two variables without defining any temporary variable in the user code.

a^=b^=a^=b;


a=a+b-b=a;


The two code snippets shown above impresses almost every newbie the first time they see it. They would leave no stone unturned in praising the author of this code. Little do they realize that this code is not guaranteed to produce the same result on every system they test it on.

Here we are entering the realm of Undefined Behavior often abbreviated as UB.

There are three kinds of operators: unary, binary and ternary. Unary operators are those which operate on one and only one operand. Unary+, Unary-,dereferencing operator* are some examples. Binary operators are those which operate on two operands. Addition (+), Subtraction (-), Logical Or (||) are a few examples of binary operators. There is only one ternary operator in C++ viz ?:.

When the compiler sees an expression, it breaks them into smaller units depending on a table called the operator precedence table. The operators at the top of the table are the ones executed first followed by other operators in the decreasing order of precedence.

Now each of these operators can operate on one, two or three operands. These operands can themselves be expressions.

For example : int a=(b+c)*(d+e);

Here the operator * has two operands. The former is a simple expression (b+c) and the latter is another simple expression (d+e).

Now the C++ standard doesn't place any condition on the order in which these expressions must be evaluated. A normal human being would first calculate (b+c) and store it in a temporary variable. He would then calculate (d+e) and store it in another temporary variable. He would then retrieve these two temporary values and then multiply them and store the result in the variable a. But the standard lays down no such conditions or rules. It leaves the choice to the compilers. So the compilers can evaluate either (d+e) first or (b+c) first. Different compilers choose to do it differently.

For the simple expression given above this doesn't cause much confusion. The reader would probably be wondering as to why I am rambling on about it when evaluating (b+c) or (d+e) first doesn't change the result.

Surprise!!!

Consider this expression:

int i=1;
int temp=i++*(i-1);

Here the operands of the multiplicative operator * are i++ and i-1;

Now as mentioned before the compiler can evaluate either i++ first or (i-1) first. Let us see what happens when i++ is evaluated first. After i++ is evaluated the value that would be returned would be the existing value i.e. 1 and i would now be 2. After this i-1 is evaluated which would be 1. So i++*(i-1) would be 1*1 which is equal to 1. So temp would be 1 now.

Now let us see what happens when (i-1) is evaluated first. Since i is 1, i-1 becomes zero. Irrespective of what i++ is now, i++*(i-1) would be zero. So temp is initialized to zero here.

Do you realize the mistake? The result here depends on the order of evaluation and hence is unpredictable. It is unreasonable to expect all compilers to follow the same order of evaluation. Hence temp will be initialized to 1 on some systems and it will be initialized to zero on some other systems. This is Undefined Behavior and it is more often than not better to avoid coding practices like these.

Now this brings me to the discussion of sequence points. I have tried hard and exhausted myself in the process of trying to convince people that they should pay special attention to this topic. Please check C-faq and Wikipedia article for more info on sequence points.

After reading those topics some users confuse themselves between operator precedence and sequence points. Sequence points has got nothing to do with operator precedence. This requires more explanation.

When a compiler sees an expression with many operators, it breaks the expression into many logical and simpler units. How it does this depends on operator precedence.

For example when the compiler sees an expression like a+b*c+d, here the two operands are binary multiplicative operator * and binary addition operator +. * has higher precedence.

So the first step would be to choose * as the point where the expression can be split into simpler units. So the expression would now become a+(b*c)+d.

This expression is evaluated from left to right as + has left to right associativity. So the expression would become ( a + (b*c) ) + d

Now assuming a and d are simple expressions themselves. In this case consider the first binary addition operator.

Its operands are d and ( a + (b*c) ). Now as I mentioned before, the compiler can evaluate either d first or it can evaluate ( a + (b*c) ) first.

Now if it chooses to evaluate ( a+ (b*c) ) first, it can further choose to evaluate (b*c) first or a first. And further if b and c were expressions themselves, then we cant be sure about which amongst b or c would be evaluated first. The standard doesn't guarantee much here.

So the expression a+b*c+d which appears so simple to the naked eye has so many complications when we enter the realm of compilers and it is always better not to assume anything.

Hope that clears things a bit.

Now coming back to the original discussion. Why do those two code snippets lead to UB?

For precisely the same reason (i++)*(i-1) leads to UB. In the first code snippet there are many XOR operations operating on the two variables a and b. The author of the code ends up modifying the variables a and b more than once between two sequence points which results in UB.

In the second code snippet, the author assumes that b=a would be evaluated first before all other expressions. It may or may not be evaluated first. The results would be different for both the cases and hence leads to be UB again.

Now to summarize, there is no known way to swap two variables without using a third variable that would apply equally well to all built-in types.

Related links

Swapping two variables without using a temporary in C or C++By Paul J Herring.


CategoryPuzzles
 Comments [Hide comments/form]
From the view of ISO C standard, there /are/ temporary variables involved here.

1)
a=a+b;
b=a-b;
a=a-b;
2)
a=a^b;
b=b^a;
a=a^b;
3)
a=a*b;
b=a/b;
a=a/b;
-- SpS (2006-11-07 09:29:32)
@SpS

Thanks. Let us just assume that the question was to swap two variables without defining another temporary variable in the user code. :D

And yeah have taken note of your point. Will add it to the explanation above.
-- PriyaSridharan (2006-11-08 13:42:26)
This statement not works in VC++. y?
a=a+b-b=a;
-- 221.132.113.154 (2007-01-03 09:51:23)
@above

a=(a+b)-(b=a)

This statement may or may not work on certain compilers. The author of that bit of code intended a+b to be executed before b=a. But this may not be so and if the compiler evaluates b=a before a+b then you will get the wrong answer.

Please read the whole explanation above. I have already discussed this issue.
-- PriyaSridharan (2007-01-04 09:09:04)
Vejita:
a=a+b-b=a; code is not working on my compiler either!
-- 209.190.85.92 (2007-01-20 10:31:54)
It will not because of the statement
a+b-b=a, beacuse we are assigning the value of a to constant variable
for eg a = 5 b =2
a+b-b will evaluates to 5 , and then we are asigning the value of variable a that is 5 to a constant that is 5 , hence is the Error
-- 209.190.85.92 (2007-01-20 15:12:53)
I`ve got no words to say hw simple u make probs easier... This is for all the solutions given....
---Vinothini Kantharaj
-- 59.92.12.4 (2007-04-07 01:29:22)
Looks good. A few comments:

1) The original problem statement and one of the comments describe the problem differently. "Swap two variables" and "swap two numbers" are not at all the same problem. The variables could be structures, in which case none of the solutions would work. (Oh, and then of course there's another version of the problem that allows the supplied values to be the same variable or location, in which case obviously no swap is needed, but you have to be sure not to zero out one of the values via XOR with itself, or something...)

2) The first version, with additive operators, also cannot work reliably with floating point values, because of round-off errors as well as overflow. And it won't work with pointers, obviously, since you can't add two pointers. However, I believe it WILL work with unsigned integral types.

3) The XOR version, under a strict interpretation, may not work with negative integral values. The C standard allows three representations of signed integers. Twos-complement is the most popular by far, and would work okay, but the other two forms have a "negative zero" representation, and the C standard allows a negative zero to be converted to a normal zero when saving the value. So the result could be wrong in the sign bit.

4) It's a shame the undefined behavior issue above seems to be so difficult for people to understand, because the reality is even more subtle. The compiler can make any bizarre optimization it wants in hopes of speeding up the program, and it's allowed to assume that your program won't break the rules. Which means if your program does break the rules -- even with something as simple as this apparent "order of evaluation" error -- then there is no guarantee that the program will do *anything* reasonable. It can crash computing these (erroneous) arithmetic expressions. With most compilers and most real-world cases, it's unlikely, but it's allowed. Someday I'd like to construct a good example of this....

So, in the end, I think the additive and XOR versions (as coded without the sequence point bugs) both work okay for swapping variables of unsigned integral type, or signed integral type when all values are guaranteed to be non-negative. I don't think I've seen a solution that works on anything else.

--Ken
-- c-65-96-188-63.hsd1.ma.comcast.net (2007-04-18 05:30:03)
How abt this codelet: it doesnt use temporary variables

if(a>=0&&b>=0||a<0&&b;<0)
{
...if(a>=b)
...{
......a=a-b;
......b=a+b;
......a=b-a;
....}
....else
...{
....b=b-a;
....a=a+b;
....b=a-b;
...}
}
else
{
a=a+b;
b=a-b;
a=a-b;
}
-- 59.93.196.134 (2007-05-05 09:35:03)
Great coding yaar..
-- 59.92.32.123 (2007-05-14 06:41:38)
Still Overflow error can occur
-- 92-235-216-78.cable.ubr23.edin.blueyonder.co.uk (2008-11-29 12:29:38)
@ above
plzz gve ur explanation on how overflow occurs in the above code.....
-- ws157-229-252-122.rcil.gov.in (2009-01-09 08:18:14)
Page was generated in 0.0547 seconds