CTGD The Principle of Mathematical Induction

Section B.1 The Principle of Mathematical Induction

Subsection B.1.1 The ideas behind mathematical induction

There is a variant of one of the bijections we used to prove the Pascal Equation that comes up in counting the subsets of a set. In the next problem it will help us compute the total number of subsets of a set, regardless of their size. Our main goal in this problem, however, is to introduce some ideas that will lead us to one of the most powerful proof techniques in combinatorics (and many other branches of mathematics), the principle of mathematical induction.

Problem 360

(a)

Write down a list of the subsets of \(\{1, 2 \}\text{.}\) Don't forget the empty set! Group the sets containing containing 2 separately from the others.

(b)

Write down a list of the subsets of \(\{1, 2, 3 \}\text{.}\) Group the sets containing 3 separately from the others.

(c)

Look for a natural way to match up the subsets containing 2 in Part a with those not containing 2. Look for a way to match up the subsets containing 3 in Part b containing 3 with those not containing 3.

(d)

On the basis of the previous part, you should be able to find a bijection between the collection of subsets of \(\{1, 2, \ldots , n \}\) containing \(n\) and those not containing \(n\text{.}\) (If you are having difficulty figuring out the bijection, try rethinking Parts a and b, perhaps by doing a similar exercise with the set \(\{1,2,3,4\}\text{.}\)) Describe the bijection (unless you are very familiar with the notation of sets, it is probably easier to describe to describe the function in words rather than symbols) and explain why it is a bijection. Explain why the number of subsets of \(\{1, 2, \ldots , n \}\) containing \(n\) equals the number of subsets of \(\{1, 2, \ldots, n-1 \}\text{.}\)

(e)

Parts a and b suggest strongly that the number of subsets of a \(n\)-element set is \(2^n\text{.}\) In particular, the empty set has \(2^0\) subsets, a one-element set has \(2^1\) subsets, itself and the empty set, and in Parts a and b we saw that two-element and three-element sets have \(2^2\) and \(2^3\) subsets respectively. So there are certainly some values of \(n\) for which an \(n\)-element set has \(2^n\) subsets. One way to prove that an \(n\)-element set has \(2^n\) subsets for all values of \(n\) is to argue by contradiction. For this purpose, suppose there is a nonnegative integer \(n\) such that an \(n\)-element set doesn't have exactly \(2^n\) subsets. In that case there may be more than one such \(n\text{.}\) Choose \(k\) to be the smallest such \(n\text{.}\) Notice that \(k -1\) is still a positive integer, because \(k\) can't be 0, 1, 2, or 3. Since \(k\) was the smallest value of \(n\) we could choose to make the statement “An \(n\)-element set has \(2^n\) subsets” false, what do you know about the number of subsets of a \((k - 1)\)-element set? What do you know about the number of subsets of the \(k\)-element set \(\{1, 2, \ldots, k \}\) that don't contain \(k\text{?}\) What do you know about the number of subsets of \(\{1, 2, \ldots, k \}\) that do contain \(k\text{?}\) What does the sum principle tell you about the number of subsets of \(\{1, 2, \ldots, k \}\text{?}\) Notice that this contradicts the way in which we chose \(k\text{,}\) and the only assumption that went into our choice of \(k\) was that ``there is a nonnegative integer \(n\) such that an \(n\)-element set doesn't have exactly \(2^n\) subsets." Since this assumption has led us to a contradiction, it must be false. What can you now conclude about the statement ``for every nonnegative integer \(n\text{,}\) an n-element set has exactly \(2^n\) subsets?"

Problem 361

The expression

\begin{equation*} 1+3+5+\cdots+2n-1 \end{equation*}

is the sum of the first \(n\) odd integers. Experiment a bit with the sum for the first few positive integers and guess its value in terms of \(n\text{.}\) Now apply the technique of Problem 360 to prove that you are right.

Hint

You've probably guessed that the sum is \(n^2\text{.}\) To prove this by contradiction, you have to assume it is false, that is, that there is an \(n\) such that \(1 + 3 + 5 + \cdots + 2n - 1 \ne n^2\text{.}\) Then the method of Problem 360 says there must be a smallest such \(n\) and suggests we call it \(k\text{.}\) Why do you know that \(1 + 3 + 5 + \cdots + 2k - 3 = (k-1)2\text{?}\) What happens if you add \(2n-1\) to both sides of the equation?

In Problems 360 and 361 our proofs had several distinct elements. We had a statement involving an integer \(n\text{.}\) We knew the statement was true for the first few nonnegative integers in Problem 360 and for the first few positive integers in Problem 361. We wanted to prove that the statement was true for all nonnegative integers in Problem 360 and for all positive integers in Problem 361. In both cases we used the method of proof by contradiction; for that purpose we assumed that there was a value of \(n\) for which our formula wasn't true. We then chose \(k\) to be the smallest value of \(n\) for which our formula wasn't true. This meant that when \(n\) was \(k-1\text{,}\) our formula was true, (or else that \(k-1\) wasn't a nonnegative integer in Problem 360 or that \(k-1\) wasn't a positive integer in Problem 361). What we did next was the crux of the proof. We showed that the truth of our statement for \(n=k-1\) implied the truth of our statement for \(n=k\text{.}\) This gave us a contradiction to the assumption that there was an \(n\) that made the statement false. In fact, we will see that we can bypass entirely the use of proof by contradiction. We used it to help you discover the central ideas of the technique of proof by mathematical induction.

The central core of mathematical induction is the proof that the truth of a statement about the integer \(n\) for \(n=k-1\) implies the truth of the statement for \(n=k\text{.}\) For example, once we know that a set of size 0 has \(2^0\) subsets, if we have proved our implication, we can then conclude that a set of size 1 has \(2^1\) subsets, from which we can conclude that a set of size 2 has \(2^2\) subsets, from which we can conclude that a set of size 3 has \(2^3\) subsets, and so on up to a set of size \(n\) having \(2^n\) subsets for any nonnegative integer \(n\) we choose. In other words, although it was the idea of proof by contradiction that led us to think about such an implication, we can now do without the contradiction at all. What we need to prove a statement about \(n\) by this method is a place to start, that is a value \(b\) of \(n\) for which we know the statement to be true, and then a proof that the truth of our statement for \(n=k-1\) implies the truth of the statement for \(n=k\) whenever \(k>b\text{.}\)

Subsection B.1.2 Mathematical induction

The principle of mathematical induction states that

In order to prove a statement about an integer \(n\text{,}\) if we can

Prove the statement when \(n=b\text{,}\) for some fixed integer \(b\)

Show that the truth of the statement for \(n=k-1\) implies the truth of the statement for \(n=k\) whenever \(k>b\text{,}\)

then we can conclude the statement is true for all integers \(n\ge b\text{.}\)

As an example, let us return to Problem 360. The statement we wish to prove is the statement that “A set of size \(n\) has \(2^n\) subsets.”

Our statement is true when \(n=0\text{,}\) because a set of size 0 is the empty set and the empty set has \(1=2^0\) subsets. (This step of our proof is called a base step.) Now suppose that \(k>0\) and every set with \(k-1\) elements has \(2^{k-1}\) subsets. Suppose \(S=\{a_1,a_2,\ldots a_k\}\) is a set with \(k\) elements. We partition the subsets of \(S\) into two blocks. Block \(B_1\) consists of the subsets that do not contain \(a_n\) and block \(B_2\) consists of the subsets that do contain \(a_n\text{.}\) Each set in \(B_1\) is a subset of \(\{a_1,a_2,\ldots a_{k-1}\}\text{,}\) and each subset of \(\{a_1,a_2, \ldots a_{k-1}\}\) is in \(B_1\text{.}\) Thus \(B_1\) is the set of all subsets of \(\{a_1,a_2,\ldots a_{k-1}\}\text{.}\) Therefore by our assumption in the first sentence of this paragraph, the size of \(B_1\) is \(2^{k-1}\text{.}\) Consider the function from \(B_2\) to \(B_1\) which takes a subset of \(S\) including \(a_n\) and removes \(a_n\) from it. This function is defined on \(B_2\text{,}\) because every set in \(B_2\) contains \(a_n\text{.}\) This function is onto, because if \(T\) is a set in \(B_1\text{,}\) then \(T\cup \{a_k\}\) is a set in \(B_2\) which the function sends to \(T\text{.}\) This function is one-to-one because if \(V\) and \(W\) are two different sets in \(B_2\text{,}\) then removing \(a_k\) from them gives two different sets in \(B_1\text{.}\) Thus we have a bijection between \(B_1\) and \(B_2\text{,}\) so \(B_1\) and \(B_2\) have the same size. Therefore by the sum principle the size of \(B_1\cup B_2\) is \(2^{k-1} +2^{k-1}=2^k\text{.}\) Therefore \(S\) has \(2^k\) subsets. This shows that if a set of size \(k-1\) has \(2^{k-1}\) subsets, then a set of size \(k\) has \(2^k\) subsets. Therefore by the principle of mathematical induction, a set of size \(n\) has \(2^n\) subsets for every nonnegative integer \(n\text{.}\)

The first sentence of the last paragraph is called the inductive hypothesis. In an inductive proof we always make an inductive hypothesis as part of proving that the truth of our statement when \(n=k-1\) implies the truth of our statement when \(n=k\text{.}\) The last paragraph itself is called the inductive step of our proof. In an inductive step we derive the statement for \(n=k\) from the statement for \(n=k-1\text{,}\) thus proving that the truth of our statement when \(n=k-1\) implies the truth of our statement when \(n=k\text{.}\) The last sentence in the last paragraph is called the inductive conclusion. All inductive proofs should have a base step, an inductive hypothesis, an inductive step, and an inductive conclusion.

There are a couple details worth noticing. First, in this problem, our base step was the case \(n=0\text{,}\) or in other words, we had \(b=0\text{.}\) However, in other proofs, \(b\) could be any integer, positive, negative, or 0. Second, our proof that the truth of our statement for \(n=k-1\) implies the truth of our statement for \(n=k\) required that \(k\) be at least 1, so that there would be an element \(a_k\) we could take away in order to describe our bijection. However, condition (2) of the principle of mathematical induction only requires that we be able to prove the implication for \(k>0\text{,}\) so we were allowed to assume \(k>0\text{.}\)

Problem 362

Use mathematical induction to prove your formula from Problem 361.

Subsection B.1.3 Proving algebraic statements by induction

Problem 363

Use mathematical induction to prove the well-known formula that for all positive integers \(n\text{,}\)

\begin{equation*} 1+2 + \cdots +n = \frac{n(n+1)}{2}. \end{equation*}

Problem 364

Experiment with various values of \(n\) in the sum

\begin{equation*} \frac{1}{1\cdot2}+\frac{1}{2\cdot3} + \frac{1}{3\cdot 4}+\cdots+\frac{1}{n\cdot (n+1)} = \sum_{i=1}^n \frac{1}{i\cdot(i+1)}. \end{equation*}

Guess a formula for this sum and prove your guess is correct by induction.

Problem 365

For large values of \(n\text{,}\) which is larger, \(n^2\) or \(2^n\text{?}\) Use mathematical induction to prove that you are correct.

Hint 1

You've probably already seen that, with small values of \(n\text{,}\) sometimes \(n^2\) and sometimes \(2^n\) is bigger. But if you keep experimenting one of the functions seems to get bigger and stay bigger than the other. The number \(n=b\) where this change occurs is a good choice for a base case. So as not to spoil the problem for you, we won't say here what this value of \(b\) is. However you shouldn't be surprised later in the proof if you need to use the assumption that \(n /gt b\text{.}\)

Hint 2

You may have reached the point of assuming that \(2^{k-1} \gt (k-1)^2\) and found yourself wondering how to prove that \(2^k \gt k^2\text{.}\) A natural thing to try is multiplying both sides of \(2^{k-1} \gt (k-1)^2\) by \(2\text{.}\) This ends up giving you \(2^k \gt 2k^2 - 4k +2\text{.}\) Based on previous experience it is natural for you to expect to see how to turn this new right hand side into \(k^2\) but not see how to do it. Here is the hint. You only need to show that the right hand side is greater than or equal to \(k^2\text{.}\) For this purpose you need to show that one of the two \(k^2\)s in \(2k^2\) somehow balances out the \(-4k\text{.}\) See if you can figure out how the fact that you are only considering \(k\)s with \(k \gt b\) can help you out.

Problem 366

What is wrong with the following attempt at an inductive proof that all integers in any consecutive set of \(n\) integers are equal for every positive integer \(n\text{?}\) For an arbitrary integer \(i\text{,}\) all integers from \(i\) to \(i\) are equal, so our statement is true when \(n=1\text{.}\) Now suppose \(k>1\) and all integers in any consecutive set of \(k-1\) integers are equal. Let \(S\) be a set of \(k\) consecutive integers. By the inductive hypothesis, the first \(k-1\) elements of \(S\) are equal and the last \(k-1\) elements of \(S\) are equal. Therefore all the elements in the set \(S\) are equal. Thus by the principle of mathematical induction, for every positive \(n\text{,}\) every \(n\) consecutive integers are equal.

Hint

When you suspect an argument is not valid, it may be helpful to explicitly try several values of \(n\) to see if it makes sense for them. Often small values of \(n\) are adequate to find the flaw. If you find one flaw, it invalidates everything that comes afterwards (unless, of course, you can fix the flaw).

Subsection B.1.4 Strong Induction

One way of looking at the principle of mathematical induction is that it tells us that if we know the “first” case of a theorem and we can derive each other case of the theorem from a smaller case, then the theorem is true in all cases. However the particular way in which we stated the theorem is rather restrictive in that it requires us to derive each case from the immediately preceding case. This restriction is not necessary, and removing it leads us to a more general statement of the principal of mathematical induction which people often call the strong principle of mathematical induction. It states:

In order to prove a statement about an integer \(n\) if we can

prove our statement when \(n=b\) and

prove that the statements we get with \(n=b\text{,}\) \(n=b+1\text{,}\) … \(n=k-1\) imply the statement with \(n=k\text{,}\)

then our statement is true for all integers \(n\ge b\text{.}\)

Problem 367

What postage do you think we can make with five and six cent stamps? Is there a number \(N\) such that if \(n\ge N\text{,}\) then we can make \(n\) cents worth of postage?

You probably see that we can make \(n\) cents worth of postage as long as \(n\) is at least 20. However you didn't try to make 26 cents in postage by working with 25 cents; rather you saw that you could get 20 cents and then add six cents to that to get 26 cents. Thus if we want to prove by induction that we are right that if \(n\ge 20\text{,}\) then we can make \(n\) cents worth of postage, we are going to have to use the strong version of the principle of mathematical induction.

We know that we can make 20 cents with four five-cent stamps. Now we let \(k\) be a number greater than 20, and assume that it is possible to make any amount between 20 and \(k-1\) cents in postage with five and six cent stamps. Now if \(k\) is less than 25, it is 21, 22, 23, or 24. We can make 21 with three fives and one six. We can make 22 with two fives and two sixes, 23 with one five and three sixes, and 24 with four sixes. Otherwise \(k-5\) is between 20 and \(k-1\) (inclusive) and so by our inductive hypothesis, we know that \(k-5\) cents can be made with five and six cent stamps, so with one more five cent stamp, so can \(k\) cents. Thus by the (strong) principle of mathematical induction, we can make \(n\) cents in stamps with five and six cent stamps for each \(n\ge 20\text{.}\)

Some people might say that we really had five base cases, \(n=20\text{,}\) 21, 22, 23, and 24, in the proof above and once we had proved those five consecutive base cases, then we could reduce any other case to one of these base cases by successively subtracting 5. That is an appropriate way to look at the proof. A logician would say that it is also the case that, for example, by proving we could make 22 cents, we also proved that if we can make 20 cents and 21 cents in stamps, then we could also make 22 cents. We just didn't bother to use the assumption that we could make 20 cents and 21 cents! So long as one point of view or the other satisfies you, you are ready to use this kind of argument in proofs.

Problem 368

A number greater than one is called prime if it has no factors other than itself and one. Show that each positive number is either a prime or a power of a prime or a product of powers of prime numbers.

Problem 369

Show that the number of prime factors of a positive number \(n\ge 2\) is less than or equal to \(\log_2 n\text{.}\) (If a prime occurs to the \(k\)th power in a factorization of \(n\text{,}\) you can consider that power as \(k\) prime factors.) (There is a way to do this by induction and a way to do it without induction. It would be ideal to find both ways.)

Problem 370

One of the most powerful statements in elementary number theory is Euclid's Division Theorem.¹In a curious twist of language, mathematicians have long called The Division Algorithm or Euclid's Division Algorithm. However as computer science has grown in importance, the word algorithm has gotten a more precise definition: an algorithm is now a method to do something. There is a method (in fact there are more than one) to get the \(q\) and \(r\) that Euclid's Division Theorem gives us, and computer scientists would call these methods algorithms. Your author has chosen to break with mathematical tradition and restrict his use of the word algorithm to the more precise interpretation as a computer scientist probably would. We aren't giving a method here, so this is why the name used here is “Euclid's Division Theorem.” This states that if \(m\) and \(n\) are positive integers, then there are unique nonnegative intergers \(q\) and \(r\) with \(0 \le r \lt n\text{,}\) such that \(m = nq + r\text{.}\) The number \(q\) is called the quotient and the number \(r\) is called the remainder. In computer science it is common to denote \(r\) by \(m \mod n\text{.}\) In elementary school you learned how to use long division to find \(q\) and \(r\text{.}\) However, it is unlikely that anyone ever proved for you that for any pair of positive intgers, \(m\) and \(n\text{,}\) there is such a pair of nonnegative numbers \(q\) and \(r\text{.}\) You now have the tools needed to prove this. Do so.

Hint

You might start out by ignoring the word unique and give a proof of the simpler theorem that results. Then look at your proof to see how you can include uniqueness in it.