Skip to main content

Merge Sort, Divide & Conquer, Master Theorem

Class Material​

  • Slides can be found here.

Notations: Big-O vs. Little-o​

In computational complexity theory, Big-O (denoted as O(β‹…)O(\cdot)) and Little-o (denoted as o(β‹…)o(\cdot)) are both used to describe the growth rates of functions, but they differ subtly in how tightly they bound the growth of one function relative to another. Below is a detailed comparison between the two:


Big-O Notation (O)​

  • Definition: f(n)∈O(g(n))f(n) \in O(g(n)) means that f(n) does not grow faster than g(n) up to a constant factor for sufficiently large n.

    • There exist positive constants CC and n0n_0 such that: f(n)≀Cβ‹…g(n)forΒ allnβ‰₯n0f(n) \leq C \cdot g(n) \quad \text{for all} \quad n \geq n_0
  • Intuition:
    Big-O provides an upper bound on the growth rate of f(n)f(n). It tells us that f(n)f(n) grows at most as fast as g(n)g(n) asymptotically. It allows equality β€” meaning f(n)f(n) and g(n)g(n) could potentially grow at the same rate.

  • Example:
    2n2+3n+5∈O(n2)2n^2 + 3n + 5 \in O(n^2) because, asymptotically, n2n^2 is the dominant term (the other terms grow more slowly). So 2n2+3n+52n^2 + 3n + 5 does not grow faster than n2n^2.


Little-o Notation (o)​

  • Definition: f(n)∈o(g(n))f(n) \in o(g(n)) means that f(n) grows strictly slower than g(n), even without any constant factor.

    • Formally, for every positive constant CC, there exists n0n_0 such that: f(n)<Cβ‹…g(n)forΒ allnβ‰₯n0f(n) < C \cdot g(n) \quad \text{for all} \quad n \geq n_0
    • This implies that the ratio f(n)/g(n)β†’0f(n) / g(n) \to 0 as nβ†’βˆžn \to \infty.
  • Intuition:
    Little-o provides a strict upper bound on the growth rate. It says that f(n)f(n) grows strictly slower than g(n)g(n) as nn becomes large, and they cannot grow at the same rate.

  • Example:
    n∈o(n2)n \in o(n^2) because nn grows slower than n2n^2 as nβ†’βˆžn \to \infty. Specifically, the ratio n/n2=1/nβ†’0n / n^2 = 1/n \to 0.

    n2βˆ‰o(n2)n^2 \notin o(n^2) because n2n^2 grows at the same rate as n2n^2 (not strictly slower). The ratio n2/n2=1n^2 / n^2 = 1 as nβ†’βˆžn \to \infty. (We don't need to use L'HΓ΄pital's Rule because it is not an indeterminate form – both the numerator and denominator grow at the same rate. Since they are identical, their ratio is always 1, no matter how large nn\frac{n}{n} becomes.)

Application of L'HΓ΄pital's Rule in asymptotic analysis

The expression ∞∞\frac{\infty}{\infty} (infinity over infinity) is considered an indeterminate form in calculus. This means it is not immediately clear what the value should be, and further analysis is required to evaluate it properly.


Why is ∞∞\frac{\infty}{\infty} Indeterminate?​

  • Infinity (∞\infty) is a concept, not a real number. It indicates that a value grows without bound.
  • When we encounter ∞∞\frac{\infty}{\infty}, both the numerator and the denominator are growing indefinitely, but we don’t know how fast they grow relative to each other. Depending on the functions involved, the ratio could result in different outcomes:
    • Finite value: n2nβ†’12\frac{n}{2n} \to \frac{1}{2} as nβ†’βˆžn \to \infty.
    • Infinity: n2nβ†’βˆž\frac{n^2}{n} \to \infty as nβ†’βˆžn \to \infty.
    • Zero: nn2β†’0\frac{n}{n^2} \to 0 as nβ†’βˆžn \to \infty.

Resolving ∞∞\frac{\infty}{\infty} using L'HΓ΄pital's Rule​

When a limit yields the form ∞∞\frac{\infty}{\infty}, we often apply L'HΓ΄pital's Rule to evaluate it. L'HΓ΄pital’s Rule states:

lim⁑xβ†’βˆžf(x)g(x)=lim⁑xβ†’βˆžfβ€²(x)gβ€²(x),\lim_{x \to \infty} \frac{f(x)}{g(x)} = \lim_{x \to \infty} \frac{f'(x)}{g'(x)},

provided that both f(x)β†’βˆžf(x) \to \infty and g(x)β†’βˆžg(x) \to \infty, and the derivatives fβ€²(x)f'(x) and gβ€²(x)g'(x) exist.


Example Using L'HΓ΄pital's Rule​

Evaluate:

lim⁑xβ†’βˆžxex.\lim_{x \to \infty} \frac{x}{e^x}.

Both the numerator xx and the denominator exe^x go to infinity, giving ∞∞\frac{\infty}{\infty}. Applying L'Hôpital's Rule:

lim⁑xβ†’βˆžxex=lim⁑xβ†’βˆž1ex=0.\lim_{x \to \infty} \frac{x}{e^x} = \lim_{x \to \infty} \frac{1}{e^x} = 0.

Summary​

∞∞\frac{\infty}{\infty} is indeterminate because it can lead to different results depending on the growth rates of the functions involved. To resolve it, we need more information, often using tools like L'Hôpital's Rule or algebraic manipulation.


Key Differences between Big-O and Little-o​

AspectBig-O (O)Little-o (o)
Bound TypeUpper bound (not strict)Strict upper bound
Equality Allowed?Yes (same growth rate possible)No (strictly slower growth)
Mathematical Limitlim⁑nβ†’βˆžf(n)/g(n)β‰€βˆž\lim_{n \to \infty} f(n) / g(n) \leq \inftylim⁑nβ†’βˆžf(n)/g(n)=0\lim_{n \to \infty} f(n) / g(n) = 0
Example Relationshipn2∈O(n2)n^2 \in O(n^2)n∈o(n2)n \in o(n^2)

Summary​

  • Big-O notation allows for equal growth rates (or f(n)f(n) can grow slower or at the same rate as g(n)g(n)).
  • Little-o notation requires that f(n)f(n) grows strictly slower than g(n)g(n).
    Both notations are useful tools in asymptotic analysis to describe how algorithms perform as input sizes grow large.

In asymptotic analysis, n0n_0 is the threshold beyond which the behavior of the function we’re analyzing matches the specified growth bound. Essentially, it marks the point where the asymptotic relationship becomes valid.


Role of n0n_0 in Asymptotic Notations
  • When we say that f(n)∈O(g(n))f(n) \in O(g(n)), it means that after a certain input size nβ‰₯n0n \geq n_0, the inequality holds: f(n)≀Cβ‹…g(n)forΒ allnβ‰₯n0f(n) \leq C \cdot g(n) \quad \text{for all} \quad n \geq n_0
  • For little-o notation f(n)∈o(g(n))f(n) \in o(g(n)), the idea is that for all positive constants CC, there exists an n0n_0 beyond which: f(n)<Cβ‹…g(n)forΒ allnβ‰₯n0f(n) < C \cdot g(n) \quad \text{for all} \quad n \geq n_0

Intuition​

  • n0n_0 ensures that the asymptotic behavior is true only for large enough input sizes. For smaller values n<n0n < n_0, the functions may behave unpredictably, but asymptotic notations focus on large nn.

Example​

Say we claim that f(n)=3n+2∈O(n)f(n) = 3n + 2 \in O(n).

  • For small nn, the +2+2 term might dominate. However, for large enough nn, 3n+23n + 2 behaves similarly to 3n3n.
  • In this case, n0n_0 would be the point where 3n+2≀Cβ‹…n3n + 2 \leq C \cdot n for some constant CC. For example, with C=4C = 4, we can say that the inequality holds for nβ‰₯n0=2n \geq n_0 = 2.

Summary​

  • n0n_0 is the starting point beyond which the asymptotic bound (like O(g(n))O(g(n)) or o(g(n))o(g(n))) reliably holds.
  • Asymptotic notations don't care about small inputsβ€”only what happens as nβ†’βˆžn \to \infty matters. n0n_0 ensures that we focus on that asymptotic behavior.

Sorting Algorithms​

Merge Sort​

Resources

Lemma: No sorting algorithm can sort nn arbitrary numbers in o(nlog⁑n)o (n \log n) comparisons

  • This means that Merge Sort is optimal in terms of the number of comparisons it makes.

Comparison Trees​

alt text

In the slides: Using Comparison Trees to describe any comparison-based algorithm.

  • The height of the tree represents the number of comparisons made by the algorithm: log⁑(n)\log(n).

    To arrive at a specific leaf, which is the correct result – a sorted permutation, we've only had to traverse at worst the max height of the tree.

  • The number of leaves represents the number of possible outcomes: n!n!. (For example: {a,b,c}\{a, b, c\} has 3!3! possible permutations)

Note
  • These assume that the tree is a binary tree.
  • Comparisons have exactly 2 results: "a<ba < b" or "a>=ba >= b".

Showing that Comparison Trees are Ξ©(nlog⁑n)\Omega(n \log n)​

To show that log⁑2(m!)\log_2(m!) is Ω(nlog⁑n)\Omega(n \log n) without using Stirling's approximation, we need to lower bound log⁑2(m!)\log_2(m!) and compare it asymptotically with nlog⁑nn \log n.


Step 1: Express log⁑2(m!)\log_2(m!)​

We start with the definition of the factorial's logarithm:

log⁑2(m!)=log⁑2(1β‹…2β‹…3β‹―m).\log_2(m!) = \log_2(1 \cdot 2 \cdot 3 \cdots m).

Using the logarithm of a product:

log⁑2(m!)=log⁑2(1)+log⁑2(2)+log⁑2(3)+β‹―+log⁑2(m).\log_2(m!) = \log_2(1) + \log_2(2) + \log_2(3) + \cdots + \log_2(m).

Step 2: Lower Bound log⁑2(m!)\log_2(m!)​

Notice that the sum log⁑2(1)+log⁑2(2)+β‹―+log⁑2(m)\log_2(1) + \log_2(2) + \cdots + \log_2(m) can be lower-bounded by focusing on the second half of the terms (those from m/2m/2 to mm):

log⁑2(m!)β‰₯βˆ‘k=m/2mlog⁑2(k).\log_2(m!) \geq \sum_{k = m/2}^m \log_2(k).

Tip: We can do this magic step of focusing on the second half because of the "β‰₯\geq" sign. We're looking for a lower bound, so we can ignore the first half of the terms. For graphical intution, the prof's half-baked example here might help, though it is not entirely complete.

alt text

Since for all kk in this range, kβ‰₯m/2k \geq m/2, we can lower bound each log⁑2(k)\log_2(k) by log⁑2(m/2)\log_2(m/2):

log⁑2(m!)β‰₯βˆ‘k=m/2mlog⁑2(m2).\log_2(m!) \geq \sum_{k = m/2}^m \log_2\left(\frac{m}{2}\right).

There are mβˆ’m/2+1=m/2+1m - m/2 + 1 = m/2 + 1 terms in this sum, so:

log⁑2(m!)β‰₯(m2+1)log⁑2(m2).\log_2(m!) \geq \left(\frac{m}{2} + 1\right) \log_2\left(\frac{m}{2}\right).

Step 3: Simplify the Lower Bound​

We simplify the expression (m2+1)log⁑2(m2)\left(\frac{m}{2} + 1\right) \log_2\left(\frac{m}{2}\right). For large mm, the term +1+1 is insignificant, so:

log⁑2(m!)β‰₯m2log⁑2(m2).\log_2(m!) \geq \frac{m}{2} \log_2\left(\frac{m}{2}\right).

Using log⁑2(m2)=log⁑2(m)βˆ’log⁑2(2)=log⁑2(m)βˆ’1\log_2\left(\frac{m}{2}\right) = \log_2(m) - \log_2(2) = \log_2(m) - 1, we get:

log⁑2(m!)β‰₯m2(log⁑2(m)βˆ’1).\log_2(m!) \geq \frac{m}{2} \left(\log_2(m) - 1\right).

For large mm, the βˆ’1-1 becomes negligible, so:

log⁑2(m!)β‰₯m2log⁑2(m).\log_2(m!) \geq \frac{m}{2} \log_2(m).

Step 4: Compare with nlog⁑nn \log n​

Now we want to compare log⁑2(m!)\log_2(m!) with nlog⁑nn \log n. If we set m=nm = n, the lower bound becomes:

log⁑2(n!)β‰₯n2log⁑2(n).\log_2(n!) \geq \frac{n}{2} \log_2(n).

This shows that log⁑2(n!)\log_2(n!) grows asymptotically at least as fast as n2log⁑2(n)\frac{n}{2} \log_2(n), which is clearly Ω(nlog⁑n)\Omega(n \log n). Therefore:

log⁑2(m!)=Ω(nlog⁑n).\log_2(m!) = \Omega(n \log n).

Conclusion​

We have shown that log⁑2(m!)\log_2(m!) is Ξ©(nlog⁑n)\Omega(n \log n) without using Stirling’s approximation by directly analyzing a lower bound of the sum log⁑2(1)+log⁑2(2)+β‹―+log⁑2(m)\log_2(1) + \log_2(2) + \cdots + \log_2(m).

Master Theorem​

Resources

Aside
  • If you're having a hard time understanding the proof for Master Theorem, watching the Merge Sort video first will 100% help.