🖨️ Printing Instructions: Press Ctrl/Cmd + P and select "Save as PDF".

Reductions and NP-Completeness

Learning Goals

Understand the definitions of P, NP, NP-Hard, and NP-Complete.
Understand polynomial-time reductions ($A \le_p B$) and their direction.
Know the Cook-Levin Theorem and why it matters.
Be able to recognize and precisely state classic NP-Complete problems.
Follow a reduction proof and understand the recipe for showing NP-Completeness.

Complexity Classes: P and NP

Decision Problems

We focus on decision problems: problems with a YES or NO answer.
Example: "Does this graph have a cycle of length exactly 5?" (YES/NO).
Optimization problems ("Find the shortest path") can usually be rephrased as decision problems ("Is there a path of length $\le k$?").

Class P

P = the set of decision problems that can be solved in polynomial time.
Meaning: there exists an algorithm that, given any input of size $n$, always outputs the correct YES/NO answer in $O(n^c)$ time for some constant $c$.
Examples: Sorting, Shortest Path, 2-SAT, Maximum Matching.

Class NP (Intuitive Definition)

NP = the set of decision problems where, if the answer is YES, there exists a short certificate (proof) that lets you verify the answer quickly.
More precisely: for every YES instance, there is a certificate of polynomial size that a polynomial-time verifier can check.
NP does NOT mean "Not Polynomial". It stands for "Nondeterministic Polynomial", but think of it as "Quickly Verifiable".

NP Example: Clique

Problem: "Does graph $G$ have a clique of size $k$?"
If the answer is YES:
Certificate = the set of $k$ vertices forming the clique.
Verifier = check that all $\binom{k}{2}$ pairs are connected by edges.
Verification runs in polynomial time. ✓
So Clique $\in$ NP.

P vs NP

Every problem in P is also in NP. (If you can solve it fast, you can certainly verify it fast.)
The big open question: Does P = NP?
Most computer scientists believe $P \neq NP$: there are problems easy to verify but hard to solve.
No one has proven this yet. It is the most important open problem in CS.

Polynomial-Time Reductions

The Core Idea

How do we compare the difficulty of two problems?
We say problem $A$ reduces to problem $B$, written $A \le_p B$, if:
We can convert any instance of $A$ into an instance of $B$ in polynomial time, such that the answer is preserved.
Intuition: "If I had a magic solver for $B$, I could use it to solve $A$."
Implication: $B$ is at least as hard as $A$.

Reduction: Step by Step

1. Start with an instance $I_A$ of problem $A$.
2. In polynomial time, construct an instance $I_B$ of problem $B$.
3. Feed $I_B$ to the solver for $B$.
4. Return whatever the solver says (YES or NO).
Key requirement: $I_A$ is a YES-instance of $A$ if and only if $I_B$ is a YES-instance of $B$.

Why Direction Matters

$A \le_p B$ means: "$A$ is no harder than $B$" — $B$ is at least as hard.
To prove $B$ is hard, we reduce FROM a known hard problem TO $B$.
Common mistake: reducing in the wrong direction.
Reducing $B \to A$ (new → known hard) only shows $B$ is no harder than $A$, which is not useful for proving $B$ is hard.

NP-Hardness and NP-Completeness

NP-Hard

A problem $B$ is NP-Hard if:
Every problem $A$ in NP satisfies $A \le_p B$.
In plain English: $B$ is at least as hard as every problem in NP.
An NP-Hard problem does not have to be in NP itself — it could be even harder (e.g., undecidable).

NP-Complete (NPC)

A problem $B$ is NP-Complete if it satisfies both:
1. $B \in NP$ (solutions can be verified in polynomial time).
2. $B$ is NP-Hard (it is at least as hard as every problem in NP).
NP-Complete problems are the hardest problems within NP.
They sit exactly at the boundary: hard enough to capture all of NP, but not so hard that solutions can't even be verified.

The Key Consequence

All NP-Complete problems are equivalent in difficulty (each reduces to every other).
If any single NPC problem can be solved in polynomial time, then every problem in NP can be solved in polynomial time, meaning $P = NP$.
Since we believe $P \neq NP$, we believe no NPC problem has a polynomial-time algorithm.
This is why showing a problem is NPC is strong evidence that it is intractable.

The First NPC Problem: Cook-Levin Theorem

The Bootstrapping Problem

To prove problem $C$ is NPC, we reduce a known NPC problem to $C$.
But how do we get the first NPC problem?
Someone had to prove, from scratch, that one specific problem is NPC.
This was done independently by Stephen Cook (1971) and Leonid Levin (1973).

SAT (Boolean Satisfiability)

Input: A boolean formula $\phi$ over variables $x_1, x_2, \ldots, x_n$ using AND ($\land$), OR ($\lor$), NOT ($\neg$).
Question: Is there an assignment of TRUE/FALSE to each variable that makes $\phi$ evaluate to TRUE?
Example: $\phi = (x_1 \lor \neg x_2) \land (\neg x_1 \lor x_3)$. Answer: YES (set $x_1 = T, x_2 = T, x_3 = T$).
SAT is in NP: the certificate is the satisfying assignment; verification means plugging it in and checking.

Cook-Levin Theorem

Theorem (Cook-Levin): SAT is NP-Complete.
Proof idea (high level, no machine details):
Any problem in NP has a polynomial-time verifier — a program that takes an input and a certificate and says YES or NO.
The key insight: the entire execution of that verifier program can be encoded as a boolean formula.
The formula is satisfiable if and only if there exists a certificate that makes the verifier say YES.
So solving SAT lets you solve any NP problem. Therefore SAT is NP-Hard, and since it is also in NP, it is NP-Complete.

Why Cook-Levin Matters

It gives us the anchor for all NP-Completeness proofs.
Once we know SAT is NPC, we can reduce SAT to other problems to prove them NPC.
This created a chain: SAT → 3-SAT → Clique → Vertex Cover → ... hundreds of problems.

The Recipe for Proving NP-Completeness

Two-Step Recipe

To prove a new problem $C$ is NP-Complete:
Step 1: Show $C \in NP$.
→ State what the certificate is and how to verify it in polynomial time.
Step 2: Show $C$ is NP-Hard by reducing a known NPC problem $B$ to $C$.
→ Pick a known NPC problem $B$.
→ Give a polynomial-time transformation from any instance of $B$ to an instance of $C$.
→ Prove correctness: $B$-instance is YES $\iff$ $C$-instance is YES.

Choosing Which Problem to Reduce From

Pick a known NPC problem that "looks similar" to your target.
Common starting points:
Graph problems → reduce from Clique, Vertex Cover, or Independent Set.
Logic/constraint problems → reduce from 3-SAT.
Number/subset problems → reduce from Subset Sum.
Path/tour problems → reduce from Hamiltonian Cycle.
The simpler the reduction, the better.

Classic NP-Complete Problems: Precise Definitions

3-SAT

Input: A boolean formula in Conjunctive Normal Form (CNF) where every clause has exactly 3 literals. A literal is a variable $x_i$ or its negation $\neg x_i$.
Question: Is there a TRUE/FALSE assignment to the variables that satisfies (makes TRUE) every clause?
Example: $(x_1 \lor \neg x_2 \lor x_3) \land (\neg x_1 \lor x_2 \lor \neg x_4)$.
NPC: Proven by reducing SAT to 3-SAT (splitting long clauses, padding short ones).
This is typically the most useful starting point for reductions.

Clique

Input: An undirected graph $G = (V, E)$ and a positive integer $k$.
Question: Does $G$ contain a clique of size $k$? (A clique is a subset $S \subseteq V$ of $k$ vertices such that every pair of vertices in $S$ is connected by an edge.)
Example: In a social network, a clique of size $k$ is a group of $k$ people who are ALL mutual friends.
In NP: Certificate = the $k$ vertices. Verify all $\binom{k}{2}$ edges exist.
NPC: Proven by reduction from 3-SAT.

3-SAT → Clique Reduction

Given a 3-SAT formula with $m$ clauses, construct a graph $G$:
Vertices: For each clause, create 3 nodes (one per literal). So $3m$ nodes total.
Edges: Connect two nodes $u$ and $v$ if: (1) they come from different clauses, AND (2) they are not contradictory (i.e., $u \neq \neg v$).
Set $k = m$ (number of clauses).
Claim: The formula is satisfiable $\iff$ $G$ has a clique of size $m$.
Why: A satisfying assignment picks one true literal per clause; these $m$ literals are pairwise consistent (no contradictions), so they form a clique.

Independent Set

Input: An undirected graph $G = (V, E)$ and a positive integer $k$.
Question: Does $G$ contain an independent set of size $k$? (An independent set is a subset $S \subseteq V$ of $k$ vertices such that no two vertices in $S$ are connected by an edge.)
Example: In a conflict graph (edges = conflicts), an independent set is a group of $k$ items with no conflicts between them.
In NP: Certificate = the $k$ vertices. Verify no edge exists between any pair.
NPC: Proven by reduction from Clique. ($G$ has a $k$-clique $\iff$ the complement graph $\bar{G}$ has an independent set of size $k$.)

Vertex Cover

Input: An undirected graph $G = (V, E)$ and a positive integer $k$.
Question: Does $G$ have a vertex cover of size $\le k$? (A vertex cover is a subset $S \subseteq V$ such that every edge in $E$ has at least one endpoint in $S$.)
Example: Place guards at $k$ intersections so every street has at least one guard on it.
In NP: Certificate = the $k$ vertices. Verify every edge is touched.
NPC: Proven by reduction from Independent Set. ($S$ is an independent set $\iff$ $V \setminus S$ is a vertex cover.)

Hamiltonian Cycle

Input: An undirected (or directed) graph $G = (V, E)$.
Question: Does $G$ contain a Hamiltonian cycle? (A cycle that visits every vertex exactly once and returns to the starting vertex.)
Example: Can a delivery driver visit every city exactly once and return home, using only the available roads?
In NP: Certificate = the ordering of all $n$ vertices. Verify each consecutive pair (and wrap-around) is an edge, and all vertices appear exactly once.
NPC: Proven by reduction from 3-SAT (or Vertex Cover). The construction is intricate.

Traveling Salesperson Problem (TSP)

Input: A set of $n$ cities, a distance $d(i,j) \ge 0$ between every pair of cities, and a budget $B$.
Question: Is there a tour that visits every city exactly once, returns to the starting city, and has total distance $\le B$?
Example: A salesperson must visit clients in 20 cities. Can they do it while driving at most 500 miles total?
In NP: Certificate = the ordering of cities. Verify all cities appear, sum the distances, check $\le B$.
NPC: Proven by reduction from Hamiltonian Cycle.

HamCycle → TSP Reduction

Given graph $G = (V, E)$ for the Hamiltonian Cycle problem, construct a TSP instance:
Cities = vertices of $G$.
Distances: If edge $(u,v) \in E$, set $d(u,v) = 1$. Otherwise, set $d(u,v) = 2$.
Budget $B = |V|$.
Correctness: $G$ has a Hamiltonian cycle $\iff$ the TSP instance has a tour of total cost exactly $|V|$ (using only weight-1 edges, which are exactly the original edges).

Graph Coloring (3-Coloring)

Input: An undirected graph $G = (V, E)$.
Question: Can you assign one of 3 colors to each vertex so that no two adjacent vertices share the same color?
Example: You have 3 time slots for exams. Students share edges if they take both courses. Can you schedule all exams so no student has a conflict?
In NP: Certificate = the color assignment. Verify every edge has differently-colored endpoints.
NPC: Proven by reduction from 3-SAT.

Subset Sum

Input: A set of $n$ positive integers $\{a_1, a_2, \ldots, a_n\}$ and a target integer $T$.
Question: Is there a subset of these integers that adds up to exactly $T$?
Example: You have bills of sizes \$3, \$7, \$1, \$8, \$4. Can you pick some that total exactly \$12? (Yes: \$3 + \$1 + \$8.)
In NP: Certificate = the chosen subset. Verify the elements are from the input and their sum equals $T$.
NPC: Proven by reduction from 3-SAT.

0/1 Knapsack (Decision Version)

Input: $n$ items, each with a weight $w_i$ and a value $v_i$; a weight capacity $W$; and a value target $V$.
Question: Is there a subset of items with total weight $\le W$ and total value $\ge V$?
Example: Your backpack holds 15 kg. Items have weights and dollar values. Can you pack items worth at least \$100 without exceeding 15 kg?
In NP: Certificate = the chosen items. Verify weight $\le W$ and value $\ge V$.
NPC: Proven by reduction from Subset Sum.

The Reduction Chain

How Classic NPC Proofs Connect

SAT (Cook-Levin: NPC from scratch)
└→ 3-SAT
├→ Clique → Independent Set → Vertex Cover
├→ 3-Coloring
├→ Subset Sum → Knapsack
└→ Hamiltonian Cycle → TSP
Each arrow $A \to B$ means "$A$ was reduced to $B$" (i.e., $A \le_p B$), proving $B$ is NPC.
You only need one reduction into a problem to prove it NP-Hard.

Interactive Demonstrations

Reduction Visualizer

🚀 Interactive Demo: reduction_demo.html

Practice Problems

Problem 1: Is 2-SAT NP-Complete?

2-SAT: Same as SAT, but every clause has exactly 2 literals.
Example: $(x_1 \lor \neg x_3) \land (\neg x_2 \lor x_4) \land (x_1 \lor x_2)$.
Answer: No, 2-SAT is in P.
It can be solved in linear time using an implication graph and strongly connected components.
This shows that small changes in a problem definition (3 literals → 2 literals) can drastically change complexity.

Problem 2: HamCycle → TSP Reduction

We showed this reduction earlier. Verify you understand it:
Q: Why do we set non-edges to weight 2 (not weight 1)?
A: If all distances were 1, every tour would cost $|V|$, and we couldn't distinguish Hamiltonian from non-Hamiltonian.
Q: Why is the budget $B = |V|$?
A: A tour uses exactly $|V|$ edges. Each real edge costs 1. Total = $|V|$ only if every edge in the tour is a real edge from $G$.

Problem 3: Is the Halting Problem NP-Complete?

Halting Problem: Given a program and an input, does the program eventually stop (rather than running forever)?
Answer: No, the Halting Problem is not NP-Complete.
It is undecidable — no algorithm can solve it at all, regardless of time.
It is NP-Hard (and much harder than NP-Hard), but it is not in NP because there is no way to verify the answer in finite time for all cases.
NP-Complete requires being both NP-Hard and in NP.

Problem 4: Reduction Direction Check

Suppose you prove a reduction from your new problem $X$ to 3-SAT ($X \le_p \text{3-SAT}$).
Does this prove $X$ is NP-Complete?
No. This only shows $X$ is no harder than 3-SAT, meaning $X \in NP$ (at best).
To prove $X$ is NPC, you need a reduction from 3-SAT to $X$: ($\text{3-SAT} \le_p X$).

Common Mistakes

Pitfalls to Avoid

Wrong direction: $A \le_p B$ means $B$ is harder, not $A$. "Reduce FROM known-hard TO new."
NP ≠ 'Not Polynomial': NP means solutions can be verified quickly. Many NP problems are also in P.
P ⊆ NP: Every problem in P is automatically in NP. P and NP are not disjoint sets.
Forgetting Step 1: To prove NPC, you must show the problem is in NP (state the certificate and verifier), not just do a reduction.
Confusing NP-Hard and NPC: NP-Hard problems might not be in NP (e.g., Halting Problem). NPC = NP-Hard $\cap$ NP.

Summary

Key Takeaways

P = problems solvable in polynomial time.
NP = problems where YES answers have a short proof that can be checked in polynomial time.
Reduction $A \le_p B$: a poly-time conversion showing $B$ is at least as hard as $A$.
NP-Hard = at least as hard as everything in NP.
NP-Complete = NP-Hard and in NP — the hardest problems within NP.
Cook-Levin Theorem: SAT is NP-Complete (the anchor for all NPC proofs).
To prove a new problem is NPC: (1) show it is in NP, (2) reduce a known NPC problem to it.

Supplementary Resources

🚀 Interactive Demo: reduction_demo.html

🚀 Interactive Demo: reading_practice.html

🚀 Interactive Demo: worked_examples.html

🚀 Interactive Demo: handout.html