Greedy Algorithms: Concepts and Applications
Introduction to Greedy Algorithms
Greedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum. They are simple, intuitive, and often efficient, but don't always produce the optimal solution for all problems.
Key Characteristics:
- Greedy Choice Property: A global optimum can be reached by making locally optimal choices
- Optimal Substructure: An optimal solution contains optimal solutions to subproblems
- Irrevocable: Choices made cannot be changed later
When to Use Greedy Algorithms
Greedy algorithms work well for problems where:
- Local optimal choices lead to global optimum
- The problem has optimal substructure
- We can make a decision without considering future choices
- The solution can be built incrementally
Classic Greedy Problems with Python Implementations
1. Activity Selection Problem
Problem: Select the maximum number of non-overlapping activities from a set.
Greedy Approach:
- Sort activities by finish time
- Select first activity
- For each remaining activity, select if it doesn't conflict with last selected
def activity_selection(start, finish):
n = len(finish)
selected = []
# Sort activities by finish time
activities = sorted(zip(start, finish), key=lambda x: x[1])
# Select first activity
i = 0
selected.append(i)
for j in range(1, n):
# If activity starts after last finishes, select it
if activities[j][0] >= activities[i][1]:
selected.append(j)
i = j
return selected
# Example
start = [1, 3, 0, 5, 8, 5]
finish = [2, 4, 6, 7, 9, 9]
selected = activity_selection(start, finish)
print("Selected activities:", selected) # Output: [0, 1, 3, 4]
2. Fractional Knapsack Problem
Problem: Fill a knapsack with maximum value, allowing fractions of items.
Greedy Approach:
- Calculate value/weight ratio for each item
- Sort items by ratio in descending order
- Take as much as possible of highest ratio items
def fractional_knapsack(value, weight, capacity):
n = len(value)
items = list(zip(value, weight))
# Sort by value/weight ratio (descending)
items.sort(key=lambda x: x[0]/x[1], reverse=True)
total_value = 0.0
for v, w in items:
if capacity >= w:
total_value += v
capacity -= w
else:
fraction = capacity / w
total_value += v * fraction
break
return total_value
# Example
value = [60, 100, 120]
weight = [10, 20, 30]
capacity = 50
print("Maximum value:", fractional_knapsack(value, weight, capacity))
# Output: 240.0 (all of items 1 and 2, 2/3 of item 3)
3. Huffman Coding (Data Compression)
Problem: Assign variable-length codes to characters to minimize total bits.
Greedy Approach:
- Build frequency table
- Create min-heap of nodes
- Repeatedly combine two smallest nodes
- Assign codes based on tree paths
import heapq
class HuffmanNode:
def __init__(self, char=None, freq=0, left=None, right=None):
self.char = char
self.freq = freq
self.left = left
self.right = right
def __lt__(self, other):
return self.freq < other.freq
def build_huffman_tree(text):
if not text:
return None
# Calculate frequency
freq = {}
for char in text:
freq[char] = freq.get(char, 0) + 1
# Create priority queue
heap = []
for char, count in freq.items():
heapq.heappush(heap, HuffmanNode(char, count))
# Build tree
while len(heap) > 1:
left = heapq.heappop(heap)
right = heapq.heappop(heap)
merged = HuffmanNode(freq=left.freq + right.freq, left=left, right=right)
heapq.heappush(heap, merged)
return heapq.heappop(heap)
def build_codes(node, prefix="", codebook={}):
if node is None:
return
if node.char is not None:
codebook[node.char] = prefix
build_codes(node.left, prefix + "0", codebook)
build_codes(node.right, prefix + "1", codebook)
return codebook
# Example
text = "this is an example for huffman encoding"
huffman_tree = build_huffman_tree(text)
codes = build_codes(huffman_tree)
print("Huffman Codes:", codes)
4. Dijkstra's Algorithm (Shortest Path)
Problem: Find shortest paths from a source to all other vertices.
Greedy Approach:
- Maintain set of unvisited nodes
- At each step, pick node with smallest distance
- Update distances to its neighbors
- Repeat until all nodes visited
import heapq
def dijkstra(graph, start):
distances = {vertex: float('infinity') for vertex in graph}
distances[start] = 0
heap = [(0, start)]
while heap:
current_dist, current_vertex = heapq.heappop(heap)
if current_dist > distances[current_vertex]:
continue
for neighbor, weight in graph[current_vertex].items():
distance = current_dist + weight
if distance < distances[neighbor]:
distances[neighbor] = distance
heapq.heappush(heap, (distance, neighbor))
return distances
# Example
graph = {
'A': {'B': 2, 'C': 5},
'B': {'A': 2, 'C': 1, 'D': 7},
'C': {'A': 5, 'B': 1, 'D': 3},
'D': {'B': 7, 'C': 3}
}
print("Shortest distances:", dijkstra(graph, 'A'))
# Output: {'A': 0, 'B': 2, 'C': 3, 'D': 6}
5. Minimum Spanning Tree (Prim's Algorithm)
Problem: Find a minimum weight spanning tree for a connected weighted graph.
Greedy Approach:
- Start with arbitrary vertex
- At each step, add cheapest edge that connects tree to new vertex
- Repeat until all vertices are included
import heapq
def prim_mst(graph):
mst = []
visited = set()
start_vertex = next(iter(graph))
heap = []
visited.add(start_vertex)
for neighbor, weight in graph[start_vertex].items():
heapq.heappush(heap, (weight, start_vertex, neighbor))
while heap and len(visited) < len(graph):
weight, u, v = heapq.heappop(heap)
if v not in visited:
visited.add(v)
mst.append((u, v, weight))
for neighbor, w in graph[v].items():
if neighbor not in visited:
heapq.heappush(heap, (w, v, neighbor))
return mst
# Example
graph = {
'A': {'B': 2, 'D': 6},
'B': {'A': 2, 'C': 3, 'D': 8},
'C': {'B': 3, 'D': 4},
'D': {'A': 6, 'B': 8, 'C': 4}
}
print("MST edges:", prim_mst(graph))
# Output: [('A', 'B', 2), ('B', 'C', 3), ('C', 'D', 4)]
6. Coin Change Problem (Greedy Version)
Problem: Make change using fewest coins (when greedy works).
Greedy Approach:
- Sort coins in descending order
- Use as many as possible of largest coin
- Move to next smaller coin
def coin_change_greedy(coins, amount):
coins.sort(reverse=True)
change = []
for coin in coins:
while amount >= coin:
amount -= coin
change.append(coin)
return change if amount == 0 else None
# Example (works for standard coin systems)
coins = [25, 10, 5, 1]
amount = 63
print("Change for 63 cents:", coin_change_greedy(coins, amount))
# Output: [25, 25, 10, 1, 1, 1]
7. Job Sequencing with Deadlines
Problem: Schedule jobs to maximize profit before deadlines.
Greedy Approach:
- Sort jobs by profit (descending)
- Schedule each job as late as possible before deadline
def job_sequencing(jobs):
# Sort jobs by profit (descending)
jobs.sort(key=lambda x: x[2], reverse=True)
max_deadline = max(job[1] for job in jobs)
schedule = [None] * (max_deadline + 1)
for job in jobs:
id, deadline, profit = job
# Find latest available slot before deadline
for i in range(deadline, 0, -1):
if schedule[i] is None:
schedule[i] = id
break
# Remove None values and return scheduled jobs
return [job for job in schedule if job is not None]
# Example
jobs = [
('a', 2, 100),
('b', 1, 19),
('c', 2, 27),
('d', 1, 25),
('e', 3, 15)
]
print("Scheduled jobs:", job_sequencing(jobs))
# Output: ['c', 'a', 'e']
Greedy vs Dynamic Programming
Characteristic | Greedy | Dynamic Programming |
---|---|---|
Approach | Makes locally optimal choice | Considers all possibilities |
Optimality | Not always optimal | Always optimal |
Speed | Faster | Slower |
Space | Usually less memory | More memory |
Examples | Dijkstra's, Prim's | Knapsack, LCS |
When Greedy Fails
Greedy algorithms don't always produce optimal solutions. Example: The classic 0/1 Knapsack problem cannot be solved optimally with a greedy approach (except in fractional version).
# Non-optimal greedy solution for 0/1 Knapsack
def greedy_knapsack(value, weight, capacity):
ratio = [v/w for v, w in zip(value, weight)]
indices = sorted(range(len(ratio)), key=lambda i: ratio[i], reverse=True)
total_value = 0
total_weight = 0
for i in indices:
if total_weight + weight[i] <= capacity:
total_value += value[i]
total_weight += weight[i]
return total_value
# Example where greedy fails
value = [60, 100, 120]
weight = [10, 20, 30]
capacity = 50
print("Greedy knapsack:", greedy_knapsack(value, weight, capacity)) # 160
print("Optimal solution:", 220) # Using DP would give 220
Proof Techniques for Greedy Algorithms
To verify a greedy algorithm's correctness:
- Greedy Choice Property: Prove that greedy choice is part of some optimal solution
- Optimal Substructure: Show that remaining subproblem is similar to original
- Exchange Argument: Show that any optimal solution can be transformed to include greedy choice
Greedy algorithms are powerful tools when applicable, offering efficient solutions to many optimization problems. The key is recognizing when the greedy choice property holds and when it doesn't.