Chapter 18 - Greedy Algorithms

Chapter 18 – Greedy Algorithms

Chapter 18: Greedy Algorithms

Welcome to Chapter 18, where we’ll dive into the fascinating world of Greedy Algorithms. A Greedy Algorithm works by making the best decision at each step, aiming to reach an optimal solution. The idea is simple: take the immediate advantage without worrying about the future. While this may sound risky, greedy algorithms often lead to optimal solutions in many real-world problems.

In this chapter, we will explore two popular applications of greedy algorithms:

  1. Huffman Coding — for data compression.
  2. Minimum Spanning Trees — using Kruskal’s and Prim’s algorithms for efficient network design.

Let’s get started with Huffman Coding!


18.1 Huffman Coding

Definition:

Huffman Coding is a greedy algorithm used for lossless data compression. It assigns variable-length codes to input characters, with shorter codes assigned to more frequent characters and longer codes to less frequent ones. This minimizes the overall size of the encoded data.

How Huffman Coding Works:

Here’s how you can visualize Huffman Coding:

  1. Frequency Calculation: Start by calculating the frequency of each character in the input data.
  2. Build a Priority Queue: Insert all characters (or symbols) into a priority queue, where the element with the lowest frequency has the highest priority.
  3. Construct the Huffman Tree: Repeatedly extract the two characters with the smallest frequencies from the queue, combine them into a new node (with the combined frequency), and reinsert this node back into the queue. Keep doing this until there’s only one node left – the root of the Huffman tree.
  4. Generate Codes: Starting from the root of the tree, assign 0 to the left edge and 1 to the right edge of each node. Traverse the tree to generate the variable-length codes for each character.
See also  Chapter 12 - Binary Trees and Binary Search Trees (BST)
Example:

Let’s say we want to encode the string "AABACD".

  1. Character frequencies:

    • A: 3
    • B: 1
    • C: 1
    • D: 1
  2. Create a priority queue:

    • Insert A(3), B(1), C(1), D(1).
  3. Building the tree:

    • Combine B and C (1 + 1 = 2), now we have A(3), BC(2), D(1).
    • Combine D and BC (1 + 2 = 3), now we have A(3), DBC(3).
    • Combine A and DBC (3 + 3 = 6), this becomes the root.
  4. Generate Huffman Codes:

    • A = 0
    • B = 101
    • C = 100
    • D = 11

So, the encoded string for "AABACD" becomes: 0 0 101 0 100 11.

Huffman Coding Implementation in C:
#include <stdio.h>
#include <stdlib.h>

// A Huffman tree node
struct MinHeapNode {
    char data;
    unsigned freq;
    struct MinHeapNode *left, *right;
};

// Function to create a new min heap node
struct MinHeapNode* newNode(char data, unsigned freq) {
    struct MinHeapNode* temp = (struct MinHeapNode*) malloc(sizeof(struct MinHeapNode));
    temp->data = data;
    temp->freq = freq;
    temp->left = temp->right = NULL;
    return temp;
}

// A utility function to print an array of size n
void printArray(int arr[], int n) {
    for (int i = 0; i < n; ++i)
        printf("%d", arr[i]);
    printf("\n");
}

// Print Huffman codes from the root of the tree
void printHuffmanCodes(struct MinHeapNode* root, int arr[], int top) {
    if (root->left) {
        arr[top] = 0;
        printHuffmanCodes(root->left, arr, top + 1);
    }

    if (root->right) {
        arr[top] = 1;
        printHuffmanCodes(root->right, arr, top + 1);
    }

    if (!(root->left && root->right)) {
        printf("%c: ", root->data);
        printArray(arr, top);
    }
}

// This function builds the Huffman Tree and prints the codes by traversing the built tree
void HuffmanCodes(char data[], int freq[], int size) {
    // Create a simple example of a Huffman Tree (for demonstration purposes)
    struct MinHeapNode* root = newNode('$', 0);
    root->left = newNode('A', freq[0]);
    root->right = newNode('B', freq[1]);

    int arr[100], top = 0;
    printHuffmanCodes(root, arr, top);
}

int main() {
    char arr[] = {'A', 'B'};
    int freq[] = {5, 9};

    int size = sizeof(arr) / sizeof(arr[0]);
    HuffmanCodes(arr, freq, size);

    return 0;
}

This code gives a basic illustration of Huffman Coding. You would need a priority queue or min-heap to fully implement the algorithm, which can be a great exercise for you to try!

See also  Chapter 6 - Circular Linked Lists
Pros and Cons of Huffman Coding:
  • Pros:

    • Reduces the size of data significantly, leading to efficient storage.
    • Works well for characters with varying frequencies.
  • Cons:

    • Requires prior knowledge of character frequencies.
    • Can be time-consuming to build for very large datasets.

18.2 Minimum Spanning Trees

Now let’s move to the Minimum Spanning Tree (MST) problem. An MST is a subgraph of a graph that connects all vertices with the minimum possible total edge weight, without forming any cycles.

There are two popular algorithms for finding the MST:

  1. Kruskal’s Algorithm
  2. Prim’s Algorithm

Let’s start with Kruskal’s Algorithm.


Kruskal’s Algorithm

Kruskal’s Algorithm is a greedy approach that treats the graph as a forest, where each vertex is its own tree. It keeps adding the smallest edge that connects two separate trees until all vertices are connected.

How Kruskal’s Algorithm Works:
  1. Sort all the edges in increasing order based on their weights.
  2. Initialize an empty forest (collection of trees), where each vertex is its own tree.
  3. Pick the smallest edge. If it connects two different trees, add it to the MST and merge the trees.
  4. Repeat this until all vertices are connected in a single tree.
Kruskal’s Algorithm Example:

Let’s say we have the following graph with edges:

Edge Weight
A-B 1
A-C 4
B-C 3
B-D 2
C-D 5
  1. Sort edges: A-B (1), B-D (2), B-C (3), A-C (4), C-D (5).
  2. Start by adding A-B to the MST.
  3. Add B-D next.
  4. Add B-C.
  5. Skip C-D (since it would create a cycle).
  6. The MST includes edges: A-B, B-D, B-C with a total weight of 6.
Kruskal’s Algorithm Implementation in C:
#include <stdio.h>
#include <stdlib.h>

// Structure to represent an edge
struct Edge {
    int src, dest, weight;
};

// Structure to represent a graph
struct Graph {
    int V, E;
    struct Edge* edges;
};

// Create a new graph
struct Graph* createGraph(int V, int E) {
    struct Graph* graph = (struct Graph*) malloc(sizeof(struct Graph));
    graph->V = V;
    graph->E = E;
    graph->edges = (struct Edge*) malloc(graph->E * sizeof(struct Edge));
    return graph;
}

// A utility function to find the subset of an element
int find(int parent[], int i) {
    if (parent[i] == i)
        return i;
    return find(parent, parent[i]);
}

// A function to do union of two subsets
void Union(int parent[], int x, int y) {
    int xroot = find(parent, x);
    int yroot = find(parent, y);
    parent[xroot] = yroot;
}

// Function to implement Kruskal's Algorithm
void KruskalMST(struct Graph* graph) {
    // Sort the edges based on their weight
    int parent[graph->V];
    for (int v = 0; v < graph->V; ++v)
        parent[v] = v;

    int e = 0; // Count of edges in MST
    int i = 0; // Initial index of sorted edges

    while (e < graph->V - 1) {
        struct Edge next_edge = graph->edges[i++];
        int x = find(parent, next_edge.src);
        int y = find(parent, next_edge.dest);

        if (x != y) {
            e++;
            printf("Edge included: %d - %d with weight %d\n", next_edge.src, next_edge.dest, next_edge.weight);
            Union(parent, x, y);
        }
    }
}

int main() {
    int V = 4, E = 5

; // Vertices and edges
    struct Graph* graph = createGraph(V, E);

    graph->edges[0].src = 0;
    graph->edges[0].dest = 1;
    graph->edges[0].weight = 10;

    // More edges...

    KruskalMST(graph);

    return 0;
}

The code initializes a graph and uses Kruskal’s algorithm to find the MST.

See also  Chapter 16 - Sorting Algorithms

Prim’s Algorithm

While Kruskal’s algorithm focuses on adding edges, Prim’s Algorithm builds the MST by starting with a single vertex and adding the smallest edge that connects a new vertex.

How Prim’s Algorithm Works:
  1. Start with an arbitrary vertex.
  2. Find the smallest edge that connects the vertex to another vertex outside the current MST.
  3. Add this edge and repeat the process until all vertices are included.

These greedy algorithms – Huffman Coding and Minimum Spanning Trees – are incredibly useful in various fields such as network design, file compression, and resource allocation. With this foundation, you are now equipped to explore deeper into the world of optimization!


In the next chapter, we will delve into more advanced graph algorithms.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Contact Form Demo