DSA – Data Structures and Algorithm

Chapter 23 - Advanced Data Structures

Chapter 23 – Advanced Data Structures

Chapter 23: Advanced Data Structures Advanced data structures are crucial for solving complex computational problems efficiently. They allow us to manage data in a way that reduces the time complexity of operations such as searching, updating, and querying large datasets. In this chapter, we will focus on three key advanced data structures: Hash Tables Tries Segment Trees and Fenwick Trees Each of these structures is designed to address different computational challenges and is widely used in competitive programming and real-world applications. Let’s explore these structures in detail. 23.1 Hash Tables A Hash Table is a data structure that provides an efficient way to store and retrieve data. It uses a hash function to map keys to values, enabling constant time average complexity for operations such as insertion, deletion, and search. 23.1.1 The Structure of Hash Tables A hash table consists of an array, where each position corresponds to an index produced by a hash function. The hash function converts a key (usually a string or an integer) into an index that maps to the array position where the associated value is stored. Hash Function: A function that converts a key into an index. Collisions: When two different keys produce the same index, a collision occurs. Hash tables handle collisions using two primary methods: Chaining: Multiple key-value pairs are stored in a list at the same index. Open Addressing: The hash table searches for the next available index. 23.1.2 Operations in Hash Tables Insert(key, value): Insert a key-value pair into the hash table. Search(key): Retrieve the value associated with the key. Delete(key): Remove the key-value pair from the hash table. Example Implementation of a Hash Table Here’s a simple implementation of a hash table using chaining in C. #include <stdio.h> #include <stdlib.h> #include <string.h> #define TABLE_SIZE 10 struct Node { int key; int value; struct Node* next; }; struct Node* hashTable[TABLE_SIZE]; int hashFunction(int key) { return key % TABLE_SIZE; } void insert(int key, int value) { int hashIndex = hashFunction(key); struct Node* newNode = (struct Node*) malloc(sizeof(struct Node)); newNode->key = key; newNode->value = value; newNode->next = hashTable[hashIndex]; hashTable[hashIndex] = newNode; } int search(int key) { int hashIndex = hashFunction(key); struct Node* node = hashTable[hashIndex]; while (node != NULL) { if (node->key == key) { return node->value; } node = node->next; } return -1; } void delete(int key) { int hashIndex = hashFunction(key); struct Node* node = hashTable[hashIndex]; struct Node* prev = NULL; while (node != NULL && node->key != key) { prev = node; node = node->next; } if (node == NULL) return; if (prev == NULL) { hashTable[hashIndex] = node->next; } else { prev->next = node->next; } free(node); } 23.2 Tries (Prefix Trees) A Trie (pronounced “try”) is a tree-like data structure that is used for storing strings efficiently. It is especially useful for scenarios where we need to quickly search for strings or prefixes of strings, such as in dictionary implementations and autocomplete systems. 23.2.1 Structure of a Trie A Trie stores strings character by character. Each node in the Trie represents a character, and a path from the root to a node represents a string or a prefix. The Trie allows us to check whether a string exists, whether a prefix exists, and retrieve all strings that start with a given prefix. Each node of a Trie has: Children: References to its child nodes, representing the next characters. End of Word Marker: A boolean value indicating if the node represents the end of a word. 23.2.2 Trie Operations Insert(word): Inserts a word into the Trie. Search(word): Checks if the word is present in the Trie. Prefix Search(prefix): Checks if there is any word in the Trie that starts with the given prefix. Example Implementation of a Trie #include <stdio.h> #include <stdlib.h> #include <string.h> #define ALPHABET_SIZE 26 struct TrieNode { struct TrieNode* children[ALPHABET_SIZE]; int isEndOfWord; }; struct TrieNode* getNode() { struct TrieNode* node = (struct TrieNode*) malloc(sizeof(struct TrieNode)); node->isEndOfWord = 0; for (int i = 0; i < ALPHABET_SIZE; i++) { node->children[i] = NULL; } return node; } void insert(struct TrieNode* root, const char* key) { struct TrieNode* current = root; for (int i = 0; i < strlen(key); i++) { int index = key[i] – ‘a’; if (!current->children[index]) { current->children[index] = getNode(); } current = current->children[index]; } current->isEndOfWord = 1; } int search(struct TrieNode* root, const char* key) { struct TrieNode* current = root; for (int i = 0; i < strlen(key); i++) { int index = key[i] – ‘a’; if (!current->children[index]) { return 0; } current = current->children[index]; } return current->isEndOfWord; } 23.3 Segment Trees and Fenwick Trees 23.3.1 Segment Trees A Segment Tree is a data structure used for efficiently performing range queries and updates on an array. For example, if we want to calculate the sum, minimum, or maximum of elements in a range of an array and frequently update the array values, a segment tree provides a more efficient solution compared to a simple array. A segment tree divides the array into segments, and each node in the tree stores information about a specific range. It allows range queries and point updates in O(log n) time. Operations on Segment Tree Build Tree: Construct the segment tree from an array. Range Query: Query a range for sum, minimum, maximum, etc. Update: Update an element in the array and reflect the changes in the segment tree. Example Implementation of a Segment Tree #include <stdio.h> int segmentTree[1000]; void buildTree(int arr[], int start, int end, int node) { if (start == end) { segmentTree[node] = arr[start]; } else { int mid = (start + end) / 2; buildTree(arr, start, mid, 2 * node + 1); buildTree(arr, mid + 1, end, 2 * node + 2); segmentTree[node] = segmentTree[2 * node + 1] + segmentTree[2 * node + 2]; } } int rangeQuery(int start, int end, int l, int r, int node) { if (l > end || r < start) { return 0; } if (l <= start && r >= end) { return segmentTree[node]; } int

Chapter 23 – Advanced Data Structures Read More »

Chapter 22 - Advanced Graph Algorithms

Chapter 22 – Advanced Graph Algorithms

Chapter 22: Advanced Graph Algorithms In this chapter, we delve deeper into Advanced Graph Algorithms that are essential for solving complex problems related to graph traversal, connectivity, and more. These algorithms are foundational in many computer science applications, from navigating networks to solving puzzles and finding connected components. We will explore the following topics: Depth-First Search (DFS) Breadth-First Search (BFS) Topological Sorting Strongly Connected Components (Kosaraju’s Algorithm) 22.1 Depth-First Search (DFS) Depth-First Search (DFS) is a fundamental algorithm used to explore all the vertices and edges of a graph. DFS explores as far as possible along each branch before backtracking, which makes it useful for tasks like detecting cycles or solving maze-like problems. Problem Definition: Given a graph, perform DFS traversal from a starting vertex, visiting each vertex once. DFS Algorithm Steps: Start at a given vertex and mark it as visited. For each unvisited neighbor, recursively perform DFS. If all neighbors are visited, backtrack to the previous vertex. DFS Algorithm (in C): #include <stdio.h> #include <stdbool.h> #define V 5 void DFSUtil(int graph[V][V], int v, bool visited[]) { visited[v] = true; printf("%d ", v); for (int i = 0; i < V; i++) { if (graph[v][i] == 1 && !visited[i]) { DFSUtil(graph, i, visited); } } } void DFS(int graph[V][V], int startVertex) { bool visited[V] = {false}; DFSUtil(graph, startVertex, visited); } int main() { int graph[V][V] = { {0, 1, 0, 0, 1}, {1, 0, 1, 0, 0}, {0, 1, 0, 1, 0}, {0, 0, 1, 0, 1}, {1, 0, 0, 1, 0} }; printf("DFS starting from vertex 0:\n"); DFS(graph, 0); return 0; } Explanation: The graph is represented as an adjacency matrix. The DFSUtil() function recursively visits each vertex and its adjacent vertices. visited[] keeps track of the vertices that have already been explored to avoid revisiting. Time Complexity: The time complexity of DFS is O(V²) for an adjacency matrix representation and O(V + E) for an adjacency list, where V is the number of vertices and E is the number of edges. 22.2 Breadth-First Search (BFS) Breadth-First Search (BFS) is another fundamental graph traversal algorithm that explores the vertices level by level. BFS is often used to find the shortest path in unweighted graphs and in applications such as solving puzzles or network broadcasting. BFS Algorithm Steps: Start from a given vertex and mark it as visited. Visit all the adjacent vertices of the current vertex. Repeat for each vertex, visiting the vertices in a breadth-first manner. BFS Algorithm (in C): #include <stdio.h> #include <stdbool.h> #define V 5 void BFS(int graph[V][V], int startVertex) { bool visited[V] = {false}; int queue[V], front = 0, rear = 0; visited[startVertex] = true; queue[rear++] = startVertex; while (front != rear) { int currentVertex = queue[front++]; printf("%d ", currentVertex); for (int i = 0; i < V; i++) { if (graph[currentVertex][i] == 1 && !visited[i]) { visited[i] = true; queue[rear++] = i; } } } } int main() { int graph[V][V] = { {0, 1, 0, 0, 1}, {1, 0, 1, 0, 0}, {0, 1, 0, 1, 0}, {0, 0, 1, 0, 1}, {1, 0, 0, 1, 0} }; printf("BFS starting from vertex 0:\n"); BFS(graph, 0); return 0; } Explanation: The graph is represented as an adjacency matrix. The queue[] is used to maintain the vertices that need to be explored. The visited[] array keeps track of the visited vertices. Time Complexity: The time complexity of BFS is O(V²) for an adjacency matrix representation and O(V + E) for an adjacency list. 22.3 Topological Sorting Topological Sorting is an ordering of vertices in a Directed Acyclic Graph (DAG) where, for every directed edge (u, v), vertex u comes before vertex v. It is widely used in scheduling problems where some tasks must precede others. Topological Sort Algorithm (DFS-based): Perform DFS on the graph. Push the vertices onto a stack when DFS finishes visiting all vertices from that vertex. Pop all vertices from the stack to get the topological order. Topological Sort Algorithm (in C): #include <stdio.h> #include <stdbool.h> #define V 6 void topologicalSortUtil(int graph[V][V], int v, bool visited[], int stack[], int *index) { visited[v] = true; for (int i = 0; i < V; i++) { if (graph[v][i] == 1 && !visited[i]) { topologicalSortUtil(graph, i, visited, stack, index); } } stack[(*index)–] = v; } void topologicalSort(int graph[V][V]) { bool visited[V] = {false}; int stack[V], index = V – 1; for (int i = 0; i < V; i++) { if (!visited[i]) { topologicalSortUtil(graph, i, visited, stack, &index); } } printf("Topological Sort:\n"); for (int i = 0; i < V; i++) { printf("%d ", stack[i]); } printf("\n"); } int main() { int graph[V][V] = { {0, 1, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0}, {0, 0, 0, 1, 1, 0}, {0, 0, 0, 0, 0, 1}, {0, 0, 0, 0, 0, 1}, {0, 0, 0, 0, 0, 0} }; topologicalSort(graph); return 0; } Explanation: The graph is represented as an adjacency matrix. The vertices are stored in the stack in a topologically sorted order. The algorithm makes use of DFS to ensure that all dependencies are visited before the vertex itself. Time Complexity: The time complexity of Topological Sort is O(V²) for an adjacency matrix and O(V + E) for an adjacency list. 22.4 Strongly Connected Components (Kosaraju’s Algorithm) In directed graphs, a Strongly Connected Component (SCC) is a maximal subgraph where every vertex is reachable from every other vertex in the subgraph. Kosaraju’s Algorithm is an efficient method to find SCCs in a directed graph. Kosaraju’s Algorithm Steps: Perform DFS on the original graph and push the vertices onto a stack based on their finish times. Reverse the graph. Perform DFS again, using the vertices in the order defined by the stack. Kosaraju’s Algorithm (in C): #include <stdio.h> #include <stdbool.h> #define V 5 void DFSUtil(int graph[V][V], int v, bool visited[]) { visited[v] = true; printf("%d ", v); for (int i = 0; i < V; i++) { if (graph[v][i] == 1 && !visited[i]) { DFSUtil(graph, i, visited); } }

Chapter 22 – Advanced Graph Algorithms Read More »

Chapter 21 - Graph Algorithms

Chapter 21 – Graph Algorithms

Chapter 21: Graph Algorithms Welcome back! In this chapter, we’ll explore Graph Algorithms, which are crucial in solving problems related to networks, connections, and paths. These algorithms help us navigate through graphs and solve problems like finding the shortest path, traversing graphs, and much more. We’ll focus on two widely-used algorithms for solving problems involving graphs: Shortest Path Algorithms Minimum Spanning Tree (MST) Algorithms Let’s dive right in! 21.1 Shortest Path Algorithms In many real-world applications like GPS navigation, computer networks, and transportation systems, we are often asked to find the shortest path from one point to another in a graph. 21.1.1 Dijkstra’s Algorithm Dijkstra’s Algorithm is used to find the shortest path from a source node to all other nodes in a graph with non-negative edge weights. Problem Definition: Given a graph and a source node, find the shortest path from the source to all other nodes. Algorithm Steps: Create a set of all nodes in the graph. Assign a tentative distance value to every node: 0 for the source node and infinity for all other nodes. While there are unvisited nodes: Pick the unvisited node with the smallest tentative distance. Update the tentative distances of its neighboring nodes. Mark the node as visited. Dijkstra’s Algorithm (in C): #include <stdio.h> #include <limits.h> #include <stdbool.h> #define V 9 int minDistance(int dist[], bool sptSet[]) { int min = INT_MAX, min_index; for (int v = 0; v < V; v++) if (sptSet[v] == false && dist[v] <= min) min = dist[v], min_index = v; return min_index; } void printSolution(int dist[]) { printf("Vertex \t Distance from Source\n"); for (int i = 0; i < V; i++) printf("%d \t %d\n", i, dist[i]); } void dijkstra(int graph[V][V], int src) { int dist[V]; bool sptSet[V]; for (int i = 0; i < V; i++) dist[i] = INT_MAX, sptSet[i] = false; dist[src] = 0; for (int count = 0; count < V – 1; count++) { int u = minDistance(dist, sptSet); sptSet[u] = true; for (int v = 0; v < V; v++) if (!sptSet[v] && graph[u][v] && dist[u] != INT_MAX && dist[u] + graph[u][v] < dist[v]) dist[v] = dist[u] + graph[u][v]; } printSolution(dist); } int main() { int graph[V][V] = { {0, 4, 0, 0, 0, 0, 0, 8, 0}, {4, 0, 8, 0, 0, 0, 0, 11, 0}, {0, 8, 0, 7, 0, 4, 0, 0, 2}, {0, 0, 7, 0, 9, 14, 0, 0, 0}, {0, 0, 0, 9, 0, 10, 0, 0, 0}, {0, 0, 4, 14, 10, 0, 2, 0, 0}, {0, 0, 0, 0, 0, 2, 0, 1, 6}, {8, 11, 0, 0, 0, 0, 1, 0, 7}, {0, 0, 2, 0, 0, 0, 6, 7, 0} }; dijkstra(graph, 0); return 0; } Explanation: The graph is represented by an adjacency matrix where graph[i][j] represents the weight of the edge between vertices i and j. If there is no edge, the value is 0. We maintain an array dist[] where dist[i] holds the shortest distance from the source to vertex i. sptSet[] keeps track of vertices for which we have finalized the shortest distance. Time Complexity: The time complexity of this implementation is O(V²), where V is the number of vertices. 21.2 Minimum Spanning Tree (MST) Algorithms Next up, we’ll look at Minimum Spanning Trees (MST). This is a fundamental problem in network design. Problem Definition: Given a connected, undirected graph, a Minimum Spanning Tree is a subset of the edges that connects all the vertices together without any cycles and with the minimum possible total edge weight. Two well-known algorithms to find MSTs: Prim’s Algorithm Kruskal’s Algorithm 21.2.1 Prim’s Algorithm Prim’s Algorithm finds a minimum spanning tree by starting from a single vertex and growing the tree one edge at a time, always choosing the smallest edge that connects a new vertex to the tree. Algorithm Steps: Initialize the minimum spanning tree (MST) with a single vertex. Repeatedly add the smallest edge that connects a vertex inside the MST to one outside it. Stop when all vertices are in the MST. Prim’s Algorithm (in C): #include <stdio.h> #include <limits.h> #include <stdbool.h> #define V 5 int minKey(int key[], bool mstSet[]) { int min = INT_MAX, min_index; for (int v = 0; v < V; v++) if (mstSet[v] == false && key[v] < min) min = key[v], min_index = v; return min_index; } void printMST(int parent[], int graph[V][V]) { printf("Edge \tWeight\n"); for (int i = 1; i < V; i++) printf("%d – %d \t%d \n", parent[i], i, graph[i][parent[i]]); } void primMST(int graph[V][V]) { int parent[V]; int key[V]; bool mstSet[V]; for (int i = 0; i < V; i++) key[i] = INT_MAX, mstSet[i] = false; key[0] = 0; parent[0] = -1; for (int count = 0; count < V – 1; count++) { int u = minKey(key, mstSet); mstSet[u] = true; for (int v = 0; v < V; v++) if (graph[u][v] && mstSet[v] == false && graph[u][v] < key[v]) parent[v] = u, key[v] = graph[u][v]; } printMST(parent, graph); } int main() { int graph[V][V] = { {0, 2, 0, 6, 0}, {2, 0, 3, 8, 5}, {0, 3, 0, 0, 7}, {6, 8, 0, 0, 9}, {0, 5, 7, 9, 0} }; primMST(graph); return 0; } Explanation: We represent the graph using an adjacency matrix. graph[i][j] contains the weight of the edge between vertex i and j. key[] keeps track of the minimum edge weights that connect vertices to the MST. parent[] stores the constructed MST. Time Complexity: The time complexity of Prim’s algorithm is O(V²). 21.2.2 Kruskal’s Algorithm Kruskal’s Algorithm works by sorting all the edges of the graph in non-decreasing order of their weight. It then repeatedly adds the smallest edge to the MST unless doing so would form a cycle. Algorithm Steps: Sort all edges in non-decreasing order of their weight. Pick the smallest edge. If it doesn’t form a cycle, add it to the MST. Repeat until there are exactly V-1 edges in the MST. Kruskal’s Algorithm (in C): #include <stdio.h> #include <stdlib.h> #define V 4 #define

Chapter 21 – Graph Algorithms Read More »

Chapter 19 - Advanced Graph Algorithms

Chapter 19 – Advanced Graph Algorithms

Chapter 19: Advanced Graph Algorithms Welcome to Chapter 19, where we’ll explore some of the more advanced algorithms used in graph theory. These algorithms are crucial for solving complex problems related to networks, paths, and flows. In this chapter, we will cover: Shortest Path Algorithms: Dijkstra’s Algorithm Bellman-Ford Algorithm Maximum Flow Algorithms: Ford-Fulkerson Method Edmonds-Karp Algorithm 19.1 Shortest Path Algorithms In graph theory, finding the shortest path between two nodes is a common problem. Depending on the type of graph (directed, undirected, weighted, unweighted), different algorithms can be used to find the shortest paths efficiently. 19.1.1 Dijkstra’s Algorithm Definition: Dijkstra’s Algorithm finds the shortest path from a source node to all other nodes in a graph with non-negative edge weights. It follows a greedy approach by choosing the shortest path at each step. How Dijkstra’s Algorithm Works: Initialize Distances: Start by assigning an infinite distance to all nodes except the source node, which gets a distance of 0. Priority Queue: Use a priority queue to select the node with the smallest known distance. Relaxation: For each neighboring node, check if a shorter path can be found through the current node. If so, update the distance. Repeat: Continue this process until all nodes have been visited and their shortest paths are known. Dijkstra’s Algorithm Example: Consider the graph below: A –1–> B A –4–> C B –2–> C B –6–> D C –3–> D To find the shortest path from A to all other nodes: Initialization: Set the distance to A as 0, and all others as infinity: Distances: A = 0, B = ∞, C = ∞, D = ∞ Priority Queue: Start with A (0). Relaxation: From A, update B (0 + 1 = 1) and C (0 + 4 = 4). New distances: A = 0, B = 1, C = 4, D = ∞ Next Node: Select B (1), update C (1 + 2 = 3) and D (1 + 6 = 7). New distances: A = 0, B = 1, C = 3, D = 7 Next Node: Select C (3), update D (3 + 3 = 6). Final distances: A = 0, B = 1, C = 3, D = 6 The shortest paths from A are: A to B: 1 A to C: 3 A to D: 6 Dijkstra’s Algorithm Implementation in C++: #include <iostream> #include <vector> #include <queue> #include <limits.h> using namespace std; #define INF INT_MAX // Structure for a graph edge struct Edge { int dest, weight; }; // Dijkstra’s algorithm to find the shortest path void dijkstra(vector<vector<Edge>>& graph, int src) { int V = graph.size(); vector<int> dist(V, INF); dist[src] = 0; priority_queue<pair<int, int>, vector<pair<int, int>>, greater<pair<int, int>>> pq; pq.push({0, src}); while (!pq.empty()) { int u = pq.top().second; pq.pop(); for (auto edge : graph[u]) { int v = edge.dest; int weight = edge.weight; if (dist[u] + weight < dist[v]) { dist[v] = dist[u] + weight; pq.push({dist[v], v}); } } } cout << "Vertex\tDistance from Source\n"; for (int i = 0; i < V; ++i) cout << i << "\t\t" << dist[i] << "\n"; } int main() { int V = 5; vector<vector<Edge>> graph(V); graph[0].push_back({1, 10}); graph[0].push_back({4, 5}); graph[1].push_back({2, 1}); graph[2].push_back({3, 4}); graph[4].push_back({1, 3}); graph[4].push_back({2, 9}); dijkstra(graph, 0); return 0; } This code illustrates Dijkstra’s algorithm for a graph with 5 vertices. 19.1.2 Bellman-Ford Algorithm Definition: The Bellman-Ford Algorithm is another method for finding the shortest path from a source node to all other nodes, but unlike Dijkstra’s algorithm, it works with graphs that may contain negative weight edges. How Bellman-Ford Algorithm Works: Initialize Distances: Set the distance of the source node to 0 and all other nodes to infinity. Relax All Edges: For each edge, update the distance to its destination if a shorter path is found through the source. Repeat: Perform the relaxation process V-1 times, where V is the number of vertices. Negative Cycle Detection: In the final pass, check for any further distance reductions. If found, a negative weight cycle exists. Example: Consider the graph: A –1–> B B –2–> C C –(-1)–> A B –3–> D To find the shortest path from A: Initialize distances: A = 0, B = ∞, C = ∞, D = ∞. After the first pass: A to B: 1 B to C: 1 + 2 = 3 B to D: 1 + 3 = 4 After the second pass, no changes occur, confirming the final shortest paths. Bellman-Ford Algorithm Implementation in C++: #include <iostream> #include <vector> using namespace std; struct Edge { int u, v, weight; }; void bellmanFord(int V, int E, vector<Edge>& edges, int src) { vector<int> dist(V, INT_MAX); dist[src] = 0; for (int i = 1; i <= V – 1; i++) { for (int j = 0; j < E; j++) { int u = edges[j].u; int v = edges[j].v; int weight = edges[j].weight; if (dist[u] != INT_MAX && dist[u] + weight < dist[v]) { dist[v] = dist[u] + weight; } } } // Check for negative-weight cycles for (int i = 0; i < E; i++) { int u = edges[i].u; int v = edges[i].v; int weight = edges[i].weight; if (dist[u] != INT_MAX && dist[u] + weight < dist[v]) { cout << "Graph contains a negative-weight cycle\n"; return; } } cout << "Vertex\tDistance from Source\n"; for (int i = 0; i < V; i++) cout << i << "\t\t" << dist[i] << "\n"; } int main() { int V = 5, E = 8; vector<Edge> edges = {{0, 1, -1}, {0, 2, 4}, {1, 2, 3}, {1, 3, 2}, {1, 4, 2}, {3, 2, 5}, {3, 1, 1}, {4, 3, -3}}; bellmanFord(V, E, edges, 0); return 0; } The Bellman-Ford algorithm is more versatile than Dijkstra’s because it can handle negative weights but is less efficient in practice. 19.2 Maximum Flow Algorithms In network flow problems, we aim to send as much flow as possible from a source node to a destination node, subject to capacity limits on the edges. Two classic algorithms for finding the maximum

Chapter 19 – Advanced Graph Algorithms Read More »

Chapter 18 - Greedy Algorithms

Chapter 18 – Greedy Algorithms

Chapter 18: Greedy Algorithms Welcome to Chapter 18, where we’ll dive into the fascinating world of Greedy Algorithms. A Greedy Algorithm works by making the best decision at each step, aiming to reach an optimal solution. The idea is simple: take the immediate advantage without worrying about the future. While this may sound risky, greedy algorithms often lead to optimal solutions in many real-world problems. In this chapter, we will explore two popular applications of greedy algorithms: Huffman Coding — for data compression. Minimum Spanning Trees — using Kruskal’s and Prim’s algorithms for efficient network design. Let’s get started with Huffman Coding! 18.1 Huffman Coding Definition: Huffman Coding is a greedy algorithm used for lossless data compression. It assigns variable-length codes to input characters, with shorter codes assigned to more frequent characters and longer codes to less frequent ones. This minimizes the overall size of the encoded data. How Huffman Coding Works: Here’s how you can visualize Huffman Coding: Frequency Calculation: Start by calculating the frequency of each character in the input data. Build a Priority Queue: Insert all characters (or symbols) into a priority queue, where the element with the lowest frequency has the highest priority. Construct the Huffman Tree: Repeatedly extract the two characters with the smallest frequencies from the queue, combine them into a new node (with the combined frequency), and reinsert this node back into the queue. Keep doing this until there’s only one node left – the root of the Huffman tree. Generate Codes: Starting from the root of the tree, assign 0 to the left edge and 1 to the right edge of each node. Traverse the tree to generate the variable-length codes for each character. Example: Let’s say we want to encode the string "AABACD". Character frequencies: A: 3 B: 1 C: 1 D: 1 Create a priority queue: Insert A(3), B(1), C(1), D(1). Building the tree: Combine B and C (1 + 1 = 2), now we have A(3), BC(2), D(1). Combine D and BC (1 + 2 = 3), now we have A(3), DBC(3). Combine A and DBC (3 + 3 = 6), this becomes the root. Generate Huffman Codes: A = 0 B = 101 C = 100 D = 11 So, the encoded string for "AABACD" becomes: 0 0 101 0 100 11. Huffman Coding Implementation in C: #include <stdio.h> #include <stdlib.h> // A Huffman tree node struct MinHeapNode { char data; unsigned freq; struct MinHeapNode *left, *right; }; // Function to create a new min heap node struct MinHeapNode* newNode(char data, unsigned freq) { struct MinHeapNode* temp = (struct MinHeapNode*) malloc(sizeof(struct MinHeapNode)); temp->data = data; temp->freq = freq; temp->left = temp->right = NULL; return temp; } // A utility function to print an array of size n void printArray(int arr[], int n) { for (int i = 0; i < n; ++i) printf("%d", arr[i]); printf("\n"); } // Print Huffman codes from the root of the tree void printHuffmanCodes(struct MinHeapNode* root, int arr[], int top) { if (root->left) { arr[top] = 0; printHuffmanCodes(root->left, arr, top + 1); } if (root->right) { arr[top] = 1; printHuffmanCodes(root->right, arr, top + 1); } if (!(root->left && root->right)) { printf("%c: ", root->data); printArray(arr, top); } } // This function builds the Huffman Tree and prints the codes by traversing the built tree void HuffmanCodes(char data[], int freq[], int size) { // Create a simple example of a Huffman Tree (for demonstration purposes) struct MinHeapNode* root = newNode(‘$’, 0); root->left = newNode(‘A’, freq[0]); root->right = newNode(‘B’, freq[1]); int arr[100], top = 0; printHuffmanCodes(root, arr, top); } int main() { char arr[] = {‘A’, ‘B’}; int freq[] = {5, 9}; int size = sizeof(arr) / sizeof(arr[0]); HuffmanCodes(arr, freq, size); return 0; } This code gives a basic illustration of Huffman Coding. You would need a priority queue or min-heap to fully implement the algorithm, which can be a great exercise for you to try! Pros and Cons of Huffman Coding: Pros: Reduces the size of data significantly, leading to efficient storage. Works well for characters with varying frequencies. Cons: Requires prior knowledge of character frequencies. Can be time-consuming to build for very large datasets. 18.2 Minimum Spanning Trees Now let’s move to the Minimum Spanning Tree (MST) problem. An MST is a subgraph of a graph that connects all vertices with the minimum possible total edge weight, without forming any cycles. There are two popular algorithms for finding the MST: Kruskal’s Algorithm Prim’s Algorithm Let’s start with Kruskal’s Algorithm. Kruskal’s Algorithm Kruskal’s Algorithm is a greedy approach that treats the graph as a forest, where each vertex is its own tree. It keeps adding the smallest edge that connects two separate trees until all vertices are connected. How Kruskal’s Algorithm Works: Sort all the edges in increasing order based on their weights. Initialize an empty forest (collection of trees), where each vertex is its own tree. Pick the smallest edge. If it connects two different trees, add it to the MST and merge the trees. Repeat this until all vertices are connected in a single tree. Kruskal’s Algorithm Example: Let’s say we have the following graph with edges: Edge Weight A-B 1 A-C 4 B-C 3 B-D 2 C-D 5 Sort edges: A-B (1), B-D (2), B-C (3), A-C (4), C-D (5). Start by adding A-B to the MST. Add B-D next. Add B-C. Skip C-D (since it would create a cycle). The MST includes edges: A-B, B-D, B-C with a total weight of 6. Kruskal’s Algorithm Implementation in C: #include <stdio.h> #include <stdlib.h> // Structure to represent an edge struct Edge { int src, dest, weight; }; // Structure to represent a graph struct Graph { int V, E; struct Edge* edges; }; // Create a new graph struct Graph* createGraph(int V, int E) { struct Graph* graph = (struct Graph*) malloc(sizeof(struct Graph)); graph->V = V; graph->E = E; graph->edges = (struct Edge*) malloc(graph->E * sizeof(struct Edge)); return graph; } // A utility function to find the subset of an element int find(int

Chapter 18 – Greedy Algorithms Read More »

Chapter 17 - Advanced Sorting Algorithms

Chapter 17 – Advanced Sorting Algorithms

Chapter 17: Advanced Sorting Algorithms Welcome back, dear reader! Now that we’ve dipped our toes into the world of simple sorting algorithms, it’s time to dive a little deeper. In this chapter, we’ll explore some more powerful sorting algorithms that are designed to handle larger datasets efficiently. Before we jump in, take a moment to appreciate how far you’ve come! Sorting is one of the most important concepts in data structures, and understanding it thoroughly gives you a solid foundation for tackling more complex problems. So, what’s next? Let’s begin with a sorting algorithm that’s fast, efficient, and widely used in real-world applications — Quick Sort. 17.1 Quick Sort Definition: Quick Sort is a divide-and-conquer algorithm. It works by selecting a "pivot" element from the array and partitioning the other elements into two sub-arrays: those smaller than the pivot and those larger. The pivot is then placed in its correct position, and the process is recursively applied to the sub-arrays. How It Works: Quick Sort might sound a little tricky, but let’s break it down step-by-step: Pick an element from the array. This is called the pivot. Partition the array into two parts: elements less than the pivot go to the left, and elements greater than the pivot go to the right. Recursively apply Quick Sort to the left and right sub-arrays. Combine the results, and you have a sorted array! Let’s walk through an example to see how Quick Sort works in action. Quick Sort Example: Imagine we have the following array: [30, 10, 50, 20, 60, 40] We choose 30 as our pivot (you can choose any element, but for simplicity, we’ll take the first one). Elements less than 30: [10, 20] Elements greater than 30: [50, 60, 40] Now we recursively sort these two sub-arrays. The left sub-array [10, 20] is already sorted, so no further action is needed there. But the right sub-array [50, 60, 40] needs sorting. We choose 50 as the pivot for this sub-array: Elements less than 50: [40] Elements greater than 50: [60] Now, both [40] and [60] are sorted individually. We combine everything, and voila! The final sorted array is: [10, 20, 30, 40, 50, 60] See how it works? It’s like dividing the problem into smaller pieces until it’s easy to solve, and then putting everything back together. Time Complexity: Best case: O(n log n) — This happens when the pivot divides the array into two nearly equal parts each time. Worst case: O(n²) — This occurs when the pivot is always the smallest or largest element, leading to unbalanced partitions. Average case: O(n log n) — Generally, Quick Sort is quite efficient, and this is the expected performance for most datasets. Quick Sort Implementation in C: Let’s write a Quick Sort program in C, but instead of using just arrays, we’ll also show you how to use pointers. Pointers are powerful tools in C, and they make managing memory much easier. #include <stdio.h> // Function to swap two elements using pointers void swap(int *a, int *b) { int temp = *a; *a = *b; *b = temp; } // Partition function that returns the index of the pivot int partition(int arr[], int low, int high) { int pivot = arr[high]; // Choosing the last element as the pivot int i = (low – 1); // Index of the smaller element for (int j = low; j <= high – 1; j++) { if (arr[j] < pivot) { i++; // Increment the index of the smaller element swap(&arr[i], &arr[j]); } } swap(&arr[i + 1], &arr[high]); return (i + 1); } // Quick Sort function that uses recursion void quickSort(int arr[], int low, int high) { if (low < high) { // Get the pivot element in its correct position int pi = partition(arr, low, high); // Recursively sort the left and right sub-arrays quickSort(arr, low, pi – 1); quickSort(arr, pi + 1, high); } } // Function to print the array void printArray(int arr[], int size) { for (int i = 0; i < size; i++) { printf("%d ", arr[i]); } printf("\n"); } int main() { int arr[] = {30, 10, 50, 20, 60, 40}; int n = sizeof(arr) / sizeof(arr[0]); printf("Original array: \n"); printArray(arr, n); quickSort(arr, 0, n – 1); printf("Sorted array: \n"); printArray(arr, n); return 0; } Explanation: The partition() function is the heart of the Quick Sort algorithm. It rearranges the array by moving smaller elements to the left of the pivot and larger elements to the right. We use pointers in the swap() function to directly modify the elements of the array in memory. The quickSort() function is recursive and sorts the array by dividing it into smaller sub-arrays around the pivot. Why Use Quick Sort? Quick Sort is fast and works well for most real-world data. It’s not only good for small datasets but also scales efficiently for larger ones. And since it’s a divide-and-conquer algorithm, it can be optimized for parallel processing. But wait! We’re not done yet. Next, we’ll look at another very important sorting algorithm — Merge Sort. This one is also based on the divide-and-conquer approach, but it works in a slightly different way. Should we continue with Merge Sort? I promise it’s super interesting! Alright! Let’s keep the momentum going and dive into Merge Sort — another powerful sorting algorithm that you’ll often encounter in the real world. If you liked the divide-and-conquer strategy of Quick Sort, you’re going to love Merge Sort because it takes this strategy to the next level with guaranteed performance. 17.2 Merge Sort Definition: Merge Sort is a divide-and-conquer algorithm that breaks the array into smaller sub-arrays, sorts those sub-arrays, and then merges them back together in the correct order. The key idea in Merge Sort is that merging two sorted arrays is easier than sorting a large unsorted array. So instead of sorting the entire array in one go, we: Break the array into two halves. Recursively sort both halves. Merge the two

Chapter 17 – Advanced Sorting Algorithms Read More »

Chapter 16 - Sorting Algorithms

Chapter 16 – Sorting Algorithms

Chapter 16: Sorting Algorithms Sorting algorithms are fundamental in computer science and are used in nearly every application that involves data. Sorting helps in arranging data in a specific order, usually ascending or descending, making data retrieval and manipulation easier. In this chapter, we will explore various sorting algorithms, their logic, code implementations, and analyze their efficiency. Let’s get started with one of the simplest sorting algorithms — Bubble Sort. 16.1 Bubble Sort Definition: Bubble Sort is a simple sorting algorithm that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. The process is repeated until the list is sorted. How It Works: Compare the first two elements. If the first element is larger, swap them. Move to the next pair of elements and repeat the comparison and swapping process. Continue doing this for the entire list. At the end of the first pass, the largest element will "bubble up" to its correct position at the end of the list. Repeat the process for the rest of the list, ignoring the last sorted elements. Continue until no more swaps are needed. Time Complexity: Worst case: O(n²) — This happens when the list is in reverse order. Best case: O(n) — This occurs when the list is already sorted. Bubble Sort Implementation in C: #include <stdio.h> void bubbleSort(int arr[], int n) { for (int i = 0; i < n-1; i++) { for (int j = 0; j < n-i-1; j++) { if (arr[j] > arr[j+1]) { // Swap arr[j] and arr[j+1] int temp = arr[j]; arr[j] = arr[j+1]; arr[j+1] = temp; } } } } void printArray(int arr[], int size) { for (int i = 0; i < size; i++) { printf("%d ", arr[i]); } printf("\n"); } int main() { int arr[] = {64, 34, 25, 12, 22, 11, 90}; int n = sizeof(arr)/sizeof(arr[0]); bubbleSort(arr, n); printf("Sorted array: \n"); printArray(arr, n); return 0; } Explanation: The bubbleSort() function performs repeated passes through the array, comparing adjacent elements and swapping them if necessary. The largest element "bubbles" up to the correct position after each pass. The printArray() function is used to print the sorted array after the algorithm finishes. Pros and Cons of Bubble Sort: Pros: Easy to understand and implement. Good for small datasets or nearly sorted data. Cons: Not efficient for large datasets due to its O(n²) time complexity. Performs unnecessary comparisons even when the array is already sorted. 16.2 Selection Sort Definition: Selection Sort is another simple sorting algorithm. It divides the input list into two parts: a sorted part and an unsorted part. Initially, the sorted part is empty, and the unsorted part contains the entire list. The algorithm proceeds by repeatedly selecting the smallest element from the unsorted part and swapping it with the leftmost unsorted element. How It Works: Start with the first element and assume it’s the minimum. Compare it with every other element in the array. If a smaller element is found, mark that as the new minimum. At the end of the first pass, swap the minimum element with the first element. Move to the next element and repeat the process. Continue until the entire list is sorted. Time Complexity: Worst, Average, and Best case: O(n²) Selection Sort Implementation in C++: #include <iostream> using namespace std; void selectionSort(int arr[], int n) { for (int i = 0; i < n-1; i++) { int minIdx = i; for (int j = i+1; j < n; j++) { if (arr[j] < arr[minIdx]) { minIdx = j; } } // Swap the found minimum element with the first element int temp = arr[minIdx]; arr[minIdx] = arr[i]; arr[i] = temp; } } void printArray(int arr[], int size) { for (int i = 0; i < size; i++) { cout << arr[i] << " "; } cout << endl; } int main() { int arr[] = {64, 25, 12, 22, 11}; int n = sizeof(arr)/sizeof(arr[0]); selectionSort(arr, n); cout << "Sorted array: \n"; printArray(arr, n); return 0; } Explanation: The selectionSort() function selects the smallest element from the unsorted portion of the array and swaps it with the leftmost unsorted element. The printArray() function prints the sorted array. Pros and Cons of Selection Sort: Pros: Simple and easy to understand. Reduces the number of swaps compared to bubble sort. Cons: Like Bubble Sort, it’s inefficient for large datasets. Time complexity remains O(n²) even if the array is partially sorted. 16.3 Insertion Sort Definition: Insertion Sort works by dividing the array into a sorted and an unsorted section. The sorted section starts with the first element. The algorithm picks elements from the unsorted section and places them into the correct position in the sorted section. How It Works: Start with the second element as the key. Compare the key with the elements in the sorted section. Shift larger elements one position to the right to make space for the key. Insert the key into its correct position. Repeat this process for each element in the unsorted section. Time Complexity: Worst and Average case: O(n²) Best case: O(n) (when the array is already sorted) Insertion Sort Implementation in C++: #include <iostream> using namespace std; void insertionSort(int arr[], int n) { for (int i = 1; i < n; i++) { int key = arr[i]; int j = i – 1; // Shift elements that are greater than key to one position ahead while (j >= 0 && arr[j] > key) { arr[j + 1] = arr[j]; j–; } arr[j + 1] = key; } } void printArray(int arr[], int size) { for (int i = 0; i < size; i++) { cout << arr[i] << " "; } cout << endl; } int main() { int arr[] = {12, 11, 13, 5, 6}; int n = sizeof(arr) / sizeof(arr[0]); insertionSort(arr, n); cout << "Sorted array: \n"; printArray(arr, n); return 0; } Explanation: The insertionSort() function inserts each element into its correct position by shifting larger elements one step to the

Chapter 16 – Sorting Algorithms Read More »

Chapter 15 - Searching Algorithms

Chapter 15 – Searching Algorithms

Chapter 15: Searching Algorithms Introduction Searching and sorting algorithms are crucial in Data Structures and Algorithms (DSA) because they help us organize and access data efficiently. Whether you’re building a search engine, a database, or even a simple contact list, you’ll need to understand these algorithms deeply. This chapter focuses on two important tasks: Searching: Finding a specific element within a dataset. Sorting: Arranging data in a certain order (ascending, descending, etc.) for better management. To keep things interesting, we’ll break the chapter into digestible sections. First, we’ll focus on searching algorithms, and then in subsequent parts, we’ll cover sorting algorithms. As we proceed, remember to try the code examples on your IDE or notebook—this practice will cement your understanding. Part 1: Searching Algorithms 1.1 Linear Search Let’s start with one of the simplest searching algorithms: Linear Search. Though not the most efficient for large datasets, linear search is straightforward and can be used on unsorted data. Definition: Linear search traverses the entire data structure from the beginning to the end, comparing each element with the target value. If the target is found, the algorithm returns its position; otherwise, it returns a "not found" result. Use Case: Imagine you have a list of unsorted contact names. If you’re looking for someone, linear search is the only choice unless the list is sorted. Linear search is often used in small datasets where performance isn’t a concern. Algorithm: Here’s a simple breakdown of how linear search works: Start at the first element of the data structure. Compare the current element with the target value. If they match, return the index (position). If not, move to the next element. Repeat this process until you find the target or reach the end. Time Complexity: O(n) — because in the worst case, you might have to check every element. Let’s implement this in C and C++! C Programming: Linear Search in an Array #include <stdio.h> int linearSearch(int arr[], int n, int target) { for (int i = 0; i < n; i++) { if (arr[i] == target) { return i; // Target found, return the index } } return -1; // Target not found } int main() { int arr[] = {10, 23, 45, 70, 11, 15}; int n = sizeof(arr) / sizeof(arr[0]); int target = 70; int result = linearSearch(arr, n, target); if (result != -1) { printf("Element found at index: %d\n", result); } else { printf("Element not found.\n"); } return 0; } C++ Programming: Linear Search in an Array #include <iostream> using namespace std; int linearSearch(int arr[], int n, int target) { for (int i = 0; i < n; i++) { if (arr[i] == target) { return i; // Target found, return the index } } return -1; // Target not found } int main() { int arr[] = {2, 4, 0, 1, 9}; int n = sizeof(arr) / sizeof(arr[0]); int target = 1; int result = linearSearch(arr, n, target); if (result != -1) { cout << "Element found at index: " << result << endl; } else { cout << "Element not found." << endl; } return 0; } Explanation: In both programs, we create an array of integers and then implement the linear search function. The function traverses through the array, comparing each element with the target. If it finds the target, it returns the index of the element. If not, it returns -1, indicating that the element was not found. Discussion on Linked Lists: Linear search works similarly in Linked Lists, but instead of indexing like we do in arrays, we traverse the nodes. Since a linked list is a sequential structure without random access, we need to move from node to node. C Programming: Linear Search in a Singly Linked List #include <stdio.h> #include <stdlib.h> struct Node { int data; struct Node* next; }; int linearSearch(struct Node* head, int target) { struct Node* current = head; int index = 0; while (current != NULL) { if (current->data == target) { return index; } current = current->next; index++; } return -1; // Target not found } // Helper function to create a new node struct Node* createNode(int data) { struct Node* newNode = (struct Node*)malloc(sizeof(struct Node)); newNode->data = data; newNode->next = NULL; return newNode; } int main() { struct Node* head = createNode(10); head->next = createNode(20); head->next->next = createNode(30); head->next->next->next = createNode(40); int target = 30; int result = linearSearch(head, target); if (result != -1) { printf("Element found at position: %d\n", result); } else { printf("Element not found.\n"); } return 0; } Explanation: Here, we create a singly linked list and use a linear search algorithm to find the target value. We traverse the list node by node, checking each node’s data. Since there’s no concept of index in a linked list, we maintain a manual counter (index). If we find the target, we return its position. Otherwise, we return -1. Advantages and Disadvantages of Linear Search: Advantages: Simple and easy to implement. Works on both sorted and unsorted datasets. Disadvantages: Inefficient for large datasets, especially when compared to more advanced algorithms like binary search. Takes linear time in the worst case (O(n)). Where to Use Linear Search: When dealing with small datasets or unsorted data. For data structures like linked lists, where other search algorithms might be less efficient due to lack of direct indexing. This concludes our detailed discussion of Linear Search across different data structures. Next Part Preview: Binary Search In the next part of the chapter, we will dive into the more efficient Binary Search, where we optimize the searching process by dividing the dataset in half. But remember, binary search only works on sorted datasets, so sorting comes into play! You’ll also see how binary search works not just with arrays but with Binary Search Trees (BSTs), a crucial data structure in DSA. And as always, feel free to check out resources on digilearn.cloud from Emancipation Edutech Private Limited, where this book is available for free reading! Part 2: Binary Search 1.2 Binary

Chapter 15 – Searching Algorithms Read More »

Chapter 14 - Tree Traversal Methods

Chapter 14 – Tree Traversal Methods

Chapter 14: Tree Traversal Methods In this chapter, we will explore Tree Traversal Methods, which are vital for accessing and processing the data stored in tree structures. Trees are non-linear data structures that represent hierarchical relationships, making traversing them crucial for various applications, such as searching, sorting, and modifying data. Traversing a tree means visiting all its nodes in a specified order. The choice of traversal method can significantly impact the efficiency and outcome of operations performed on the tree. There are two main categories of tree traversal methods: Depth-First Traversal (DFT) Breadth-First Traversal (BFT) 1. Depth-First Traversal (DFT) Depth-First Traversal explores as far down one branch of the tree as possible before backtracking to explore other branches. This method is typically implemented using recursion or an explicit stack data structure. There are three primary types of Depth-First Traversal: 1.1 Pre-Order Traversal In Pre-Order Traversal, nodes are visited in the following order: Visit the current node. Traverse the left subtree. Traverse the right subtree. Use Cases for Pre-Order Traversal Copying a Tree: If you need to create a duplicate of a tree, pre-order traversal allows you to visit each node, storing its value in a new tree structure. Prefix Expression Evaluation: In expressions written in prefix notation (also known as Polish notation), pre-order traversal is essential for evaluating the expression. Example: Given the binary tree: A / \ B C / \ D E The Pre-Order Traversal would yield: A, B, D, E, C. Implementation: class Node: def __init__(self, key): self.left = None self.right = None self.val = key def pre_order_traversal(root): if root: print(root.val, end=’ ‘) # Visit the node pre_order_traversal(root.left) # Traverse left subtree pre_order_traversal(root.right) # Traverse right subtree # Create the tree root = Node(‘A’) root.left = Node(‘B’) root.right = Node(‘C’) root.left.left = Node(‘D’) root.left.right = Node(‘E’) print("Pre-Order Traversal:") pre_order_traversal(root) # Output: A B D E C 1.2 In-Order Traversal In In-Order Traversal, nodes are visited in this order: Traverse the left subtree. Visit the current node. Traverse the right subtree. Use Cases for In-Order Traversal Binary Search Trees (BST): In-order traversal is used to retrieve the elements of a BST in sorted order. Expression Tree Evaluation: For expression trees, in-order traversal can be useful for generating infix notation. Example: Using the same binary tree, the In-Order Traversal yields: D, B, E, A, C. Implementation: def in_order_traversal(root): if root: in_order_traversal(root.left) # Traverse left subtree print(root.val, end=’ ‘) # Visit the node in_order_traversal(root.right) # Traverse right subtree print("\nIn-Order Traversal:") in_order_traversal(root) # Output: D B E A C 1.3 Post-Order Traversal In Post-Order Traversal, nodes are visited in this order: Traverse the left subtree. Traverse the right subtree. Visit the current node. Use Cases for Post-Order Traversal Deleting a Tree: When deallocating memory for tree nodes, post-order traversal ensures that child nodes are deleted before the parent node. Postfix Expression Evaluation: In postfix notation (Reverse Polish notation), post-order traversal is used for evaluation. Example: Again, using the same binary tree, the Post-Order Traversal yields: D, E, B, C, A. Implementation: def post_order_traversal(root): if root: post_order_traversal(root.left) # Traverse left subtree post_order_traversal(root.right) # Traverse right subtree print(root.val, end=’ ‘) # Visit the node print("\nPost-Order Traversal:") post_order_traversal(root) # Output: D E B C A 2. Breadth-First Traversal (BFT) Breadth-First Traversal explores all the nodes at the present depth level before moving on to the nodes at the next depth level. This traversal method is typically implemented using a queue data structure. 2.1 Level Order Traversal In Level Order Traversal, the nodes are visited level by level from top to bottom. Starting from the root, it visits all nodes at the present depth level before moving on to the nodes at the next depth level. Use Cases for Level Order Traversal Finding the Shortest Path: In unweighted trees or graphs, level order traversal is useful for finding the shortest path between nodes. Completeness Checking: To check whether a binary tree is complete, level order traversal can help validate the condition at each level. Example: Using the same binary tree, the Level Order Traversal yields: A, B, C, D, E. Implementation: from collections import deque def level_order_traversal(root): if root is None: return queue = deque([root]) # Initialize the queue with the root node while queue: current = queue.popleft() # Dequeue the front node print(current.val, end=’ ‘) # Visit the node if current.left: # Enqueue left child queue.append(current.left) if current.right: # Enqueue right child queue.append(current.right) print("\nLevel Order Traversal:") level_order_traversal(root) # Output: A B C D E Comparison of Traversal Methods Traversal Method Order of Visiting Nodes Use Cases Pre-Order Node, Left, Right Copying a tree, prefix expression In-Order Left, Node, Right Binary search trees, sorted output Post-Order Left, Right, Node Deleting a tree, postfix expression Level Order Level by Level Finding shortest path in unweighted trees Complexity Analysis Understanding the time and space complexity of these traversal methods is essential: Time Complexity: All traversal methods (pre-order, in-order, post-order, and level order) have a time complexity of O(n), where n is the number of nodes in the tree. Each node is visited exactly once. Space Complexity: Pre-Order, In-Order, and Post-Order: The space complexity for these recursive methods is O(h), where h is the height of the tree. This is due to the stack space used by recursive calls. For balanced trees, this is O(log n), but for skewed trees, it can be O(n). Level Order: The space complexity for level order traversal is O(w), where w is the maximum width of the tree. In the worst case, this can also be O(n). Conclusion Tree traversal methods are fundamental for effectively working with tree data structures. They allow us to explore and manipulate tree nodes efficiently, which is essential for various applications in computer science and programming. In our next chapter, we will delve into Binary Trees and Binary Search Trees (BST), laying the groundwork for understanding more complex tree structures. Remember, for free access to this book and other resources provided by Emancipation Edutech Private Limited, you can visit digilearn.cloud. Happy learning!

Chapter 14 – Tree Traversal Methods Read More »

Chapter 13 - AVL Trees and Red-Black Trees

Chapter 13 – AVL Trees and Red-Black Trees

Chapter 13: AVL Trees and Red-Black Trees In this chapter, we will dive into AVL Trees and Red-Black Trees, two important types of self-balancing binary search trees. These trees ensure that the height of the tree remains balanced, providing improved efficiency for operations such as insertion, deletion, and searching. Self-balancing trees like AVL Trees and Red-Black Trees guarantee that the operations on the tree have a time complexity of O(log n), even in the worst-case scenario. Let’s explore them in detail! What are AVL Trees? An AVL Tree is a type of self-balancing binary search tree where the difference between the heights of the left and right subtrees of any node (known as the balance factor) is at most 1. The AVL Tree is named after its inventors Adelson-Velsky and Landis, who introduced the concept in 1962. Key Properties of AVL Trees: Balance Factor: For every node in an AVL tree, the balance factor must be either -1, 0, or +1. Balance Factor = Height of Left Subtree – Height of Right Subtree If the balance factor is not within this range, the tree is considered unbalanced and requires rotation to rebalance. Rotations: To maintain the balance of an AVL tree, we use rotations when an insertion or deletion operation causes the tree to become unbalanced. Single Rotations: Right or Left rotation. Double Rotations: A combination of right and left rotations. Example of an AVL Tree Consider inserting the elements 10, 20, and 30 into an empty AVL Tree. Step 1: Insert 10: 10 Step 2: Insert 20: 10 \ 20 Step 3: Insert 30. This causes the tree to become unbalanced because the balance factor of node 10 becomes -2 (left height = 0, right height = 2). To rebalance the tree, we perform a left rotation at node 10: Before rotation: 10 \ 20 \ 30 After rotation: 20 / \ 10 30 Now, the tree is balanced again. Rotations in AVL Trees To maintain the balance factor of AVL trees, we perform rotations. There are four types of rotations used to rebalance an AVL tree: 1. Left Rotation (LL Rotation) A left rotation is performed when a node becomes unbalanced due to the right subtree being taller than the left subtree. This happens when a new node is inserted into the right subtree of the right child. Example: Before left rotation: 10 \ 20 \ 30 After left rotation: 20 / \ 10 30 2. Right Rotation (RR Rotation) A right rotation is performed when a node becomes unbalanced due to the left subtree being taller than the right subtree. This happens when a new node is inserted into the left subtree of the left child. Example: Before right rotation: 30 / 20 / 10 After right rotation: 20 / \ 10 30 3. Left-Right Rotation (LR Rotation) A left-right rotation is a combination of left and right rotations. It is performed when a new node is inserted into the right subtree of the left child. Example: Before LR rotation: 30 / 10 \ 20 First, perform a left rotation on the left child (10), then perform a right rotation on the root (30): After LR rotation: 20 / \ 10 30 4. Right-Left Rotation (RL Rotation) A right-left rotation is a combination of right and left rotations. It is performed when a new node is inserted into the left subtree of the right child. Example: Before RL rotation: 10 \ 30 / 20 First, perform a right rotation on the right child (30), then perform a left rotation on the root (10): After RL rotation: 20 / \ 10 30 What are Red-Black Trees? A Red-Black Tree is another type of self-balancing binary search tree. It is similar to an AVL tree, but it has a more relaxed balancing criterion, which makes it faster for insertion and deletion operations. The key feature of a Red-Black Tree is the use of colors to maintain balance. Key Properties of Red-Black Trees: Each node is colored either red or black. The root node must always be black. No two consecutive red nodes can appear on the same path (i.e., a red node cannot have a red parent or red child). Every path from a node to its descendant null nodes must contain the same number of black nodes. The longest path from the root to a leaf is no more than twice as long as the shortest path (this ensures that the tree is balanced). Example of a Red-Black Tree Consider inserting the elements 10, 20, and 30 into an empty Red-Black Tree. Step 1: Insert 10 (the root node is always black). 10 (black) Step 2: Insert 20. Since it is greater than 10, it is inserted as the right child and colored red: 10 (black) \ 20 (red) Step 3: Insert 30. Since this creates a violation of two consecutive red nodes (20 and 30), we perform a left rotation at node 10 and recolor: Before rotation: 10 (black) \ 20 (red) \ 30 (red) After rotation and recoloring: 20 (black) / \ 10 (red) 30 (red) Now, the tree satisfies all the Red-Black Tree properties. Rotations in Red-Black Trees Similar to AVL trees, Red-Black Trees also use rotations to restore balance when a violation occurs. The rotations are the same as those in AVL trees: left rotation, right rotation, left-right rotation, and right-left rotation. Comparison: AVL Trees vs. Red-Black Trees Feature AVL Trees Red-Black Trees Balancing Method Strictly balanced Loosely balanced Height Difference At most 1 No more than twice the shortest path Rebalancing Frequent rotations Fewer rotations Efficiency in Insertion/Deletion Slower in large trees Faster for insertion/deletion Search Efficiency Slightly better Slightly worse than AVL trees Applications of AVL and Red-Black Trees Database Indexing: Both AVL and Red-Black Trees are widely used in databases for maintaining ordered records efficiently. Memory Management: Red-Black Trees are often used in memory allocators (e.g., Linux kernel uses Red-Black Trees). Networking: AVL and Red-Black Trees are useful in routing

Chapter 13 – AVL Trees and Red-Black Trees Read More »

Scroll to Top
Contact Form Demo