Data Structures

Within programming, there are a few standard data structures (usually implemented as classes) which are used to store data in a certain way. These allow data to be accessed and formatted in a certain way and are usually used for groups of data (lists and making sure they are accessed in the right way).

In and Out order

A given data structure can be described using an in and out order which describes the order in which a data item is entered and the relative time it can be accessed.

For example, if a data structure is FIFO (First in First out) such as a Queue then this means that the first item to be entered into the list will be the first item to be accessed in the list.

Likewise, if a data structure is FILO (First in Last out) such as a Stack then this means the first item to be entered into the list will be the last item to be accessed in the list.

Finally, a list can be FIAO (First in Any out) which means that an item entered can be accessed at any point in the list using an index (e.g. Linked List)

Linked List

Up until now, you have likely been using an array to store a sequential set of data. However, there is another option, a linked list. Within an array, all data is stored sequentially in 1 big block, meaning that the data can be accessed really quickly using an offset. However, to edit the size of an array, you must copy the entire array to a new location in memory which can be an incredibly slow process.

A Linked List has the opposite effects.

A linked list stores data in individual elements, each element has 2 components:

Because of this pointer, lists do not need to be stored sequentially or in a big block as they can be accessed by simply following the chain of pointers, thus making them much easier to add new elements to.

However, to access an item in the list, you must go through each item in the pointer chain until you reach that item, therefore linked lists are much slower to access.

A linked list can be single linked or double linked. In a single linked list, the chain of pointers travels from the first item to the last item. In a double linked list, there are 2 chains of pointers which travel in both directions (first to last & last to first)

A linked list is defined as a FIAO (First in Any out) algorithm.

Below is an implementation of a singularly linked list in C++:

// Node structure
struct Node {
    int data;
    Node* next;

    Node(int val) : data(val), next(nullptr) {}
};

// LinkedList class
class LinkedList {
private:
    Node* head;

public:
    LinkedList() : head(nullptr) {}

    // Insert a new node at the beginning
    void insertAtBeginning(int data) {
        Node* newNode = new Node(data);
        newNode->next = head;
        head = newNode;
    }

    // Insert a new node at the end
    void insertAtEnd(int data) {
        Node* newNode = new Node(data);
        if (head == nullptr) {
            head = newNode;
            return;
        }
        Node* temp = head;
        while (temp->next != nullptr) {
            temp = temp->next;
        }
        temp->next = newNode;
    }

    // Delete a node by value
    void deleteNode(int data) {
        if (head == nullptr) return;
        if (head->data == data) {
            Node* temp = head;
            head = head->next;
            delete temp;
            return;
        }
        Node* temp = head;
        while (temp->next != nullptr && temp->next->data != data) {
            temp = temp->next;
        }
        if (temp->next == nullptr) return;
        Node* nodeToDelete = temp->next;
        temp->next = temp->next->next;
        delete nodeToDelete;
    }

    // Display the linked list
    void display() {
        Node* temp = head;
        while (temp != nullptr) {
            cout << temp->data << " -> ";
            temp = temp->next;
        }
        cout << "nullptr" << endl;
    }

    // Destructor to clean up the memory
    ~LinkedList() {
        Node* current = head;
        Node* nextNode;
        while (current != nullptr) {
            nextNode = current->next;
            delete current;
            current = nextNode;
        }
    }
};

Stack

The stack is secretly the most common data structure in programming, although most people dont know about it.

A stack is similar to a Linked List in terms of internal style but differs in terms of access.

Firstly, a stack is a FILO (First in Last out) algorithm. The best way to imagine this is a literal stack of plates. When you add an item, you add it to the top of the stack, and you can only access the item at the top of the stack, so to access another in the middle of the stack, you will have to remove all items on top of the item you want.

To interact with a stack, it has 3 main functions:

Call Stack

As I mentioned earlier, a stack is secretly the most common data structure in programming, as pretty much every single program uses it. This is called a call stack.

Within a program, you can have functions which can be called from functions which can be called from functions, these are managed using the call stack.

Whenever the program enters a new function, it adds this function to the call stack, then when the function returns, it is popped from the call stack and this then allows the program to access the next highest item in the call stack (the function from which the popped function was called).

The static variables within the function are also stored in this call stack to help manage scope

Below is an implementation of a Stack in C++:

// Node structure
struct Node {
    int data;
    Node* next;

    Node(int val) : data(val), next(nullptr) {}
};

// Stack class
class Stack {
private:
    Node* top;  // Pointer to the top node

public:
    // Constructor to initialize stack
    Stack() : top(nullptr) {}

    // Destructor to clean up the memory
    ~Stack() {
        while (top != nullptr) {
            Node* temp = top;
            top = top->next;
            delete temp;
        }
    }

    // Function to add an element to the stack
    void push(int x) {
        Node* newNode = new Node(x);
        newNode->next = top;
        top = newNode;
    }

    // Function to remove an element from the stack
    int pop() {
        if (top == nullptr) {
            throw std::underflow_error("Stack Underflow");
        }
        Node* temp = top;
        int poppedData = temp->data;
        top = top->next;
        delete temp;
        return poppedData;
    }

    // Function to return the top element of the stack
    int peek() const {
        if (top == nullptr) {
            throw std::underflow_error("Stack is Empty");
        }
        return top->data;
    }

    // Function to check if the stack is empty
    bool isEmpty() const {
        return top == nullptr;
    }

    // Function to return the size of the stack
    int size() const {
        int count = 0;
        Node* temp = top;
        while (temp != nullptr) {
            count++;
            temp = temp->next;
        }
        return count;
    }
};

Queue

A queue is another extremely common data structure. This one is rather self explanatory.

A queue stores data in FIFO (first in first out) and is used to store data in a list so that you enter an item at the back and are accessed in order of when they were entered, starting with the most distant.

There are different types of queue:

Basic Queue

A basic queue is the most primitive type of queue, this simply stores items in a single linear queue with no special behaviours.

This works by storing items in an array and then moving a head pointer which points to the first item in the list.

This can lead to an error where space in the array before the head is wasted.

Below is an implementation of a Basic Queue:

class LinearQueue {
private:
    int* arr;
    int front, rear, capacity, size;

public:
    // Constructor to initialize the queue
    LinearQueue(int capacity) : capacity(capacity), size(0), front(0), rear(capacity - 1) {
        arr = new int[capacity];
    }

    // Destructor to clean up the memory
    ~LinearQueue() {
        delete[] arr;
    }

    // Function to add an element to the queue
    void enqueue(int data) {
        if (isFull()) {
            throw std::overflow_error("Queue Overflow");
        }
        rear = (rear + 1) % capacity;
        arr[rear] = data;
        size++;
    }

    // Function to remove an element from the queue
    int dequeue() {
        if (isEmpty()) {
            throw std::underflow_error("Queue Underflow");
        }
        int data = arr[front];
        front = (front + 1) % capacity;
        size--;
        return data;
    }

    // Function to get the front element of the queue
    int getFront() const {
        if (isEmpty()) {
            throw std::underflow_error("Queue is Empty");
        }
        return arr[front];
    }

    // Function to get the rear element of the queue
    int getRear() const {
        if (isEmpty()) {
            throw std::underflow_error("Queue is Empty");
        }
        return arr[rear];
    }

    // Function to check if the queue is empty
    bool isEmpty() const {
        return size == 0;
    }

    // Function to check if the queue is full
    bool isFull() const {
        return size == capacity;
    }

    // Function to get the current size of the queue
    int getSize() const {
        return size;
    }
};

Cyclic Queue

A cyclic queue is a slightly more advanced queue with a second pointer to the tail (last item in the list). This overcomes the error in a Basic Queue where the space in the array before the head is wasted as data is added to the location at the tail rather than simply at the end of the array.

Below is an implementation of a Cyclic Queue:

class CircularQueue {
private:
    int* arr;
    int front, rear, size, capacity;

public:
    // Constructor to initialize queue
    CircularQueue(int capacity) : capacity(capacity), size(0), front(0), rear(capacity - 1) {
        arr = new int[capacity];
    }

    // Destructor to clean up the memory
    ~CircularQueue() {
        delete[] arr;
    }

    // Function to add an element to the queue
    void enqueue(int data) {
        if (isFull()) {
            cout << "Queue Overflow\n";
            return;
        }
        rear = (rear + 1) % capacity;
        arr[rear] = data;
        size++;
    }

    // Function to remove an element from the queue
    int dequeue() {
        if (isEmpty()) {
            cout << "Queue Underflow\n";
            return -1;
        }
        int data = arr[front];
        front = (front + 1) % capacity;
        size--;
        return data;
    }

    // Function to get the front element of the queue
    int getFront() const {
        if (isEmpty()) {
            cout << "Queue is empty\n";
            return -1;
        }
        return arr[front];
    }

    // Function to get the rear element of the queue
    int getRear() const {
        if (isEmpty()) {
            cout << "Queue is empty\n";
            return -1;
        }
        return arr[rear];
    }

    // Function to check if the queue is empty
    bool isEmpty() const {
        return size == 0;
    }

    // Function to check if the queue is full
    bool isFull() const {
        return size == capacity;
    }

    // Function to get the current size of the queue
    int getSize() const {
        return size;
    }
};

Priority Queue

A priority queue has items removed in order of priority as well as the order in which they were entered.

class PriorityQueue {
private:
    std::vector<int> heap;

    // Function to get the parent index of a given index
    int parent(int i) {
        return (i - 1) / 2;
    }

    // Function to get the left child index of a given index
    int leftChild(int i) {
        return 2 * i + 1;
    }

    // Function to get the right child index of a given index
    int rightChild(int i) {
        return 2 * i + 2;
    }

    // Function to heapify up (restore heap property upwards)
    void heapifyUp(int i) {
        if (i && heap[parent(i)] < heap[i]) {
            std::swap(heap[i], heap[parent(i)]);
            heapifyUp(parent(i));
        }
    }

    // Function to heapify down (restore heap property downwards)
    void heapifyDown(int i) {
        int left = leftChild(i);
        int right = rightChild(i);
        int largest = i;

        if (left < heap.size() && heap[left] > heap[largest]) {
            largest = left;
        }

        if (right < heap.size() && heap[right] > heap[largest]) {
            largest = right;
        }

        if (largest != i) {
            std::swap(heap[i], heap[largest]);
            heapifyDown(largest);
        }
    }

public:
    // Function to check if the priority queue is empty
    bool isEmpty() const {
        return heap.size() == 0;
    }

    // Function to get the size of the priority queue
    int size() const {
        return heap.size();
    }

    // Function to add an element to the priority queue
    void push(int element) {
        heap.push_back(element);
        int index = heap.size() - 1;
        heapifyUp(index);
    }

    // Function to remove and return the highest priority element
    int pop() {
        if (isEmpty()) {
            throw std::underflow_error("Priority Queue Underflow");
        }

        int root = heap.front();
        heap[0] = heap.back();
        heap.pop_back();
        heapifyDown(0);

        return root;
    }

    // Function to get the highest priority element without removing it
    int top() const {
        if (isEmpty()) {
            throw std::underflow_error("Priority Queue is Empty");
        }

        return heap.front();
    }
};

Graph

A Graph data structure works by storing a series of nodes, which are then connected using a series of arcs.

A Graph data structure does not have a specific in and out order due to the nature of nodes being stored.

The most typical implementations of Graphs are in social media databases (where a node represents a user or information about a user such as an interest) and in 3D applications such as Maya and Blender (where a node represents a vertex and an arc represents an edge).

Below is an implementation of a graph data structure:

class Graph {
private:
    // Adjacency list representation
    std::unordered_map<int, std::list<int>> adjList;

public:
    // Function to add a vertex to the graph
    void addVertex(int vertex) {
        // Check if vertex already exists
        if (adjList.find(vertex) == adjList.end()) {
            adjList[vertex] = std::list<int>();
        }
    }

    // Function to add an edge to the graph (undirected)
    void addEdge(int src, int dest) {
        // Add the edge from src to dest
        adjList[src].push_back(dest);
        // Since the graph is undirected, add an edge from dest to src as well
        adjList[dest].push_back(src);
    }

    // Function to display the graph
    void display() {
        for (const auto& pair : adjList) {
            std::cout << pair.first << " -> ";
            for (const auto& vertex : pair.second) {
                std::cout << vertex << " ";
            }
            std::cout << std::endl;
        }
    }

    // Function to perform BFS traversal from a given source vertex
    void BFS(int start) {
        std::unordered_map<int, bool> visited;
        std::queue<int> queue;

        visited[start] = true;
        queue.push(start);

        while (!queue.empty()) {
            int vertex = queue.front();
            queue.pop();
            std::cout << vertex << " ";

            for (const int& adjVertex : adjList[vertex]) {
                if (!visited[adjVertex]) {
                    visited[adjVertex] = true;
                    queue.push(adjVertex);
                }
            }
        }
        std::cout << std::endl;
    }

    // Function to perform DFS traversal from a given source vertex
    void DFS(int start) {
        std::unordered_map<int, bool> visited;
        DFSUtil(start, visited);
        std::cout << std::endl;
    }

private:
    // Utility function for DFS traversal
    void DFSUtil(int vertex, std::unordered_map<int, bool>& visited) {
        visited[vertex] = true;
        std::cout << vertex << " ";

        for (const int& adjVertex : adjList[vertex]) {
            if (!visited[adjVertex]) {
                DFSUtil(adjVertex, visited);
            }
        }
    }
};

Trees

A Tree data structure is an implementation of a Graph data structure which works by designating a single node as root from which all nodes can be traced back to. Furtheremore, nodes at the edges of a tree data structure are names leaves.

The most common type of tree is a binary tree, in which a parent node can have anywhere between 0 and 2 child nodes (leaf nodes). This is used in sorting and searching algorithms.

To traverse between nodes in a tree, there are 3 main approaches (when looking at the root -> child description, remember that when this traverses to a new node, it considers that node the root):

Below is an implementation of a binary tree data structure:

// Definition of a node in the binary tree
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;

    // Constructor
    TreeNode(int value) : data(value), left(nullptr), right(nullptr) {}
};

class BinaryTree {
private:
    TreeNode* root;

public:
    // Constructor
    BinaryTree() : root(nullptr) {}

    // Destructor
    ~BinaryTree() {
        // Delete all nodes in the tree
        clear(root);
    }

    // Function to insert a value into the binary tree
    void insert(int value) {
        root = insertRecursive(root, value);
    }

    // Function to perform an in-order traversal of the binary tree
    void inOrderTraversal() {
        inOrderTraversalRecursive(root);
    }

    // Function to perform a pre-order traversal of the binary tree
    void preOrderTraversal() {
        preOrderTraversalRecursive(root);
    }

    // Function to perform a post-order traversal of the binary tree
    void postOrderTraversal() {
        postOrderTraversalRecursive(root);
    }

    // Function to perform a level-order traversal of the binary tree
    void levelOrderTraversal() {
        if (!root) return;

        std::queue<TreeNode*> q;
        q.push(root);

        while (!q.empty()) {
            TreeNode* current = q.front();
            q.pop();
            std::cout << current->data << " ";

            if (current->left) q.push(current->left);
            if (current->right) q.push(current->right);
        }
    }

private:
    // Private helper function to insert a value into the binary tree recursively
    TreeNode* insertRecursive(TreeNode* node, int value) {
        if (!node) {
            return new TreeNode(value);
        }

        if (value < node->data) {
            node->left = insertRecursive(node->left, value);
        } else if (value > node->data) {
            node->right = insertRecursive(node->right, value);
        }

        return node;
    }

    // Private helper function to perform in-order traversal recursively
    void inOrderTraversalRecursive(TreeNode* node) {
        if (!node) return;

        inOrderTraversalRecursive(node->left);
        std::cout << node->data << " ";
        inOrderTraversalRecursive(node->right);
    }

    // Private helper function to perform pre-order traversal recursively
    void preOrderTraversalRecursive(TreeNode* node) {
        if (!node) return;

        std::cout << node->data << " ";
        preOrderTraversalRecursive(node->left);
        preOrderTraversalRecursive(node->right);
    }

    // Private helper function to perform post-order traversal recursively
    void postOrderTraversalRecursive(TreeNode* node) {
        if (!node) return;

        postOrderTraversalRecursive(node->left);
        postOrderTraversalRecursive(node->right);
        std::cout << node->data << " ";
    }

    // Private helper function to clear the binary tree recursively
    void clear(TreeNode* node) {
        if (!node) return;

        clear(node->left);
        clear(node->right);
        delete node;
    }
};