Two Sides of the Same Coin: Data Storage and Algorithms

 

Algorithms and Data Structures are a powerful pair within the world of computers. First and foremost, Algorithms

What is an Algorithm? 

Most people only ever hear about algorithms from a scientific perspective. Therefore, a lot of people don’t understand what an algorithm is. Despite this, they are actually a general term for methods that complete tasks. From that perspective, everything you do is an algorithm. Cooking food, getting ready for the day, and doing your job at work; everything that requires tasks needs an algorithm to complete them. 

With respect to our discussion, the ability to perform work is meaningless when there is nothing to work on. Therefore, good algorithms are often used in tandem with Data Structures. 

What is a Data Structure?  

Simply put, it is a way of organizing data. Surprisingly, there is a method to how data should be stored, leading to many types of Data Structures: 

  • Arrays/Lists: Stores info contiguously. Data is physically ordered, like ordering data in Position 1, 2, 3, etc.  

  • Stacks: Storage using First-In-First-Out methods. Simply put, only the most recent element can be interacted with. 

  • Trees: The typical File Explorer solution. Data is organized in terms of Parents and Children; with children data nested below parents. 

Variety is the Spice of Life 

With everything previously discussed, its easy to say there is an abundance of Algorithms and Data structures. They are often designed based on use cases. Consider the differences between spreadsheet/database software. Both technically do the same thing: store, organize, and read data; but their scope is different. Databases like SQL can manage terabytes of data with ease. However, this requires technical knowledge to build and run; and can become a hassle for smaller datasets. Excel is prebuilt and has an easy to interact with GUI, but has physical limitations limiting storage and read speeds. 

Therefore, picking the correct Algorithm and Data structures comes down to a few things: 

Algorithms... 

  • Time Complexity: Ranks an algorithm based on its wait times. This naturally depends on the amount of data to comb through; however, clever algorithms can often sort/search through data efficiently, often never searching through some altogether. 

  • Space Complexity: Ranks algorithms based on the space they take up in memory. Data calls require memory space, so how an algorithms uses that memory space is important. A fast algorithm that takes up too much memory could halt other processes in the system, rendering it useless. 

Data Structures... 

  • Modification: Data is read and written all the time. This data is physically stored in memory. If modification of data is lazy or underdeveloped; fragmentation or extra, unneeded memory usage can result. 

  • Algorithms: The way data is structured is directly impacted by the algorithms that search through them. An algorithm that works best in an ordered data set could bring catastrophic results in an unordered set. 

So... What is the best Search/Sort/Storage solution? 

Two words: It depends. 

The answer to this question will depend on the answers to the following: 

  • What type of data am I storing? 

  • How much of it need to be stored? 

  • How quickly does this data need to be accessed? 

  • How much memory is available for storage and active reads/writes? 

Return to the Database vs Spreadsheet debate. Spreadsheets like Excel can afford to search each cell individually because data sets are often small. In addition, Excel is not built for (and therefore isn’t expected to perform) live and constant data retrieval. Databases like SQL can abuse search and sort algorithms to store and modify large data sets. The backbones of popular websites can run on a good implementation of SQL. 

 

References 

Comments

Popular posts from this blog

A Timeline of Hardware History

Java: A Relic Still in use Today

Navigating a Network