Skip to main content

CPT 307: Starting to understand algorithm selection

As it turns out, there is a specific science to selecting the best algorithm to apply to data within a program.  When I first started my Data Structures & Algorithms class, I was excited to learn about different algorithms, and how to efficiently store and sort data using advanced data structures.  What I learned was that there are a great many different algorithms to both search and sort information stored in arrays.  Some websites, such as geeksforgeeks.org, have entire lists of different algorithms each with differing complexities, and each tailored for a specific use.  The computer science community describes algorithm efficiency using two different measures of complexity.  Time complexity is a function relating the number of actions (n) that will be performed on an array (a[]).  There are many different kinds of actions that an algorithm can perform on an array of data.  Time can mean the number of memory accesses performed, the number of comparisons between integers, the number of times nested loops are executed, or some other way to define the actions of an algorithm.  It is common for people incorrectly associate wall clock time with the time complexity of an algorithm.  Space complexity is the second measure of complexity.  It refers to a function that is used to describe the amount of memory that an algorithm will consume during the course of its execution.  Computer scientists often speak of the extra memory needed, in addition to the total memory occupied by the inputs.  Bytes can be used to describe the amount of memory occupied, but you can also use the length of an array, the number of fixed size features, etc.

When computer scientists start analyzing the complexities of different algorithms, they use equations in the Big O format.  An example of a Big O equation is O(n^2).  Geekforgeeks.org describes four general steps to complete to determine the efficiency of an algorithm:

1.       Figure out what the input is and what n represents. The number n is often the number of elements in an array, but it can also represent other things.
2.       Express the maximum number of operations, the algorithm performs in terms of n.  In my example equation, the maximum number of operations that will be performed is n^2.
3.       Eliminate all excluding the highest order terms. In the equations a^2 + 2a, the highest order of terms is a^2.
4.       Remove all the constant factors.  In the equation a^2+2a+3, the “3” is the constant factor.

When you apply the above steps to f(n) =a0 + a1.n + a2.n^2 + am.n^m, the Big O notation is O(n^m).  Where the complexity of the algorithm varies based on the number of arrays and the number of elements in each array.  A Big O equation is used to describe the theoretical worst-case running time, but every algorithms has the possibility of completing on the first execution, O(1).  There are many other ways to generally describe the complexity of an algorithm such as logarithmic - O(logn), linear – O(n), superlinear – O(nlogn), plus many more depending on the algorithm.


Because of our need to sort and then search through so many different data sets, we will also have an almost infinite number of algorithms to try to apply to your data sets.  I visualize an algorithm as the various conditions that influence any decision I make in the real world.  My personal experiences in different situations would affect the conditions specified in the loops of an algorithm.  In a computer program, the algorithm will apply specific conditions to data to sort and search it.  A programmer needs to be able to understand precisely what the data is and how much there is before they can select the most efficient algorithm.  A simple comparison between algorithms is demonstrated by the similarities and differences between a Mergesort algorithm and a Bubble sort.  Mergesort can be extremely fast, but because of how it operates on its inputs, the space complexity is high.  Whereas a Bubble Sort is exceptionally slow, but it also has a very low space requirement.  Selecting an algorithm is every bit as complex as the equations used to describe them.  Any different amount of factors will influence the selection and design of both a program’s data structures and the algorithms used to operate on its inputs.

References:

Lysecky, R., Vahid, F., Lysecky, S., & Givargis, T. (2015). Data structures essentials. Retrieved from https://zybooks.zyante.com/#/zybook/DataStructuresEssentialsR25/chapter/1/section/3

Zeigler, Z. (2004). Time, complexity, space complexity, and the O-notation (Links to an external site.)Links to an external site.. Retrieved from http://www.leda-tutorial.org/en/official/ch02s02s03.html

Analysis of Algorithms | Big-O analysis. (2018, June 08). Retrieved November 15, 2018, from https://www.geeksforgeeks.org/analysis-algorithms-big-o-analysis/

Complexity Analysis. (n.d.). Retrieved November 15, 2018, from http://www.cs.utexas.edu/users/djimenez/utsa/cs1723/lecture2.html

Shaffer, C. A. (2013). Data Structures and Algorithm analysis. Dover Publications.



Comments

Post a Comment

Popular posts from this blog

CPT 200: Fundamentals of Programming Languages

    During my quest to obtain a Bachelor of Information Technology from Ashford University, my fourth class was CPT 200: Fundamentals of Programming Languages.  For that class, the programming language that is taught is Python 3.     On the first week of class, we were asked to create code that would ask a user to input several pieces of information about any specific employee.  We were to use the variables: employeeName, employeeSSN, employeePhone, employeeEmail, and employeeSalary.  After the data was inputted, it needed to be printed on the screen.  Below was what I turned in for Functionality 1:     During the second week of class, we were to read two chapters: Chapter 3: Types and Chapter 4: Branching.  These chapters introduced us to the different types of variables that can be used within Python as well as how to use branching in your scripts. For the second functionality, we were instructed to adjust our code to allow for 5 different employees to be input into the system

CPT 200: Employee Management System Final Project

import sys import csv employeeList = {} employeeImport = {} lstMenuOption = ('1', '2', '3', '4', 'Q') edtMenuOption = ('1', '2', '3', '4', '5', 'Q') validation = False #initialize validation check to "False" menuSelection = None numEmployees = None num = 1 def cls():     print('\n'*50)     def main_menu():     totalEmployees = len(employeeList) #Display number of employees in the databse     print('{:~^79}'.format('~'))     print('{:~^79}'.format('                              '))     print('{:~^79}'.format('  Employee Management System  '))     print('{:~^79}'.format('          James Hardy         '))     print('{:~^79}'.format('                              '))     print('{:~^79}'.format('~'))     print(' ')     print('{:^79}'.format('There are (%d) employees in the

CPT 200: Functionality 3 - Loops and Functions

import sys employeeList = {} lstMenuOption = ('1', '2', '3', 'Q') #i = 0 #loop count variable validation = False #initialize validation check to "False" menuSelection = None numEmployees = None num = 1 def main_menu():     print('-----------------------------------------------------')     print('(1) Add new employee(s)')     print('(2) View all employees in the database')     print("(3) Check employee's information")     print('(Q) Exit\n')     totalEmployees = len(employeeList) #Display number of employees in the databse     print('There are (%d) in the database' % totalEmployees)     print('-----------------------------------------------------')     option = input('Please select an option from the menu: ')     return option def valid(option): #Function to validate menu selection     if (option.isnumeric() == True):         if option in lstMenuOption:             return True