Go Back   CodeCall Programming Forum > Software Development > Tutorials > C Tutorials
Register Blogs Search Today's Posts Mark Forums Read

C Tutorials All C Tutorials and Code

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-23-2008, 10:11 PM
WingedPanther's Avatar
Super Moderator
 
Join Date: Jul 2006
Age: 36
Posts: 11,435
WingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud of
Abstract Data Types and Code Efficiency.

Abstract Data Types (ADTs) and code efficiency are two of the most important concepts when moving from basic coding in a language and beginning to create powerful applications. Many of you will never actually use these concepts directly, but if you want to be a professional programmer, you need to be aware of them. If you have used any libraries of any power (.NET, Boost, wxWidgets, Borland's GUI, Lazarus, etc), you have indirectly made use of both of these concepts.

Code efficiency is the first topic, and it deals with how gracefully a piece of code will grow. When we write programs, we tend to create code in a very "nice" environment. We sort arrays of 10 elements, we search for items in a text files with 20 lines, etc. In cases like these, it doesn't really matter how wretched your code is, it will still complete in the blink of an eye. Unfortunately, in the real world, we don't deal with data that small, we deal with arrays with thousands of elements, and text files that contain the US tax code.

To give an example of what I'm talking about, consider these to functions to find a value in an array of ints (that are in ascending order):
Code:
int badsearch(int* array, int size, int val)
{
  for (int i=0;i<size;i++)
    if (array[i]==val) return i;
  return -1;
}

int goodsearch(int* array, int size, int val)
{
  int i=0,j=size-1;
  while (i<=j)
  {
    if (array[(i+j)/2]==val)
      return (i+j)/2
    else if (array[(i+j)/2]<val)
      i=(i+j)/2+1
    else
      j=(i+j)/2-1
  }
  return -1;
}
badsearch() is the obvious way to find an element of an array, go through the array, element by element, until you find what you're looking for. When you find it, you can return its index and you're done. Here's the problem, on average, you have to go halfway through the array. If you have 10 elements, on average you'll find it on about the 5th element. That sounds pretty good, but consider this: if you have one million elements, you will find it on about the 500,000th element. What this means is that if you have X elements, it will take X/2 comparisons, on average, to find the element. While this might not seem bad, realize that every time you multiply the amount of data by 1000, the search time is also multiplied by 1000.

goodsearch(), on the other hand, takes a different approach. It takes advantage of the array being sorted and cuts range of possibilities in half with every 2 comparisons (one to see if we found it, one to see which half it's on). What this does is allow us to double the number of elements at the cost of only 2 more comparisons. This means multiplying the number of elements by 1024 costs us an additional 2*10 (since 1024 = 2^10) comparisons. Using log_2 as log base 2, this means the worst-case number of comparisons is 2*log_2(X) for an X element array. This means that while 10 elements would take 6 comparisons, one million elements would take only 40 comparisons.

Looking at a chart of values to see how they comparisons change as the array size grows:
Code:
Array Size              | 10 | 100 | 1,000 | 10,000 | 100,000 | 1,000,000
badsearch ave if calls  |  5 |  50 |   500 |  5,000 |  50,000 |   500,000
goodsearch max if calls |  6 |  14 |    20 |     26 |      34 |        40
For every algorithm, we can create a function that measures the number of times a line of code gets run. badsearch() has an average number of line executions of X/2+1, where X is the number of elements in the array. goodsearch() has a worst-case number of line executions of 2*log_2(X)+1. In general, we measure how efficient an algorithm is by looking at the fastest growing term for each (X/2 for badsearch(), 2*log_2(X) for goodsearch()), and ignoring any multipliers or dividers by a constant. We use O() (read big-Oh) to represent that piece. badsearch() has O(X), goodsearch has O(log_2(x)).

Rating these from best to worst, O(1) is best, then O(log(log(X))), O(log(X)), O(X), O(X*log(X)), O(X^2), O(X^3), O(X^n), O(e^X), O(X!). There are other values as well, but these are the most common you will see. Determining which big-Oh a function corresponds to can be challenging at best, insane at worst. Also, log doesn't care what the base is, they are all equivalent.

When looking at Abstract Data Types, we will be looking at different options for how to make them work. What will often happen is there will be a trade-off, where one feature will get a higher big-Oh to give another feature a lower big-Oh.

That brings up an important question: what the heck is an Abstract Data Type anyway? An Abstract Data Type (ADT), is a data structure (like an array) that has specific functionality defined, but no specification of HOW the functionality is provided. Common ADTs are: Stack, Queue, Lists, Strings, Trees, Heaps, Tables, and Graphs.

As an example: a stack stores information like a stack of plates. You can use the "push" function to add data to the top of the stack, and you can use the "pop" function to pull the topmost piece of data from the stack. This is much like the stack of plates where you can keep adding plates on top, but if you want to get to the bottom plate, you have to take all the other plates off it (pop them off, one by one). What makes a stack abstract is the fact that you can choose most ANY method of actually storing the data. An array or a linked list are common examples. We will look at how to implement various ADTs in future tutorials.
__________________
CodeCall Blog | CodeCall Wiki | Shareware
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

Last edited by WingedPanther; 12-24-2008 at 11:18 AM.. Reason: fix goodsearch
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 12-24-2008, 09:50 AM
Jordan's Avatar
Administrator
 
Join Date: Nov 2005
Location: Hendersonville, NC
Posts: 24,556
Jordan is a name known to allJordan is a name known to allJordan is a name known to allJordan is a name known to allJordan is a name known to allJordan is a name known to all
Send a message via ICQ to Jordan Send a message via AIM to Jordan Send a message via MSN to Jordan Send a message via Yahoo to Jordan
Re: Abstract Data Types and Code Efficiency.

Simply brilliant! I believe goodsearch() is bad though. I ran test data through it using an array of 10 (1,2,3....10) and a val of 8. I believe this is incorrect:

Code:
    else if (array[(i+j)/2]<val)
      j=(i+j)/2-1
    else
      i=(i+j)/2+1
  }
Should the -1 and +1 be swapped? However, I may have just not done the example correct as I was running it by hand (not in an actual program).
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 12-24-2008, 11:17 AM
WingedPanther's Avatar
Super Moderator
 
Join Date: Jul 2006
Age: 36
Posts: 11,435
WingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud of
Re: Abstract Data Types and Code Efficiency.

You are half-right... the entire lines should be swapped. Fixed above.
__________________
CodeCall Blog | CodeCall Wiki | Shareware
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 12-24-2008, 02:09 PM
chili5's Avatar
Code Slinger
 
Join Date: Mar 2008
Posts: 7,018
chili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond reputechili5 has a reputation beyond repute
Re: Abstract Data Types and Code Efficiency.

Interesting read.

What does this mean?

Quote:
O(log_2(x))
What search algorithm is that? Perhaps write a tutorial on different search algorithms?
__________________
"Whenever you remember, I'll be there/
Remember how we reached that dream together" - Carrie Underwood
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 12-24-2008, 07:38 PM
Jordan's Avatar
Administrator
 
Join Date: Nov 2005
Location: Hendersonville, NC
Posts: 24,556
Jordan is a name known to allJordan is a name known to allJordan is a name known to allJordan is a name known to allJordan is a name known to allJordan is a name known to all
Send a message via ICQ to Jordan Send a message via AIM to Jordan Send a message via MSN to Jordan Send a message via Yahoo to Jordan
Re: Abstract Data Types and Code Efficiency.

He mentions:

Quote:
We use O() (read big-Oh)
but I'm not sure what it is nor did I do as he suggested thus I agree with chili5, a tutorial would be great. lol
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 12-25-2008, 06:56 AM
WingedPanther's Avatar
Super Moderator
 
Join Date: Jul 2006
Age: 36
Posts: 11,435
WingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud ofWingedPanther has much to be proud of
Re: Abstract Data Types and Code Efficiency.

O(log_2(X)) is the search efficiency for a balanced binary tree. Roughly, here's how the various efficiencies work:

Code:
X         | 1 |   5 |      10 |                   20 |     100 |    1000 |     10000
O(1)      | 1 |   1 |       1 |                    1 |       1 |       1 |         1
O(log(X)) | 0 | .70 |       1 |                 1.30 |       2 |       3 |         4
O(X)      | 1 |   5 |      10 |                   20 |     100 |    1000 |     10000
O(X^2)    | 1 |  25 |     100 |                  400 |   10000 | 1000000 | 100000000
O(X^3)    | 1 | 125 |    1000 |                 8000 | 1000000
O(2^X)    | 1 |  32 |    1024 |              1048576
O(X!)     | 1 | 120 | 3628800 | ~2432902008176640000
If X is the size of the problem (records in a database table, number of elements in an array, lines of text in a file, etc), then the numbers under it are a measure of the processing time (in lines of code, milliseconds, seconds, etc). The purpose isn't to provide precise measurements, but to show how rapidly the problem explodes.
__________________
CodeCall Blog | CodeCall Wiki | Shareware
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
Increase Build Process added-value with Static Analysis Kernel News 0 12-14-2008 06:20 PM
Introduction to Form Submission PART I Xav PHP Tutorials 7 10-14-2008 03:09 PM
Some Unexpected Code Dependency Issues Kernel News 0 10-08-2008 06:50 PM


All times are GMT -5. The time now is 11:00 AM.


vBulletin v3.8.0 ©2010, Jelsoft Enterprises Ltd.


no new posts

LinkBacks Enabled by vBSEO 3.1.0