0

Data Structures CS301 book HANDOUTS in pdf
Data Structures (CS301)

Handouts | Lectures | Contents | Books



Data Structures
Lecture No. 01
Reading Material
Data Structures and algorithm analysis in C++ Chapter. 3
3.1, 3.2, 3.2.1
Summary
• Introduction to Data Structures
• Selecting a Data Structure
• Data Structure Philosophy
• Goals of this Course
• Array
• List data structure
Welcome to the course of data structure. This is very important subject as the topics
covered in it will be encountered by you again and again in the future courses. Due to
its great applicability, this is usually called as the foundation course. You have
already studied Introduction to programming using C and C++ and used some data
structures. The focus of that course was on how to carry out programming with the
use of C and C++ languages besides the resolution of different problems. In this
course, we will continue problem solving and see that the organization of data in
some cases is of immense importance. Therefore, the data will be stored in a special
way so that the required result should be calculated as fast as possible.
Following are the goals of this course:
􀂃 Prepare the students for (and is a pre-requisite for) the more advanced material
students will encounter in later courses.
􀂃 Cover well-known data structures such as dynamic arrays, linked lists, stacks,
queues, trees and graphs.
􀂃 Implement data structures in C++
You have already studied the dynamic arrays in the previous course. We will now
discuss linked lists, stacks, queues, trees and graphs and try to resolve the problems
with the help of these data structures. These structures will be implemented in C++
language. We will also do programming assignments to see the usage and importance
of these structures.
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 4 of 505
Information about Data Structure subject is available at: “http://www.vu.edu.pk/ds”.
Introduction to Data Structures
Let’s discuss why we need data structures and what sort of problems can be solved
with their use. Data structures help us to organize the data in the computer, resulting
in more efficient programs. An efficient program executes faster and helps minimize
the usage of resources like memory, disk. Computers are getting more powerful with
the passage of time with the increase in CPU speed in GHz, availability of faster
network and the maximization of disk space. Therefore people have started solving
more and more complex problems. As computer applications are becoming complex,
so there is need for more resources. This does not mean that we should buy a new
computer to make the application execute faster. Our effort should be to ensue that the
solution is achieved with the help of programming, data structures and algorithm.
What does organizing the data mean? It means that the data should be arranged in a
way that it is easily accessible. The data is inside the computer and we want to see it.
We may also perform some calculations on it. Suppose the data contains some
numbers and the programmer wants to calculate the average, standard deviation etc.
May be we have a list of names and want to search a particular name in it. To solve
such problems, data structures and algorithm are used. Sometimes you may realize
that the application is too slow and taking more time. There are chances that it may be
due to the data structure used, not due to the CPU speed and memory. We will see
such examples. In the assignments, you will also check whether the data structure in
the program is beneficial or not. You may have two data structures and try to decide
which one is more suitable for the resolution of the problem.
As discussed earlier, a solution is said to be efficient if it solves the problem within its
resource constraints. What does it mean? In the computer, we have hard disk, memory
and other hardware. Secondly we have time. Suppose you have some program that
solves the problem but takes two months. It will be of no use. Usually, you don’t have
this much time and cannot wait for two months. Suppose the data is too huge to be
stored in disk. Here we have also the problem of resources. This means that we have
to write programs considering the resources to achieve some solution as soon as
possible. There is always cost associated with these resources. We may need a faster
and better CPU which can be purchased. Sometimes, we may need to buy memory.
As long as data structures and programs are concerned, you have to invest your own
time for this. While working in a company, you will be paid for this. All these
requirements including computer, your time and computer time will decide that the
solution you have provided is suitable or not. If its advantages are not obtained, then
either program or computer is not good.
So the purchase of a faster computer, while studying this course, does not necessarily
help us in the resolution of the problem. In the course of “Computer Architecture”
you will see how the more efficient solutions can be prepared with the hardware. In
this course, we will use the software i.e. data structures, algorithms and the recipes
through which the computer problems may be resolved with a faster solution.
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 5 of 505
Selecting a Data Structure
How can we select the data structure needed to solve a problem? You have already
studied where to use array and the size of array and when and where to use the
pointers etc. First of all, we have to analyze the problem to determine the resource
constraints that a solution must meet. Suppose, the data is so huge i.e. in Gega bytes
(GBs) while the disc space available with us is just 200 Mega bytes. This problem can
not be solved with programming. Rather, we will have to buy a new disk.
Secondly, it is necessary to determine the basic operations that must be supported.
Quantify the resource constraints for each operation. What does it mean? Suppose you
have to insert the data in the computer or database and have to search some data item.
Let’s take the example of telephone directory. Suppose there are eight million names
in the directory. Now someone asks you about the name of some particular person.
You want that this query should be answered as soon as possible. You may add or
delete some data. It will be advisable to consider all these operations when you select
some data structure.
Finally select the data structure that meets these requirements the maximum. Without,
sufficient experience, it will be difficult to determine which one is the best data
structure. We can get the help from internet, books or from someone whom you know
for already getting the problems solved. We may find a similar example and try to use
it. After this course, you will be familiar with the data structures and algorithms that
are used to solve the computer problems.
Now you have selected the data structure. Suppose a programmer has inserted some
data and wants to insert more data. This data will be inserted in the beginning of the
existing data, or in the middle or in the end of the data. Let’s talk about the arrays and
suppose you have an array of size hundred. Data may be lying in the first fifty
locations of this array. Now you have to insert data in the start of this array. What will
you do? You have to move the existing data (fifty locations) to the right so that we get
space to insert new data. Other way round, there is no space in the start. Suppose you
have to insert the data at 25th location. For this purpose, it is better to move the data
from 26th to 50th locations; otherwise we will not have space to insert this new data at
25th location.
Now we have to see whether the data can be deleted or not. Suppose you are asked to
delete the data at 27th position. How can we do that? What will we do with the space
created at 27th position?
Thirdly, is all the data processed in some well-defined order or random access
allowed? Again take the example of arrays. We can get the data from 0th position and
traverse the array till its 50th position. Suppose we want to get the data, at first from
50th location and then from 13th. It means that there is no order or sequence. We want
to access the data randomly. Random access means that we can’t say what will be the
next position to get the data or insert the data.
Data Structure Philosophy
Let’s talk about the philosophy of data structure. Each data structure has costs and
benefits. Any data structure used in your program will have some benefits. For this,
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 6 of 505
you have to pay price. That can be computer resources or the time. Also keep in mind
that you are solving this problem for some client. If the program is not efficient, the
client will not buy it.
In rare cases, a data structure may be better than another one in all situations. It means
that you may think that the array is good enough for all the problems. Yet this is not
necessary. In different situations, different data structures will be suitable. Sometimes
you will realize that two different data structures are suitable for the problem. In such
a case, you have to choose the one that is more appropriate. An important skill this
course is going to lend to the students is use the data structure according to the
situation. You will learn the programming in a way that it will be possible to replace
the one data structure with the other one if it does not prove suitable. We will replace
the data structure so that the rest of the program is not affected. You will also have to
attain this skill as a good programmer.
There are three basic things associated with data structures. A data structure requires:
– space for each data item it stores
– time to perform each basic operation
– programming effort
Goals of this Course
Reinforce the concept that costs and benefits exist for every data structure. We will
learn this with practice.
Learn the commonly used data structures. These form a programmer's basic data
structure “toolkit”. In the previous course, you have learned how to form a loop,
functions, use of arrays, classes and how to write programs for different problems. In
this course, you will make use of data structures and have a feeling that there is bag
full of different data structures. In case of some problem, you will get a data structure
from the toolkit and use some suitable data structure.
Understand how to measure the cost of a data structure or program. These techniques
also allow you to judge the merits of new data structures that you or others might
develop. At times, you may have two suitable data structures for some problem. These
can be tried one by one to adjudge which one is better one. How can you decide
which data structure is better than other. Firstly, a programmer can do it by writing
two programs using different data structure while solving the same problem. Now
execute both data structures. One gives the result before the other. The data structure
that gives results first is better than the other one. But sometimes, the data grows too
large in the problem. Suppose we want to solve some problem having names and the
data of names grows to10 lakhs (one million). Now when you run both programs, the
second program runs faster. What does it mean? Is the data structure used in program
one not correct? This is not true. The size of the data, being manipulated in the
program can grow or shrink. You will also see that some data structures are good for
small data while the others may suit to huge data. But the problem is how can we
determine that the data in future will increase or decrease. We should have some way
to take decision in this regard. In this course we will do some mathematical analysis
and see which data structure is better one.
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 7 of 505
Arrays
You have already studied about arrays and are well-versed with the techniques to
utilize these data structures. Here we will discuss how arrays can be used to solve
computer problems. Consider the following program:
main( int argc, char** argv )
{
int x[6];
int j;
for(j = 0; j < 6; j++)
x[j] = 2 * j;
}
We have declared an int array of six elements and initialized it in the loop.
Let’s revise some of the array concepts. The declaration of array is as int x[6]; or float
x[6]; or double x[6]; You have already done these in your programming assignments.
An array is collection of cells of the same type. In the above program, we have array x
of type int of six elements. We can only store integers in this array. We cannot put int
in first location, float in second location and double in third location. What is x? x is a
name of collection of items. Its individual items are numbered from zero to one less
than array size. To access a cell, use the array name and an index as under:
x[0], x[1], x[2], x[3], x[4], x[5]
To manipulate the first element, we will use the index zero as x[0] and so on. The
arrays look like in the memory as follows:
Array occupies contiguous memory area in the computer. In case of the above
example, if some location is assigned to x[0], the next location can not contain data
other than x[1]. The computer memory can be thought of as an array. It is a very big
array. Suppose a computer has memory of 2MB, you can think it as an array of size 2
million and the size of each item is 32 bits. You will study in detail about it in the
computer organization, and Assembly language courses. In this array, we will put our
programs, data and other things.
X[0]
X[1]
X[2]
X[3]
X[4]
X[5]
Array cells are
contiguous in
computer memory
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 8 of 505
In the above program, we have declared an array named x. ‘x’ is an array’s name but
there is no variable x. ‘x’ is not an lvalue. If some variable can be written on the lefthand
side of an assignment statement, this is lvalue variable. It means it has some
memory associated with it and some value can be assigned to it. For example, if we
have the code int a, b; it can be written as b = 2; it means that put 2 in the memory
location named b. We can also write as a = b; it means whatever b has assign it to a,
that is a copy operation. If we write as a = 5; it means put the number 5 in the
memory location which is named as a. But we cannot write 2 = a; that is to put at
number 2 what ever the value of a is. Why can’t we do that? Number 2 is a constant.
If we allow assignment to constants what will happen? Suppose ‘a’ has the value
number 3. Now we assigned number 2 the number 3 i.e. all the number 2 will become
number 3 and the result of 2 + 2 will become 6. Therefore it is not allowed.
‘x’ is a name of array and not an lvalue. So it cannot be used on the left hand side in
an assignment statement. Consider the following statements
int x[6];
int n;
x[0] = 5; x[1] = 2;
x = 3; //not allowed
x = a + b; // not allowed
x = &n; // not allowed
In the above code snippet, we have declared an array x of int. Now we can assign
values to the elements of x as x[0] = 5 or x[1] = 2 and so on. The last three statements
are not allowed. What does the statement x = 3; mean? As x is a name of array and
this statement is not clear, what we are trying to do here? Are we trying to assign 3 to
each element of the array? This statement is not clear. Resultantly, it can not be
allowed. The statement x = a + b is also not allowed. There is nothing wrong with a
+ b. But we cannot assign the sum of values of a and b to x. In the statement x = &n,
we are trying to assign the memory address of n to x which is not allowed. The reason
is the name x is not lvalue and we cannot assign any value to it. For understanding
purposes, consider x as a constant. Its name or memory location can not be changed.
This is a collective name for six locations. We can access these locations as x[0], x[1]
up to x[5]. This is the way arrays are manipulated.
Sometimes, you would like to use an array data structure but may lack the information
about the size of the array at compile time. Take the example of telephone directory.
You have to store one lakh (100,000) names in an array. But you never know that the
number of entries may get double or decline in future. Similarly, you can not say that
the total population of the country is one crore (10 million) and declare an array of
one crore names. You can use one lakh locations now and remaining will be used as
the need arrives. But this is not a good way of using the computer resources. You
have declared a very big array while using a very small chunk of it. Thus the
remaining space goes waste which can, otherwise, be used by some other programs.
We will see what can be the possible solution of this problem?
Suppose you need an integer array of size n after the execution of the program. We
have studied that if it is known at the execution of the program that an array of size 20
or 30 is needed, it is allocated dynamically. The programming statement is as follows:
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 9 of 505
int* y = new int[20];
It means we are requesting computer to find twenty memory locations. On finding it,
the computer will give the address of first location to the programmer which will be
stored in y. Arrays locations are contiguous i.e. these are adjacent. These twenty
locations will be contiguous, meaning that they will be neighbors to each other. Now
y has become an array and we can say y[0] =1 or y[5] = 15. Here y is an lvalue.
Being a pointer, it is a variable where we can store the address of some variable.
When we said int* y = new int[20]; the new returns the memory address of first of
the twenty locations and we store that address into y. As y is a pointer variable, so it
can be used on the left-hand side. We can write it as:
y = &x[0];
In the above statement, we get the address of the fist location of the array x and store
it in y. As y is lvalue, so it can be used on left hand side. This means that the above
statement is correct.
y = x;
Similarly, the statement y = x is also correct. x is an array of six elements that holds
the address of the first element. But we cannot change this address. However we can
get that address and store it in some other variable. As y is a pointer variable and
lvalue so the above operation is legal. We have dynamically allocated the memory for
the array. This memory, after the use, can be released so that other programs can use
it. We can use the delete keyword to release the memory. The syntax is:
delete[ ] y;
We are releasing the memory, making it available for use by other programs. We will
not do it in case of x array, as ‘new’ was not used for its creation. So it is not our
responsibility to delete x.
List data structure
This is a new data structure for you. The List data structure is among the most generic
of data structures. In daily life, we use shopping list, groceries list, list of people to
invite to a dinner, list of presents to give etc. In this course, we will see how we use
lists in programming.
A list is the collection of items of the same type (grocery items, integers, names). The
data in arrays are also of same type. When we say int x[6]; it means that only the
integers can be stored in it. The same is true for list. The data which we store in list
should be of same nature. The items, or elements of the list, are stored in some
particular order. What does this mean? Suppose in the list, you have the fruit first
which are also in some order. You may have names in some alphabetical order i.e. the
names which starts with A should come first followed by the name starting with B and
so on. The order will be reserved when you enter data in the list.
It is possible to insert new elements at various positions in the list and remove any
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 10 of 505
element of the list. You have done the same thing while dealing with arrays. You
enter the data in the array, delete data from the array. Sometimes the array size grows
and at times, it is reduced. We will do this with the lists too.
List is a set of elements in a linear order. Suppose we have four names a1, a2, a3, a4
and their order is as (a3, a1, a2, a4) i.e. a3, is the first element, a1 is the second
element, and so on. We want to maintain that order in the list when data is stored in
the list. We don’t want to disturb this order. The order is important here; this is not
just a random collection of elements but an ordered one. Sometimes, this order is due
to sorting i.e. the things that start with A come first. At occasions, the order may be
due to the importance of the data items. We will discuss this in detail while dealing
with the examples.
Now we will see what kind of operations a programmer performs with a list data
structure. Following long list of operations may help you understand the things in a
comprehensive manner.
Operation Name Description
createList() Create a new list (presumably empty)
copy() Set one list to be a copy of another
clear(); Clear a list (remove all elements)
insert(X, ?) Insert element X at a particular position in the list
remove(?) Remove element at some position in the list
get(?) Get element at a given position
update(X, ?) Replace the element at a given position with X
find(X) Determine if the element X is in the list
length() Returns the length of the list.
createList() is a function which creates a new list. For example to create an array, we
use int x[6] or int* y = new int[20]; we need similar functionality in lists too. The
copy() function will create a copy of a list. The function clear() will remove all the
elements from a list. We want to insert a new element in the list, we also have to tell
where to put it in the list. For this purpose insert(X, position) function is used.
Similarly the function remove(position) will remove the element at position. To get an
element from the list get(position) function is used which will return the element at
position. To replace an element in the list at some position the function update(X,
position) is used. The function find(X) will search X in the list. The function length()
tells us about the number of elements in the list.
We need to know what is meant by “particular position” we have used “?” for this in
the above table. There are two possibilities:
􀂃 Use the actual index of element: i.e. insert it after element 3, get element
number 6. This approach is used with arrays
􀂃 Use a “current” marker or pointer to refer to a particular position in the list.
The first option is used in the data structures like arrays. When we have to manipulate
the arrays, we use index like x[3], x[6]. In the second option we do not use first,
second etc for position but say wherever is the current pointer. Just think of a pointer
in the list that we can move forward or backward. When we say get, insert or update
CS301 – Data Structures Lecture No. 01
___________________________________________________________________
Page 11 of 505
while using the current pointer, it means that wherever is the current pointer, get data
from that position, insert data after that position or update the data at that position. In
this case, we need not to use numbers. But it is our responsibility that current pointer
is used in a proper way.
If we use the “current” marker, the following four methods would be useful:
Functions Description
start() Moves the “current” pointer to the very first element
tail() Moves the “current” pointer to the very last element
next() Move the current position forward one element
back() Move the current position backward one element
In the next lecture, we will discuss the implementation of the list data structure and
write the functions discussed today, in C++ language.
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 12 of 505
Data Structures
Lecture No. 02
Reading Material
Data Structures and Algorithm Analysis in C++ Chapter. 3
3.1, 3.2, 3.2.1, 3.2.2
Summary
1) List Implementation
• add Method
• next Method
• remove Method
• find Method
• Other Methods
2) Analysis Of Array List
3) List Using Linked Memory
4) Linked List
Today, we will discuss the concept of list operations. You may have a fair idea of ‘
start operation’ that sets the current pointer to the first element of the list while the
tail operation moves the current pointer to the last element of the list. In the previous
lecture, we discussed the operation next that moves the current pointer one element
forward. Similarly, there is the ‘back operation’ which moves the current pointer one
element backward.
List Implementation
Now we will see what the implementation of the list is and how one can create a list
in C++. After designing the interface for the list, it is advisable to know how to
implement that interface. Suppose we want to create a list of integers. For this
purpose, the methods of the list can be implemented with the use of an array inside.
For example, the list of integers (2, 6, 8, 7, 1) can be represented in the following
manner where the current position is 3.
A 2 6 8 7 1 current size
1 2 3 4 5 3 5
In this case, we start the index of the array from 1 just for simplification against the
usual practice in which the index of an array starts from zero in C++. It is not
necessary to always start the indexing from zero. Sometimes, it is required to start the
indexing from 1. For this, we leave the zeroth position and start using the array from
index 1 that is actually the second position. Suppose we have to store the numbers
from 1 to 6 in the array. We take an array of 7 elements and put the numbers from the
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 13 of 505
index 1. Thus there is a correspondence between index and the numbers stored in it.
This is not very useful. So, it does not justify the non-use of zeroth position of the
array out-rightly. However for simplification purposes, it is good to use the index
from 1.
add Method
Now we will talk about adding an element to the list. Suppose there is a call to add an
element in the list i.e. add(9). As we said earlier that the current position is 3, so by
adding the element 9 to the list, the new list will be (2, 6, 8, 9, 7, 1).
To add the new element (9) to the list at the current position, at first, we have to make
space for this element. For this purpose, we shift every element on the right of 8 (the
current position) to one place on the right. Thus after creating the space for new
element at position 4, the array can be represented as
A 2 6 8 7 1 current size
1 2 3 4 5 3 5
Now in the second step, we put the element 9 at the empty space i.e. position 4. Thus
the array will attain the following shape. The figure shows the elements in the array in
the same order as stored in the list.
A 2 6 8 9 7 1 current size
1 2 3 4 5 6 4 6
We have moved the current position to 4 while increasing the size to 6. The size
shows that the elements in the list. Where as the size of the array is different that we
have defined already to a fixed length, which may be 100, 200 or even greater.
next Method
Now let’s see another method, called ‘next’. We have talked that the next method
moves the current position one position forward. In this method, we do not add a new
element to the list but simply move the pointer one element ahead. This method is
required while employing the list in our program and manipulating it according to the
requirement. There is also an array to store the list in it. We also have two variablescurrent
and size to store the position of current pointer and the number of elements in
the list. By looking on the values of these variables, we can find the state of the list
i.e. how many elements are in the list and at what position the current pointer is.
The method next is used to know about the boundary conditions of the list i.e. the
array being used by us to implement the list. To understand the boundary conditions,
we can take the example of an array of size 100 to implement the list. Here, 100
elements are added to the array. Let’s see what happens when we want to add 101st
element to the array? We used to move the current position by next method and
reached the 100th position. Now, in case of moving the pointer to the next position
(i.e. 101st), there will be an error as the size of the array is 100, having no position
after this point. Similarly if we move the pointer backward and reach at the first
position regardless that the index is 0 or 1. But what will happen if we want to move
backward from the first position? These situations are known as boundary conditions
and need attention during the process of writing programs when we write the code to
use the list. We will take care of these things while implementing the list in C++
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 14 of 505
programs.
remove Method
We have seen that the add method adds an element in the list. Now we are going to
discuss the remove method. The remove method removes the element residing at the
current position. The removal of the element will be carried out as follows. Suppose
there are 6 elements (2, 6, 8, 9, 7, 1) in the list. The current pointer is pointing to the
position 5 that has the value 7. We remove the element, making the current position
empty. The size of the list will become 5. This is represented in the following figure.
A 2 6 8 9 1 current size
1 2 3 4 5 6 5 6
5
We fill in the blank position left by the removal of 7 by shifting the values on the right
of position 5 to the left by one space. This means that we shift the remaining elements
on the right hand side of the current position one place to the left so that the element
next to the removed element (i.e. 1) takes its place (the fifth position) and becomes
the current position element. We do not change the current pointer that is still pointing
to the position 5. Thus the current pointer remains pointing to the position 5 despite
the fact that there is now element 1 at this place instead of 7. Thus in the remove
method, when we remove an element, the element next to it on the right hand side
comes at its place and the remaining are also shifted one place to the right. This step is
represented by the following figure.
A 2 6 8 9 1 current size
1 2 3 4 5 5 5
find Method
Now lets talk about a function, used to find a specific element in the array. The find
(x) function is used to find a specific element in the array. We pass the element, which
is to be found, as an argument to the find function. This function then traverses the
array until the specific element is found. If the element is found, this function sets the
current position to it and returns 1 i.e. true. On the other hand, if the element is not
found, the function returns 0 i.e. false. This indicates that the element was not found.
Following is the code of this find(x) function in C++.
int find (int x)
{
int j ;
for (j = 1; j < size + 1; j++ )
if (A[j] == x )
break ;
if ( j < size + 1) // x is found
{
current = j ; //current points to the position where x found
return 1 ; // return true
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 15 of 505
}
return 0 ; //return false, x is not found
}
In the above code, we execute a for loop to traverse the array. The number of
execution of this loop is equal to the size of the list. This for loop gets terminated
when the value of loop variable (j) increases from the size of the list. However we
terminate the loop with the break statement if the element is found at a position.
When the control comes out from the loop, we check the value of j. If the value of j is
less than the size of the array, it means that the loop was terminated by the break
statement. We use the break statement when we find the required element (x) in the
list. The execution of break statement shows that the required element was found at
the position equal to the value of j. So the program sets the current position to j and
comes out the function by returning 1 (i.e. true). If the value of j is greater than the
size of the array, it means that the whole array has traversed and the required element
is not found. So we simply return 0 (i.e. false) and come out of the function.
Other Methods
There are some other methods to implement the list using an array. These methods are
very simple, which perform their task just in one step (i.e. in one statement). There is
a get() method , used to get the element from the current position in the array. The
syntax of this function is of one line and is as under
return A[current] ;
This statement returns the element to which the current is pointing to (i.e. the current
position) in the list A.
Another function is update(x). This method is used to change (set) the value at the
current position. A value is passed to this method as an argument. It puts that value at
the current position. The following statement in this method carries out this process.
A [current] = x ;
Then there is a method length( ).This method returns the size of the list. The syntax of
this method is
return size ;
You may notice here that we are returning the size of the list and not the size of the
array being used internally to implement the list. This size is the number of the
elements of the list, stored in the array.
The back() method decreases the value of variable current by 1. In other words, it
moves the current position one element backward. This is done by writing the
statement.
current -- ;
The -- is a decrement operator in C++ that decreases the value of the operand by one.
The above statement can also be written as
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 16 of 505
current = current -1 ;
The start() method sets the current position to the first element of the list. We know
that the index of the array starts from 0 but we use the index 1 for the starting
position. We do not use the index zero. So we set the current position to the first
element by writing
current = 1 ;
Similarly, the end() method sets the current position to the last element of the list i.e.
size. So we write
current = size ;
Analysis of Array List
Now we analyze the implementation of the list while using an array internally. We
analyze different methods used for the implementation of the list. We try to see the
level upto which these are efficient in terms of CPU’s time consumption. Time is the
major factor to see the efficiency of a program.
Add
First of all, we have talked about the add method. When we add an element to the list,
every element is moved to the right of the current position to make space for the new
element. So, if the current position is the start of the list and we want to add an
element in the beginning, we have to shift all the elements of the list to the right one
place. This is the worst case of adding an element to the list. Suppose if the size of the
list is 10000 or 20000, we have to do the shift operation for all of these 10000 or
20000 elements. Normally, it is done by shifting of elements with the use of a for
loop. This operation takes much time of the CPU and thus it is not a good practice to
add an element at the beginning of a list. On the other hand, if we add an element at
the end of the list, it can be done by carrying out ‘no shift operation’. It is the best
case of adding an element to the list. However, normally we may have to move half of
the elements. The usage of add method is the matter warranting special care at the
time of implementation of the list in our program. To provide the interface of the list,
we just define these methods.
Remove
When we remove an element at the current position in the list, its space gets empty.
The current pointer remains at the same position. To fill this space, we shift the
elements on the right of this empty space one place to the left. If we remove an
element from the beginning of the list, then we have to shift the entire remaining
elements one place to the left. Suppose there is a large number of elements, say 10000
or 20000, in the list. We remove the first element from the list. Now to fill this space,
the remaining elements are shifted (that is a large number). Shifting such a large
number of elements is time consuming process. The CPU takes time to execute the for
loop that performs this shift operation. Thus to remove an element at the beginning of
the list is the worst case of remove method. However it is very easy to remove an
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 17 of 505
element at the end of the list. In average cases of the remove method we expect to
shift half of the elements. This average does not mean that in most of the cases, you
will have to shift half the elements. It is just the average. We may have to shift all the
elements in one operation (if we remove at the beginning) and in the second
operation, we have to shift no element (if we remove at the end). Similarly, in certain
operations, we have to shift just 10, 15 elements.
Find
We have discussed that the find method takes an element and traverses the list to find
that element. The worst case of the find method is that it has to search the entire list
from beginning to end. So, it finds the element at the end of the array or the element is
not found. On average the find method searches at most half the list.
The other methods get (), length () etc are one-step methods. They carry out their
operation in one instruction. There is no need of any loop or other programming
structures to perform the task. The get() method gets the value from the specified
position just in one step. Similarly the update() method sets a value at the specific
position just in one-step. The length () method returns the value of the size of the list.
The other methods back(), start() and end() also perform their tasks only in one step.
List using Linked Memory
We have seen the implementation of the list with the use of an array. Now we will
discuss the implementation of the list while using linked memory. In an array, the
memory cells of the array are linked with each other. It means that the memory of the
array is contiguous. In an array, it is impossible that one element of the array is
located at a memory location while the other element is located somewhere far from it
in the memory. It is located in very next location in the memory. It is a property of the
array that its elements are placed together with one another in the memory. Moreover,
when we have declared the size of the array, it is not possible to increase or decrease
it during the execution of the program. If we need more elements to store in the array,
there is need of changing its size in the declaration. We have to compile the program
again before executing it. Now array will be of the new size. But what happens if we
again need to store more elements? We will change the code of our program to
change the declaration of the array while recompiling it.
Suppose we have used the dynamic memory allocation and created an array of 100
elements with the use of new operator. In case of need of 200 elements, we will
release this array and allocate a new array of 200 elements. Before releasing the
previous array, it will wise to copy its elements to the new array so that it does not
lose any information. Now this new array is in ‘ready for use’ position. Thus the
procedure of creating a new array is not an easy task.
To avoid such problems, usually faced by the programmers while using an array,
there is need of using linked memory in which the various cells of memory, are not
located continuously. In this process, each cell of the memory not only contains the
value of the element but also the information where the next element of the list is
residing in the memory. It is not necessary that the next element is at the next location
in the memory. It may be anywhere in the memory. We have to keep a track of it.
Thus, in this way, the first element must explicitly have the information about the
location of the second element. Similarly, the second element must know where the
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 18 of 505
third element is located and the third should know the position of the fourth element
and so on. Thus, each cell (space) of the list will provide the value of the element
along with the information about where the next element is in the memory. This
information of the next element is accomplished by holding the memory address of
the next element. The memory address can be understood as the index of the array. As
in case of an array, we can access an element in the array by its index. Similarly, we
can access a memory location by using its address, normally called memory address.
Linked List
For the utilization of the concept of linked memory, we usually define a structure,
called linked list. To form a linked list, at first, we define a node. A node comprises
two fields. i.e. the object field that holds the actual list element and the next that holds
the starting location of the next node.
object next
A chain of these nodes forms a linked list. Now let’s consider our previous list, used
with an array i.e. 2, 6, 8, 7, 1. Following is the figure which represents the list stored
as a linked list.
Head
2 6 8 7 1 size = 5
current
This diagram just represents the linked list. In the memory, different nodes may occur
at different locations but the next part of each node contains the address of the next
node. Thus it forms a chain of nodes which we call a linked list.
While using an array we knew that the array started from index 1that means the first
element of the list is at index 1. Similarly in the linked list we need to know the
starting point of the list. For this purpose, we have a pointer head that points to the
first node of the list. If we don’t use head, it will not be possible to know the starting
position of the list. We also have a pointer current to point to the current node of the
list. We need this pointer to add or remove current node from the list. Here in the
linked list, the current is a pointer and not an index as we used while using an array.
The next field of the last node points to nothing .It is the end of the list. We place the
memory address NULL in the last node. NULL is an invalid address and is
inaccessible.
Now again consider the list 2, 6, 8, 7, 1. The previous figure represents this list as a
linked list. In this linked list, the head points to 2, 2 points to 6, 6 points to 8, 8 points
to 7 and 7 points to 1. Moreover we have the current position at element 8.
This linked list is stored in the memory. The following diagram depicts the process
CS301 – Data Structures Lecture No. 02
___________________________________________________________________
Page 19 of 505
through which this linked list is stored in the memory.
1051 6
1052 1063
current 1053
1054 2
1055 1051
1056
1057 7
1058 1060
1059
1060 1
1061 0
head 1062 1054
1063 8
1064 1057
1065
We can see in the figure that each memory location has an address. Normally in
programming, we access the memory locations by some variable names. These
variable names are alias for these locations and are like labels that are put to these
memory locations. We use head and current variable names instead of using the
memory address in numbers for starting and the current nodes. In the figure, we see
that head is the name of the memory location 1062 and the name current is used for
the memory address 1053. The head holds the address 1054 and the element 2, the
first one in the list, is stored in the location 1054. Similarly current holds the address
1063 where the element 8 is stored which is our current position in the list. In the
diagram, two memory locations comprise a node. So we see that the location 1054
holds the element 2 while the next location 1055 holds the address of the memory
location (1051) where the next element of the list (i.e. 6) is stored. Similarly the next
part of the node that has value 6 holds the memory address of the location occupied
by the next element (i.e. 8) of the list. The other nodes are structured in a similar
fashion. Thus, by knowing the address of the next element we can traverse the whole
list.
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 20 of 505
Data Structures
Lecture No. 03
Reading Material
Data Structures and algorithm analysis in C++ Chapter. 3
3.2.2, 3.2.3, 3.2.5
Summary
• Linked List inside Computer Memory
• Linked List Operations
• Linked List Using C++
• Example Program
In the previous lectures, we used an array to construct a list data structure and
observed the limitation that array being of fixed size can only store a fixed number of
elements. Therefore, no more elements can be stored after the size of the array is
reached.
In order to resolve this, we adopted a new data structure called linked list. We started
discussing, how linked lists are stored in computer memory and how memory chains
are formed.
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 21 of 505
Linked List inside Computer Memory
Fig 1. Linked list in memory
There are two parts of this figure. On the left is the linked list chain that is actually the
conceptual view of the linked list and on the right is the linked list inside the computer
memory. Right part is a snapshot of the computer memory with memory addresses
from 1051 to 1065. The head pointer is pointing to the first element in the linked list.
Note that head itself is present in the memory at address 1062. It is actually a pointer
containing the memory address 1054. Each node in the above linked list has two parts
i.e. the data part of the node and the pointer to the next node. The first node of the
linked list pointed by the head pointer is stored at memory address 1054. We can see
the data element 2 present at that address. The second part of the first node contains
the memory address 1051. So the second linked list’s node starts at memory address
1051. We can use its pointer part to reach the third node of the list and in this way, we
can traverse the whole list. The last node contains 1 in its data part and 0 in its pointer
part. 0 indicates that it is not pointing to any node and it is the last node of the linked
list.
Linked List Operations
The linked list data structure provides operations to work on the nodes inside the list.
The first operation we are going to discuss here is to create a new node in the
memory. The Add(9) is used to create a new node in the memory at the current
1051 6
1052 1063
1053 1063
1054 2
1055 1051
1056
1057 7
1058 1060
1059
1060 1
1061 0
1062 1054
1063 8
1064 1057
1065
currentent
headent
head
2 6 8 7 1
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 22 of 505
position to hold ‘9’. You must remember while working with arrays, to add an
element at the current position that all the elements after the current position were
shifted to the right and then the element was added to the empty slot.
Here, we are talking about the internal representation of the list using linked list. Its
interface will remain the same as in case of arrays.
We can create a new node in the following manner in the add() operation of the linked
list with code in C++:
Node * newNode = new Node(9);
The first part of the statement that is on the left of the assignment is declaring a
variable pointer of type Node. It may also be written as Node * newNode. On the
right of this statement, the new operator is used to create a new Node object as new
Node(9). This is one way in C++ to create objects of classes. The name of the class is
provided with the new operator that causes the constructor of the class to be called.
The constructor of a class has the same name as the class and as this a function,
parameters can also be passed to it. In this case, the constructor of the Node class is
called and ‘9’ is passed to it as an int parameter.
Hence, the whole statement means:
“Call the constructor of the Node class and pass it ‘9’ as a parameter. After
constructing the object in memory, give me the starting memory address of the object.
That address will be stored in the pointer variable newNode.”
To create an object of int type in the same manner, we can write as:
int * i = new int ;
Previously, we used the same technique to allocate memory for an array of ints as:
int * i = new int [10] ;
Now after the node has been created, how the node is fit into the chain of the linked
list.
Fig 2. Insertion of new Node into the linked list
In the above figure, there is a linked list that contains five nodes with data elements as
2, 6, 8, 7 and 1. The current pointer is pointing to the node with element as 8. We
want to insert a new node with data element 9. This new node will be inserted at the
current position (the position where the current pointer is pointing to). This insertion
2
6
8
7
1
9
current 1
newNode
2
3
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 23 of 505
operation is performed in a step by step fashion.
- The first step is to point next pointer of the new node (with data element as 9) to
the node with data element as 7.
- The second step is to point the next pointer of the node with data element 8 to the
node the new node with data element 9.
- The third step is to change the current pointer to point to the new node.
Now, the updated linked list has nodes with data elements as 2, 6, 8, 9, 7 and 1. The
list size has become 6.
Linked List Using C++
/* The Node class */
class Node
{
public:
int get() { return object; };
void set(int object) { this->object = object; };
Node * getNext() { return nextNode; };
void setNext(Node * nextNode) { this->nextNode = nextNode; };
private:
int object;
Node * nextNode;
};
Whenever, we write a class, it begins with the word class followed by the class-name
and the body of the class enclosed within curly braces. In the body, we write its public
variables, methods and then private variables and methods, this is normally the
sequence.
If there is no code to write inside the constructor function of a class, we need not to
declare it ourselves as the compiler automatically creates a default constructor for us.
Similarly, if there is nothing to be done by the destructor of the class, it will be better
not to write it explicitly. Rather, the compiler writes it automatically. Remember, the
default constructor and destructor do nothing as these are the function without any
code statements inside.
Let’s start with the data members first. These are given at the bottom of the class body
with the scope mentioned as private. These data members are actually two parts of a
linked list’s node. First variable is object of type int, present there to store the data
part of the node. The second variable is nextNode, which is a pointer to an object of
type Node. It has the address of the next element of the linked list.
The very first public function given at the top is get(). We have written its code within
the class Node. It returns back the value of the variable object i.e. of the type of int.
When we write class in C++, normally, we make two files (.h and .cpp) for a class.
The .h file contains the declarations of public and private members of that class. The
public methods are essentially the interface of the class to be employed by the users of
this class. The .cpp file contains the implementation for the class methods that has the
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 24 of 505
actual code. This is usually the way that we write two files for one class. But this is
not mandatory. In the code given above, we have only one file .cpp, instead of
separating into two files. As the class methods are very small, so their code is written
within the body of the class. This facilitates us in carrying on discussion. Thus instead
of talking about two files, we will only refer to one file. On the other hand, compiler
takes these functions differently that are called inline functions. The compiler replaces
the code of these inline functions wherever the call to them is made.
The second method in the above-mentioned class is set() that accepts a parameter of
type int while returning back nothing. The accepted parameter is assigned to the
internal data member object. Notice the use of this pointer while assigning the value
to the internal data member. It is used whenever an object wants to talk to its own
members.
The next method is getNext() which returns a pointer to an object of type Node lying
somewhere in the memory. It returns nextNode i.e. a pointer to an object of type
Node. As discussed above, nextNode contains the address of next node in the linked
list.
The last method of the class is setNext() that accepts a pointer of type Node, further
assigned to nextNode data member of the object. This method is used to connect the
next node of the linked list with the current object. It is passed an address of the next
node in the linked list.
Let’s discuss a little bit about classes. A very good analogy of a class is a factory.
Think about a car factory. On the placement of order, it provides us with the number
of vehicles we ordered for. Similarly, you can see number of other factories in your
daily-life that manufacture the specific products.
Let’s take this analogy in C++ language. Suppose, we want to make a factory in C++.
By the way, what is our Node class? It is actually a factory that creates nodes. When
we want to make a new node, a new operator is used. By using new operator with the
Node class, actually, we send an order to Node factory, to make as many as nodes for
us.
So we have a good analogy, to think about a class as a factory. The products that are
made by the factory have their own characteristics. For example, a car made by an
automobile factory has an engine, wheels, steering and seats etc. These variables
inside a class are called state variables. Now the kinds of operations this car can do
are the methods of its class. A car can be driven, engine can be started, gears can be
shifted and an accelerator can be pressed to run it faster.
Similarly, the Node class creates nodes, where every node has two-state variables i.e.
object and nextNode. We have already seen its operations in the above code. We use
new to create new object or an array of new objects, stored in memory.
Let’s see the code below.
/* List class */
#include <stdlib.h>
#include "Node.cpp"
class List
{
public:
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 25 of 505
// Constructor
List() {
headNode = new Node();
headNode->setNext(NULL);
currentNode = NULL;
size = 0;
}
We are creating a list factory here employed to create list objects. Remember the list
operations; add, remove, next, back and start etc. Let’s see the above class declaration
code in detail.
There are two include statements at the start. The first line is to include a standard
library stdlib.h while the second line includes the Node class file Node.cpp. This Node
class is used to create nodes that form a List object. So this List factory will order
Node class to create new nodes. The List class itself carries out the chain management
of these Node objects.
We have written our own constructor of List class as the default constructor is not
sufficient enough to serve the purpose. The List constructor is parameterless. The very
first step it is doing internally is that it is asking Node class to create a new node and
assigning the starting address of the new Node’s object to the headNode data member.
In the second statement, we are calling setNext() method of the Node class for the
object pointed to by the headNode pointer. This call is to set the nextNode data
member to NULL, i.e. Node’s object pointed to by the headNode pointer is not
pointing to any further Node. The next statement is to set the currentNode pointer to
NULL. So at the moment, we have initialized the currentNode pointer to NULL that is
not pointing to any Node object. The next statement is to initialize the size data
member to 0 indicating that there is no node present in the list. All this processing is
done inside the constructor of List class, as we want all this done when a list object is
created. Considering the analogy of car factory, the constructor function can perform
certain tasks: The oil is poured into the engine, the tyres are filled-in with air etc.
Let’s see the add method of the List class:
/* add() class method */
void add (int addObject)
{
1. Node * newNode = new Node();
2. newNode->set(addObject);
3. if( currentNode != NULL )
4. {
5. newNode->setNext(currentNode->getNext());
6. currentNode->setNext( newNode );
7. lastCurrentNode = currentNode;
8. currentNode = newNode;
9. }
10. else
11. {
12. newNode->setNext(NULL);
13. headNode->setNext(newNode);
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 26 of 505
14. lastCurrentNode = headNode;
15. currentNode = newNode;
16. }
17. size ++;
}
The interface or signatures of add() method is similar to the one discussed in case of
an array. This method takes the object to be added as a parameter. The
implementation of this add() method is a bit longer as the method is being
implemented for linked list. In the first statement, a new Node object is created with
its address stored in the newNode pointer variable. The second statement is to call
set() method of the Node object pointed to by the newNode pointer. You can note the
way the method is called. A pointer variable is at the left most side then an arrow sign
(->), then the name of the method with appropriate arguments within parenthesis. It is
followed by the if-statement that checks the currentNode is not NULL to perform
certain operations inside the if-code block. Inside the if-statement, at line 5, the
nextNode pointer of the new node is being set to the nextNode of the object pointed to
by the currentNode pointer. In order to understand the statements given in this code
properly, consider the fig 2 above, where we added a node in the linked list. We have
done step 1 at line5. At line 6, we are performing the second step by setting the
newNode in the nextNode pointer of the object pointed to by the currentNode. At line
7, we are saving the current position (address) of the currentNode pointer in the
pointer variable lastCurrentNode, which might be useful for backward traversing.
Although, the fig 1 (left part) indicates movement in one direction from left to right
but the lastCurrentNode pointer node can be used by the back() member function to
traverse one position back from right to left. At line 8, the currentNode pointer is
assigned the address of the object pointed to by newNode. This way, a new node is
added in already existent linked list.
Line 10 is start of the else part of if-statement. This is executed if the currentNode is
NULL. It means that there is no node present in the list previously and first node is
going to be added. At line 12, we are setting the nextNode pointer of the object
pointed to by newNode pointer. The nextNode is being set to NULL by calling the
setNext() method. Then at line 13, we point the head pointer (headNode) to this new
node pointed to by newNode pointer. Note that headNode is pointing to a node that is
there despite the fact that the size of the linked list is 0. Actually, we have allocated a
Node object for headNode pointer. Although, we don’t need a Node object here, yet it
will be helpful when we perform other operations like remove() and find().
At line 14, the headNode address is being assigned to lastCurrentNode. At line 15,
currentNode pointer is assigned the address of newNode. At the end i.e. at line 17, the
size of the list is incremented by 1.
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 27 of 505
Fig 3. Add operation of linked list
Following is the crux of this add() operation :
Firstly, it will make a new node by calling Node class constructor. Insert the value
e.g. 2. of the node into the node by calling the set method. Now if the list already
exists (has some elements inside or its size is non-zero), it will insert the node after
the current position. If the list does not already exist, this node is added as the first
element inside the list.
Let’s try to add few more elements into the above linked list in the figure. The
following are the lines of code to be executed to add nodes with values 8, 7 and 1 into
the linked list.
list.add(8); list.add(7); list.add(1);
Fig 4. More Nodes added into linked list
Now we will see the remaining methods of the linked list. The get() method of the
List class is given below
List list; headNode size = 0
headNode
currentNode
size = 1
lastCurrentNode
headNodee 2
currentNode
size = 2
lastCurrentNode
list.add(2);
list.add(6);
2
6
2 6 8 7 1
headNode lastCurrentNode
currentNode
size = 5
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 28 of 505
/* get() class method */
int get()
{
if (currentNode != NULL)
return currentNode->get();
}
This method firstly confirms that the currentNode pointer is not NULL. If it is not
NULL, then it must be pointing to some Node object as inside the constructor of the
List class, we have initialized this pointer variable to NULL. That indicates that the
currentNode is NULL when there is no element inside the list. However, when a Node
object is added into it, it starts pointing to it. So, this get() returns the address of the
node pointed to by the currentNode pointer.
Further, we have another method given below:
/* next() class method */
bool next()
{
1. if (currentNode == NULL) return false;
2.
3. lastCurrentNode = currentNode;
4. currentNode = currentNode->getNext();
5. return true;
};
This is next() method, used to advance the currentNode pointer to the next node inside
the linked list. At line 1, the currentNode is being checked to confirm that there are
some elements present in the list to advance further. At line 1, the method is returning
false if there is no element present in the list. At line 3, it is storing the value of the
currentNode pointer into the lastCurrentNode. At line 4, currentNode is calling the
getNext() method to get the address of next node to be stored in the currentNode
pointer to advance the currentNode pointer to the next element. At line 5, it returns
true indicating the method is successful in moving to the next node.
Example Program
Given below is the full source code of the example program. You can copy, paste and
compile it right away. In order to understand the linked list concept fully, it is highly
desirable that you understand and practice with the below code.
#include <iostream.h>
#include <stdlib.h>
/* The Node class */
class Node
{
public:
int get() { return object; };
void set(int object) { this->object = object; };
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 29 of 505
Node * getNext() { return nextNode; };
void setNext(Node * nextNode) { this->nextNode = nextNode; };
private:
int object;
Node * nextNode;
};
/* The List class */
class List
{
public:
List();
void add (int addObject);
int get();
bool next();
friend void traverse(List list);
friend List addNodes();
private:
int size;
Node * headNode;
Node * currentNode;
Node * lastCurrentNode;
};
/* Constructor */
List::List()
{
headNode = new Node();
headNode->setNext(NULL);
currentNode = NULL;
lastCurrentNode = NULL;
size = 0;
}
/* add() class method */
void List::add (int addObject)
{
Node * newNode = new Node();
newNode->set(addObject);
if( currentNode != NULL )
{
newNode->setNext(currentNode->getNext());
currentNode->setNext( newNode );
lastCurrentNode = currentNode;
currentNode = newNode;
}
else
{
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 30 of 505
newNode->setNext(NULL);
headNode->setNext(newNode);
lastCurrentNode = headNode;
currentNode = newNode;
}
size ++;
}
/* get() class method */
int List::get()
{
if (currentNode != NULL)
return currentNode->get();
}
/* next() class method */
bool List::next()
{
if (currentNode == NULL) return false;
lastCurrentNode = currentNode;
currentNode = currentNode->getNext();
if (currentNode == NULL || size == 0)
return false;
else
return true;
}
/* Friend function to traverse linked list */
void traverse(List list)
{
Node* savedCurrentNode = list.currentNode;
list.currentNode = list.headNode;
for(int i = 1; list.next(); i++)
{
cout << "\n Element " << i << " " << list.get();
}
list.currentNode = savedCurrentNode;
}
/* Friend function to add Nodes into the list */
List addNodes()
{
List list;
list.add(2);
list.add(6);
list.add(8);
list.add(7);
list.add(1);
CS301 – Data Structures Lecture No. 03
___________________________________________________________________
Page 31 of 505
cout << "\n List size = " << list.size <<'\n';
return list;
}
main()
{
List list = addNodes();
traverse(list);
}
The output of the example program is as follows:
List size = 5
Element 1 2
Element 2 6
Element 3 8
Element 4 7
Element 5 1
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 32 of 505
Data Structures
Lecture No. 04
Reading Material
Data Structures and algorithm analysis in C++ Chapter. 3
3.2.3, 3.2.4, 3.2.5
Summary
• Methods of Linked List
• Example of list usage
• Analysis of Link List
• Doubly-linked List
• Circularly-linked lists
• Josephus Problem
Methods of Linked List
In the previous lecture, we discussed the methods of linked list. These methods form
the interface of the link list. For further elucidation of these techniques, we will talk
about the start method that has the following code.
// position currentNode and lastCurrentNode at first element
void start() {
lastCurrentNode = headNode;
currentNode = headNode;
};
There are two statements in this method. We assign the value of headNode to both
lastCurrentNode and currentNode. These two pointers point at different nodes of the
list. Here we have pointed both of these pointers at the start of the list. On calling
some other method like next, these pointers will move forward. As we can move in
the singly-linked list in one direction, these pointers cannot go behind headNode.
We will now see how a node can be removed from the link list. We use the method
remove for this purpose.
void remove() {
if( currentNode != NULL && currentNode != headNode) {
(step 1) lastCurrentNode->setNext(currentNode->getNext());
(step 2) delete currentNode;
(step 3) currentNode = lastCurrentNode;
(step 4) size--;
}
};
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 33 of 505
Suppose that the currentNode is pointing at the location that contains the value 6. A
request for the removal of the node is made. Resultantly, the node pointed by
currentNode should be removed. For this purpose, at first, the next pointer of the node
with value 2 (the node pointed by the lastCurrentNode pointer), that is before the
node with value 6, bypasses the node with value 6. It is, now pointing to the node with
value 8. The code of the first step is as:
lastCurrentNode->setNext(currentNode->getNext());
What does the statement currentNode->getNext() do? The currentNode is pointing to
the node with value 6 while the next of this node is pointing to the node with value 8.
That is the next pointer of node with value 6 contains the address of the node with
value 8. The statement lastCurrentNode->setNext(currentNode->getNext()) will set
the next pointer of the node pointed by the lastCurrentNode to the node with value 8.
So the next pointer of the node with value 2 is pointing to the node with value 8.
You see that the next pointer of the node having data element 2 contains the address
of the node having data element 8. The node with value 6 has been disconnected from
the chain while the node with value 2 is connected to the node with the value 8.
The code of the next step is:
delete currentNode;
You already know, in case of allocation of the memory with the help of the new
keyword, the delete statement releases this memory which returns the memory to the
system. Pictorially it can be represented as:
6 7 1
headNode
currentNode
Size = 5
lastCurrentNode
2 8
6 7 1
headNode
currentNode
Size = 5
lastCurrentNode
2 8
Step1
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 34 of 505
In the next step, we have moved the currentNode to point the next node. The code is:
currentNode = lastCurrentNode;
In the fourth step, the size of the list has been reduced by 1 after the deletion of one
node i.e.
size--;
The next method is length() that simply returns the size of the list. The code is as
follows:
// returns the size of the list
int length()
{
return size;
};
The private data members of the list are:
private:
int size; // contains the size of the list
Node *headNode; // points to the first node of the list
Node *currentNode, // current node
Node *lastCurrentNode; // last current node
The list class completed just now, can be termed as list factory. We have included all
7 1
headNode
currentNode
Size = 5
lastCurrentNode
2 8
Step1
Step2
7 1
headNode
currentNode
Size = 4
lastCurrentNode
2 8
Step1
Step2
Step3
Step4
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 35 of 505
the required methods in it. We may employ more methods if required. A programmer
can get the size of the list, add or remove nodes in it besides moving the pointers.
Example of list usage
Now let’s see how we use the link list. Here is an example showing the use of list:
/* A simple example showing the use of link list */
#include <iostream>
#include <stdlib.h>
#include "List.cpp" // This contains the definition of List class
// main method
int main(int argc, char *argv[])
{
List list; // creating a list object
// adding values to the list
list.add(5);
list.add(13);
list.add(4);
list.add(8);
list.add(24);
list.add(48);
list.add(12);
// calling the start method of the list
list.start();
// printing all the elements of the list
while (list.next())
cout << "List Element: "<< list.get()<<endl;
}
The output of the program is:
List Element: 5
List Element: 13
List Element: 4
List Element: 8
List Element: 24
List Element: 48
List Element: 12
Let’s discuss the code of the above program. We have included the standard libraries
besides having the “List.cpp” file. Usually we do not include .cpp files. Rather, the .h
files are included. Whenever you write a class, two files will be created i.e. .h (header
file containing the interface of the class) and .cpp (implementation file). Here for the
sake of explanation, we have combined the two files into “List.cpp” file. At the start
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 36 of 505
of the main method, we have created a list object as:
List list;
Here the default constructor will be called. If you understand the concept of factory,
then it is not difficult to know that we have asked the List factory to create a List
object and named it as list. After creating the object, nodes have been added to it. We
have added the elements with data values 5, 13, 4, 8, 24, 48 and 12. Later, the start()
method of list is called that will position the currentNode and lastCurrentNode at the
start of the list. Now there is no need to worry about the implementation of the List.
Rather, we will use the interface of the List. So the start method will take us to the
start of the list and internally, it may be array or link list or some other
implementation. Then there is a while loop that calls the next() method of the List. It
moves the pointer ahead and returns a boolean value i.e. true or false. When we reach
at the end of the list, the next() method will return false. In the while loop we have a
cout statement that prints the value of the list elements, employing the get() method.
The loop will continue till the next() method returns true. When the pointers reach at
the end of the list the next() will return false. Here the loop will come to an end.
Please keep in mind that list is a very important data structure that will be used in the
entire programming courses.
Analysis of Link List
As stated earlier, we will be going to analyze each data structure. We will see whether
it is useful or not. We will see its cost and benefit with respect to time and memory.
Let us analyze the link list which we have created with the dynamic memory
allocation in a chain form.
􀂃 add
For the addition purposes, we simply insert the new node after the current node.
So ‘add’ is a one-step operation. We insert a new node after the current node in
the chain. For this, we have to change two or three pointers while changing the
values of some pointer variables. However, there is no need of traversing too
much in the list. In case of an array, if we have to add an element in the centre of
the array, the space for it is created at first. For this, all the elements that are after
the current pointer in the array, should be shifted one place to the right. Suppose if
we have to insert the element in the start of the array, all the elements to the right
one spot are shifted. However, for the link list, it is not something relevant. In link
lists, we can create a new node very easily where the current pointer is pointing.
We have to adjust two or three pointers. Its cost, in terms of CPU time or
computing time, is not much as compared to the one with the use of arrays.
􀂃 remove
Remove is also a one-step operation. The node before and after the node to be
removed is connected to each other. Update the current pointer. Then the node to
be removed is deleted. As a result, the node to be removed is deleted. Very little
work is needed in this case. If you compare it with arrays, for the deletion of an
element from the array, space is created. To fill this space, all the right elements
are shifted one spot left. If the array size is two thousand or three thousand, we
need to run a loop for all these elements to shift them to left.
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 37 of 505
􀂃 find
The worst-case in find is that we may have to search the entire list. In find, we
have to search some particular element say x. If found, the currentNode pointer is
moved at that node. As there is no order in the list, we have to start search from
the beginning of the list. We have to check the value of each node and compare it
with x (value to be searched). If found, it returns true and points the currentNode
pointer at that node otherwise returns false. Suppose that x is not in the list, in this
case, we have to search the list from start to end and return false. This is the worst
case scenario. Though time gets wasted, yet we find the answer that x is not in the
list. If we compare this with array, it will be the same. We don’t know whether x
is in the array or not. So we have to search the complete array. In case of finding
it, we will remember that position and will return true. What is the average case? x
can be found at the first position , in the middle or at the end of the list. So on
average, we have to search half of the list.
􀂃 back
In the back method, we move the current pointer one position back. Moving the
current pointer back, one requires traversing the list from the start until the node
whose next pointer points to current node. Our link list is singly linked list i.e. we
can move in one direction from start towards end. Suppose our currentNode
pointer and lastCurrentNode are somewhere in the middle of the list. Now we
want to move one node back. If we have the pointer of lastCurrentNode, it will be
easy. We will assign the value of lastCurrentNode to currentNode. But how can
we move the lastCurrentNode one step back. We don’t have the pointer of
previous node. So the solution for this is to go at the start of the list and traverse
the list till the time you reach the node before the lastCurrentNode is pointing.
That will be the node whose next pointer contains the value lastCurrentNode. If
the currentNode and the lastCurrentNode are at the end of the list, we have to
traverse the whole list. Therefore back operation is not a one step operation. We
not only need a loop here but also require time.
Doubly-linked List
If you look at single link list, the chain is seen formed in a way that every node has a
field next that point to the next node. This continues till the last node where we set the
next to NULL i.e. the end of the list. There is a headNode pointer that points to the
start of the list. We have seen that moving forward is easy in single link list but going
back is difficult. For moving backward, we have to go at the start of the list and begin
from there. Do you need a list in which one has to move back or forward or at the start
or in the end very often? If so, we have to use double link list.
In doubly-link list, a programmer uses two pointers in the node, i.e. one to point to
next node and the other to point to the previous node. Now our node factory will
create a node with three parts.
First part is prev i.e. the pointer pointing to the previous node, second part is element,
prev element next
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 38 of 505
containing the data to be inserted in the list. The third part is next pointer that points to
the next node of the list. The objective of prev is to store the address of the previous
node.
Let’s discuss the code of the node of the doubly-link list. This node factory will create
nodes, each having two pointers. The interface methods are same as used in singly
link list. The additional methods are getPrev and setPrev. The method getPrev returns
the address of the previous node. Thus its return type is Node*. The setPrev method
sets the prev pointer. If we have to assign some address to prev pointer, we will call
this method. Following is the code of the doubly-linked list node.
/* this is the doubly-linked list class, it uses the next and prev pointers */
class Node {
public:
int get() { return object; }; // returns the value of the element
void set(int object) { this->object = object; }; // set the value of the element
Node* getNext() { return nextNode; }; // get the address of the next node
void setNext(Node* nextNode) // set the address of the next node
{ this->nextNode = nextNode; };
Node* getPrev() { return prevNode; }; // get the address of the prev node
void setPrev(Node* prevNode) // set the address of the prev node
{ this->prevNode = prevNode; };
private:
int object; // it stores the actual value of the element
Node* nextNode; // this points to the next node
Node* prevNode; // this points to the previous node
};
Most of the methods are same as those in singly linked list. A new pointer prevNode
is added and the methods to get and set its value i.e. getPrev and setPrev. Now we
will use this node factory to create nodes.
You have to be very cautious while adding or removing a node in a doubly linked list.
The order in which pointers are reorganized is important. Let’s have a pictorial view
of doubly-link list. The diagram can help us understand where the prevNode and
nextNode are pointing.
This is a doubly link list. The arrows pointing towards right side are representing
nextNode while those pointing towards left side are representing prevNode. Suppose
head 2 6 8 7 1 size=5
current
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 39 of 505
we are at the last node i.e. the node with value 1. In case of going back, we usually
take the help of prevNode pointer. So we can go to the previous node i.e. the node
with value 7 and then to the node with value 8 and so on. In this way, we can traverse
the list from the end to start. We can move forward or backward in doubly-link list
very easily. We have developed this facility for the users to move in the list easily.
Let’s discuss other methods of the doubly-linked list. Suppose we have created a new
node from the factory with value 9. We will request the node factory to create a new
object using new keyword. The newly created node contains three fields i.e. object,
prevNode and nextNode. We will store 9 into object and connect this new node in the
chain. Let’s see how the pointers are manipulated to do that. Consider the above
diagram, the current is pointing at the node with value 6. The new node will be
inserted between the node with value 6 and the one with value 8.
In the first step, we assign the address of the node with value 8 to the nextNode of the
new node.
newNode->setNext( current->getNext() );
In the next step, a programmer points the prevNode of the newNode to the node with
value 6.
newNode->setprev( current );
In the third step, we will set the previous node with value 8 to point to the newNode.
(current->getNext())->setPrev(newNode);
head 2 6 8 7 size=5
current
1
newNode 9 1
2
head 2 6 8 7 size=5
current
1
newNode 9 1
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 40 of 505
Now the prevNode of the node with value 8 is pointing to the node with value 9.
In the fourth step, the nextNode of the node with value 6 is pointing to the newNode
i.e. the node with value 9. Point the current to the newNode and add one to the size of
the list.
current->setNext( newNode );
current = newNode;
size++;
Now the newNode has been inserted between node with value 6 and node with value
8.
Circularly-linked lists
Let’s talk about circularly linked list. The next field in the last node in a singly-linked
list is set to NULL. The same is the case in the doubly-linked list. Moving along a
singly-linked list has to be done in a watchful manner. Doubly-linked lists have two
NULL pointers i.e. prev in the first node and next in the last node. A way around this
potential hazard is to link the last node with the first node in the list to create a
circularly-linked list.
The next method in the singly-linked list or doubly-linked list moves the current
pointer to the next node and every time it checks whether the next pointer is NULL or
not. Similarly the back method in the double-linked list has to be employed carefully
if the current is pointing the first node. In this case, the prev pointer is pointing to
NULL. If we do not take care of this, the current will be pointing to NULL. So if we
try to access the NULL pointer, it will result in an error. To avoid this, we can make a
circularly linked list.
head 2 6 8 7 size=6
current
1
newNode 9 1
2 4 3
9 1
head 2 6 8 7 size=5
current
1
newNode
2 3
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 41 of 505
We have a list with five elements. We have connected the last node with the first
node. It means that the next of the last node is pointing towards the first node.
The same list has been shown in a circular shape.
You have noticed that there is no such node whose next field is NULL. What is the
benefit of this? If you use the next or back methods that move the current pointer, it
will never point to NULL. It may be the case that you keep on circulating in the list.
To avoid this, we get help from the head node. If we move the head node in the
circularly linked list, it will not be certain to say where it was pointing in the start. Its
advantages depend on its use. If we do not have to move too much in the list and have
no problem checking the NULL, there is little need a circularly-linked list. But this
facility is available to us.
In this example, we made a circular linked list from a singly link list. In a singly link
list we move in one direction. We point the next pointer of the last node to the first
node. We can do the same with the doubly-linked list. The prev pointer of the first
node will point to the last node and the next pointer of the last node will point to the
first node. If you arrange all the nodes in a circle, one of the pointers (i.e. next
pointer) will move in clockwise direction while the prev pointers in anti-clockwise
direction. With the help of these pointers, you can move in the clockwise direction or
anti-clockwise direction. Head node pointer will remain at its position. You don’t
need to change it. If there is a need to remove the node pointed by head node than you
have to move the head pointer to other node. Now we don’t have any NULL pointer
in the doubly-linked list. We will not get any exception due to NULL pointers.
2
8
7
1
head
current
size=5
6
head 2 6 8 7 1
current
size=5
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 42 of 505
Josephus Problem
Now we will see an example where circular link list is very useful. This is Josephus
Problem. Consider there are 10 persons. They would like to choose a leader. The way
they decide is that all 10 sit in a circle. They start a count with person 1 and go in
clockwise direction and skip 3. Person 4 reached is eliminated. The count starts with
the fifth and the next person to go is the fourth in count. Eventually, a single person
remains.
You might ask why someone has to choose a leader in this way. There are some
historical stories attached to it. This problem is also studied in mathematics. Let’s see
its pictorial view.
We have ten numbers representing the ten persons who are in a circle. The value of M
shows the count. As the value of M is three, the count will be three. N represents the
number of persons. Now we start counting clockwise. After counting up to three, we
have the number four. The number four is eliminated and put in the eliminated
column.
After eliminating the number four, we will start our counting from number five.
Counting up to three, we have number eight which is eliminated and so on.
N=10, M=3
4
5
6
7
9 8
10
1
2
3
N=10, M=3
4
5
6
7
9 8
10
1
2
3
Eliminated
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 43 of 505
In the end, only number five will remain intact.
If we have ten persons (N = 10) in a circle and eliminate after counting up to three (M
= 3). If we start our count from one, who will be the leader? We have studied this
earlier and know that the person who is sitting at the fifth position will become the
leader.
Suppose if the value of N is 300 or 400 and the value of M is 5 or 10. Now who will
be the leader? This is a mathematical problem where we can change the values of N
and M. There is a formula where the values of N, M are allotted. You can calculate
who should become the leader. Here we will not solve it mathematically. Rather, it
will be tackled as a computer problem. If you analyze the pictures shown above, it
gets clear that this can be solved with the circular link list. We arrange these numbers
in a circularly-linked list, point the head pointer at the starting number and after
calling the next method for three times, we will reach the node which is to be
removed. We will use the remove method to remove the node. Then the next method
is called thrice from there and the node is removed. We will continue this till we have
N=10, M=3 4
5
6
7
8
9
10
1
2
3
Eliminated
N=10, M=3
4
5
6
7
8
9
10
1
2
3
Eliminated
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 44 of 505
only one node.
We are not concerned with the NULL pointers, internal to link list. However, if you
want to solve this problem and choose the best data structure, then circular link list is
the best option. We can also use the list to solve this.
Let’s see the code of the program by which we can solve this problem. The code is as
under:
/*This program solves the Josephus Problem */
#include <iostream.h>
#include "CList.cpp" //contains the circularly-linked list definition
// The main method
void main(int argc, char *argv[])
{
CList list; // creating an object of list
int i, N=10, M=3;
for(i=1; i <= N; i++ ) list.add(i); // initializing the list with values
list.start(); // pointing the pointers at the start of the list
// counting upto M times and removing the element
while( list.length() > 1 ) {
for(i=1; i <= M; i++ ) list.next();
cout << "remove: " << list.get() << endl;
list.remove();
}
cout << "leader is: " << list.get() << endl; //displaying the remaining node
}
We have included the “CList.cpp”. It means that we are using the circularly-linked
list. In the main method, CList factory is called to create a circular link list as CList
list; After this, we assign the values to N and M. We have used for loop to add the
nodes in the list. When this loop finishes, we have ten nodes in the list having values
from 1 to 10. But here a programmer may not pay attention to the internal details of
the list. We have created a list and stored ten numbers in it. Then we moved the
pointers of the list at the start of the list using the start method. It means that the
pointers are pointing at the position from where we want to start the counting of the
list.
There is a while loop that will continue executing until only one node is left in the list.
Inside this loop, we have a for loop. It will execute from 1 to M. It has only one
statement i.e. list.next(). This will move the pointer forward three times (as the value
of M is 3). Now the current pointer is at the 4th node. We called the remove method.
Before removing the node, we display its value on the screen using cout. Again we
come into the while loop, now the length of the list is 9. The ‘for loop’ will be
executed. Now the list.next() is not starting from the start. It will start from the
CS301 – Data Structures Lecture No. 04
___________________________________________________________________
Page 45 of 505
position where the current pointer is pointing. The current pointer is pointing at the
next node to the node deleted. The count will start again. The list.next() will be called
for three times. The current pointer will point at the 8th node. Again the remove
method will be called and the current pointer moved to the next node and so on. The
nodes will be deleted one by one until the length of the list is greater than one. When
the length of the list is one, the while loop will be terminated. Now only one node is
left in the list i.e. the leader. We will display its value using the get method.
We can change the values of M and N. Similarly, these values can be read from the
file or can use the command line arguments to get values. There are many variations
of this problem. One variation is that the value of M keeps on changing. Sometimes, it
is 3, sometimes 4 or 5 and so on. Due to this, it will become difficult to think that who
will become leader. Make a picture in your mind that ten persons are sitting in a
circle. Every time the value of M is incremented by one. Now try to ascertain which
position you should sit to get chosen as a leader. You may like to write a program to
solve this or use the mathematical formula.
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 46 of 505
Data Structures
Lecture No. 05
Reading Material
Data Structures and Algorithm Analysis in C++ Chapter. 3
3.1, 3.2.5, 3.3.1, 3.3.2
(array implementation)
Summary
5) Benefits of using circular list
6) Abstract Data Type
7) Stacks
8) Stack Implementation using arrays
In the previous lecture, we demonstrated the use of the circular list for the resolution
of the Josephus problem. After writing a program with the help of this data structure,
a leader among ten persons was selected. You must have noted many things while
trying to solve the problem. These things will help us to understand the usage of data
structures in C++, thus making the programming easy. The code of the program is
given below.
#include "CList.cpp"
void main(int argc, char *argv[])
{
CList list;
int i, N=10, M=3;
for(i=1; i <= N; i++ ) list.add(i);
list.start();
while( list.length() > 1 ) {
for(i=1; i <= M; i++ ) list.next();
cout << "remove: " << list.get() << endl;
list.remove();
}
cout << "leader is: " << list.get() << endl;
}
In the program, we include the file of the class CList and create its object i.e. list.
Then we solve the problem by using the add, start, length, next, remove and get
methods of the class CList.
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 47 of 505
In the program, we have included already-defined data structure CList. After defining
its different methods, we have an interface of Clist. There is no need to be worry
about the nature of the list i.e. whether it is linked list, doubly linked list or an array.
For us, it is only a list to be manipulated according to our requirement. You will see
that a programmer may use different methods of the list object to solve the problem.
We add elements to the list by a simple call of add method and go to the first element
of the list by start method. Here, the length method is used in the condition of the
while loop. Then we remove elements from the list and use the next, get and remove
methods during this process. We get the current element by using the get method, then
remove it by calling the remove method and then go to the next element by the
method next. This way, all the elements are removed from the list except one element,
called the leader. This one element remains there as we execute the while loop one
less than the length of the list.
In singly linked list, the ‘next’ returns false when it reaches to the last node due to the
fact that the next field of the last node is set to NULL. But in a circularly linked list
there is no NULL. It will be there only when there is no node in the list.
The whole process, which we carried out to solve the Josephus problem, can also be
carried out with functions in C++. While adopting this way (of writing functions), we
have to write these functions whenever we write another program that manipulates a
list. In this method, we define a class of the data structure list and its different
methods for the purpose of manipulation. This way, this class, obviously its methods
too, can be used in any program where the manipulation of a list is needed. Thus there
is re-usability of the code. In a class, we encapsulate the data and its methods. This
shows that we are no longer interested in the internal process of the class. Rather, we
simply use it wherever needed. The circular linked list, earlier used for the resolution
of the Josephus problem, can also be employed in other problems. We have a class
CList of this circular linked list through which any number of objects of data type of
circular linked list can be created. Thus we can assume the class CList as a factory,
creating as many objects of list as needed. This class and its objects in any program
can be used to solve the problems with the help of its interface. The interface of this
class consists of some methods like add, remove, next, back, get and some other
simple ones. While carrying out programming, we will see that these classes (objects)
help us very much to solve different problems.
Benefits of using circular list
While solving the Josephus problem, it was witnessed that the usage of circular linked
list helped us make the solution trivial. We had to just write a code of some lines that
solved the whole problem. In the program, we included the class CList (which is of
our data structure i.e. circular linked list) and used all of its methods according to the
requirements. There was no problem regarding the working of the methods. We just
called these methods and their definition in the class CList worked well.
Now we will see what happens if we solve the Josephus problem by using an array
instead of the class in our program. In this case, we have to define an array and write
code to move back and forth in the array and to remove different elements properly in
a particular order. A programmer needs to be very careful while doing this, to reach
the solution of the problem. Thus our code becomes very complex and difficult for
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 48 of 505
someone to understand and modify it. Moreover we cannot use this code in some
other problem. Note that here we are talking about the use of an array in the main
program, not in the class that defines the CList data structure. There is no need to be
worried whether an array, singly linked list, doubly linked list is used or circular
linked list being employed internally in implementing the list in defining the class of
list data type. We only want that it should create objects of list. The usage of the class
of a data structure simplifies the code of the program. We can also use this class
wherever needed in other programs. This shows that the choice of appropriate data
structures can simplify an algorithm. It can make the algorithm much faster and
efficient. In this course, we will see that there are different data structures, which
makes the algorithms very easy to solve our problems. Later, we will see how some
elegant data structures lie at the heart of major algorithms. There is also a course
dedicated to study different algorithms and recipes that can be used to solve host of
complex problems. Moreover, we will study different data structures in detail and see
that with the use of a proper data structure, we can solve a problem efficiently. A
properly constructed data structure will always help in the solution of problems.
Abstract Data Type
A data type is a collection of values and a set of operations on those values. That
collection and these operations form a mathematical construct that may be
implemented with the use of a particular hardware or software data structure. The
term abstract data type (ADT) refers to the basic mathematical concept that defines
the data type. We have discussed four different implementations of the list data
structure. In case of implementation of the list with the use of an array, the size of the
array gives difficulty if increased. To avoid this, we allocate memory dynamically for
nodes before connecting these nodes with the help of pointers. For this purpose, we
made a singly linked list and connected it with the next pointer to make a chain.
Moving forward is easy but going back is a difficult task. To overcome this problem,
we made a doubly linked list using prev and next pointers. With the help of these
pointers, we can move forward and backward very easily. Now we face another
problem that the prev pointer of first node and the next pointer of the last node are
NULL. Therefore, we have to be careful in case of NULL pointers. To remove the
NULL pointers, we made the circular link list by connecting the first and last node.
The program employing the list data structure is not concerned with its
implementation. We do not care how the list is being implemented whether through
an array, singly linked list, doubly linked list or circular linked list. It has been
witnessed that in these four implementations of the list, the interface remained the
same i.e. it implements the same methods like add, get, next, start and remove etc.
This proves that with this encapsulation attained by making a class, we are not
concerned with its internal implementation. The implementation of these abstract data
types can be changed anytime. These abstract data types are implemented using
classes in C++. If the list is implemented using arrays while not fulfilling the
requirements, we can change the list implementation. It can be implemented with the
use of singly-link list or doubly link list. As long as the interface is same, a
programmer can change the internal implementation of the list and the program using
this list will not be affected at all. This is the abstract data type (ADT). What we care
about is the methods that are available for use, with the List ADT i.e. add, get, and
remove etc methods. We have not studied enough examples to understand all the
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 49 of 505
benefits of abstract data types. We will follow this theme while developing other
ADT. We will publish the interface and keep the freedom to change the
implementation of ADT without effecting users of the ADT. The C++ classes provide
a programmer an ability to create such ADTs. What benefits can we get with the help
of these ADTs and classes? When we develop an ADT or a class or factory then the
users of this factory are independent of how this factory works internally. Suppose
that we have ordered the car factory (car class) to produce a new car and it replies
after a long time. If we ordered the remove method to remove one node and we are
waiting and it keeps on working and working. Then we might think that its
implementation is not correct. Although, we are not concerned with the internal
implementation of this ADT yet it is necessary to see whether this ADT is useful for
solving our problem or not. It should not become a bottleneck for us. If the method we
are using is too much time consuming or it has some problem in terms of algorithm
used. On one side, we only use the interfaces provided by these ADTs, classes, or
factories as long as they do what they promise. We are not concerned with the internal
details. On the other hand, we have to be careful that these factories or methods
should not take too much time so that these will not be useful for the problem.
This distinction will always be there. Sometimes, the source code of classes is not
provided. We will be provided libraries, as standard libraries are available with the
compiler. These classes are in compiled form i.e. are in object form or in binary form.
On opening these files, you will not see the C++ code, rather binary code. When you
read the assembly language code, it will give some idea what this binary code is
about. You can view the interface methods in the .h file. As an application
programmer, you have to see that the ADTs being used are written in a better way.
The point to be remembered here is that you should not worry about the internal
implementation of these ADTs. If we want to change the internal implementation of
the ADTs, it can be done without affecting the users of these ADTs. While writing a
program, you should check its performance. If at some point, you feel that it is slow,
check the ADTs used at that point. If some ADT is not working properly, you can ask
the writer of the ADT to change the internal implementation of that ADT to ensure
that it works properly.
Stacks
Let’s talk about another important data structure. You must have a fair idea of stacks.
Some examples of stacks in real life are stack of books, stack of plates etc. We can
add new items at the top of the stack or remove them from the top. We can only
access the elements of the stack at the top. Following is the definition of stacks.
“Stack is a collection of elements arranged in a linear order”
Let’s see an example to understand this. Suppose we have some video cassettes. We
took one cassette and put it on the table. We get another cassette and put it on the top
of first cassette. Now there are two cassettes on the table- one at the top of other. Now
we take the third cassette and stack it on the two. Take the fourth cassette and stack it
on the three cassettes.
Now if we want to take the cassette, we can get the fourth cassette which is at the top
and remove it from the stack. Now we can remove the third cassette from the stack
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 50 of 505
and so on. Suppose that we have fifty cassettes stacked on each other and want to
access the first cassette that is at the bottom of the stack. What will happen? All the
cassettes will fell down. It will not happen exactly the same in the computer. There
may be some problem. It does not mean that our data structure is incorrect. As we see
in the above example that the top most cassette will be removed first and the new
cassette will be stacked at the top. The same example can be repeated with the books.
In the daily life, we deal with the stacked goods very carefully.
Now we will discuss how to create a stack data structure or a factory, going to create
stack object for us. What will be the attributes of this object? During the discussion on
the list, we came to know that a programmer adds values in the list, removes values
from the list and moves forward and backward. In case of a stack too, we want to add
things and remove things. We will not move forward or backward in the stack. New
items can be added or removed at the top only. We can not suggest the removal of the
middle element of the stack.
Let’s talk about the interface methods of the stacks. Some important methods are:
Method Name Description
push(x) Insert x as the top element of the stack
pop() Remove the top element of the stack and return it.
top() Return the top element without removing it from the stack.
The push(x) method will take an element and insert it at the top of the stack. This
element will become top element. The pop() method will remove the top element of
the stock and return it to the calling program. The top() method returns the top-most
stack element but does not remove it from the stack. The interface method names that
we choose has special objective. In case of list, we have used add, remove, get, set as
the suitable names. However, for stack, we are using push, pop and top. We can
depict the activity from the method name like push means that we are placing an
element on the top of the stack and pushing the other elements down.
The example of a hotel’s kitchen may help understand the concept of stacks in a
comprehensive manner. In the kitchen, the plates are stacked in a cylinder having a
spring on the bottom. When a waiter picks a plate, the spring moves up the other
plates. This is a stack of plates. You will feel that you are pushing the plates in the
cylinder and when you take a plate from the cylinder it pops the other plates. The top
method is used to get the top- most element without removing it.
When you create classes, interfaces and methods, choose such names which depicts
what these method are doing. These names should be suitable for that class or factory.
Let’s discuss the working of stack with the help of a diagram.
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 51 of 505
At the start, the stack is empty. First of all, we push the value 2 in the stack. As a
result, the number 2 is placed in the stack. We have a top pointer that points at the top
element. Then we said push(5). Now see how 2 and 5 are stacked. The number 5 is
placed at the top of number 2 and the pointer top moves one step upward. Then we
pushed the number 7 which is placed on the top and the number 2 and 5 are below.
Similarly, we push number 1. The last figure in the first row shows the stacked values
of the numbers- 1, 7, 5 and 2.
Let’s pop the elements from the stack. The first figure of second row shows the pop
operation. As a result, the number 1 is popped. Than again we push the number 21 on
the stack. The number 7, 5, and 2 are already in the stack and number 21 is pushed at
the top. If we pop now, the number 21 is popped. Now number 7 is at the top. If we
pop again, the number 7 is popped. Pop again and the number 5 is popped and number
2 remains in the stack. Here with the help of this diagram, we are proving that the
values are added at the top and removed at the top in a stack.
The last element to go into the stack is the first to come out. That is why, a stack is
known as LIFO (Last In First Out) structure. We know that the last element pushed in
the stack is at the top which is removed when we call pop. Let’s see some other
scenarios. What happens if we call pop() while there is no element? One possible
way-out is that we have isEmpty() function that returns true if stack is empty and false
otherwise. This is a boolean function that returns false if there is no element in the
stack. Otherwise, it will return true. The second option is this that when we call pop
on an empty stack, it throws an exception. This is a concept of advanced C++.
Exception is also a way to convey that some unusual condition has arisen or
something has gone wrong. Suppose, if we have a division method and try to divide
some number with zero. This method will throw ‘division by zero’ exception.
push(2)
top 2
push(5)
top
2
5
push(7)
top
2
5
7
push(1)
top
2
5
7
1
1 pop()
top
2
5
7
push(21)
top
2
5
7
21
21 pop()
top
2
5
7
7 pop()
2
top 5
top 2
5 pop()
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 52 of 505
Currently we will not throw an exception but use the isEmpty() method. The user who
is employing the stack is responsible to call the isEmpty() method before calling the
pop. Call the pop method if isEmpty() returns false . Otherwise, there will be a
problem.
Stack Implementation using array
Let’s discuss the implementation of the stack. Suppose we implement the stack using
the arrays. The stack shown in the above diagram may be considered as an array. Here
the array is shown vertically. We can implement the stack using array. The interface
will remain as push and pop methods. The user of the stack does not need to know
that the stack is internally implemented with the help of array. The worst case for
insertion and deletion from an array may happen when we insert and delete from the
beginning of the array. We have to shift elements to the right for insertion and left for
removal of an element. We face the same problem while implementing the list with
the use of the array. If we push and pop the elements from the start of the array for
stack implementation, this problem will arise. In case of push, we have to shift the
stack elements to the right. However, in case of pop, after removing the element, we
have to shift the elements of stack that are in the array to the left. If we push the
element at the end of the array, there is no need to shift any element. Similarly as the
pop method removes the last element of the stack which is at the end of the array, no
element is shifted. To insert and remove elements at the end of the array we need not
to shift its elements. Best case for insert and delete is at the end of the array where
there is no need to shift any element. We should implement push() and pop() by
inserting and deleting at the end of an array.
In the above diagram, on the left side we have a stack. There are four elements in the
stack i.e. 1, 7, 5 and 2. The element 1 is the extreme-most that means that it is inserted
in the end whereas 7, 5, and 2 have been added before. As this is a LIFO structure so
the element 1 should be popped first. On the right side we have an array with
positions 0, 1, 2, 3 and so on. We have inserted the numbers 2, 5, 7 and 1. We have
decided that the elements should be inserted at the end of the array. Therefore the
most recent element i.e. 1 is at position 3. The top is the index representing the
position of the most recent element. Now we will discuss the stack implementation in
detail using array.
We have to choose a maximum size for the array. It is possible that the array may
‘fill-up’ if we push enough elements. Now more elements cannot be pushed. Now
what should the user of the stack do? Internally, we have implemented the stack
using array which can be full. To avoid this, we write isFull() method that will return
top
2
5
7
1
2 5 7 1
0 1 2 3 4
top = 3
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 53 of 505
a boolean value. If this method returns true, it means that the stack (array) is full and
no more elements can be inserted. Therefore before calling the push(x), the user
should call isFull() method. If isFull() returns false, it will depict that stack is not full
and an element can be inserted. This method has become the part of the stack
interface. So we have two more methods in our interface i.e. isEmpty() and isFull().
Now we will discuss the actual C++ code of these operations. These methods are part
of stack class or stack factory. We have an array named A while current is its index.
The code of pop() method is as:
int pop()
{
return A[current--];
}
In this method, the recent element is returned to the caller, reducing the size of the
array by 1.
The code of push method is:
void push(int x)
{
A[++current] = x;
}
We know that ++current means that add one to the current and then use it. That also
shows that element x should be inserted at current plus one position. Here we are not
testing that this current index has increased from the array size or not. As discussed
earlier that before using the push method, the user must call isFull() method.
Similarly it is the responsibility of the user to call the isEmpty() method before calling
the pop method. Therefore there is no if statement in the push and pop method.
The code of the top() method is:
int top()
{
return A[current];
}
This method returns the element at the current position. We are not changing the
value of current here. We simply want to return the top element.
int isEmpty()
{
return ( current == -1 );
}
This method also tests the value of the current whether it is equal to -1 or not. Initially
when the stack is created, the value of current will be -1. If the user calls the
isEmpty() method before pushing any element, it will return true.
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 54 of 505
int isFull()
{
return ( current == size-1);
}
This method checks that the stack is full or not. The variable size shows the size of the
array. If the current is equal to the size minus one, it means that the stack is full and
we cannot insert any element in it.
We have determined the cost and benefit of all the data structures. Now we will see
how much time these methods take. A quick examination shows that all the five
operations take constant time. In case of list, the find method takes too much time as it
has to traverse the list. Whereas the add and remove methods are relatively quick. The
methods of stack are very simple. There is no complexity involved. We insert element
at one side and also remove from that side not in the middle or some other place.
Therefore we need not to carry out a lot of work. During the usage of the array, the
stack methods push, pop, top, isFull and isEmpty all are constant time operations.
There is not much difference of time between them.
The complete code of the program is:
/* Stack implementation using array */
#include <iostream.h>
/* The Stack class */
class Stack
{
public:
Stack() { size = 10; current = -1;} //constructor
int pop(){ return A[current--];} // The pop function
void push(int x){A[++current] = x;} // The push function
int top(){ return A[current];} // The top function
int isEmpty(){return ( current == -1 );} // Will return true when stack is empty
int isFull(){ return ( current == size-1);} // Will return true when stack is full
private:
int object; // The data element
int current; // Index of the array
int size; // max size of the array
int A[10]; // Array of 10 elements
};
// The main method
int main()
{
Stack stack; // creating a stack object
// pushing the 10 elements to the stack
for(int i = 0; i < 12; i++)
CS301 – Data Structures Lecture No. 05
___________________________________________________________________
Page 55 of 505


Post a Comment

 
Top