Objectives
By the end of this lesson you will be able to:
- Describe and utilize an array data structure
- Describe and utilize a queue data structure
- Describe and utilize a stack data structure
- Describe and utilize linked lists
- Describe and utilize classes from a class library
Introduction To Data Structures
Data structures are the principal method for collecting related data values into a single unit. Data structures are nothing more than techniques for organizing and storing related data in computer memory. How the data is stored affects how the data is accessed and modified. Understanding a data structure involves not only understanding the storage pattern, but also knowing what methods are used to create, access, and manipulate the data structure.
Data structures are the building blocks of most computer programs, and they allow developers to implement complex functionality. Most programming frameworks provide built-in support for a variety of data structures and associated methods to manipulate these data structures. In this section, you will learn about several distinct types of data structures so that you are familiar with the general techniques for manipulating them.
In this tutorial you will learn about some of the most commonly used data structures in computer programming; arrays, queues, stacks, linked lists and classes.
Arrays
An array is a collection of items stored in a contiguous memory location and addressed using one or more indices.
An array is a common data structure that represents a collection of items of a similar type. The items in an array are stored in contiguous memory locations. In most languages (not JavaScript) an array is a homogeneous data structure because the all the items of an array are of the same data type. Any array item can be directly accessed by using an index. In .NET Framework, array indexes are zero-based.
Introducing Arrays
Array Properties and Methods
Internal Representation
In the following C# code, the first statement declares an array variable, and the second statement initializes the variable with an array of four integers:
int[] numbers;
numbers = new int[4];
At first, the variable numbers
is stored on the stack and set to a null value because the array is not yet initialized. However, the second statement initializes the array by allocating a contiguous memory space big enough to store four integers in the memory heap. All the array elements are initialized in this case with the value 0 (zero) because 0 is the default value for an integer. The starting address of the memory allocation is stored with the array variable numbers
on the stack, as shown in figure 3-2.
Figure 3-2: In the first row, the numbers array is created and null is stored for its value on the stack. In row 2, the numbers array is initialized and the beginning address of the contiguous area of memory where the array is stored is placed on the heap in place of the value null.
The variable numbers
then acts as a reference to the memory location assigned to the array. The array name can be used to access each of the array items directly. In the .NET Framework, all arrays are zero-based—that is, the first item of the array is accessed using an index of numbers[0], the second item is accessed by numbers[1], and so on.
yes, we could do this all in one statement if we wanted to, like this:
int[] numbers = new int[4];
Passing Arrays as Arguments
MSDN: Passing arrays as arguments (C# Programming Guide)
Multi-Dimensional Arrays
It is also possible to have multi-dimensional arrays. A two-dimensional array can be thought of as a table in which each cell is an array element and can be addressed using the numbers of the row and column to which it belongs. Both the row number and column number are indexed by zero. For example, the expression table[2, 3]
would refer to an item in the third row and fourth column of an array by the name table
.
Common Operations
Arrays support the following operations:
To work with an array, you will need to allocate the memory initializing the array, as shown previously. Once the address in memory for storing the array is allocated, you can access any array element in any order you please by directly referring to its index. For example, the following code assigns a value of 10 to the fourth item of the array, and twice that value is then assigned to the variable calc
:
number[3] = 10;
int calc = number[3] * 2; // calc is assigned the value of 20
Performance and Usage
The contents of an array are laid out as a contiguous block of memory and can be accessed directly by using the array index. Thus, reading from or writing to an array is extremely fast. However, arrays are limited by the requirements of homogeneity and fixed size. Although array size can be increased, doing so requires reallocation of all the array elements and is a time-consuming operation.
Arrays work best when the number of items in the collection is predetermined and fast, direct access to each item is required.
In the .NET Framework, you can use the ArrayList
class to get around an array’s requirements for homogeneity and fixed size. An ArrayList
is a collection type that can hold items of any data type and dynamically expand when needed. However, an ArrayList
is not as fast as an array.
Visual Studio Solution for this tutorial (link will download zip file).
⇑ Table of Contents
Array Example Code
C#
Here is how the code in our Program.cs
file will look. Place the code inside the Main() method and run it.
string[] array = new string[3];
Console.Write("Enter name: ");
array[0] = Console.ReadLine();
Console.Write("Enter email: ");
array[1] = Console.ReadLine();
Console.Write("Enter phone: ");
array[2] = Console.ReadLine();
Console.WriteLine("Hello {0},
your email address is {1},
and your phone number is {2}",
array[0], array[1], array[2]);
Array Example Dissected
In the example above, you can see that the first line of code added to the Main() method initializes an array which can hold three string values and has an ID of "array". In the next six lines of code the user is prompted three times, once to enter their name, once to enter their email address, and once to enter their phone number. Each time the user types in a value and presses enter, that value is placed into the array at the appropriate index. Remember arrays are "zero-based," so when the user entered values are inserted into the array, the first entry will be stored at array index 0, the second value entered at array index 1 and the third value entered are array index 2 . To retrieve each value we can simply type the name of the array we want to retrieve a value from, "array" in this case, and the index position where the value we want to retrieve is stored in the array, like this array[0], array[1], and array[2].
Manipulating String Output
Look closely at final line of code in the example above, notice how the Console.WriteLine() method has placeholders inserted into the string text (the text between the quotes) that we want to output to the console. By inserting {0} into the string, at runtime the placeholder will be replaced by the value of the first variable listed after the string, array[0] in this case. In this example the {0} is a placeholder for the array[0] value, {1} is the placeholder for the array[1] value and {2} is the placeholder for the array[2] value.
Another concept being demonstrated here is the you can combine one to string to another using the concatenation "+" operator. In this case it was done in order to take a long line of code and display it in the code editor on two lines in order to fit the width of the screen capture image. You could also concatenate strings to variable values like this:
"Hello " + array[0] + ", your email address is " + array[1] +
", and your phone number is " + array[2]
This method can be a little trickier because you have to make sure that the spaces you want in your output are properly added into your concatenated string. That's where the Composite Formatting or the String Interpolation operator can help.
Composite Formatting
Using the example from above, when can modify the output like this and still get the same results:
"Hello {0}, your email address is {1}, and your phonenumber is {2}", array[0], array[1], array[2]
In the case of Composite Formatting we are adding placeholders into the string value to be output using curly braces and a zero-based notation which corresponds to the order of the comma separated variables list that comes after the end of the string; e.g. {0} is replaced with the value of the array element 0, {1} is replaced by the value of the array element 1 and {2} is replaced with the value of the array element 2.
If we change things up a bit we could easily switch the order the variables are listed as to something like this:
"Hello {0}, your email address is {1}, and your phonenumber is {2}", array[0], array[2], array[1]
In this example the output would display the value of array element 2 before it displays the value of array element 1.
String Interpolation Operator
But wait, using the String Interpolation operator we can make our statement even clearer and simpler like this:
$"Hello {array[0]}, your email address is {array[1]}, and your phonenumber is {array[2]}"
By adding the string interpolation operator, $, before the beginning of the string we can insert our variables by name (and index value in the case of our array) directly into the string. This makes the code much easier to read and understand. This is the recommended what to combine variables with string output in C# programs starting with C# version 6.
⇑ Table of Contents
Queues
A queue is a collection of items in which the first item added to the collection is the first one to be removed.
The queue data structure mimics a real-life queue. In a queue, items are processed in the order in which they were added to the queue. In particular, items are always added at the end of the queue and removed from the front of the queue. This is also commonly known as first-in, first-out (FIFO) processing. The capacity of a queue is the number of items the queue can hold. However, as elements are added to the queue, the capacity is automatically increased. A queue is also a heterogeneous data structure, meaning that items in a queue can be of different data types.
Internal Representation
In order to avoid excessive reallocation of memory space and allow easy management, a queue is often internally implemented as a circular array of objects, as shown in Figure 3-3.
Figure 3-3: This figure depicts the internal representation of a queue data structure.
Within a queue, the head index points to the first item, and the tail index points to the last item. In Figure 3-3, for example, the head index points to location 2 on the queue. Because the queue is circular, as long as you can keep track of the head and tail pointers, it doesn’t matter what location the queue starts from. When an item is removed, the head moves to the
next item in the queue. When a new item is added, it always appears at the end of the queue, and the tail starts pointing to the newly added item. Any null slots in a queue (including the one depicted in Figure 3-3) are the empty spots that can be filled before the queue will require a memory reallocation.
The.NET Framework provides an implementation of the queue data structure as part of the Queue
class in the System.Collections
namespace. In programming languages that don’t provide an implementation of a queue, you can write your own Queue class by using an array-like data structure and simulating the queue operations.
Common Operations
A queue supports the following common operations:
- Enqueue: The enqueue operation first checks whether there is enough capacity available in the queue to add one more item. If capacity is available, the item is added to the tail end of the queue. If there is no space available in queue, the array is reallocated by a pre-specified growth factor, and the new item is then added to the queue.
- Dequeue: The dequeue operation removes the current element at the head of the queue and sets the head to point to the next element.
- Peek: The peek operation allows you to look at the current item at the head position without actually removing it from the queue.
- Contains: The contains operation allows you to determine whether a particular item exists in the queue.
Performance and Usage
A queue is a special-purpose data structure that is best suited for an application in which you need to process items in the order they were received. Some examples may include
print spoolers, messaging systems, and job schedulers. Unlike an array, a queue cannot be used to randomly access elements. Operations such as enqueue and dequeue actually add and remove the items from the queue.
Queue Code Example in C#
Notice how in this C# example the System.Collections library is added to the Using directives. This is because the Queue class is located in the System.Collections library of the .NET framework.
using System;
using System.Collections;
public class SamplesQueue
{
public static void Main()
{
// Creates and initializes a new Queue.
Queue myQ = new Queue();
myQ.Enqueue("Hello");
myQ.Enqueue("World");
myQ.Enqueue("!");
// Displays the properties and values of the Queue.
Console.WriteLine("myQ");
Console.WriteLine($"\tCount: {myQ.Count}");
Console.Write("\tValues:");
PrintValues(myQ);
}
public static void PrintValues(IEnumerable myCollection)
{
foreach (Object item in myCollection)
Console.Write($"{item}");
Console.WriteLine();
}
}
/*
This code produces the following output.
myQ
Count: 3
Values: Hello World!
*/
//NOTE: IEnumerable is an interface which exposes an enumerator
//by using simple iteration over a non-generic collection.
⇑ Table of Contents
Stacks
A stack is a collection of items in which the last item added to the collection is the first one to be removed.
As opposed to a queue, a stack is a last-in, first-out (LIFO) data structure. Think of a stack as similar to a stack of dinner plates on a buffet table; here, the last plate to be added is also the first plate to be removed. The capacity of a stack refers to the number of items it can hold. However, as elements are added to a stack, the stack’s capacity is automatically increased. A stack is a heterogeneous data structure, meaning that the items within it can be of different data types.
Internal Representation
Like a queue, a stack is often implemented as a circular buffer in order to avoid excessive reallocation of memory space and permit
easier management. A stack can be visualized just like the queue shown in Figure 3-3, except that the tail is now called the top of the stack and the head is now called the bottom of the stack.
New items are always added to the top of a stack; when this happens, the top of the stack starts pointing to the newly added element. Items are also removed from the top of the stack, and when that happens, the top of the stack is adjusted to point to the next item in the stack.
The .NET Framework provides an implementation of the stack data structure as part of the Stack
class in the System.Collections
namespace. In programming languages that don’t provide an implementation of the stack, you can write your own Stack class by using an array-like data structure and simulating the stack operations.
Common Operations
A stack supports the following common operations:
-
Push: The push operation first checks whether there is enough capacity available in the stack to add one more item. If capacity is available, the item is added to the top of the stack. If there is no space in the stack, the array is reallocated by a pre-specified growth factor, and then the new item is added to the stack.
-
Pop: The pop operation removes the element at the top of the stack and sets the top to point to the next element in the stack.
-
Peek: The peek operation allows you to look at the current item at the top of the stack without actually removing it from the stack.
-
Contains: The contains operation allows you to determine whether a particular item exists in the stack.
Performance and Usage
A stack is a special-purpose data structure that is best suited for applications in which you need to process items in last-in, first-out order. Stacks are useful structures because of their applications in runtime memory management, expression evaluation, method-call tracking, etc. Unlike an array, a stack cannot be used to access elements randomly. Operations such as push and pop actually add and remove the items from the stack.
Stack Code Example In C#
This code demonstrates the last in first out nature of the stack data structure. Instead of being.
using System;
using System.Collections;
public class SamplesStack {
public static void Main() {
// Creates and initializes a new Stack.
Stack myStack = new Stack();
myStack.Push("Hello");
myStack.Push("World");
myStack.Push("!");
// Displays the properties and values of the Stack.
Console.WriteLine( "myStack" );
Console.WriteLine( "\tCount: {0}", myStack.Count );
Console.Write( "\tValues:" );
PrintValues( myStack );
}
public static void PrintValues( IEnumerable myCollection ) {
foreach ( Object obj in myCollection )
Console.Write( "{0}", obj );
Console.WriteLine();
}
}
/*
This code produces the following output.
myStack
Count: 3
Values: ! World Hello
*/
⇑ Table of Contents
Linked Lists
A linked list is a collection of nodes in which each node contains a reference (or link) to the next node in the sequence. Unlike an array, the items in a linked list need not be contiguous; therefore, a linked list does not require reallocation of memory space for the entire list when more items must be added.
Internal Operations
In memory, a linked list can be visualized as a collection of nodes, as shown in Figure 3-4.
Figure 3-4: This image depicts the internal representation of the Linked List data structure.
Each node in a linked list contains two pieces of information: the data corresponding to the node, and the link to the next node. The first node of the list is called the head node. Using the link in the head node, you can get to the next node and continue traversing nodes until the final link is a null value. Often, the term tail is used to refer to the list pointed to by the head node—that is, it refers to everything after the head node. Thus, in Figure 3-4, the tail is the linked list starting from node B.
Several other implementations of linked lists may also be used depending on requirements. For instance, in a circular linked list, the last node in the list points back to the first node to create a circle. In contrast, in a doubly linked list, each node contains two links, as shown in Figure 3-5.
Figure 3-5: Internal depiction of a circular linked list data structure.
At each node of a doubly linked list, one link is a forward reference that points to the next node in the sequence, and the other link is a backward reference that points to the previous node in the sequence. As you can imagine, a doubly linked list is easy to traverse in either direction.
The .NET Framework provides a LinkedList
class as part of the System.Collections.Generic
namespace. This class implements a homogeneous doubly linked list of the specified data type. You can also write your own classes to implement a different kind of linked list implementation.
Common Operations
A linked list supports the following common operations:
- Add: Adding or inserting an item in a linked list is a matter of changing links, as shown in Figure 3-6. Say you want to insert a new node (with value Z) between the nodes with values A and B. First, you need to allocate memory for the new node and assign value Z to the data section of the node. Next, you must copy the link section of node A to the link section of node Z so that node Z is pointing to node B. Finally, you must copy the address of the newly created node Z to the link section of node A so that node A starts pointing to node Z.
- Remove: Similar to the add operation, the remove or delete operation is also a matter of changing links. For example, to delete the third node in Figure 3-4, you would change the link for the second node to a null value. The third node will now be an unreferenced piece of memory, and it will eventually be returned to the pool of avail- able memory.
- Find :The find operation finds a node with a given value in the linked list. To find a value, you generally start from the head node and check whether the value matches. If not, you follow the link to the next node and continue the find operation until you reach the end of the list, which happens when you encounter a null link.
Figure 3-6: Depiction of a new item being added to a linked list.
Performance and Usage
A linked list does not allow random access to its items. The only way to get to an item is to start from the head node and follow the links from there. As a result, linked lists are slow at retrieving data. However, for insert and delete operations, linked lists are extremely fast, because insertion or deletion of a node involves simply changing a link. Linked lists also have no maximum capacity after which their contents need to be reallocated.
In fact, a linked list provides an alternative way to implement the queue and the stack data structures. If your requirements call for frequent access to data but you seldom need to insert or delete data, an array is the preferred implementation. If, however, your requirements call for frequent insert and delete operations, then a linked list may be a better implementation.
⇑ Table of Contents
Classes
Most modern programming languages support a data structure known as a class. A class is an extensible program-code-template for creating objects, providing initial values for state (member variables) and implementations of behavior (member functions or methods). In many languages, the class name is used as the name for the class (the template itself), the name for the default constructor of the class (a subroutine that creates objects), and as the type of objects generated by instantiating the class; these distinct concepts are easily conflated.
In this course we do not teach you how to write and use your own custom classes; you will learn how to do that in a later programming class like C# level 2 and C++ level 2. However, if you completed assignment 1 using the C# language then you have already learned to how to use a class in your program, the built-in Console class provided by the .NET framework. At this stage of your programming career you need to focus on basic coding structures, but knowing how to use some of the common classes that are built-in to all modern-day programming languages is important too.
Class Libraries and Namespaces
As a quick refresher let's remind ourselves that the Microsoft .NET framework is a library of pre-built classes for us to use in our own programs without having to write all the code needed to get the same functionality. In order to begin learning about which classes are available in the .NET Framework we need to visit Microsoft's online documentation for the .NET Framework Class Library (important note: the .NET Framework is similar to but different from the .NET Standard and the newer .NET Core class libraries).
The .NET Framework Class library is organized into namespaces which are collections of classes that have related functionality. Looking at the description for the System namespace we can see that; "the System namespace contains fundamental classes and base classes that define commonly-used value and reference data types, events and event handlers, interfaces, attributes, and processing exceptions." I know that's a lot to digest but, your takeaway at this point need only be an understanding that the System namespaces has classes we will use quite frequently in our own programs. Let's click on the System hypertext link and see what we find.
What we find is a list of all the classes that comprise the System namespace. In fact this would be a good time to remind you of what your HelloWorld program code looked like when written using the C# programming language.
The HelloWorld program written in C#.
Using Classes From The System Namespace
Looking at line 1 of the code you can see that there is a using directive that reads "using System;" this tells the Visual Studio program to load the System namespace into memory thereby giving our HelloWorld program access to the System namespace's library of classes.
If you scroll down through the list of classes in the System namespace you will encounter the Console class and if you read the description you will learn that the Console class represents the standard input, output, and error streams for console applications. Click on the Console hypertext link and you will see this
The top of the C# Console class page.
Scroll down on the C# Console class page; you will see that it is organized into its constituent components of: properties, methods and events. Scroll down to the Methods section and look towards the bottom of the list for the WriteLine() method's description. Again, the Microsoft document editors don't really make it clear what the class does so I'll interpret for you. The WriteLine() method will write to the screen ("the standard output stream") whatever value type you place into its parenthesis and then generates a line feed ("the current line terminator") to move the cursor to the next line.
Compare that with the Write() method which outputs to the screen whatever value type you place in its parenthesis, but it does not generate a line feed, the cursor remains on the line where the string was displayed. From this you might deduce that "the current line terminator" refers to generating a line feed just as if you had pressed the Enter (Return on a Mac) key on your keyboard. Now click on the hypertext link for WriteLine() and on the page that comes up you'll see coding examples of how to use the WriteLine() method.
In addition to the Console class, the .NET class library houses thousands of pre-built classes. In fact most modern languages have libraries of classes to help programmers get things done faster. That is really one of the goals as a computer programmer is to continually learn about new class libraries, their classes, and how to use them. In C# even the data primitives are actually classes with corresponding properties, methods and events built-into them. That means that in C# even a Sting data type has pre-built properties, methods, and event associated to it. You will also find that many modern day languages have classes for commonly performed tasks like storing and manipulating dates - the Date class (or DateTime in C#); etc.
⇑ Table of Contents
Video Lecture
⇑ Table of Contents
Summary
In this lesson you have learned about what data structures are and were provided with several examples of commonly used data structures in computer programming including: arrays, queues, stacks, linked lists and classes.
⇑ Table of Contents