A data structure is a collection of data elements that are organized in some way. Learn about the different types of data structures in programming, such as files, lists, arrays, stacks, queues and trees.
Data structures provide a way of storing and organizing data in a computer. Programming uses a number of different data structures. Within a program, a data structure is a collection of data elements that are organized in some way. For example, a collection of elements could be organized and numbered in sequence. Common data structures are files, lists, arrays, stacks, queues and trees.
A computer file is used for storing information. Computer programs can use files to read the information that needs to be processed and to write the results of the processing. The data inside a file is typically organized in smaller packets of information, which are often referred to as 'records' or 'lines.'
For example, a file can contain the payroll information for each employee, with each employee represented by a line. A computer program can read this information line by line and perform some type of payroll-related operation, such as calculating benefits for a certain pay period. The results could be added to the existing file or written to a new file. Files make it possible for different programs to share information, as one file can be passed onto the next.
A list contains elements of one particular data type. For example, a list could contain strings. An example would be the names of all the players on a soccer team. Each name is a string, but when you organize all the names together, they form a list. A list is the simplest data structure.
For example, a list of strings could look like this:
('John,' 'Paul,' 'George,' 'Ringo')
A list of numbers could look like this:
(67, 84, 92, 52, 81, 75)
Each element in a list is identified by a specific index. All elements in a list are ordered in one particular sequence, and the index for an element tells you at what position in the sequence that element is located. Typically, the index value of the first position is zero (0).
An array is a data structure where the elements are identified by one or more indices. An array is similar to a list, but an array can have multiple dimensions. A one-dimensional array is the same as a list: a linear sequence of elements that are all of the same type.
In a two-dimensional array, the elements are organized in two dimensions, which you can think of as the rows and columns of a table. This type of array uses two indices: one for rows and one for columns. The unique combination of two index values represents a unique cell in the table. Each cell corresponds to an element, which can be a string, a number or some other type of data.
A two-dimensional array is also called a matrix. A three-dimensional array can be represented by a cube and uses three indices. Arrays can have more dimensions, but they are more difficult to visualize.
Arrays make it possible to organize data in an efficient manner because the indices make it possible to retrieve any element. They are relatively easy to implement. However, all the elements have to be of the same type, and searching through a large array can be time consuming if it is not sorted.
A stack is a data structure where elements are only inserted or deleted at the top. A stack is similar to a list, but it can contain any type of element. Contrary to a list, the order of the elements is fixed and cannot be changed. This limits the operations that can be performed on it. There are basically only two operations: push and pop. Push adds an item to the top of the stack, and pop removes an item from the top of the stack.
A stack is a last in, first out (LIFO) data structure. Elements are removed in the reverse order of their addition. The easiest way to visualize a stack is as a pile of books, and the only thing you can do is to add a book to the top or remove a book from the top. This is not a very sophisticated data structure, but it is very efficient. Stacks can be very large, and yet you can work with a stack very quickly since you are only concerned with the item at the top.
A queue is a data structure where the elements are kept in order and where elements are inserted at one end and deleted at the other. A queue is a first in, first out (FIFO) data structure. The first element in a queue will be the first one to be removed. You can also keep adding elements or keep removing elements, and the queue can get longer or shorter.
The easiest way to visualize a queue is a line of people waiting at the bank. New people coming into the bank are added to the back of the line. When a teller opens up, the person leaving the line is at the very front and has been there the longest of all the people in the line.
A queue is very useful in handling an intermediate step in a program. Imagine two processes that run at variable speeds due to the nature of the elements being processed - some elements are more complicated than others and take more time. If the processes are connected in a direct sequence, where the output of the first process becomes the input into the second process, the program may not be very efficient since the two processes may end up having to wait on each other. A queue between the two processes allows the programs to be more efficient.
When the first process runs faster than the second one, the length of the queue will grow, but the first process does not need to slow down. When the second process runs faster than the first one, the length of the queue will shorten, but the second process does not have to wait for the first one. The use of a queue reduces the overall wait time. Of course, if the second process is consistently faster than the first one, the queue will become empty, and some of the efficiency is lost.
A tree is a hierarchical data structure. It looks somewhat like a tree structure with branches but is typically shown upside down. A tree is a type of connected graph where each element is represented as a node in a graph. Relationships between nodes are described in terms of parents and children.
Each node in the tree is connected to one parent node higher up in the structure, with the exception of the node at the very top, which does not have a parent. In the example below, node number two does not have a parent and is considered the root of the tree.
Node #2 is the root.
Each node in the tree can have any number of children lower down in the structure, including zero. In the example, nodes two, six and seven have two children each, while nodes five and nine have one child each. Nodes three, four, five and eleven have no children and are then considered leaves.
Trees are a more complicated data structure. They are typically used to represent hierarchical data or collections of sorted lists. For example, consider the folder and file structure on your computer.
The root is typically the C: drive, and your files are organized in a number of folders. Each folder can contain any number of folders or files. This can go various levels deep, so the actual location of a file could be something like C:\MyDocuments\Projects2013\MyReport.doc. You can represent the entire structure of all your files and folders as one tree structure.
A data structure is a collection of data elements that are organized in some way. Common data structures are files, lists, arrays, stacks, queues and trees.
Upon completing this lesson, you will be able to:
- Define data structure
- Describe common data structures, such as files, lists, arrays, stacks, queues and trees