Abstract Data Types:
Most of the concepts introduced in the previous section should be familiar ideas from a beginning course in programming. The one possibly new notion is that of an abstract data type, and before proceeding it would be useful to discuss the role of abstract data types in the overall program design process. To begin, it is useful to compare an abstract data type with the more familiar notion of a procedure.
Procedures, an essential tool in programming, generalize the notion of an operator. Instead of being limited to the built-in operators of a programming language (addition, subtraction, etc.), by using procedures a programmer is free to define his own operators and apply them to operands that need not be basic types. An example of a procedure used in this way is a matrix multiplication routine.
Another advantage of procedures is that they can be used to encapsulate parts of an algorithm by localizing in one section of a program all the statements relevant to a certain aspect of a program. An example of encapsulation is the use of one procedure to read all input and to check for its validity. The advantage of encapsulation is that we know where to go to make changes to the encapsulated aspect of the problem. For example, if we decide to check that inputs are nonnegative, we need to change only a few lines of code, and we know just where those lines are.
Definition of Abstract Data Type
We can think of an abstract data type (ADT) as a mathematical model with a collection of operations defined on that model. Sets of integers, together with the operations of union, intersection, and set difference, form a simple example of an ADT. In an ADT, the operations can take as operands not only instances of the ADT being defined but other types of operands, e.g., integers or instances of another ADT, and the result of an operation can be other than an instance of that ADT. However, we assume that at least one operand, or the result, of any operation is of the ADT in question.
The two properties of procedures mentioned above generalization and encapsulation -- apply equally well to abstract data types. ADT's are generalizations of primitive data types (integer, real, and so on), just as procedures are generalizations of primitive operations (+, -, and so on). The ADT encapsulates a data type in the sense that the definition of the type and all operations on that type can be localized to one section of the program. If we wish to change the implementation of an ADT, we know where to look, and by revising one small section we can be sure that there is no subtlety elsewhere in the program that will cause errors concerning this data type. Moreover, outside the section in which the ADT's operations are defined, we can treat the ADT as a primitive type; we have no concern with the underlying implementation. One pitfall is that certain operations may involve more than one ADT, and references to these operations must appear in the sections for both ADT's.
To illustrate the basic ideas, consider the procedure greedy of the previous section implemented using primitive operations on an abstract data type LIST (of integers). The operations performed on the LIST newclr were:
1. make a list empty,
2. get the first member of the list and return null if the list is empty,
3. get the next member of the list and return null if there is no next member, and
4. insert an integer into the list.
There are many data structures that can be used to implement such lists efficiently. In Fig. 1.8, if we replace these
operations by the statements
1. MAKENULL(newclr);
2. w := FIRST(newclr);
3. w := NEXT(newclr);
4. INSERT(v, newclr);
then we see an important aspect of abstract data types.
An implementation of an ADT is a translation, into statements of a programming language, of the declaration that defines a variable to be of that abstract data type, plus a procedure in that language for each operation of the ADT.
An implementation chooses a data structure to represent the ADT; each data structure is built up from the basic data types of the underlying programming language using the available data structuring facilities.
Arrays and record structures are two important data structuring facilities that are available in Pascal. For example, one possible implementation for variable S of type SET would be an array that contained the members of S.
One important reason for defining two ADT's to be different if they have the same underlying model but different operations is that the appropriateness of an implementation depends very much on the operations to be performed.
Basic Abstract DataTypes
We consider lists, which are sequences of elements, and two special cases of lists: stacks, where elements are inserted and deleted at one end only, and queues, where elements are inserted at one end and deleted at the other. We then briefly study the mapping or associative store, an ADT that behaves as a function. For each of these ADT's we consider several implementations and compare their relative merits.
The Abstract Data Type "List"
Lists are a particularly flexible structure because they can grow and shrink on demand, and elements can be accessed, inserted, or deleted at any position within a list. Lists can also be concatenated together or split into sublists. Lists arise routinely in applications such as information retrieval, programming language translation, and simulation.
Mathematically, a list is a sequence of zero or more elements of a given type (which we generally call the element type). We often represent such a list by a comma-separated sequence of elements
al, a2, . . . ,an
where n 0, and each ai is of type elementtype. The number n of elements is said to be the length of the list. Assuming n 1, we say that a1 is the first element and an is the last element. If n = 0, we have an empty list, one which has no elements.
An important property of a list is that its elements can be linearly ordered according to their position on the list. We say ai precedes ai+1 for i = 1, 2, . . . , n-1, and ai follows ai-1 for i = 2, 3, . . . ,n. We say that the element ai is at position i. It is also convenient to postulate the existence of a position following the last element on a list. The function END(L) will return the position following position n in an n- element list L. Note that position END(L) has a distance from the beginning of the list that varies as the list grows or shrinks, while all other positions have a fixed distance from the beginning of the list.
To form an abstract data type from the mathematical notion of a list we must define a set of operations on objects of type LIST.† As with many other ADT's we discuss in this book, no one set of operations is suitable for all applications. Here, we shall give one representative set of operations. In the next section we shall offer several data structures to represent lists and we shall write procedures for the typical list operations in terms of these data structures.
To illustrate some common operations on lists, let us consider a typical application in which we have a mailing list from which we wish to purge duplicate entries. Conceptually, this problem can be solved quite simply: for each item on the list, remove all equivalent following items. To present this algorithm, however, we need to define operations that find the first element on a list, step through all successive elements, and retrieve and delete elements.
We shall now present a representative set of list operations. In what follows, L is a list of objects of type elementtype, x is an object of that type, and p is of type position. Note that "position" is another data type whose implementation will vary for different list implementations. Even though we informally think of positions as integers, in practice, they may have another representation.
1. INSERT(x, p, L). Insert x at position p in list L, moving elements at p and following positions to the next higher position. That is, if L is al, a2, . . . ,an, then L becomes a1, a2,. . . ,ap- 1, x, ap, . . . ,an. If p is END(L), then L becomes a1, a2, . . . , an, x. If list L has no position p, the result is undefined.
2. LOCATE(x, L). This function returns the position of x on list L. If x appears more than once, then the position of the first occurrence is returned. If x does not appear at all, then END(L) is returned.
3. RETRIEVE(p, L). This function returns the element at position p on list L. The result is undefined if p = END(L) or if L has no position p. Note that the elements must be of a type that can be returned by a function if RETRIEVE is used. In practice, however, we can always modify RETRIEVE to return a pointer to an object of type elementtype.
4. DELETE(p, L). Delete the element at position p of list L. If L is a1, a2, . . . ,an, then L becomes a1, a2, . . . ,ap- 1, ap+1, . . . ,an. The result is undefined if L has no position p or if p = END(L).
5. NEXT(p, L) and PREVIOUS(p, L) return the positions following and preceding position p on list L. If p is the last position on L, then NEXT(p, L) = END(L). NEXT is undefined if p is END(L). PREVIOUS is undefined if p is
1. Both functions are undefined if L has no position p.
6. MAKENULL(L). This function causes L to become an empty list and returns position END(L).
7. FIRST(L). This function returns the first position on list L. If L is empty, the position returned is END(L).
8. PRINTLIST(L). Print the elements of L in the order of occurrence.
Most of the concepts introduced in the previous section should be familiar ideas from a beginning course in programming. The one possibly new notion is that of an abstract data type, and before proceeding it would be useful to discuss the role of abstract data types in the overall program design process. To begin, it is useful to compare an abstract data type with the more familiar notion of a procedure.
Procedures, an essential tool in programming, generalize the notion of an operator. Instead of being limited to the built-in operators of a programming language (addition, subtraction, etc.), by using procedures a programmer is free to define his own operators and apply them to operands that need not be basic types. An example of a procedure used in this way is a matrix multiplication routine.
Another advantage of procedures is that they can be used to encapsulate parts of an algorithm by localizing in one section of a program all the statements relevant to a certain aspect of a program. An example of encapsulation is the use of one procedure to read all input and to check for its validity. The advantage of encapsulation is that we know where to go to make changes to the encapsulated aspect of the problem. For example, if we decide to check that inputs are nonnegative, we need to change only a few lines of code, and we know just where those lines are.
Definition of Abstract Data Type
We can think of an abstract data type (ADT) as a mathematical model with a collection of operations defined on that model. Sets of integers, together with the operations of union, intersection, and set difference, form a simple example of an ADT. In an ADT, the operations can take as operands not only instances of the ADT being defined but other types of operands, e.g., integers or instances of another ADT, and the result of an operation can be other than an instance of that ADT. However, we assume that at least one operand, or the result, of any operation is of the ADT in question.
The two properties of procedures mentioned above generalization and encapsulation -- apply equally well to abstract data types. ADT's are generalizations of primitive data types (integer, real, and so on), just as procedures are generalizations of primitive operations (+, -, and so on). The ADT encapsulates a data type in the sense that the definition of the type and all operations on that type can be localized to one section of the program. If we wish to change the implementation of an ADT, we know where to look, and by revising one small section we can be sure that there is no subtlety elsewhere in the program that will cause errors concerning this data type. Moreover, outside the section in which the ADT's operations are defined, we can treat the ADT as a primitive type; we have no concern with the underlying implementation. One pitfall is that certain operations may involve more than one ADT, and references to these operations must appear in the sections for both ADT's.
To illustrate the basic ideas, consider the procedure greedy of the previous section implemented using primitive operations on an abstract data type LIST (of integers). The operations performed on the LIST newclr were:
1. make a list empty,
2. get the first member of the list and return null if the list is empty,
3. get the next member of the list and return null if there is no next member, and
4. insert an integer into the list.
There are many data structures that can be used to implement such lists efficiently. In Fig. 1.8, if we replace these
operations by the statements
1. MAKENULL(newclr);
2. w := FIRST(newclr);
3. w := NEXT(newclr);
4. INSERT(v, newclr);
then we see an important aspect of abstract data types.
An implementation of an ADT is a translation, into statements of a programming language, of the declaration that defines a variable to be of that abstract data type, plus a procedure in that language for each operation of the ADT.
An implementation chooses a data structure to represent the ADT; each data structure is built up from the basic data types of the underlying programming language using the available data structuring facilities.
Arrays and record structures are two important data structuring facilities that are available in Pascal. For example, one possible implementation for variable S of type SET would be an array that contained the members of S.
One important reason for defining two ADT's to be different if they have the same underlying model but different operations is that the appropriateness of an implementation depends very much on the operations to be performed.
Basic Abstract DataTypes
We consider lists, which are sequences of elements, and two special cases of lists: stacks, where elements are inserted and deleted at one end only, and queues, where elements are inserted at one end and deleted at the other. We then briefly study the mapping or associative store, an ADT that behaves as a function. For each of these ADT's we consider several implementations and compare their relative merits.
The Abstract Data Type "List"
Lists are a particularly flexible structure because they can grow and shrink on demand, and elements can be accessed, inserted, or deleted at any position within a list. Lists can also be concatenated together or split into sublists. Lists arise routinely in applications such as information retrieval, programming language translation, and simulation.
Mathematically, a list is a sequence of zero or more elements of a given type (which we generally call the element type). We often represent such a list by a comma-separated sequence of elements
al, a2, . . . ,an
where n 0, and each ai is of type elementtype. The number n of elements is said to be the length of the list. Assuming n 1, we say that a1 is the first element and an is the last element. If n = 0, we have an empty list, one which has no elements.
An important property of a list is that its elements can be linearly ordered according to their position on the list. We say ai precedes ai+1 for i = 1, 2, . . . , n-1, and ai follows ai-1 for i = 2, 3, . . . ,n. We say that the element ai is at position i. It is also convenient to postulate the existence of a position following the last element on a list. The function END(L) will return the position following position n in an n- element list L. Note that position END(L) has a distance from the beginning of the list that varies as the list grows or shrinks, while all other positions have a fixed distance from the beginning of the list.
To form an abstract data type from the mathematical notion of a list we must define a set of operations on objects of type LIST.† As with many other ADT's we discuss in this book, no one set of operations is suitable for all applications. Here, we shall give one representative set of operations. In the next section we shall offer several data structures to represent lists and we shall write procedures for the typical list operations in terms of these data structures.
To illustrate some common operations on lists, let us consider a typical application in which we have a mailing list from which we wish to purge duplicate entries. Conceptually, this problem can be solved quite simply: for each item on the list, remove all equivalent following items. To present this algorithm, however, we need to define operations that find the first element on a list, step through all successive elements, and retrieve and delete elements.
We shall now present a representative set of list operations. In what follows, L is a list of objects of type elementtype, x is an object of that type, and p is of type position. Note that "position" is another data type whose implementation will vary for different list implementations. Even though we informally think of positions as integers, in practice, they may have another representation.
1. INSERT(x, p, L). Insert x at position p in list L, moving elements at p and following positions to the next higher position. That is, if L is al, a2, . . . ,an, then L becomes a1, a2,. . . ,ap- 1, x, ap, . . . ,an. If p is END(L), then L becomes a1, a2, . . . , an, x. If list L has no position p, the result is undefined.
2. LOCATE(x, L). This function returns the position of x on list L. If x appears more than once, then the position of the first occurrence is returned. If x does not appear at all, then END(L) is returned.
3. RETRIEVE(p, L). This function returns the element at position p on list L. The result is undefined if p = END(L) or if L has no position p. Note that the elements must be of a type that can be returned by a function if RETRIEVE is used. In practice, however, we can always modify RETRIEVE to return a pointer to an object of type elementtype.
4. DELETE(p, L). Delete the element at position p of list L. If L is a1, a2, . . . ,an, then L becomes a1, a2, . . . ,ap- 1, ap+1, . . . ,an. The result is undefined if L has no position p or if p = END(L).
5. NEXT(p, L) and PREVIOUS(p, L) return the positions following and preceding position p on list L. If p is the last position on L, then NEXT(p, L) = END(L). NEXT is undefined if p is END(L). PREVIOUS is undefined if p is
1. Both functions are undefined if L has no position p.
6. MAKENULL(L). This function causes L to become an empty list and returns position END(L).
7. FIRST(L). This function returns the first position on list L. If L is empty, the position returned is END(L).
8. PRINTLIST(L). Print the elements of L in the order of occurrence.
No comments:
Post a Comment