Advanced Data Structures and Algorithms
Material
Big-Omega Notation
and
and
Applications[edit]
One-dimensional arrays[edit]
Multidimensional arrays[edit]
8). Linked Representation:
In the linked representation, a sequenceis a (reference to) an
object, which is either an empty node, representing the empty sequence, or a cons node with a field of type T containing the first
element of the sequence and a field of type Seq containing a pointer
to the first node in the rest of the sequence. This data representation, which
is often called a linked list, directly corresponds to the standard inductive
definition of sequences. We defined this sequence representation in Section 1.9.3 as the class List,
but that definition did not support all of operations listed above. The
following modification of the Listcomposite class hierarchy from Section 1.9.3 defines a linked representation for lists of
objects; it includes all of the FinalSeq operations:
Singly-linked list. Addition (insertion) operation.
Empty list case
Add first
Add last
Singly-linked list. Removal (deletion) operation.
List has only one node
Remove first
Remove last
Inserting a
node[edit]
Removing a
node[edit]
Circular list[edit]
Dictionary of keys (DOK)
List of lists (LIL)[edit]
Material
1.UNIT
1)
Algorithms: is a step-by-step procedure for
calculations. Algorithms are used for calculation, data processing, and automated
reasoning. An algorithm is an effective method expressed as a finite list[1] of well-defined instructions[2] for calculating a function.[3] Starting from an initial state and initial
input (perhaps empty),[4] the instructions describe a computation that, when executed, proceeds through a finite[5]number of well-defined successive states, eventually
producing "output"[6] and terminating at a final ending state
Performance Analysis: analysis of algorithms is the determination of the amount of
resources (such as time and storage) necessary to execute them. Most algorithms are designed to work with inputs of arbitrary
length. Usually, the efficiency or running time of an algorithm is stated as a
function relating the input length to the number of steps (time complexity) or storage locations (space complexity). Algorithm analysis is an important part of a
broader computational complexity theory, which provides theoretical estimates for the
resources needed by any algorithm which solves a given computational
problem. These estimates
provide an insight into reasonable directions of search for efficient
algorithms.
In theoretical analysis of algorithms it is
common to estimate their complexity in the asymptotic sense, i.e., to estimate
the complexity function for arbitrarily large input. Big O notation, Big-omega
notation and Big-theta
notation are
used to this end. For instance, binary search is said to run in a number of steps
proportional to the logarithm of the length of the list being searched, or in
O(log(n)),
Cost
Moedel: Time efficiency
estimates depend on what we define to be a step. For the analysis to correspond
usefully to the actual execution time, the time required to perform a step must
be guaranteed to be bounded above by a constant. One must be careful here; for
instance, some analyses count an addition of two numbers as one step.
Run-time
analysis: Run-time analysis
is a theoretical classification that estimates and anticipates the increase in running time (or run-time) of an algorithmas its input size (usually denoted as n) increases. Run-time
efficiency is a topic of great interest in computer science: A program can take seconds, hours or even years to finish
executing, depending on which algorithm it implements (see also performance analysis, which is the analysis of an algorithm's run-time in
practice).
Time
complexity: time complexity of an algorithm quantifies the
amount of time taken by an algorithm to run as a function of the length of the string representing the input[1]:226. The time complexity of an algorithm is
commonly expressed using big O notation, which
excludes coefficients and lower order terms. When expressed this way, the time
complexity is said to be describedasymptotically, i.e., as the input
size goes to infinity. For example, if the time required by an algorithm on all
inputs of size n is at most5n3 + 3n, the asymptotic time complexity is O(n3).
Time complexity is commonly
estimated by counting the number of elementary operations performed by the
algorithm, where an elementary operation takes a fixed amount of time to
perform. Thus the amount of time taken and the number of elementary operations
performed by the algorithm differ by at most a constant factor.
Since an algorithm's performance
time may vary with different inputs of the same size, one commonly uses the worst-case time complexity of an algorithm, denoted as T(n), which is defined as the maximum amount of time taken on
any input of size n. Time complexities are classified by the nature of the
function T(n). For
instance, an algorithm with T(n) = O(n) is called a linear time
algorithm, and an algorithm with T(n) = O(2n) is said to be an
exponential time algorithm.
Name
|
Running time (T(n))
|
Examples of running times
|
Example algorithms
|
|
constant time
|
O(1)
|
10
|
Determining if an integer (represented in binary) is even or odd
|
|
O(α(n))
|
||||
log-logarithmic
|
O(log log n)
|
|||
logarithmic time
|
O(log n)
|
log n, log(n2)
|
||
polylogarithmic time
|
poly(log n)
|
(log n)2
|
Space Complexity: SPACE is the computational
resource describing
the resource of memory space for adeterministic
Turing machine. It
represents the total amount of memory space that a "normal" physical
computer would need to solve a given computational
problem with
a given algorithm. It is one of the most well-studied complexity measures,
because it corresponds so closely to an important real-world resource: the
amount of physical computer memory needed to run a given program.
The measure DSPACE is used to define complexity classes, sets of all of the decision problems which can be solved using a certain amount of
memory space. For each function f(n),
there is a complexity class SPACE(f(n)), the set of decision problems which can be solved by a deterministic
Turing machine using space O(f(n)).
There is no restriction on the amount of computation time which can be used, though there may be
restrictions on some other complexity measures
3)
Asymtotic Notation:
Asymptotic complexity is a
way of expressing the main
component of the cost of an
algorithm, using idealized units of computational work. Consider, for example,
the algorithm for sorting a deck of cards, which proceeds by repeatedly searching
through the deck for the lowest card. The asymptotic complexity of this
algorithm is the square of the number of cards in the deck.
This quadratic behavior is the main term in the complexity formula, it says,
e.g., if you double the size of the deck, then the work is roughly quadrupled.
The O Notation
The O
(pronounced big-oh) is the
formal method of expressing the upper bound of an algorithm's running time.
It's a measure of the longest amount of time it could possibly take for the
algorithm to complete. The O (pronounced big-oh) is the
formal method of expressing the upper bound of an algorithm's running time.
It's a measure of the longest amount of time it could possibly take for the
algorithm to complete.
More formally, for non-negative functions, f(n) and g(n), if there
exists an integer
and a constant c > 0 such that for all integers
, f(n) ≤ cg(n),
then f(n) is Big O of g(n). This is
denoted as "f(n) = O(g(n))". If graphed, g(n) serves as an
upper bound to the curve you are analyzing, f(n).
Note that if f can take on
finite values only (as it should happen normally) then this definition implies
that there exists some constant C (potentially larger than c) such that for
all values of n, f(n) ≤ Cg(n).
An appropriate value for C is the maximum of c and
.
So, let's take
an example of Big-O. Say that f(n) = 2n + 8, and g(n) =
. Can we find a
constant
, so that 2n + 8
<=
? The number 4
works here, giving us 16 <= 16. For any number n greater than 4, this will
still work. Since we're trying to generalize this for large values of n, and
small values (1, 2, 3) aren't that important, we can say that f(n) is generally
faster than g(n); that is, f(n) is bound by g(n), and will always be less than
it.
Big-Omega Notation
1
< logn < n < nlogn < n2 < n3 <
2n < n!
Theta Notation: For non-negative functions, f(n) and g(n), f(n) is theta of g(n) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)). This is denoted as "f(n) = Θ(g(n))".
This is basically saying that the
function, f(n) is bounded both
from the top and bottom by the same function, g(n).
4) Complexity Analysis: Complex analysis, traditionally
known as the theory of functions of a complex variable, is the branch of mathematical
analysis that investigates functions of complex numbers. It is useful in many branches of
mathematics, including algebraic
geometry, number theory, applied
mathematics; as well as
in physics, including hydrodynamics, thermodynamics, nuclear,aerospace, mechanical and electrical
engineering. A complex function is one in which the independent variable and the dependent variable are both complex numbers. More precisely, a complex
function is a function whose domain and range are subsets of the complex plane.
For any complex function, both
the independent variable and the dependent variable may be separated into real and imaginary parts:
where
and
are real-valued
functions.
In other words, the components of
the function f(z),
5)
Data structure : data
structure is a particular way
of organizing data in a computer so that it can be used efficiently. Different kinds of data structures are
suited to different kinds of applications, and some are highly specialized to
specific tasks. For example, B-trees are particularly well-suited for implementation of databases,
while compilerimplementations usually use hash tables to look up identifiers.
Basic
principles: Data structures are generally based on the ability of a computer to
fetch and store data at any place in its memory, specified by an address –
a bit string that can be itself stored in memory and manipulated by the
program. Thus, the array and record data structures are based on computing the addresses
of data items with arithmetic
operations; while the linked
data structures are based on storing addresses of data items within the
structure itself. Many data structures use both principles, sometimes combined
in non-trivial ways (as in XOR linking).
There are numerous types of data
structures:
·
An array is a number of elements in a specific order. They are accessed
using an integer to specify which element is required (although the elements
may be of almost any type). Typical implementations allocate contiguous memory
words for the elements of arrays (but this is not always a necessity). Arrays
may be fixed-length or expandable.
·
Records (also called tuples or structs)
are among the simplest data structures. A record is a value that contains other
values, typically in fixed number and sequence and typically indexed by names.
The elements of records are usually called fields ormembers.
·
A hash table (also called a dictionary or map) is a more flexible variation on a record, in which name-value pairs can be added and
deleted freely.
·
A union type specifies which of a number of permitted primitive types
may be stored in its instances, e.g. "float or long integer".
Contrast with a record, which could be defined to contain a float and an integer; whereas, in a union, there
is only one value at a time. Enough space is allocated to contain the widest
member datatype.
Data
Structures
There are two types of data structers 1)
Linear 2) N on Linear
Linear: Arrays,Lists
Non Linear: Binary
trees,heap,
Linear data structures
Linear data structures organize their data elements in a
linear fashion, where data elements are attached one after the other. Data
elements in a liner data structure are traversed one after the other and only
one element can be directly reached while traversing. Linear data structures
are very easy to implement, since the memory of the computer is also organized
in a linear fashion. Some commonly used linear data structures are arrays,
linked lists, stacks and queues. An arrays is a collection of data elements
where each element could be identified using an index. A linked list is a
sequence of nodes, where each node is made up of a data element and a reference
to the next node in the sequence. A stack is actually a list where data
elements can only be added or removed from the top of the list. A queue is also
a list, where data elements can be added from one end of the list and removed
from the other end of the list.
Arrays
– Stacks
– Queues
– Singly-Linked Lists
– Doubly-Linked Lists
Nonlinear data structures
In nonlinear data structures, data elements are not
organized in a sequential fashion. A data item in a nonlinear data structure
could be attached to several other data elements to reflect a special
relationship among them and all the data items cannot be traversed in a single
run. Data structures like multidimensional arrays, trees and graphs are some
examples of widely used nonlinear data structures. A multidimensional array is
simply a collection of one-dimensional arrays. A tree is a data structure that
is made up of a set of linked nodes, which can be used to represent a
hierarchical relationship among data elements. A graph is a data structure that
is made up of a finite set of edges and vertices. Edges represent connections
or relationships among vertices that stores data elements.
6) ADT Concept: abstract
data type (ADT) is a mathematical
model for
a certain class of data structures that have similar behavior; or for certain data types of one or more programming
languages that
have similar semantics. An abstract data type is defined indirectly, only by the
operations that may be performed on it and by mathematical constraints on the
effects (and possibly cost) of those operations.
For example, an
abstract stack could be defined by three operations:
push
,
that inserts some data item onto the structure, pop
,
that extracts an item from it (with the constraint that each pop always returns
the most recently pushed item that has not been popped yet), and peek
,
that allows data on top of the structure to be examined without removal. When analyzing
the efficiency of algorithms that use stacks, one may also specify that all
operations take the same time no matter how many items have been pushed into
the stack, and that the stack uses a constant amount of storage for each
element.
Linear List ADT:
A list is a linear
structure
• Each item except
the first (front, head) has a
unique predecessor
• Each item except
the last (end, tail) has a
unique successor
• First item has no
predecessor, and last item
has no successor
• An item within a
list is specified by its position
in the list
Possible Operations
on a List
List ------------ Create an empty list
isEmpty -------- --
Determine whether the list is empty
isFull ---------- -- Determine whether the list is full
getLength --------------- -- Return number of items in the list
insert --------- --Add an item to the list
remove ----------- Remove a specified item
from list
retrieve ------------Get the list item at
a specified pos’n
List ADT
Implementation 1
• We’ll store either
simple types (int, char,
etc) or pointers to
simple types or to
more complex objects
- to avoid copying
complex objects
• Underlying
structure is an array with a
fixed maximum size
7) Array
Representation: The simplest type of data structure is a linear array. This is
also called one-dimensional array. an array is a data structure consisting of a collection of elements (values or variables), each identified by at least one array index or key. An array is stored so that the position of each element can
be computed from its index tuple by a mathematical formula.[1][2][3]
For example, an array of 10 32-bit integer variables, with indices 0 through 9,
may be stored as 10 words at memory
addresses 2000, 2004, 2008, … 2036, so that the element with index i has the address
2000 + 4 × i.[4]
Applications[edit]
Arrays are used to implement
mathematical vectors and matrices, as well as other kinds of rectangular tables. Many databases, small and
large, consist of (or include) one-dimensional arrays whose elements are records.
Arrays are used to implement
other data structures, such as heaps, hash tables, deques, queues, stacks, strings, and VLists.
One or more large arrays are
sometimes used to emulate in-program dynamic memory allocation, particularly memory pool allocation.
Historically, this has sometimes been the only way to allocate "dynamic
memory" portably.
One-dimensional arrays[edit]
A one-dimensional array (or
single dimension array) is a type of linear array. Accessing its elements
involves a single subscript which can either represent a row or column index.
As an example consider the C
declaration
int
anArrayName[10];
Syntax : datatype
anArrayname[sizeofArray];
Multidimensional arrays[edit]
For a two-dimensional array, the
element with indices i,j would have
address B + c · i + d · j, where the coefficients c and d are therow and column
address increments, respectively.
More generally, in a k-dimensional
array, the address of an element with indices i1, i2, …, ik is
B + c1 · i1 + c2 · i2 + … + ck · ik.
For example: int a[3][2];
8). Linked Representation:
In the linked representation, a sequenceis a (reference to) an
object, which is either an empty node, representing the empty sequence, or a cons node with a field of type T containing the first
element of the sequence and a field of type Seq containing a pointer
to the first node in the rest of the sequence. This data representation, which
is often called a linked list, directly corresponds to the standard inductive
definition of sequences. We defined this sequence representation in Section 1.9.3 as the class List,
but that definition did not support all of operations listed above. The
following modification of the Listcomposite class hierarchy from Section 1.9.3 defines a linked representation for lists of
objects; it includes all of the FinalSeq operations:
9) Vector representation:
The vector data model uses x, y coordinates
and simple geometric
objects:– points, line and areas to represent spatial
features.
A point (node, vertex or 0-cell) has 0
dimension and has only the property of
dimension.
A
line (edge, link, chain, 1-cell) has 1
dimension and has the property of
length.
An
area (polygon, face, zone, 2-cells) has
2 dimension and has the property of
area
and boundary.
10)
Singly Linked list:
linked list is a data structure consisting of a
group of nodes which together represent a sequence. Under the simplest
form, each node is composed of a data and a reference (in other words, a link) to the next
node in the sequence; more complex variants add additional links. This
structure allows for efficient insertion or removal of elements from any
position in the sequence.
Advantages:
·
Linked lists are a dynamic data structure, allocating the needed
memory when the program is initiated.
·
Insertion and deletion node operations are easily implemented in
a linked list.
·
Linear data structures such as stacks and queues are easily
executed with a linked list.
·
They can reduce access time and may expand in real time without
memory overhead.
Singly-linked list. Addition (insertion) operation.
Insertion into
a singly-linked list has two special cases. It's insertion a new node before
the head (to the very beginning of the list) and after the tail (to the very
end of the list). In any other case, new node is inserted in the middle of the
list and so, has a predecessor and successor in the list. There is a
description of all these cases below.
Empty list case
When list is
empty, which is indicated by (head == NULL)condition, the insertion is quite
simple. Algorithm sets both head and tail to point to the new node.
Add first
In this case,
new node is inserted right before the current head node.
It can be done in two steps:
- Update the next
link of a new node, to point to the current head node.
- Update head link
to point to the new node.
Add last
In this case,
new node is inserted right after the current tail node.
Singly-linked list. Removal (deletion) operation.
There are four
cases, which can occur while removing the node. These cases are similar to the
cases in add operation. We have the same four situations, but the order of
algorithm actions is opposite. Notice, that removal algorithm includes the
disposal of the deleted node, which may be unnecessary in languages with
automatic garbage collection (i.e., Java).
List has only one node
When list has
only one node, which is indicated by the condition, that the head points to the
same node as the tail, the removal is quite simple. Algorithm disposes the
node, pointed by head (or tail) and sets both head and tail to NULL.
Remove first
In this case,
first node (current head node) is removed from the list.
It can be done in two steps:
- Update head link
to point to the node, next to the head.
- Dispose removed
node.
Remove last
In this case,
last node (current tail node) is removed from the list. This operation is a bit
more tricky, than removing the first node, because algorithm should find a
node, which is previous to the tail first.
Search
single linked list
int search(int num)
{
int flag = 0;
struct node *temp;
temp = start;
while(temp!=NULL)
{
if(temp->data == num)
return(1); //Found
temp
= temp->next;
}
if(flag == 0)
return(0); // Not found
}
11)
double linked list:
doubly-linked list is a linked
data structure that consists of a set of sequentially linked records called nodes. Each node contains two fields, called links,
that are references to the previous and to the next node in the
sequence of nodes. The beginning and ending nodes' previous and next links, respectively, point to some
kind of terminator, typically a sentinel node or null, to facilitate traversal of the list.
A doubly-linked list whose nodes contain three fields: an
integer value, the link to the next node, and the link to the previous node.
Inserting a
node[edit]
These symmetric functions insert
a node either after or before a given node:
function insertAfter(List list, Node node, Node newNode)
newNode.prev := node
newNode.next := node.next
if node.next == null
list.lastNode := newNode
else
node.next.prev := newNode
node.next := newNode
function insertBefore(List list, Node node, Node newNode)
newNode.prev := node.prev
newNode.next := node
if node.prev == null
list.firstNode := newNode
else
node.prev.next := newNode
node.prev := newNode
Removing a
node[edit]
Removal of a node is easier than
insertion, but requires special handling if the node to be removed is the firstNode or lastNode:
function remove(List list, Node node)
if node.prev == null
list.firstNode := node.next
else
node.prev.next := node.next
if node.next == null
list.lastNode := node.prev
else
node.next.prev := node.prev
destroy node
Circular list[edit]
In the last node of a list, the link field often contains a null reference, a special value used to indicate the lack of
further nodes. A less common convention is to make it point to the first node
of the list; in that case the list is said to be 'circular' or 'circularly
linked'; otherwise it is said to be 'open' or 'linear'.
In the case of a circular doubly
linked list, the only change that occurs is that the end, or "tail",
of the said list is linked back to the front, or "head", of the list
and vice versa.
12)
Sparce matrices:
a sparse
matrix is a matrix populated primarily with zeros as elements of
the table. By contrast, if a larger number of elements differ from zero, then
it is common to refer to the matrix as a dense
matrix. The fraction of zero elements (non-zero elements) in a matrix is
called the sparsity (density).
Example of sparse matrix
[ 11 22 0 0 0 0 0 ]
[ 0 33 44 0 0 0 0 ] [ 0 0 55 66 77 0 0 ] [ 0 0 0 0 0 88 0 ] [ 0 0 0 0 0 0 99 ] |
The above sparse matrix contains
only 9 nonzero elements of the 35, with 26 of those elements as zero. |
The basic data structure for a matrix is a
two-dimensional array. Each entry in the array represents an element ai,j of the matrix and can be accessed by
the two indices i and j.
Traditionally, i
indicates the row number (top-to-bottom), while j indicates the column
number (left-to-right) of each element in the table. For an m×n matrix, enough memory to store up to (m×n)
entries to represent the matrix is needed.
Dictionary of keys (DOK)
DOK represents non-zero values as
a dictionary (e.g., a hash table or binary search tree) mapping
(row, column)
-pairs to values. This
format is good for incrementally constructing a sparse array, but poor for
iterating over non-zero values in sorted order.
List of lists (LIL)[edit]
LIL stores one list per row,
where each entry stores a column index and value. Typically, these entries are
kept sorted by column index for faster lookup
Indirect recursion[edit]
Anonymous recursion[edit]
How it works[edit]
Operations[edit]
Class
java.util.LinkedList
Class
java.util.Vector
Class
java.util.Stack
java.util
hasNext
next
remove
Searching
in reverse order
Examples[edit]
A hash function is any function that can be used to map data of arbitrary size to data of fixed size, with slight
differences in input data producing very big differences in output data.[1] The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes. One practical use is a
data structure called a hash table. Hash functions are
also frequently used to accelerate table or database lookup, detecting
duplicated records in a large file, and finding similar stretches in DNA
sequences. They are also useful in cryptography. A cryptographic hash function allows one to easily verify that some input data matches a
stored hash value, but hard to reconstruct the data from the hash alone. This
principle is used by the PGP algorithm for data
validation and by many password checking systems.
Hash tables[edit]
Class
declaration
Parameters
Class
constructors
Class
methods
Class
declaration
Class
constructors
Class
methods
Step-by-step
example[edit]
Definition[edit]
An example[edit]
10) Comparison of algorithms[edit]
UNIT 2
1) Stack
and Queue ADT:
Stack: a stack is a particular kind of abstract data type or collection in which the principal (or only) operations on
the collection are the addition of an entity to the collection, known as push and removal of an entity, known as pop.[1] The relation between the push and pop
operations is such that the stack is a Last-In-First-Out
(LIFO) data structure. In a
LIFO data structure, the last element added to the structure must be the first
one to be removed.
The push() operation
is used both to initialize the stack, and to store values to it. It is responsible
for inserting (copying) the value into the ps->items[] array and
for incrementing the element counter (ps->size). In a responsible C
implementation, it is also necessary to check whether the array is already full
to prevent an overrun.
void push(STACK *ps, int x)
{
if (ps->size == STACKSIZE) {
fputs("Error: stack
overflow\n", stderr);
abort();
} else
ps->items[ps->size++] = x;
}
The pop() operation
is responsible for removing a value from the stack, and decrementing the value
of ps->size. A responsible C implementation will also need to check
that the array is not already empty.
int pop(STACK *ps)
{
if (ps->size == 0){
fputs("Error: stack
underflow\n", stderr);
abort();
} else
return ps->items[--ps->size];
}
Queue:
a queue (/ˈkjuː/ kew) is a particular kind of abstract data typeor collection in which the entities in the collection are
kept in order and the principal (or only) operations on the collection are the
addition of entities to the rear terminal position, known as enqueue, and removal of entities from the front
terminal position, known as dequeue.
This makes the queue a First-In-First-Out
(FIFO) data structure. In a
FIFO data structure, the first element added to the queue will be the first one
to be removed. This is equivalent to the requirement that once a new element is
added, all elements that were added before have to be removed before the new
element can be removed.
2) Recursion:
Recursion is a method of solving problems that involves
breaking a problem down into smaller and smaller subproblems until you get to a
small enough problem that it can be solved trivially. Usually recursion
involves a function calling itself. While it may not seem like much on the
surface, recursion allows us to write elegant solutions to problems that may
otherwise be very difficult to program.
We will begin our investigation with a simple
problem that you already know how to solve without using recursion. Suppose
that you want to calculate the sum of a list of numbers such as: [1,3,5,7,9].
An iterative function that computes the sum is shown in ActiveCode
1. The function uses an
accumulator variable (theSum) to compute a running total of all the
numbers in the list by starting with 0 and adding each number in the
list.
1
def listsum(numList):
theSum = 0
for i in numList:
theSum = theSum + i
return theSum
print(listsum([1,3,5,7,9]))
Indirect recursion[edit]
Most basic examples of recursion,
and most of the examples presented here, demonstrate direct recursion, in which a function calls itself. Indirect recursion occurs
when a function is called not by itself but by another function that it called
(either directly or indirectly). For example, if f calls f, that is direct
recursion, but if f calls g which calls f, then that is indirect recursion of f. Chains of three
or more functions are possible; for example, function 1 calls function 2,
function 2 calls function 3, and function 3 calls function 1 again.
Indirect recursion is also called mutual recursion,
which is a more symmetric term, though this is simply a different of emphasis,
not a different notion. That is, if f calls g and then g calls f, which in turn calls g again, from the
point of view of f alone, f is indirectly recursing, while from the point of view of g alone, it is
indirectly recursing, while from the point of view of both, f and g are mutually
recursing on each other. Similarly a set of three or more functions that call
each other can be called a set of mutually recursive functions.
Anonymous recursion[edit]
Recursion is usually done by
explicitly calling a function by name. However, recursion can also be done via
implicitly calling a function based on the current context, which is
particularly useful for anonymous functions, and is known as anonymous recursion.
3) Convert Infix Expression to Post-Fix Expression
4) Conventional notation is called infix notation.
The arithmetic operators appears between two operands. Parentheses are required
to specify the order of the operations. For example: a + (b * c).
5) Post fix notation (also, known as reverse Polish
notation) eliminates the need for parentheses. There are no precedence rules to
learn, and parenthese are never needed. Because of this simplicity, some
popular hand-held calculators use postfix notation to avoid the complications
of multiple sets of parentheses. The operator is placed directly after the two
operands it needs to apply. For example: a b c * +
6) This short example makes the move from infix to
postfix intuitive. However, as expressions get more complicated, there will
have to be rules to follow to get the correct result:
7)
8)
Simple heuristic algorithm to visually convert infix to postfix
• Fully parenthesize the expression, to reflect correct operator precedence
• Move each operator to replace the right parenthesis to which it belongs
• Remove paretheses
Simple heuristic algorithm to visually convert infix to postfix
• Fully parenthesize the expression, to reflect correct operator precedence
• Move each operator to replace the right parenthesis to which it belongs
• Remove paretheses
9)
10) Evaluating expressions
A stack is used in two phases of evaluating an expression such as
3 * 2 + 4 * (A + B)
A stack is used in two phases of evaluating an expression such as
3 * 2 + 4 * (A + B)
11) •Convert the infix form to postfix using a stack
to store operators and then pop them in correct order of precedence.
•Evaluate the postfix expression by using a stack to store operands and then pop them when an operator is reached.
•Evaluate the postfix expression by using a stack to store operands and then pop them when an operator is reached.
12) Infix to postfix conversion
Scan through an expression, getting one token at a time.
1 Fix a priority level for each operator. For example, from high to low:
3. - (unary negation)
2. * /
1. + - (subtraction)
Scan through an expression, getting one token at a time.
1 Fix a priority level for each operator. For example, from high to low:
3. - (unary negation)
2. * /
1. + - (subtraction)
13) Thus, high priority corresponds to high number
in the table.
2 If the token is an operand, do not stack it. Pass it to the output.
3 If token is an operator or parenthesis, do the following:
-- Pop the stack until you find a symbol of lower priority number than the current one. An incoming left parenthesis will be considered to have higher priority than any other symbol. A left parenthesis on the stack will not be removed unless an incoming right parenthesis is found.
The popped stack elements will be written to output.
--Stack the current symbol.
2 If the token is an operand, do not stack it. Pass it to the output.
3 If token is an operator or parenthesis, do the following:
-- Pop the stack until you find a symbol of lower priority number than the current one. An incoming left parenthesis will be considered to have higher priority than any other symbol. A left parenthesis on the stack will not be removed unless an incoming right parenthesis is found.
The popped stack elements will be written to output.
--Stack the current symbol.
14) -- If a right parenthesis
is the current symbol, pop the stack down to (and including) the first left
parenthesis. Write all the symbols except the left parenthesis to the
output (i.e. write the operators to the output).
-- After the last token is read, pop the remainder of the stack and write any symbol (except left parenthesis) to output.
-- After the last token is read, pop the remainder of the stack and write any symbol (except left parenthesis) to output.
15) Example:
16) Convert A * (B + C) * D to postfix
notation.
Move
|
Curren Ttoken
|
Stack
|
Output
|
1
|
A
|
empty
|
A
|
2
|
*
|
*
|
A
|
3
|
(
|
(*
|
A
|
4
|
B
|
(*
|
A B
|
5
|
+
|
+(*
|
A B
|
6
|
C
|
+(*
|
A B C
|
7
|
)
|
*
|
A B C +
|
8
|
*
|
*
|
A B C + *
|
9
|
D
|
*
|
A B C + * D
|
10
|
empty
|
||
Notes:
In this table, the stack grows toward the left. Thus, the top of the stack is the leftmost symbol. In move 3, the left paren was pushed without popping the * because * had a lower priority then "(". In move 5, "+" was pushed without popping "(" because you never pop a left parenthesis until you get an incoming right parenthesis. In other words, there is a distinction between incoming priority, which is very high for a "(", and instack priority, which is lower than any operator. In move 7, the incoming right parenthesis caused the "+" and "(" to be popped but only the "+" as written out. In move 8, the current "*" had equal priority to the "*" on the stack. So the "*" on the stack was popped, and the incoming "*" was pushed onto the stack. |
4) Array and linked list:
5) Circular
queue:
A circular buffer, cyclic buffer or ring
buffer is a data
structure that
uses a single, fixed-sizebuffer as if it were connected end-to-end. This
structure lends itself easily to buffering data
streams
The useful property
of a circular buffer is that it does not need to have its elements shuffled
around when one is consumed. (If a non-circular buffer were used then it would
be necessary to shift all elements when one is consumed.) In other words, the
circular buffer is well-suited as aFIFO buffer while a standard, non-circular buffer is
well suited as a LIFO buffer.
How it works[edit]
A circular buffer first starts
empty and of some predefined length. For example, this is a 7-element buffer:
Assume that a 1 is written
into the middle of the buffer (exact starting location does not matter in a
circular buffer):
Then assume that two more
elements are added — 2 & 3 — which get appended after the 1:
If two elements are then removed
from the buffer, the oldest values inside the buffer are removed. The two
elements removed, in this case, are 1 & 2, leaving the buffer with just a
3:
If the buffer has 7
elements then it is completely full:
A consequence of the circular
buffer is that when it is full and a subsequent write is performed, then it
starts overwriting the oldest data. In this case, two more elements — A & B
— are added and they overwrite the 3 & 4:
Alternatively, the routines that
manage the buffer could prevent overwriting the data and return an error or
raise an exception. Whether or not data is overwritten is up to the semantics
of the buffer routines or the application using the circular buffer.
Finally, if two elements are now
removed then what would be returned is not 3 & 4 but 5 & 6 because A & B overwrote the 3
& the 4 yielding the buffer with:
6) Dequeue
ADT:
7) Priority Queue ADT:
a priority queue is an abstract data type which is like a regular queue or stack data structure, but where additionally each
element has a "priority" associated with it. In a priority queue, an
element with high priority is served before an element with low priority. If
two elements have the same priority, they are served according to their order
in the queue.
Operations[edit]
A priority queue must at least
support the following operations:
·
pull_highest_priority_element:
remove the element from the queue that has the highest priority, and return
it.
This is also known as "pop_element(Off)",
"get_maximum_element" or "get_front(most)_element".
Some conventions reverse the order of priorities, considering
lower values to be higher priority, so this may also be known as "get_minimum_element",
and is often referred to as "get-min" in the literature.
This may instead be specified as separate "peek_at_highest_priority_element"
and "delete_element" functions, which can be combined to
produce "pull_highest_priority_element".
Priority Queues and Heaps:
8) java.util package array list:
java.lang.Object
java.util.AbstractCollection
java.util.AbstractList
java.util.ArrayList
Resizable-array implementation of the List interface. Implements all optional list operations, and
permits all elements, including null. In addition to implementing the List interface, this class provides methods to manipulate the
size of the array that is used internally to store the list. (This class is
roughly equivalent to Vector, except that it is unsynchronized.)
The size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized
constant time, that is, adding n elements
requires O(n) time. All of the other operations run in linear time (roughly
speaking). The constant factor is low compared to that for theLinkedList implementation.
Each ArrayList instance has a capacity. The capacity is the size of the array used to store the
elements in the list. It is always at least as large as the list size. As
elements are added an ArrayList, its capacity grows automatically. The details
of the growth policy are not specified beyond the fact that adding an element
has constant amortized time cost.
An application can increase the capacity of an ArrayList instance before adding a large number of elements using the ensureCapacity operation. This may reduce the amount of incremental
reallocation.
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads
modifies the list structurally, it must be synchronized externally. (A structural modification is
any operation that adds or deletes one or more elements, or explicitly resizes
the backing array; merely setting the value of an element is not a structural
modification.) This is typically accomplished by synchronizing on some object
that naturally encapsulates the list. If no such object exists, the list should
be "wrapped" using the Collections.synchronizedListmethod. This is best done at creation time, to prevent
accidental unsynchronized access to the list:
List list = Collections.synchronizedList(new ArrayList(...));
Constructor
Index
|
|
Constructor
|
Description
|
Constructs an empty list.
|
|
ArrayList(Collection)
|
Constructs a list containing the elements of the
specified collection, in the order they are returned by the collection's
iterator.
|
ArrayList(int)
|
Constructs an empty list with the specified
initial capacity.
|
Method
Index
|
|
Method
|
Description
|
Inserts the specified element at the specified
position in this list.
|
|
Appends the specified element to the end of this
list.
|
|
Appends all of the elements in the specified
Collection to the end of this list, in the order that they are returned by
the specified Collection's Iterator.
|
Class
java.util.LinkedList
java.lang.Object
java.util.AbstractCollection
java.util.AbstractList
java.util.AbstractSequentialList
java.util.LinkedList
Linked list implementation of the List interface. Implements all optional list operations, and
permits all elements (including null). In addition to implementing the List interface, the LinkedList class provides uniformly named methods to get, remove and insert an element at the beginning and end of the list. These
operations allow linked lists to be used as a stack, queue, or double-ended
queue (deque).
All of the stack/queue/deque operations could be easily
recast in terms of the standard list operations. They're included here
primarily for convenience, though they may run slightly faster than the
equivalent List operations.
All of the operations perform as could be expected for a
doubly-linked list. Operations that index into the list will traverse the list
from the begining or the end, whichever is closer to the specified index.
Note that this implementation is not synchronized. If multiple threads access a list concurrently, and at
least one of the threads modifies the list structurally, it must be synchronized
externally. (A structural modification is any operation that adds or deletes
one or more elements; merely setting the value of an element is not a
structural modification.) This is typically accomplished by synchronizing on
some object that naturally encapsulates the list. If no such object exists, the
list should be "wrapped" using the Collections.synchronizedList
method. This is best done at creation time, to prevent accidental
unsynchronized access to the list:
List list = Collections.synchronizedList(new LinkedList(...));
Constructor
Index
|
|
Constructor
|
Description
|
Constructs an empty list.
|
|
LinkedList(Collection)
|
Constructs a list containing the elements of the
specified collection, in the order they are returned by the collection's
iterator.
|
Method
Index
|
|
Method
|
Description
|
Inserts the specified element at the specified
position in this list.
|
|
Appends the specified element to the end of this
list.
|
Class
java.util.Vector
java.lang.Object
java.util.AbstractCollection
java.util.AbstractList
java.util.Vector
The
Vector
class implements a growable array of objects. Like an
array, it contains components that can be accessed using an integer index.
However, the size of a Vector
can grow or shrink as needed to accommodate adding and
removing items after the Vector
has been created.
Each vector tries to optimize storage management by
maintaining a
capacity
and a capacityIncrement
. The capacity
is always at least
as large as the vector size; it is usually larger because as components are
added to the vector, the vector's storage increases in chunks the size ofcapacityIncrement
.
An application can increase the capacity of a vector before inserting a large
number of components; this reduces the amount of incremental reallocation.
Class
java.util.Stack
java.lang.Object
java.util.AbstractCollection
java.util.AbstractList
java.util.Vector
java.util.Stack
The
Stack
class represents a last-in-first-out (LIFO) stack of
objects. It extends class Vector with five operations that allow a vector to be treated as a
stack. The usual push and pop operations are provided, as well as a method to peek at the top item on the stack, a method to test for whether
the stack isempty, and a method to search the stack for an item and discover how far it is from the
top.
When a stack is first created, it contains no items.
Constructor
Index
|
|
Constructor
|
Description
|
Stack()
|
Creates an empty Stack.
|
Method
Index
|
|
Method
|
Description
|
Tests if this stack is empty.
|
|
Looks at the object at the top of this stack
without removing it from the stack.
|
java.util
Interface Iterator<E>
All Known Subinterfaces:
All Known Implementing Classes:
public interface Iterator<E>
An iterator over a collection. Iterator takes the place of
Enumeration in the Java collections framework. Iterators differ from
enumerations in two ways:- Iterators allow the caller to
remove elements from the underlying collection during the iteration with
well-defined semantics.
- Method names have been improved.
Since:
1.2
See Also:
boolean |
|
Void |
remove
() Removes from the underlying collection the last element returned by the iterator (optional operation). |
hasNext
boolean hasNext()
Returns true if the iteration has more elements.
(In other words, returns true if next would return an element rather than
throwing an exception.)
Returns:
true if the iterator has
more elements.
next
E next()
Returns the next element in the
iteration.
Returns:
the next element in the iteration.
Throws:
remove
void remove()
III UNIT
1) Searching-
Linear & Binary search methods:
2)
linear search or sequential
search is a method for
finding a particular value in a list, that
consists of checking every one of its elements, one at a time and in sequence,
until the desired one is found.[1]
3)
Linear search is the
simplest search algorithm;
it is a special case of brute-force search. Its worst case cost is proportional to the number of elements in the list;
and so is its expected cost, if all list elements are equally likely to be searched
for. Therefore, if the list has more than a few elements, other methods (such
as binary search or hashing) will be
faster, but they also impose additional requirements.
5) For a list with n items, the
best case is when the value is equal to the first element of the list, in which
case only one comparison is needed. The worst case is when the value is not in
the list (or occurs only once at the end of the list), in which case n comparisons
are needed.
6) If the value being sought occurs k times
in the list, and all orderings of the list are equally likely, the expected number
of comparisons is
7)
8) For example, if the value being sought occurs
once in the list, and all orderings of the list are equally likely, the
expected number of comparisons is
. However, if it
is known that it occurs once, then at most n -
1 comparisons are needed, and the expected number of comparisons is
9)
10) (for example, for n = 2 this
is 1, corresponding to a single if-then-else construct).
11) Either way, asymptotically the worst-case cost and the expected cost of linear search
are both O(n).
Searching
in reverse order
Set i to 1.
Repeat this loop:
If i > n,
then exit the loop.
If A[i] =
x, then exit the loop.
Set i to i
+ 1.
Return i.
Binary search methods:
a binary
search or half-interval search algorithm finds the position of a specified input value
(the search "key") within an array sorted by key value.[1][2] For binary search, the array should be arranged
in ascending or descending order. In each step, the algorithm compares the
search key value with the key value of the middle element of the array. If the keys
match, then a matching element has been found and its index, or position, is
returned. Otherwise, if the search key is less than the middle element's key,
then the algorithm repeats its action on the sub-array to the left of the
middle element or, if the search key is greater, on the sub-array to the right.
If the remaining array to be searched is empty, then the key cannot be found in
the array and a special "not found" indication is returned.
Examples[edit]
Example: The list to be searched:
L = 1 3 4 6 8 9 11. The value to be found: X = 4.
Compare X to 6. X is smaller. Repeat with L = 1 3 4.
Compare X to 3. X is bigger. Repeat with L = 4.
Compare X to 4. They are equal. We're done, we found X.
2)
Hashing-Hash function:
A hash function is any function that can be used to map data of arbitrary size to data of fixed size, with slight
differences in input data producing very big differences in output data.[1] The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes. One practical use is a
data structure called a hash table. Hash functions are
also frequently used to accelerate table or database lookup, detecting
duplicated records in a large file, and finding similar stretches in DNA
sequences. They are also useful in cryptography. A cryptographic hash function allows one to easily verify that some input data matches a
stored hash value, but hard to reconstruct the data from the hash alone. This
principle is used by the PGP algorithm for data
validation and by many password checking systems.
Uses[edit]
Hash tables[edit]
Hash functions are primarily used
in hash tables, to
quickly locate a data record (e.g., a dictionary definition)
given its search key (the
headword). Specifically, the hash function is used to map the search key to an
index; the index gives the place in the hash table where the corresponding
record should be stored. Hash tables, in turn, are used to implement associative arrays and dynamic sets.
Typically, the domain of a hash
function (the set of possible keys) is larger than its range (the number of different
table indexes), and so it will map several different keys to the same index.
Therefore, each slot of a hash table is associated with (implicitly or
explicitly) aset of records, rather than a single record. For this reason,
each slot of a hash table is often called a bucket,
and hash values are also called bucket
indices.
3)
Collision Resolution -
Separate Chaining
4)
Hashing in java.util.hash map:
The java.util.HashMap class is
the Hash table based implementation of the Map interface.Following are the
important points about HashMap:
·
This class makes no
guarantees as to the iteration order of the map; in particular, it does not
guarantee that the order will remain constant over time.
·
This class permits null
values and the null key.
Class declaration
Following is the declaration for java.util.HashMap class:
public class HashMap<K,V>
extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable
Parameters
Following is the parameter for java.util.HashMap class:
·
K -- This is the type of keys maintained by
this map.
·
V -- This is the type of mapped values.
Class constructors
S.N.
|
Constructor & Description
|
1
|
HashMap()
This constructs an empty HashMap with the default initial capacity (16) and the default load factor (0.75). |
2
|
HashMap(Collection<? extends E> c)
This constructs an empty HashMap with the specified initial capacity and the default load factor (0.75). |
3
|
HashMap(int initialCapacity, float
loadFactor)
This constructs an empty HashMap with the specified initial capacity and load factor. |
4
|
HashMap(Map<? extends K,? extends V>
m)
This constructs a new HashMap with the same mappings as the specified Map. |
The java.util.HashSet class implements the Set interface,
backed by a hash table.Following are the important points about HashSet:
·
This class makes no
guarantees as to the iteration order of the set; in particular, it does not
guarantee that the order will remain constant over time.
·
This class permits the
null element.
Class
declaration
Following is the
declaration for java.util.HashSet class:
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, Serializable
Parameters
Following is the
parameter for java.util.HashSet class:
·
E -- This is the type of elements maintained by this set.
Class
constructors
S.N.
|
Constructor & Description
|
1
|
HashSet()
This constructs a new, empty set; the backing HashMap instance has default initial capacity (16) and load factor (0.75). |
2
|
HashSet(Collection<? extends E> c)
This constructs a new set containing the elements in the specified collection. |
3
|
HashSet(int initialCapacity)
This constructs a new, empty set; the backing HashMap instance has the specified initial capacity and default load factor (0.75). |
4
|
HashSet(int initialCapacity, float loadFactor)
This constructs a new, empty set; the backing HashMap instance has the specified initial capacity and the specified load factor. |
Class
methods
S.N.
|
Method & Description
|
1
|
boolean add(E e)
This method adds the specified element to this set if it is not already present. |
2
|
void clear()
This method removes all of the elements from this set. |
3
|
Object clone()
This method returns a shallow copy of this HashSet instance, the elements themselves are not cloned. |
The java.util.Hashtable class implements a hashtable, which
maps keys to values.Following are the important points about Hashtable:
·
In this any non-null
object can be used as a key or as a value.
·
If many entries are to
be made into a Hashtable, creating it with a sufficiently large capacity may
allow the entries to be inserted more efficiently than letting it perform
automatic rehashing as needed to grow the table.
Class
declaration
Following is the
declaration for java.util.Hashtable class:
public class Hashtable<K,V>
extends Dictionary<K,V>
implements Map<K,V>, Cloneable, Serializable
Class
constructors
S.N.
|
Constructor & Description
|
1
|
Hashtable()
This constructs a new, empty hashtable with a default initial capacity (11) and load factor (0.75). |
2
|
Hashtable(int initialCapacity)
This constructs a new, empty hashtable with the specified initial capacity and default load factor (0.75). |
3
|
Hashtable(int initialCapacity, float loadFactor)
This constructs a new, empty hashtable with the specified initial capacity and the specified load factor. |
4
|
Hashtable(Map<? extends K,? extends V> t)
This constructs a new hashtable with the same mappings as the given Map. |
Class
methods
S.N.
|
Method & Description
|
1
|
void clear()
This method clears this hashtable so that it contains no keys. |
2
|
Object clone()
This method creates a shallow copy of this hashtable. |
3
|
boolean contains(Object value)
This method tests if some key maps into the specified value in this hashtable. |
5)
Sorting:
Sorting is the process of placing elements from
a collection in some kind of order. For example, a list of words could be
sorted alphabetically or by length. A list of cities could be sorted by
population, by area, or by zip code. We have already seen a number of
algorithms that were able to benefit from having a sorted list (recall the
final anagram example and the binary search).
Bubble Sort: is a simple sorting algorithm that works by repeatedly stepping through the
list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass
through the list is repeated until no swaps are needed, which indicates that
the list is sorted.
Step-by-step
example[edit]
Let us take the array of numbers
"5 1 4 2 8", and sort the array from lowest number to greatest number
using bubble sort. In each step, elements written in bold are being compared. Three passes will
be required.
First Pass:
( 5 1 4 2 8 )
( 1 5 4 2 8 ), Here, algorithm compares the
first two elements, and swaps since 5 > 1.
( 1 5 4 2 8 )
( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 )
( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 )
( 1 4 2 5 8 ), Now, since these elements are
already in order (8 > 5), algorithm does not swap them.
Second Pass:
( 1 4 2 5 8 )
( 1 4 2 5 8 )
( 1 4 2 5 8 )
( 1 2 4 5 8 ), Swap since 4 > 2
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without anyswap to know it is sorted.
Third Pass:
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 5 1 4 2 8 )
( 1 5 4 2 8 )
( 1 4 5 2 8 )
( 1 4 2 5 8 )
Second Pass:
( 1 4 2 5 8 )
( 1 4 2 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without anyswap to know it is sorted.
Third Pass:
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
( 1 2 4 5 8 )
procedure bubbleSort( A : list of sortable items )
n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
/* if this pair is out of order */
if A[i-1] > A[i] then
/* swap them and remember something changed */
swap( A[i-1], A[i] )
swapped = true
end if
end for
until not swapped
end procedure
2). Insertion Sort:
Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time. It is much less
efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:
for i ← 1 to length(A)
j ← i
while j > 0 and A[j-1] > A[j]
swap A[j] and A[j-1]
j ← j - 1
Example: The following table
shows the steps for sorting the sequence {3, 7, 4, 9, 5, 2, 6, 1}. In each
step, the key under consideration is underlined. The key that was moved (or
left in place because it was biggest yet considered) in the previous step is
shown in bold.
3 7 4 9 5 2 6 1
3 7 4 9 5 2 6
1
3 7 4 9 5 2 6 1
3 4 7 9 5 2 6 1
3 4 7 9 5 2 6 1
3 4 5 7 9 2 6 1
2 3 4 5 7 9 6 1
2 3 4 5 6 7 9 1
1 2 3 4 5 6 7 9
3) Quick sort:
Quicksort is a divide and
conquer algorithm. Quicksort first
divides a large array into two smaller sub-array: the low elements and the high
elements. Quicksort can then recursively sort the sub-arrays.
The steps are:
1.
Pick an element, called a pivot,
from the array.
2.
Reorder the array so that all elements with values less than the
pivot come before the pivot, while all elements with values greater than the
pivot come after it (equal values can go either way). After this partitioning,
the pivot is in its final position. This is called thepartition operation.
3.
Recursively apply the above steps to the sub-array
of elements with smaller values and separately to the sub-array of elements
with greater values.
4. quicksort(A, i, k):
5. if i < k:
6. p := partition(A, i, k)
7. quicksort(A, i, p - 1)
8. quicksort(A, p + 1, k)
4) Merge Sort:
merge sort (also commonly spelled mergesort) is an O(nlog n) comparison-based sorting algorithm. Most implementations produce a stable
sort, which means that the implementation preserves
the input order of equal elements in the sorted output
:
1.
Divide the unsorted list into n sublists, each containing 1 element (a
list of 1 element is considered sorted).
2.
Repeatedly merge sublists to produce new sorted
sublists until there is only 1 sublist remaining. This will be the sorted list.
TopDownMergeSort(A[], B[], n)
{
TopDownSplitMerge(A, 0, n, B);
}
CopyArray(B[], iBegin, iEnd, A[])
{
for(k = iBegin; k < iEnd; k++)
A[k] = B[k];
}
// iBegin is inclusive; iEnd is exclusive (A[iEnd] is not in the set)
TopDownSplitMerge(A[], iBegin, iEnd, B[])
{
if(iEnd - iBegin < 2) // if run size == 1
return; // consider it sorted
// recursively split runs into two halves until run size == 1,
// then merge them and return back up the call chain
iMiddle = (iEnd + iBegin) / 2; // iMiddle = mid point
TopDownSplitMerge(A, iBegin, iMiddle, B); // split / merge left half
TopDownSplitMerge(A, iMiddle, iEnd, B); // split / merge right half
TopDownMerge(A, iBegin, iMiddle, iEnd, B); // merge the two half runs
CopyArray(B, iBegin, iEnd, A); // copy the merged runs back to A
}
// left half is A[iBegin :iMiddle-1]
// right half is A[iMiddle:iEnd-1 ]
TopDownMerge(A[], iBegin, iMiddle, iEnd, B[])
{
i0 = iBegin, i1 = iMiddle;
// While there are elements in the left or right runs
for (j = iBegin; j < iEnd; j++) {
// If left run head exists and is <= existing right run head.
if (i0 < iMiddle && (i1 >= iEnd || A[i0] <= A[i1]))
B[j] = A[i0];
i0 = i0 + 1;
else
B[j] = A[i1];
i1 = i1 + 1; }
}
4) Heap Sort:
The heapsort algorithm can be
divided into two parts.
In the first step, a heap is built out of the data. The heap is often placed in an array
with the layout of a complete binary tree. The complete binary tree maps the
binary tree structure into the array indices; each array index represents a
node; the index of the node's parent, left child branch, or right child branch
are simple expressions. For a zero-based array, the root node is stored at
index 0; if
i
is the index of the current node, then iParent = floor((i-1) / 2)
iLeftChild = 2*i + 1
iRightChild = 2*i + 2
2. Sorting.
Heap
|
swap elements
|
delete element
|
sorted array
|
details
|
8, 6, 7, 4, 5, 3, 2, 1
|
8, 1
|
swap 8 and 1 in order to delete 8 from heap
|
||
1, 6, 7, 4, 5, 3, 2, 8
|
8
|
delete 8 from heap and add to sorted array
|
||
1, 6, 7, 4, 5, 3, 2
|
1, 7
|
8
|
swap 1 and 7 as they are not in order in the heap
|
|
7, 6, 1, 4,
5, 3, 2
|
1, 3
|
8
|
swap 1 and 3 as they are not in order in the heap
|
|
7, 6, 3, 4, 5, 1, 2
|
7, 2
|
8
|
swap 7 and 2 in order to delete 7 from heap
|
|
2, 6, 3, 4, 5, 1, 7
|
7
|
8
|
delete 7 from heap and add to sorted array
|
|
2, 6, 3, 4, 5, 1
|
2, 6
|
7, 8
|
swap 2 and 6 as they are not in order in the heap
|
|
6, 2, 3, 4, 5, 1
|
2, 5
|
7, 8
|
swap 2 and 5 as they are not in order in the heap
|
|
6, 5, 3, 4, 2, 1
|
6, 1
|
7, 8
|
swap 6 and 1 in order to delete 6 from heap
|
|
1, 5, 3, 4, 2, 6
|
6
|
7, 8
|
delete 6 from heap and add to sorted array
|
|
1, 5, 3, 4, 2
|
1, 5
|
6, 7, 8
|
swap 1 and 5 as they are not in order in the heap
|
|
5, 1, 3, 4, 2
|
1, 4
|
6, 7, 8
|
swap 1 and 4 as they are not in order in the heap
|
|
5, 4, 3, 1, 2
|
5, 2
|
6, 7, 8
|
swap 5 and 2 in order to delete 5 from heap
|
|
2, 4, 3, 1, 5
|
5
|
6, 7, 8
|
delete 5 from heap and add to sorted array
|
|
2, 4, 3, 1
|
2, 4
|
5, 6, 7, 8
|
swap 2 and 4 as they are not in order in the heap
|
|
4, 2, 3, 1
|
4, 1
|
5, 6, 7, 8
|
swap 4 and 1 in order to delete 4 from heap
|
|
1, 2, 3, 4
|
4
|
5, 6, 7, 8
|
delete 4 from heap and add to sorted array
|
|
1, 2, 3
|
1, 3
|
4, 5, 6, 7, 8
|
swap 1 and 3 as they are not in order in the heap
|
|
3, 2, 1
|
3, 1
|
4, 5, 6, 7, 8
|
swap 3 and 1 in order to delete 3 from heap
|
|
1, 2, 3
|
3
|
4, 5, 6, 7, 8
|
delete 3 from heap and add to sorted array
|
|
1, 2
|
1, 2
|
3, 4, 5, 6, 7, 8
|
swap 1 and 2 as they are not in order in the heap
|
|
2, 1
|
2, 1
|
3, 4, 5, 6, 7, 8
|
swap 2 and 1 in order to delete 2 from heap
|
|
1, 2
|
2
|
3, 4, 5, 6, 7, 8
|
delete 2 from heap and add to sorted array
|
|
1
|
1
|
2, 3, 4, 5, 6, 7, 8
|
delete 1 from heap and add to sorted array
|
|
1, 2, 3, 4, 5, 6, 7, 8
|
completed
|
5) Radix Sort:
radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping
keys by the individual digits which share the samesignificant position and value.
Definition[edit]
Each key is first figuratively
dropped into one level of buckets corresponding to the value of the rightmost
digit. Each bucket preserves the original order of the keys as the keys are
dropped into the bucket. There is a one-to-one correspondence between the
number of buckets and the number of values that can be represented by the
rightmost digit. Then, the process repeats with the next neighbouring more
significant digit until there are no more digits to process. In other words:
2.
Group the keys based on that digit, but otherwise keep the
original order of keys. (This is what makes the LSD radix sort astable
sort.)
3.
Repeat the grouping process with each more significant digit.
The sort in step 2 is
usually done using bucket sort or counting sort,
which are efficient in this case since there are usually only a small number of
digits.
An example[edit]
Original, unsorted list:
170, 45, 75, 90, 802, 2, 24, 66
Sorting by least
significant digit (1s place) gives:
170, 90, 802, 2, 24, 45, 75,
66
Notice that we keep 802 before 2, because 802 occurred before 2
in the original list, and similarly for pairs 170 & 90 and 45 & 75.
Sorting by next digit (10s place)
gives:
802, 2, 24, 45, 66, 170, 75, 90
Notice that 802 again comes before 2 as 802 comes before 2 in
the previous list.
Sorting by most
significant digit (100s place) gives:
2, 24, 45, 66, 75, 90, 170, 802
It is important to realize that
each of the above steps requires just a single pass over the data, since each
item can be placed in its correct bucket without having to be compared with
other items.
Some radix sort implementations
allocate space for buckets by first counting the number of keys that belong in
each bucket before moving keys into those buckets. The number of times that
each digit occurs is stored in an array. Consider
the previous list of keys viewed in a different way:
170, 045, 075, 090, 002, 024, 802, 066
The first counting pass
starts on the least significant digit of each key, producing an array of bucket
sizes:
2 (bucket size for digits of 0: 170, 090)
2 (bucket size for digits of 2: 002, 802)
1 (bucket size for digits of 4: 024)
2 (bucket size for digits of 5: 045, 075)
1 (bucket size for digits of 6: 066)
A second counting pass on the
next more significant digit of each key will produce an array of bucket sizes:
2 (bucket size for digits of 0: 002, 802)
1 (bucket size for digits of 2: 024)
1 (bucket size for digits of 4: 045)
1 (bucket size for digits of 6: 066)
2 (bucket size for digits of 7: 170, 075)
1 (bucket size for digits of 9: 090)
A third and final counting pass
on the most significant digit of each key will produce an array of bucket
sizes:
6 (bucket size for digits of 0: 002, 024, 045, 066, 075, 090)
1 (bucket size for digits of 1: 170)
1 (bucket size for digits of 8: 802)
10) Comparison of algorithms[edit]
In this table, n is the number of records to be sorted.
The columns "Average" and "Worst" give the time complexity in each case, under the assumption that the length of
each key is constant, and that therefore all comparisons, swaps, and other
needed operations can proceed in constant time. "Memory" denotes the
amount of auxiliary storage needed beyond that used by the list itself, under
the same assumption. The run times and the memory requirements listed below
should be understood to be inside big O notation.
Logarithms are of any base; the notation
means
.
These are all comparison sorts,
and so cannot perform better than O(n log n) in
the average or worst case.
Name
|
Best
|
Average
|
Worst
|
Memory
|
Stable
|
Method
|
Other notes
|
typical in-place sort is not stable; stable versions exist
|
Partitioning
|
Quicksort is usually done in place with O(log n) stack space.[citation
needed] Most implementations are unstable, as stable in-place
partitioning is more complex. Naïve variants use an O(n) space array to store the partition.[citation
needed] Quicksort variant using three-way (fat) partitioning takes O(n)
comparisons when sorting an array of equal keys.
|
|||||
Yes
|
Merging
|
Highly parallelizable (up to O(log n) using
the Three Hungarian's Algorithm[clarification
needed] or, more practically, Cole's parallel merge sort) for processing
large amounts of data.
|
|||||
—
|
—
|
Yes
|
Merging
|
||||
No
|
Selection
|
||||||
Yes
|
Insertion
|
||||||
No
|
Partitioning & Selection
|
||||||
No
|
Selection
|
||||||
Yes
|
Insertion & Merging
|
Makes n comparisons when the data is already
sorted or reverse sorted.
|
|||||
Yes
|
Exchanging
|
Tiny code size.
|
|||||
Yes
|
Insertion
|
||||||
—
|
No
|
Insertion
|
In-place with theoretically optimal number of writes.
|
||||
—
|
Yes
|
Insertion
|
|||||
—
|
—
|
No
|
Insertion & Selection
|
||||
No
|
Selection
|
||||||
Yes
|
Selection
|
||||||
—
|
?
|
Selection
|
|||||
Yes
|
Exchanging
|
||||||
No
|
Exchanging
|
Small code size.
|
|||||
Yes
|
Exchanging
|
Tiny code size.
|
|||||
In place for linked lists. N*sizeof(link)
for array.
|
Can be made stable by appending the input order to the key.
|
Distribution and Merge
|
No exchanges are performed. Performance is independent of data
size. The constant 'k' is proportional to the entropy in the input. K = 1 for
ordered or ordered by reversed input so runtime is equivalent to checking the
order O(N).
|
||||
—
|
Yes
|
?
|
|||||
Yes
|
Insertion & Merging
|
||||||
No
|
Random shuffling
|
UNIT-IVTrees:
Sorted
binary tree:
2)Properties
of bi nary tree:
·
The number of
nodes n in a perfect binary tree can be found
using this formula: n = 2h+1-1
where h is the depth of the tree.
·
The number of
nodes n in a binary tree of height h is at
least n = h + 1 and at most n =
2h+1-1 where h is the depth of
the tree.
·
The number of leaf
nodes l in a perfect binary tree can be found
using this formula: l = 2h where h is
the depth of the tree.
·
The number of
nodes n in a perfect binary tree can also be
found using this formula: n = 2l-1 where l is
the number of leaf nodes in the tree.
·
The number of null
links (i.e., absent children of nodes) in a complete binary tree of n nodes
is (n+1).
·
The number of internal
nodes (i.e., non-leaf nodes or n-l) in a complete
binary tree of n nodes is ⌊ n/2 ⌋.
2) Binary tree ADT:
No comments:
Post a Comment