Nothing Special   »   [go: up one dir, main page]

FS CS

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

7.

FILE STRUCTURES SYLLABUS

Semester: VI Year: 2013-14


Subject Title: File Structures Subject Code : 10IS63
Total No. of Lecture Hours : 52 Duration of Exam: 03
Total Exam Marks: 100 Total IA Marks: 25
Staff Name: Sandeep Tengale

PART - A
UNIT - 1
INTRODUCTION: File Structures: The Heart of the file structure Design, A Short History of
File Structure Design, A Conceptual Toolkit; Fundamental File Operations: Physical Files and
Logical Files, Opening Files, Closing Files, Reading and Writing, Seeking, Special Characters,
The Unix Directory Structure, Physical devices and Logical Files, File-related Header Files,
UNIX file System Commands; Secondary Storage and System Software: Disks, Magnetic Tape,
Disk versus Tape; CD-ROM: Introduction, Physical Organization, Strengths and Weaknesses;
Storage as Hierarchy, A journey of a Byte, Buffer Management, Input /Output in UNIX.
7 Hours

UNIT - 2
FUNDAMENTAL FILE STRUCTURE CONCEPTS, MANAGING FILES OF
RECORDS: Field and Record Organization, Using Classes to Manipulate Buffers, Using
Inheritance for Record Buffer Classes, Managing Fixed Length, Fixed Field Buffers, An Object-
Oriented Class for Record Files, Record Access, More about Record Structures, Encapsulating
Record Operations in a Single Class, File Access and File Organization.
6 Hours

UNIT - 3
ORGANIZATION OF FILES FOR PERFORMANCE, INDEXING:
Data Compression, Reclaiming Space in files, Internal Sorting and Binary Searching,
Keysorting; What is an Index? A Simple Index for Entry- Sequenced File, Using Template
Classes in C++ for Object I/O, Object- Oriented support for Indexed, Entry-Sequenced Files of
Data Objects, Indexes that are too large to hold in Memory, Indexing to provide access by
Multiple keys, Retrieval Using Combinations of Secondary Keys, Improving the Secondary
Index structure: Inverted Lists, Selective indexes, Binding.
7 Hours

UNIT - 4
COSEQUENTIAL PROCESSING AND THE SORTING OF LARGE FILES: A Model for
Implementing Cosequential Processes, Application of the Model to a General Ledger Program,
Extension of the Model to include Mutiway Merging, A Second Look at Sorting in Memory,
Merging as a Way of Sorting Large Files on Disk.
6 Hours
PART - B
UNIT-5 MULTI-LEVEL INDEXING AND B-TREES: The invention of B-Tree, Statement of
the problem, Indexing with Binary Search Trees; Multi-Level Indexing, B-Trees, Example of
Creating a B-Tree, An Object-Oriented Representation of B-Trees, B-Tree Methods;
Nomenclature, Formal Definition of B-Tree Properties, Worst-case Search Depth, Deletion,
Merging and Redistribution, Redistribution during insertion; B* Trees, Buffering of pages;
Virtual B-Trees; Variable-length Records and keys. 7 Hours

UNIT - 6
INDEXED SEQUENTIAL FILE ACCESS AND PREFIX B + TREES:
Indexed Sequential Access, Maintaining a Sequence Set, Adding a Simple Index to the Sequence
Set, The Content of the Index: Separators Instead of Keys, The Simple Prefix B+ Tree and its
maintenance, Index Set Block Size, Internal Structure of Index Set Blocks: A Variable-order B-
Tree, Loading a Simple Prefix B+ Trees, B-Trees, B+ Trees and Simple Prefix B+ Trees in
Perspective. 6 Hours

UNIT - 7
HASHING: Introduction, A Simple Hashing Algorithm, Hashing Functions and Record
Distribution, How much Extra Memory should be used?, Collision resolution by progressive
overflow, Buckets, Making deletions, Other collision resolution techniques, Patterns of record
access. 7 Hours

UNIT - 8
EXTENDIBLE HASHING: How Extendible Hashing Works, Implementation, Deletion,
Extendible Hashing Performance, Alternative Approaches. 6 Hours

TEXT BOOK:
1. File Structures-An Object Oriented Approach with C++ - Michel J. Folk, Bill Zoellick,
Greg Riccardi, 3rd Edition, Addison- Wesley, 1998.

REFERENCE BOOKS:
1. File Structures Using C++ - K.R. Venugopal, K.G. Srinivas, P.M. Krishnaraj, Tata
McGraw-Hill, 2008.
2. C++ Components and Algorithms - Scot Robert Ladd, BPB Publications, 1993.
3. Database Management Systems - Raghu Ramakrishan and Johannes Gehrke, 3rd Edition,
McGraw Hill, 2003.

8. FILE STRUCTURES COURSE PLAN


Prerequisites:
1. The students should have idea of data files and its need.
2. Also, the students must have the basic knowledge of any object oriented programming
language like C++.

Course overview and its relevance to program:


Storage capacity and accessing speed is the tradeoff between disks and memory. File structure
deals with the organization of data in the disks to improve the performance in terms of its
accessing speed. This course is intended to study the object oriented approach for file structure
design in a progressive manner.
The course starts with the introduction and need for file structures. Then the fundamental
concepts of file structures are discussed. It helps us to understand how the data in the file is
organized and managed. The course includes the discussion about the progress of research in the
field of file structure design with the objective that minimizes the time taken to retrieve the
required data stored in the disk i.e., getting the data with single access. Binary search is the first
developed approach. Then the high level file structure tools, including indexing, tree structures:
AVL tree, B-tree, B+ tree and finally the concept of hashing and extendible hashing are
introduced and are discussed in detail.
The course includes the extensive discussion of the object-oriented approach to represent
information and algorithms.
Applications:
With the advancement of computers in every field, the need for storing large amount of data and
fast retrieval of stored data is also increasing. This course helps to develop new file structure
tools that best satisfies the present needs.

COS
1. 1. Basics of file structure design
2. Data Compression and selection Select suitable indexing for better performance to a
given Problem
3. Merging of files and Construction of Btree
4. Construction of B+tree to make faster access of data from file
5. Choose appropriate file structure for storage representation (hashing).
6. Types of hashing techniques.
UNIT-I
UNIT WISE PLAN

Chapter Number: 1,2,3 No. of Hours: 07


Unit Title: Introduction

Learning objectives: At the end of this chapter the students will learn
1. Basics of file structures.

2. History of file structure design.

3. Basic operations on files like Read, Write and Seek.

4. The organization of disks and tapes and their differences.

5. CD-Rom, a file structure problem, and its physical organization and strengths &
weaknesses.

Lesson Plan:
L1) File Structures: The Heart of the file structure Design, A Short History of File Structure
Design, A Conceptual Toolkit; Fundamental File Operations: Physical Files and Logical Files
L2) Fundamental File Operations: Physical Files and Logical Files, Opening Files, Closing Files,
Reading and Writing, Seeking, Special Characters, The Unix Directory Structure
L3) Unix Directory Structure, Physical devices and Logical File
L4) File-related Header Files, UNIX file System Commands; Secondary Storage and System
Software
L5) File-related Header Files, UNIX file System Commands; Secondary Storage and System
Software: Disks, Magnetic Tape
L6)Disk versus Tape; CD-ROM: Introduction, Physical Organization, Strengths and
Weaknesses;
L7) Storage as Hierarchy, A journey of a Byte, Buffer Management, Input /Output in UNIX

Assignment Questions:
1) What is File Structure?
2) Explain the conceptual tool kit for the file structure?
3) Explain the OO Tool kit for the file structure?
4) Give the brief history of File Structure?
5) Give the syntax for Reading & Writing of information to the file.
6) Give the list of C & C++ stream classes.
7) How to seek the information in the file.
8) Explain the UNIX directory system of the FS.
9) Why are there inter block gap on linear tapes. Why do we not just jam all records in to
one block.
10) Use the internet to determine the characteristics of the second generation of DVD.

UNIT-II
UNIT WISE PLAN
Chapter Number: 4,5 No. of Hours: 06
Unit Title: Fundamentals File Structure Concepts, Managing Files Of Records

Learning objectives: At the end of this chapter the students will learn
1. File as a collection of records and fields and its organization.

2. Using object oriented approach for manipulating and managing fixed length and fixed
field buffers.

3. Accessing of records from the file using keys, sequential search and direct access.

Lesson Plan:
L8) Field and Record Organization
L9)Using Classes to Manipulate Buffers, Using Inheritance for Record Buffer Classes
L10) Managing Fixed Length, Fixed Field Buffers, An Object-Oriented Class for Record Files
L11) Record Access, More about Record Structures
L12) Encapsulating Record Operations in a Single Class
L13) File Access and File Organization

Assignment Questions:
1) Define the Fields & Records.
2) Explain the field structure.
3) Explain the record structure.
4) Explain the buffer class for delimited text fields.
5) Explain the buffer class for length based & fixed length fields.
6) Explain the inheritance in the C++ stream classes for record buffer class.
7) Explain the UNIX tools for sequential processing.
8) Explain the file access & file organization in FS.
9) Define the metadata.
UNIT-III
UNIT WISE PLAN
Chapter Number: 6,7 No. of Hours: 07
Unit Title: Organization Of Files For Performance, Indexing

Learning objectives: At the end of this chapter the students will learn
1. Different file organization styles to improve performance like Data compression,
reclaiming space.

2. Accessing methods: Internal sorting, binary search.

3. Key sorting, concepts of indexing.

Lesson Plan:
L14) Data Compression, Reclaiming Space in files, Internal Sorting and Binary Searching
L15) Keysorting; What is an Index? A Simple Index for Entry- Sequenced File
L16) Using Template Classes in C++ for Object I/O, Object- Oriented support for Indexed,
Entry-Sequenced Files of Data Objects
L17) Indexes that are too large to hold in Memory
L18) Indexing to provide access by Multiple keys
L19) Retrieval Using Combinations of Secondary Keys
L20) Improving the Secondary Index structure: Inverted Lists, Selective indexes, Binding

Assignment Questions:
1) Define the data compression.
2) Give the compression routines in UNIX.
3) How to reclaim space in files.
4) Explain deleting fixed length records for reclaiming space dynamically.
5) Explain the deleting variable length records.
6) Define the indexing.
7) Give the OO method of entry sequenced files of data objects.
8) Give the different operations required to maintain an indexed file.
9) Explain the indexing to provide access by multiple keys.
10) How to retrieve the information using the combination of secondary keys.
UNIT-IV
UNIT WISE PLAN
Chapter Number: 8 No. of Hours: 06
Unit Title: Consequential Processing And The Sorting Of Large Files

Learning objectives: At the end of this chapter the students will learn
1. An object oriented model for implementing consequential processes.

2. Multiway merging like K-way merge, selection tree.

3. Sorting of large files on disks.

Lesson Plan:
L21) A Model for Implementing Cosequential Processes
L22) Application of the Model to a General Ledger Program
L23) Extension of the Model to include Mutiway Merging
L24) A Second Look at Sorting in Memory
L25) Merging as a Way of Sorting Large Files on Disk
L26) Discussion on chapter

Assignment Questions:
1) Define co-sequential processing.
2) Provide a general OO model for implementing all variety of co-sequential processing.
3) Explain the merging of two lists.
4) Give the applications of the co-sequential processes to ledger program.
5) Explain the K-Way merge algorithm.
6) How much time does a merge sort take in co-sequential processing.
7) How to decrease the number of seeks in co-sequential processing.
8) Explain the co-sequential process for to disc drives with replacement selection.
9) Explain the sorting files on tape.
10) Give the sort merge packages.

UNIT-V
UNIT WISE PLAN
Chapter Number: 9 No. of Hours: 07
Unit Title: Multi-Level Indexing And B-Trees

Learning objectives: At the end of this chapter the students will learn
1. Introduction to the tree structures: AVL tree, paged binary trees (B-tree).

2. Multilevel indexing, working of B-Tree, creation of b-trees..

3. An object oriented representation of B-trees.

4. B-tree operations: Insert, Delete, Merge, Redistribution, search


5. Introduction to B* tree

Lesson Plan:
L27) The invention of B-Tree, Statement of the problem
L28) Indexing with Binary Search Trees; Multi-Level Indexing
L29) B-Trees, Example of Creating a B-Tree
L30) Example of Creating a B-Tree, An Object-Oriented Representation of B-Trees, B-Tree
Methods
L31) Nomenclature, Formal Definition of B-Tree Properties, Worst-case Search Depth, Deletion,
Merging and Redistribution
L32) B* Trees, Buffering of pages; Virtual B-Trees
L33) Variable-length Records and keys.

Assignment Questions:
1) Explain the multi level indexing.
2) What is B-Tree.
3) Explain the implementation of the fundamental operation of the B-Trees.
4) Explain the OO design of the B-Trees.
5) Give the notion of page buffering & virtual B-Trees.
6) Show the B-Trees of order 4 that result from loading following sets of keys in order
a) C G J X
b) C G J X N S U O A E B H I
7) Given B-Tree of order 256 what is the maximum number of descendent from a page.
8) Given B-Tree of order 256 what is the maximum depth of the tree if it contains 100000
keys.
9) A common belief about B-Trees that a B-Tree can not grow deeper unless it is 100% full.
Discuss this.
10) What is the difference between a B* tree and a B-tree.

UNIT-VI
UNIT WISE PLAN
Chapter Number: 10 No. of Hours: 06
Unit Title: Indexed Sequential File Access And Prefix B + Trees

Learning objectives: At the end of this chapter the students will learn
1. Introduction to indexed sequential access.

2. Maintaining sequence set with the use of blocks.

3. Introduction to simple prefix B+ tree and its loading, variable order B trees.

Lesson Plan:
L34) Indexed Sequential Access, Maintaining a Sequence Set, Adding a Simple Index to the
Sequence Set
L35) The Content of the Index: Separators Instead of Keys, The Simple Prefix B+ Tree and its
maintenance
L36) Index Set Block Size, Internal Structure of Index Set Blocks
L37) A Variable-order B- Tree,
L38) Loading a Simple Prefix B+ Trees, B-Trees
L39) B+ Trees and Simple Prefix B+ Trees in Perspective.
Assignment Questions:
1) Explain the B+ Tree.
2) Describe operations on sequence set of block that maintains records in order by key.
3) Describe the structure that permit the each of the following type of access.
a) Sequential access only.
b) Direct access only.
4) The Index set of a B+ Tree is just a B Tree but unlike the B Trees the separation do not
have to be keys. Why the difference.
5) How does block splitting in the sequence set of a simple prefix B+ tree differ from block
splitting in index set.
6) If the key BOLEN in the simple prefix B+ Tree in fig. 10.7 (Page No – 460) is deleted
from the sequence set node, how is the separator BO in the parent node affected.
7) Show a conceptual view of an index set block.
8) If the initial set record is sorted by key, the process of loading a B+ tree can be handled
by using a single pass sequential process instead of randomly inserting new records in to
the trees. What are the advantages of this approach.
UNIT-VII
UNIT WISE PLAN
Chapter Number: 11 No. of Hours: 07
Unit Title: Hashing

Learning objectives: At the end of this chapter the students will learn
1. Introduction to hashing with simple hashing algorithm.

2. Use of hashing function for record distribution.

3. Many collision resolution techniques like: progressive overflow, buckets, double hashing,
chained progressive overflow,chaining with a separate overflow area, scatter tables.

Lesson Plan:
L40) Introduction
L41) A Simple Hashing Algorithm
L42) Hashing Functions and Record Distribution
L43) How much Extra Memory should be used?
L44) Collision resolution by progressive overflow.
L45) Buckets, Making deletions, Other collision resolution techniques
L46) Patterns of record access.

Assignment Questions:
1) What is hashing.
2) Explain the collisions in Hashing.
3) Explain the simple hashing algorithm.
4) Explain hashing function and record distributions.
5) Explain the Poisson distribution.
6) Make a table showing Poisson function values for r/N = 0.1, 0.5, 0.8, 1, 2, 5 & 11.
examine the table and discuss any features and patterns that provide useful information
about hashing.
UNIT-VIII
UNIT WISE PLAN
Chapter Number: 12 No. of Hours: 06
Unit Title: Extendible Hashing

Learning objectives: At the end of this chapter the students will learn
1. Working of Extendible hashing (EH) technique.

2. Implementation of EH.

3. Implementing and handling deletion operation.

Lesson Plan:
L47) How Extendible Hashing Works
L48) Implementation
L49) Deletion
L50) Extendible Hashing Performance
L51) Alternative Approaches
L52) Chapter discussion

Assignment Questions:
1) Define extendable hashing.
2) Briefly describe the differences between extendable hashing, dynamic hashing, and linear
hashing.
3) What are the strengths & weakness dynamic hashing, and linear hashing.
4) If buckets are large, a bucket contain only a few records is not much less wasteful than an
empty bucket. How could we minimize empty buckets.
9. FILE STRUCTURES IA PORTION
I.A. Test No. Units
I Unit 1, Unit 2, Unit 3 (3 Hrs)
II Unit 3 (4 Hrs), Unit 5, Unit 6
III Unit 4, Unit 7, Unit 8

You might also like