Nothing Special   »   [go: up one dir, main page]

WO2004077219A2 - System and method of mapping patterns of data, optimising disk read and write, verifying data integrity across clients and servers of different functionality having shared resources - Google Patents

System and method of mapping patterns of data, optimising disk read and write, verifying data integrity across clients and servers of different functionality having shared resources Download PDF

Info

Publication number
WO2004077219A2
WO2004077219A2 PCT/IN2004/000030 IN2004000030W WO2004077219A2 WO 2004077219 A2 WO2004077219 A2 WO 2004077219A2 IN 2004000030 W IN2004000030 W IN 2004000030W WO 2004077219 A2 WO2004077219 A2 WO 2004077219A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
data pattern
recited
file
pattern
Prior art date
Application number
PCT/IN2004/000030
Other languages
French (fr)
Other versions
WO2004077219A3 (en
Inventor
Vinayak K. Rao
Original Assignee
Vaman Technologies (R & D) Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vaman Technologies (R & D) Limited filed Critical Vaman Technologies (R & D) Limited
Publication of WO2004077219A2 publication Critical patent/WO2004077219A2/en
Publication of WO2004077219A3 publication Critical patent/WO2004077219A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0661Format or protocol conversion arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • a server typically shares hardware resources and persistent data generated by the server across clients, maintaining data integrity and state across concurrent users sharing this persistent data.
  • the pattern of this persistent data used or generated by the server is purely as per the server functional scope.
  • database servers generate data in a table column format, web servers in an Operating System (OS) native file directory structure and mail servers in their own proprietary format.
  • OS Operating System
  • bitmap is graphical data popularly generated by any painting software like Microsoft paintbrush
  • various derivates of persistent data format for the same image data like Graphics Interchange Format (GIF) or TGA or Joint Photographic Experts Group (JPG) etc.
  • GIF Graphics Interchange Format
  • JPG Joint Photographic Experts Group
  • the digital image data remains the same but archiving patterns vary that is the preamble or post amble data pattern and compression technologies meant for fastest saving and retrieval vary.
  • time and space that is when disk space required to save data is to be minimized time is lost in uncompressing and compressing data.
  • time is to be saved, there is wastage of space for saving these uncompressed data.
  • Databases are an important tool for the storage and management of information for businesses. Both relational database management systems (RDBMS) and non-relational database management systems exist for this purpose. Examples of RDBMS' include ORACLE, DB2, and INFORMIX. Examples of nonrelational databases include custom databases created with the operating system, developed by IBM. The operating system allows programmers to created custom databases, which support keyed, sequential, and binary file types. Database Management Systems (DBMS) provide users the capabilities of controlling read/write access, specifying report generation, and analyzing usage.
  • RDBMS relational database management systems
  • non-relational databases include custom databases created with the operating system, developed by IBM. The operating system allows programmers to created custom databases, which support keyed, sequential, and binary file types.
  • Database Management Systems provide users the capabilities of controlling read/write access, specifying report generation, and analyzing usage.
  • DBMS Prior to the arrival of RDBMS, DBMS existed on various Operating Systems (OS), which used to maintain data in files (knows as file system) with tree structures supporting different directories and sub directories. These OS file systems evolved with different vendors, with each vendor supporting different features and access mechanisms to enhance disk operations. As a result of these multiple approaches by different Vendors, popular File Systems like NTFS for Windows, File Allocation Table (FAT) for DOS, FAT32 for Windows 9x, NDS for Netware, HPFS for OS2 developed. However, there existed a similarity between the ways in which they maintained the root directory entries and locations in the hard disk, known as the File Allocation Tables. The hard disk was divided into cylinders/tracks/sectors/heads and combination of sectors formed clusters.
  • OS Operating Systems
  • ODBC ODBC
  • 'SQL Data Types' any form of these types in application specific combinations formed data.
  • Outlook express has its own proprietary file format (.mbx, .pst), which is the persistent part of its data capturing mechanism.
  • the same data can be classified as table and column entities from a database perspective and a table with columns capturing "To:, From:, CC:, BCC:, etc can be archived as a generic database rather than a specific and inaccessible one because only Outlook Express can read write this data.
  • the criteria's, which can affect this search, are factors such as the volume of data, resource availability. (Hard disk type IDE, SCSI, speed of the device, and default caching supported), the value of data for that specific pattern, (unbalanced tree - sorted) .associated arithmetic and logical operations required by the application using it, which may dictate the search and archival patterns for performance benefits and security, data integrity and audit policies associated with this type of data, (encryption / compression)
  • mapping patterns of data irrespective of the server protocols and data formats in which requests are received in a relational database management system.
  • all known techniques have additional drawbacks such as they do not provide security features, it difficult to conceal information in directories in case of a directory or file being shared between users, OS specific different processes managing communication across different processes, the OS cannot tracks or take user specific backups unless allocated a separate disk partitions or directories and different patterns of data for maintenance purposes and functionality, require different types of servers to handle different protocols and data patterns / formats.
  • the present invention provides a software-implemented process, system and method for use in a computing environment, whereby a multiplicity of data patterns are mapped, irrespective of the server protocols and data formats in which requests are received in a RDBMS.
  • the present invention maps file systems as table structures, with each of its elements as columns of the table. The entire file content is saved as a blob record rather than managing the file in series of clusters.
  • the present invention provides very high security, as all features of a database can be applied user schema wise. Each user is isolated from the other and entire directory info can be concealed which is highly impossible in an operating system. Further, every action on every file access is through an internal object query and for each of these actions, triggers can be programmed to give interactivity, which is not possible in an OS file system.
  • FTP / HTTP server requests / commands are translated into internal queries to the database and results are given as expected by the client.
  • any FTP client or a web browser is not even aware that they are serviced by a DB rather than a FTP / HTTP server.
  • the present invention translates every user as a DB user entity as per schema space allocated for his inbox / outbox and maps it as table with to, from, cc, bcc, subject, body attachments as columns of the table.
  • the present invention can manage all rules for concurrency and disk caching which are currently managed by OS processes like smartdrive or findfast is taken care by features of DB itself rather than OS specific different processes managing communication across different processes.
  • the present invention centralizes the different patterns of data for maintenance purposes and functionality, which normally requires different types of servers to handle different protocols and data patterns / formats.
  • the present invention centralizes these patterns based on usage frequency / changes expected / expected retrieval time / security constraints into table / column - data type formats supporting these features. Further, it provides for management of all data types along with features of security, with a view to optimizing disk space requirement and minimizing retrieval time, while considering possibilities of updation of data for various data types.
  • the present invention manages the read & write features of an RDBMS and also handles the log file & database writer.
  • the log file is a part of recovery management process in order to recover data in case of any abnormal shutdown or other system failure occurs while database writer is used to persist committed data blocks.
  • FSM Finite State Machine
  • Fig. 1 is a block diagram illustrating the interaction of components of the pattern analyzer with the disk optimizer of the current invention.
  • Fig .2 is block diagram illustrating the mechanism of storage of persistent data in the preferred embodiment of the current invention.
  • Fig. 3 illustrates the file types such as LOG file, Database files, SWAP file, Roll Back Segment in which persistent data exists.
  • Fig.4 is a block diagram of the disk Agent and how it is used to write to database files.
  • Fig. 5 is a flow diagram illustrating the preferred embodiment of the current invention.
  • Fig. 6 is a block diagram illustrates the interaction of the Disk Agent of the current invention with various other agents.
  • the disk resource is used commonly by any functional server, but the pattern of disk usage varies because of type of data, size of data, frequency of disk accessing request, nature of disk accessing requests etc... Since any multi user functional server, which is using disk resource finally accesses some files we derive patterns of functionalities and map patterns of data to arrive at a common event based functional feature in a resource agent disk which can encapsulate and deliver any disk resource functionality demanded by any multi user server. This guarantees data exchange between functional servers since the proposed agent always persists data, in accepted compatible standards (ex: ODBC).
  • FIG 1. Illustrates the working of the pattern analyzer.
  • the pattern analyzer is made up of a functional pattern analyzer 115, data pattern analyzer 120, functional resource pattern heuristics 100, data resource pattern heuristics 105.
  • the functional resource pattern heuristics 100 and data resource pattern heuristics 105 form a part of the Lookup table 110.
  • the components mentioned are interfaced with an optimizer, which segregates operations based on type of operation to be performed on type of data. We always assume that the final base object is a file whenever data is persisted.
  • the modes of usage and sequence varies based on the state in which persistence is required i.e. we generally categorize and data state entity into three parts:
  • Pre result - transitionary persistent required before final result is derived This may be required when temporary buffers, which are used to derive the final result, exceed RAM availability and need a swap from primary to secondary storage. This may also be the state of data in between beginning of transaction and final commit or rollback (generally RBS or SWAP)
  • Final persistence This is the state of final data when actual database file is updated upon successful commit to intermediate files like log or archive logs.
  • the pattern and size of data persisted can be broadly classified into static and dynamic.
  • the static patterns of data can be various headers - database, extent, block, tuple, etc. which are database design entities and are part of metadata or meta-transactions.
  • the dynamic data is generally the data resulting from DML operation on user objects other than the metadata tables. These patterns are applications design specific and are variable in size based on amount of data in each record inserted, updated or deleted by the user.
  • the general functional pattern classification or any file operation is open, close, read, write, seek and flush. It does not matter whether the datafile is a LOG file, Database file or a Swap file. These operations being universal the patterns of data and sequence of operations generally dictate the flow of events the Disk agent has to follow.
  • Cursor operations dictate the pattern of read / write and associated operations.
  • isolation of various states of data from the beginning of the transaction till the data has been committed needs to be maintained and optimized as per atomicity, consistency, isolation and durability ('ACID') properties.
  • the transactional isolation is managed through the roll back segment ('RBS') of each client sessional instance.
  • the disk optimizer uses a SWAP and RBS combination to achieve this objective.
  • the deferring of concurrent write operations typically an INSERT operation
  • INSERT operation concurrent write operations
  • Regular flushing of data (i.e. dirty blocks) of LOG or Database file is enforced, either time based (checkpointing) or priority based (flushing) to maintain the best state of data integrity of committed transactions.
  • the flushing logic is derived based on analytical heuristics of object usage across concurrent sessions and operations performed on them. These may include size of data, number of Tuples in a transaction, size of the transaction, source of data (local or distributed), replenishment of resource required, age of the dirtied block of data in cache etc.
  • static or dynamic data care is taken to see that there is a minimum seek of offset traversal so as to achieve the fastest disk read and write time while not imposing a burden on the server with over use of its kernel resources.
  • a database' 200 which is logically divided into segments, which consist of a group of contiguous blocks.
  • the user determines the segment size and extent size at the time of database object creation.
  • This extent is also a logical data block size. It is invoked only when auto size expansion is found true and it automatically allocates the extra block for that object.
  • data stored into the logical blocks corresponds to a specific number of bytes of physical database space on the disk as shown in the figure.
  • An extent 205 consists of blocks 202, 203, 204, 205, 206, 207.
  • a block 202 further consists of a block header 208, and Tuples 210, 211 , 212.
  • the Block header 208 contains transaction information 209 such as Session ID, Number of Tuples, Save Point Level, Rollback Segment block ID.
  • a block header 208 also includes Page Index 213
  • a Tuple 212 further consists of a Tuple Header 214 and Tuple Buffer 215.
  • a Tuple Header 214 contains the information such as block id, block attributes, number of records stored in the block, next block, next free block etc.,
  • the Tuple Header 212 allows insertion of new records only when there is 60% of free block space found in the Tuple Buffer 215; otherwise it automatically transfers the data into the next free block.
  • the header information be it Block header 208 or Tuple header 214 it is always classified as static data whereas the Tuple buffer 215 is classified as dynamic data.
  • Each Tuple buffer 215 comprises of columns of data, each of similar or dissimilar data types or data patterns.
  • All static patterns are related to disk management operations. They contain information for the utilization of disk resource with optimum operations under various read / write operation workloads.
  • the static data is designed to contain information for recovering from media failures or corruption of dynamic data. This facilitates the recovery of such corrupted data by analyzing data patterns that are contiguous. Periodic checksums or cyclic redundancy checks are carried out using information contained within the static data to maintain data integrity.
  • Fig 3 illustrates the file types that are created when a new database file is opened.
  • the transaction can be in two states, either one in which the data is persistent (committed data) or transitory (uncommitted data). Further the data is in a transitory state in case of a uncommit type of transaction for example SELECT in a database.
  • a commit type of transaction the data is in persistence state.
  • persistence state When one persists the data for example INSERT or UPDATE or DELETE for a database care needs to be taken to correct or perfect the persistence. Hence one can write to disk irrespective of concurrent request for various clients and irrespective of server functionality.
  • a log file is always created 300.
  • the sequential Reads or Writes are appended to the LOG file.
  • the LOG Writer writes to the log file.
  • the current state such as transactions records of a file can be maintained even in case of power failure. This transaction committed LOG file is useful for recovery.
  • a Database (DB) writer writes at random Read or Write to DB files.
  • the DB file consists of Final Persistence data 305.
  • the SWAP file is continuously Read and Written 310. That means the SWAP file is a non-transactional file. This file is temporary and according to all transactions holds provisional data. This file is independent of transactions.
  • a Roll back Segment 315 depends on transaction and holds transitory data. It maintains the Transactional log and hence aids in recovery.
  • the random Read or Write (the transaction) is recorded 316.
  • the data is either committed or uncommitted. If the data is committed then a non-sessional LOG is prepared 317. Finally a final persistence is created when the file is saved for example as xyz.dbf 318.
  • Fig. 4 is a block diagram of Disk Agent 426 consisting of Pattern Translator 400, Recovery Manager 401 , Resource Manager 402, Error Handler 403, Disk Optimizer/Operation Manager 404, Memory Map File 405, Database (DB) Reader 406, DB Writer 407, LOG Writer 408, Log Reader 425, Disk I/O 427 Roll back Segment or Swap 409.
  • the Memory Map File 405 consists of Read Cache 410, Write or Dirty Cache 411; Roll back Segment or Swap Cache 412 and Memory Map Cache 413.
  • Also shown in the figure are two databases namely DB1 414 and DB2 415. Further each of these databases have their own database files 416, 419, Active Log files 417, 420 and an Swap or Rollback Segment files 418, 421 and some common files such as Control files 422, configuration files 423 and dat files 424.
  • the present invention is a Disk Agent 426, which primarily archives data from any server in formats of SQL types like Tables and records.
  • the Disk Agent is responsible for caching, concurrency, locking and managing data integrity across sessions accessing and modifying data simultaneously and finally recovery in case of media failure.
  • the final database file (DBF) has committed entries saved in a LOG file prior to final commit in database.
  • the Disk Agent 426 of the preferred embodiment of the present invention is a multithreaded agent and each thread is designed for a specific object and operation to work and translate these patterns of data as per operation.
  • the Disk Agent 426 can support 255 databases each having 255 data files of size extending up to 64 bits. As the size of data file increases the amount of seek and traversal increases. To support a fast seek operation the agent has an option to open multiple handles for the same data file which can operate between a set of offsets dictated by the control files and the configuration settings as shown in the figure as .cfg files.
  • the Disk Agent 426 has three threads, namely, a DB Reader Thread 406, DB Writer Thread 407, LOG Writer Thread 408, LOG Reader Thread 425 and Roll back Segment or Swap Thread 409 to perform specific functionality.
  • the preferred embodiment of the present invention takes care of recovery in case of media failure.
  • the final database file (DBF) has committed entries saved in a LOG file prior to final commit in database and any transactional but uncommitted data managed in session wise rollback segments.
  • the application may require temporary storage for transitional phases, which is managed in a central SWAP independent of sessions or transactions.
  • the Pattern Translator 400 is based on the type of operation, nature of request and pattern of data types to be persisted and creates certain metadata for pre amble or post amble which comprise of static data.
  • the pattern translator is an intelligent functional and data pattern engine, which associates a pattern of data and set of functionalities under various utility conditions. These conditions may be query specific object specific, operation or transaction specific the correlation of dependencies of which is determined at run time based on various parameters such as static of dynamic pattern of data, size of data pattern, frequency and type of usage, change in state of the same data pattern and set of associated operations required (e.g. for fault tolerance, data integrity, concurrency workload).
  • the query operation and conditions in the query along with transactional integrity associated with the query dictate the sequencing of functional patterns on the data generated as a result of the query execution.
  • the disk agent provides these as a standard set of generic event features, which can be used by any application requiring disk as a resource.
  • the set of disk resource functionality is governed and dictated as a set of generic patterns such as Data Pattern and Functional Pattern.
  • Functional Pattern comprises of a set of generic functional modules sequenced to deliver desired functionality such as managing optimum journaling of data so as to recover and rebuild data from any abnormal termination. This involves maintaining data at various state entities such as RBS file, Log file and the final DB file.
  • Another module is responsible for managing the functional data pattern entities, which clear and archive logs, terminate or truncate logs, manage overflows and underflows without much user intervention.
  • Another module is responsible for managing the user interface data pattern which is equipped with a language support or command set API's or other messaging events which can interact or communicate and expose various functionalities for developers and end-users irrespective of vendor hardware or operating system in a uniform and simple manner.
  • Recovery Manager 401 thread Only in case of an abnormal shutdown or data corruption the Recovery Manager 401 thread is evoked. This happens only when during the database instance startup the integrity check on the database and its supporting synchronization files fails. A recovery process is triggered which analyses every smallest tuple in a transaction and commits or rollbacks data between the database file and log file so as to guarantee transactional integrity.
  • the Recovery Manager 401 is also responsible for generating a status log of instance recovery or for applying incremental changes from the log file between specified periods for any previously backup database file.
  • the Resource Manager 402 allocates resources as per the requirements. It takes care and manages the different shared resources.
  • the Error Handler 403 carries out the error handling function and reporting mechanism according to the nature and type of errors
  • the Disk Optimizer/ Operation Manager 404 analyzes the type of requests and optimizes the block request across sessions and nature of operations such as reads or writes so as to merge and manage input and output and corresponding resource allocation for these operations, which can be shared. Based on the result of request analysis the operation manager schedules the other read or write threads. If current volume of data to be flushed to the database file, which does not justify the write process overheads, the operation manager may even ignore it till a minimum critical threshold level is reached to justify database write cost. Such lazy or differed write may be forced committed by check pointing process.
  • the Optimizer module tries to optimize input output by analyzing the current pending requests for read or write across sessions, transactions or databases.
  • the Memory Map File 405. consists of Read Cache 410, Write or Dirty Cache 411, Roll Back Segment/Swap Cache 412 and Memory Map Cache 413 which all aid the process of caching. Hence disk space can be mapped and used as memory. Depending upon usage these different caches can be clubbed and used as mapped memory along with RAM.
  • the DB Reader (DBRD) 406 threads perform only read operation on the database file as specified by the database index in the request.
  • the translation of disk block data to a page data and corresponding caching of these disk blocks is managed by the DBRD thread.
  • the DB Writer (DBWR) 407 thread performs only write operation on the database files for blocks, which are marked dirty and acknowledged by successful write in LOG file.
  • the translation of page data into database blocks is managed by the DBWR thread.
  • Log writing is generally a sequential append write request managed by Log Writer (LOGWR) 408 thread.
  • LOGWR Log Writer
  • the log data may have to save various states of nested committed transactional data and their prior states in case of conditional or partial rollbacks.
  • LOGWR 408 guarantees immediate flushing of committed data taking care of Log size restrictions and switching over to an active new log file in case of size overflows.
  • FIG. 5 is a flow diagram of the Disk Agent 426.
  • the Operation Manager 404 classifies the request 501.
  • the isolation of the Database 502 is done with, respect to the even parameters and the active database index for the operation.
  • the Disk Agent 426 proceeds to isolate the operand 503.
  • the isolation of the operand depending on the nature of the request, it is passed to the LOG 504, Database 505, SWAP 506 or Roll back Segment (RBS) 507.
  • the Disk Agent 426 then proceeds to isolate the request with respect to current position and handle 508. After the isolation of the request with respect to current position and handle, the isolation of the offset is carried out 509.
  • the Disk Agent 426 After isolating the offset, the Disk Agent 426 then proceeds to isolate the operation and associate thread 510. Based on this isolation, the request is passed to DBRD 406 / DBWR 407/ Log Writer 408 / Log Reader 425 and Roll back Segment/Swap 409 threads for optimization 511. The pattern association of these outputs of respective thread is then carried out 512. After the process of pattern association, the translation of these patterns is carried out 513. The Disk Agent 526 then proceeds to check if there is a read 514. In the event of no read request, then the Disk Agent 426 proceeds to Page Patterns to a persistent state 515. In the event of successful read, the Disk Agent 426 proceeds block to Pattern Page 516. The Disk Agent 426 then proceeds to insert data in the associate cache 51 .
  • Fig 6 shows the interaction of the Disk Agent 426 with various other agents.
  • the Disk Agent 426 three threads have been considered, namely, the Log WR thread 408, the DBWR 407 and the Disk I/O 427.
  • the LogWR 408 and LogRD 425 threads are used in case of recovery. Whenever any file is opened a transaction record is always made in the LogWR 408 file. The LogRD 425 is rarely used; it's useful in case of file recovery. As mentioned earlier the LogWR 408 is transaction-committed file. This is done to maintain data integrity irrespective of sessions, which are working on th ⁇ same data. Further as shown in the figure the interaction of the Scheduler Agent with the Disk Agent 426, the combination takes care of the concurrent queries, caching, locking and managing irrespective of the number and nature of clients and across server functionality having shared resources.
  • the DBWR 407 thread is activated only in case of a database file. Every database write is first made to RAM of the computer system. Block is translated to page and page is block of data written to memory. The DBWR 407 writes to a database file in case of a database Read or database Write. LogWR 408 has higher priority. As soon as commit is given everything is written to disk. The DBWR 407 is a lazy writer that is it delays the write to the disk.
  • the Timer Agent 600 checks these two conditions, this Timer Agent 600 can be customized as per the configuration settings and it forces synchronicity.
  • Checkpointing can be synchronous or asynchronous type. Check pointing is a process by which Timer Agent 600 forces all the uncommit data to the disk.
  • One of the uniqueness of the current invention is we don't have a separate checking thread it is merged with the Disk Agent 426 and the Timer Agent 600 forces for synchronicity. Further the Timer Agent 600 is customizable and hence synchronous and asynchronous check pointing is handled.
  • the Disk Agent 426 along with the Server Agent 601 , DML Agent 602, Index Agent 603 works to manage every disk read and write activity.
  • the Disk Agent 426 talks to the database and according to the type of query, the Disk Agent 426 either gives the query to the DML Agent 602 in case of a DML query or to Server Agent 601 in case of DDL or DCL commands. According to the active database it can also be given to the Index Agent. Also the DML Agent as well as Server Agent 601 can check for Index Agent 603.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates generally to a system and method for mapping patterns of persistent data in a format independent of application functionality so that the archived data allows accessibility, data exchange and sharing across multi-user applications independent of functionality without any external process of import or export in a common format understood by sharing applications.

Description

TITLE OF INVETION
System and Method of mapping patterns of data, optimising disk read and write, verifying data integrity across clients and servers of differnet functionality having shared resources
BACKGROUND OF THE INVENTION
Normally in a client server technology a server typically shares hardware resources and persistent data generated by the server across clients, maintaining data integrity and state across concurrent users sharing this persistent data. The pattern of this persistent data used or generated by the server is purely as per the server functional scope. In other words, database servers generate data in a table column format, web servers in an Operating System (OS) native file directory structure and mail servers in their own proprietary format.
What separates these persistent data patterns is the interpretation of data bytes specific to the application scope. For example, though bitmap is graphical data popularly generated by any painting software like Microsoft paintbrush there exists various derivates of persistent data format for the same image data like Graphics Interchange Format (GIF) or TGA or Joint Photographic Experts Group (JPG) etc. The digital image data remains the same but archiving patterns vary that is the preamble or post amble data pattern and compression technologies meant for fastest saving and retrieval vary. Hence there exists a trade off of time and space that is when disk space required to save data is to be minimized time is lost in uncompressing and compressing data. Similarly if time is to be saved, there is wastage of space for saving these uncompressed data.
Apart from these trade offs, which are purely as per the application specific needs the following are the critical factors dictating shared data access: 1) The hardware resource availability for the application need. (Generally a ratio of Central Processing Time (CPU) to Hard Disk Drive (HDD) access time and capacity to handle the single largest block of data)
2) Sharing or concurrent access of this digitized data across clients
3) Archiving modes generally dictated by application need.
Since need dictated these patterns, data exchange was impossible and software such as middleware, which could understand and translate these patterns evolved. Hence contemporary image editing software supports a wide variety of these file formats. This was easy because the pattern of data these applications worked on was very restricted and very little applications had tools supported or developed on these applications. Unlike a Relational Database Management System (RDBMS) where every application is built over and around it with lot of business intelligence linked with operational functionality, image editing software on the other hand never had portability issues, neither was there a need to share data concurrently.
Databases are an important tool for the storage and management of information for businesses. Both relational database management systems (RDBMS) and non-relational database management systems exist for this purpose. Examples of RDBMS' include ORACLE, DB2, and INFORMIX. Examples of nonrelational databases include custom databases created with the operating system, developed by IBM. The operating system allows programmers to created custom databases, which support keyed, sequential, and binary file types. Database Management Systems (DBMS) provide users the capabilities of controlling read/write access, specifying report generation, and analyzing usage.
Prior to the arrival of RDBMS, DBMS existed on various Operating Systems (OS), which used to maintain data in files (knows as file system) with tree structures supporting different directories and sub directories. These OS file systems evolved with different vendors, with each vendor supporting different features and access mechanisms to enhance disk operations. As a result of these multiple approaches by different Vendors, popular File Systems like NTFS for Windows, File Allocation Table (FAT) for DOS, FAT32 for Windows 9x, NDS for Netware, HPFS for OS2 developed. However, there existed a similarity between the ways in which they maintained the root directory entries and locations in the hard disk, known as the File Allocation Tables. The hard disk was divided into cylinders/tracks/sectors/heads and combination of sectors formed clusters. Each file occupying a series of these clusters, which generally was 2K, and sectors were 512 bytes. So even if a file is created of 1 byte, an entire cluster was allocated. This was a result of a compromise to the existing file access mechanism, which locates a file on the hard disk by mapping the cluster numbers as indexes, while the root directory entries saved the other information of the file along with the mapping of the location i.e. the cluster number. The root entry saved the filename, file size, file attributes, file date and time of creation etc. However, a drawback of this practice was that it resulted in excess space being wasted.
Furthermore, the data patterns, which were the final persistent part of any digitized data capturing mechanisms could be broadly classified into few specific data types already existing in the ODBC mechanism. ODBC termed these patterns as 'SQL Data Types' and any form of these types in application specific combinations formed data. For example, Outlook express has its own proprietary file format (.mbx, .pst), which is the persistent part of its data capturing mechanism. The same data can be classified as table and column entities from a database perspective and a table with columns capturing "To:, From:, CC:, BCC:, etc can be archived as a generic database rather than a specific and inaccessible one because only Outlook Express can read write this data.
These data patterns (Date, Time, String, Float etc) are associated with a search engine best associated with this pattern of data and an archival mechanism which can in minimum overheads save and help retrieve this data in fastest possible mechanism.
The criteria's, which can affect this search, are factors such as the volume of data, resource availability. (Hard disk type IDE, SCSI, speed of the device, and default caching supported), the value of data for that specific pattern, (unbalanced tree - sorted) .associated arithmetic and logical operations required by the application using it, which may dictate the search and archival patterns for performance benefits and security, data integrity and audit policies associated with this type of data, (encryption / compression)
These patterns of data types form the smallest entity of a database called as columns and a collections of these patterns dictated by the business logic of application forms tables. Based on the frequency of usage of this table and certain columns of the table the search demands INDEXING i.e. persistent sorted pattern of data.
There exists no known technique for mapping patterns of data, irrespective of the server protocols and data formats in which requests are received in a relational database management system. Further, all known techniques have additional drawbacks such as they do not provide security features, it difficult to conceal information in directories in case of a directory or file being shared between users, OS specific different processes managing communication across different processes, the OS cannot tracks or take user specific backups unless allocated a separate disk partitions or directories and different patterns of data for maintenance purposes and functionality, require different types of servers to handle different protocols and data patterns / formats.
SUMMARY OF THE INVENTION
To meet the foregoing needs, the present invention provides a software-implemented process, system and method for use in a computing environment, whereby a multiplicity of data patterns are mapped, irrespective of the server protocols and data formats in which requests are received in a RDBMS. The present invention maps file systems as table structures, with each of its elements as columns of the table. The entire file content is saved as a blob record rather than managing the file in series of clusters. The present invention provides very high security, as all features of a database can be applied user schema wise. Each user is isolated from the other and entire directory info can be concealed which is highly impossible in an operating system. Further, every action on every file access is through an internal object query and for each of these actions, triggers can be programmed to give interactivity, which is not possible in an OS file system.
In the present invention FTP / HTTP server requests / commands are translated into internal queries to the database and results are given as expected by the client. Hence any FTP client or a web browser is not even aware that they are serviced by a DB rather than a FTP / HTTP server. Further, in the case of an email server, which works on SMTP and POP protocols, the present invention translates every user as a DB user entity as per schema space allocated for his inbox / outbox and maps it as table with to, from, cc, bcc, subject, body attachments as columns of the table.
, Further, the present invention can manage all rules for concurrency and disk caching which are currently managed by OS processes like smartdrive or findfast is taken care by features of DB itself rather than OS specific different processes managing communication across different processes.
User quota, currently is a difficult task for the server to handle is monitored and triggers can be used to interact with other users and tasks specified which is currently not possible, except for functions such as reply/back. Attachments, which are common across users, can be exposed as views hence a lot of disk space can be saved when the same attachment is Cced to multiple users and email servers die for want of disk space.
The present invention centralizes the different patterns of data for maintenance purposes and functionality, which normally requires different types of servers to handle different protocols and data patterns / formats. The present invention centralizes these patterns based on usage frequency / changes expected / expected retrieval time / security constraints into table / column - data type formats supporting these features. Further, it provides for management of all data types along with features of security, with a view to optimizing disk space requirement and minimizing retrieval time, while considering possibilities of updation of data for various data types. The present invention manages the read & write features of an RDBMS and also handles the log file & database writer. The log file is a part of recovery management process in order to recover data in case of any abnormal shutdown or other system failure occurs while database writer is used to persist committed data blocks.
The entire design is based on state machines and modules comprising of various events communication via messages that is it is event driven using Finite State Machine (FSM) concept, the functionality is broken down into a series of events scheduled by kernel.
BRIEF DESCRIPTION OF THE DRAWINGS
The various objects and advantages of the present invention will become apparent to those of ordinary skill in the relevant art after reviewing the following detailed description and accompanying drawings, wherein:
Fig. 1 is a block diagram illustrating the interaction of components of the pattern analyzer with the disk optimizer of the current invention.
Fig .2 is block diagram illustrating the mechanism of storage of persistent data in the preferred embodiment of the current invention.
Fig. 3 illustrates the file types such as LOG file, Database files, SWAP file, Roll Back Segment in which persistent data exists.
Fig.4 is a block diagram of the disk Agent and how it is used to write to database files.
Fig. 5 is a flow diagram illustrating the preferred embodiment of the current invention. Fig. 6 is a block diagram illustrates the interaction of the Disk Agent of the current invention with various other agents.
DETAILED DESCRIPTION OF THE INVENTION
While the present invention is susceptible to embodiment in various forms, there is shown in the drawings and will hereinafter be described a presently preferred embodiment with the understanding that the present disclosure is to be considered an exemplification of the invention and is not intended to limit the invention to the specific embodiment illustrated.
In the present disclosure, the words "a" or "an" are to be taken to include both the singular and the plural. Conversely, any reference to plural items shall, where appropriate, include the singular.
The disk resource is used commonly by any functional server, but the pattern of disk usage varies because of type of data, size of data, frequency of disk accessing request, nature of disk accessing requests etc... Since any multi user functional server, which is using disk resource finally accesses some files we derive patterns of functionalities and map patterns of data to arrive at a common event based functional feature in a resource agent disk which can encapsulate and deliver any disk resource functionality demanded by any multi user server. This guarantees data exchange between functional servers since the proposed agent always persists data, in accepted compatible standards (ex: ODBC).
FIG 1. Illustrates the working of the pattern analyzer. The pattern analyzer is made up of a functional pattern analyzer 115, data pattern analyzer 120, functional resource pattern heuristics 100, data resource pattern heuristics 105. The functional resource pattern heuristics 100 and data resource pattern heuristics 105 form a part of the Lookup table 110. The components mentioned are interfaced with an optimizer, which segregates operations based on type of operation to be performed on type of data. We always assume that the final base object is a file whenever data is persisted. The modes of usage and sequence varies based on the state in which persistence is required i.e. we generally categorize and data state entity into three parts:
Pre result - transitionary persistent required before final result is derived. This may be required when temporary buffers, which are used to derive the final result, exceed RAM availability and need a swap from primary to secondary storage. This may also be the state of data in between beginning of transaction and final commit or rollback (generally RBS or SWAP)
Post result - When commit is specified explicitly in transaction the data has to be persisted immediately so as to maintain adherence to standards and CODDs rules. Also every fault tolerance measures have to be taken to guarantee data integrity and recovery in case of hardware malfunction or failures.
Final persistence - This is the state of final data when actual database file is updated upon successful commit to intermediate files like log or archive logs.
The nature of command, state of objects and database, concurrent usage, resource / parameterized constraints, server / transactional settings, size and distribution of transactional sub entities etc. dictate the sequence in, which any of these files will be written first. Based on operation and recovery expectations the cases of Undo-Redo, Undo-No Redo, No Undo-Redo, No Undo-No Redo is adopted and corresponding entities are updated. The pattern and size of data persisted can be broadly classified into static and dynamic. The static patterns of data can be various headers - database, extent, block, tuple, etc. which are database design entities and are part of metadata or meta-transactions. The dynamic data is generally the data resulting from DML operation on user objects other than the metadata tables. These patterns are applications design specific and are variable in size based on amount of data in each record inserted, updated or deleted by the user.
The general functional pattern classification or any file operation is open, close, read, write, seek and flush. It does not matter whether the datafile is a LOG file, Database file or a Swap file. These operations being universal the patterns of data and sequence of operations generally dictate the flow of events the Disk agent has to follow.
Cursor operations dictate the pattern of read / write and associated operations. During concurrent read / write operations on the same object data and its associated linked dependencies, isolation of various states of data from the beginning of the transaction till the data has been committed, needs to be maintained and optimized as per atomicity, consistency, isolation and durability ('ACID') properties. The transactional isolation is managed through the roll back segment ('RBS') of each client sessional instance. Whenever the RBS or primary storage (i.e. RAM) is insufficient or the prioritization needs it to be in a passive state temporarily, the disk optimizer uses a SWAP and RBS combination to achieve this objective. The deferring of concurrent write operations (typically an INSERT operation) on the same object across various sessional instances is also dynamically decided by the optimizer.
Regular flushing of data (i.e. dirty blocks) of LOG or Database file is enforced, either time based (checkpointing) or priority based (flushing) to maintain the best state of data integrity of committed transactions. The flushing logic is derived based on analytical heuristics of object usage across concurrent sessions and operations performed on them. These may include size of data, number of Tuples in a transaction, size of the transaction, source of data (local or distributed), replenishment of resource required, age of the dirtied block of data in cache etc. During any of these block writes or reads, (static or dynamic data) care is taken to see that there is a minimum seek of offset traversal so as to achieve the fastest disk read and write time while not imposing a burden on the server with over use of its kernel resources.
Referring now to the drawing particularly in Fig 2 is a database' 200, which is logically divided into segments, which consist of a group of contiguous blocks. The user determines the segment size and extent size at the time of database object creation. This extent is also a logical data block size. It is invoked only when auto size expansion is found true and it automatically allocates the extra block for that object. At the base level, data stored into the logical blocks corresponds to a specific number of bytes of physical database space on the disk as shown in the figure. An extent 205 consists of blocks 202, 203, 204, 205, 206, 207. A block 202 further consists of a block header 208, and Tuples 210, 211 , 212. The Block header 208 contains transaction information 209 such as Session ID, Number of Tuples, Save Point Level, Rollback Segment block ID. A block header 208 also includes Page Index 213 A Tuple 212 further consists of a Tuple Header 214 and Tuple Buffer 215.
A Tuple Header 214 contains the information such as block id, block attributes, number of records stored in the block, next block, next free block etc., The Tuple Header 212 allows insertion of new records only when there is 60% of free block space found in the Tuple Buffer 215; otherwise it automatically transfers the data into the next free block. The header information, be it Block header 208 or Tuple header 214 it is always classified as static data whereas the Tuple buffer 215 is classified as dynamic data. Each Tuple buffer 215 comprises of columns of data, each of similar or dissimilar data types or data patterns.
All static patterns are related to disk management operations. They contain information for the utilization of disk resource with optimum operations under various read / write operation workloads. The static data is designed to contain information for recovering from media failures or corruption of dynamic data. This facilitates the recovery of such corrupted data by analyzing data patterns that are contiguous. Periodic checksums or cyclic redundancy checks are carried out using information contained within the static data to maintain data integrity.
Fig 3 illustrates the file types that are created when a new database file is opened. The transaction can be in two states, either one in which the data is persistent (committed data) or transitory (uncommitted data). Further the data is in a transitory state in case of a uncommit type of transaction for example SELECT in a database. In case of a commit type of transaction the data is in persistence state. When one persists the data for example INSERT or UPDATE or DELETE for a database care needs to be taken to correct or perfect the persistence. Hence one can write to disk irrespective of concurrent request for various clients and irrespective of server functionality. When a new database file is opened or created, the following four types of file are made
Before writing to a final file, a log file is always created 300. The sequential Reads or Writes are appended to the LOG file. The LOG Writer writes to the log file. In a LOG file the current state, such as transactions records of a file can be maintained even in case of power failure. This transaction committed LOG file is useful for recovery.
Whenever database is idle it is lazy write to disk. A Database (DB) writer writes at random Read or Write to DB files. The DB file consists of Final Persistence data 305.
The SWAP file is continuously Read and Written 310. That means the SWAP file is a non-transactional file. This file is temporary and according to all transactions holds provisional data. This file is independent of transactions.
A Roll back Segment 315 depends on transaction and holds transitory data. It maintains the Transactional log and hence aids in recovery.
The random Read or Write (the transaction) is recorded 316. Depending on the status of the file, the data is either committed or uncommitted. If the data is committed then a non-sessional LOG is prepared 317. Finally a final persistence is created when the file is saved for example as xyz.dbf 318.
Fig. 4 is a block diagram of Disk Agent 426 consisting of Pattern Translator 400, Recovery Manager 401 , Resource Manager 402, Error Handler 403, Disk Optimizer/Operation Manager 404, Memory Map File 405, Database (DB) Reader 406, DB Writer 407, LOG Writer 408, Log Reader 425, Disk I/O 427 Roll back Segment or Swap 409. Further the Memory Map File 405 consists of Read Cache 410, Write or Dirty Cache 411; Roll back Segment or Swap Cache 412 and Memory Map Cache 413. Also shown in the figure are two databases namely DB1 414 and DB2 415. Further each of these databases have their own database files 416, 419, Active Log files 417, 420 and an Swap or Rollback Segment files 418, 421 and some common files such as Control files 422, configuration files 423 and dat files 424.
The present invention is a Disk Agent 426, which primarily archives data from any server in formats of SQL types like Tables and records. The Disk Agent is responsible for caching, concurrency, locking and managing data integrity across sessions accessing and modifying data simultaneously and finally recovery in case of media failure. As part of standard practices of data journaling the final database file (DBF) has committed entries saved in a LOG file prior to final commit in database.
As seen from Figure 4, the Disk Agent 426 of the preferred embodiment of the present invention is a multithreaded agent and each thread is designed for a specific object and operation to work and translate these patterns of data as per operation. In the preferred embodiment of the present invention, the Disk Agent 426 can support 255 databases each having 255 data files of size extending up to 64 bits. As the size of data file increases the amount of seek and traversal increases. To support a fast seek operation the agent has an option to open multiple handles for the same data file which can operate between a set of offsets dictated by the control files and the configuration settings as shown in the figure as .cfg files. As depicted in the figure, the Disk Agent 426 has three threads, namely, a DB Reader Thread 406, DB Writer Thread 407, LOG Writer Thread 408, LOG Reader Thread 425 and Roll back Segment or Swap Thread 409 to perform specific functionality.
The preferred embodiment of the present invention takes care of recovery in case of media failure. As part of standard practices of data journaling the final database file (DBF) has committed entries saved in a LOG file prior to final commit in database and any transactional but uncommitted data managed in session wise rollback segments. During the course of execution and before the generation of final result data to be persisted the application may require temporary storage for transitional phases, which is managed in a central SWAP independent of sessions or transactions. The Pattern Translator 400 is based on the type of operation, nature of request and pattern of data types to be persisted and creates certain metadata for pre amble or post amble which comprise of static data. This metadata generation is as per operation on current database usage instance; nature of operation, session and transaction so that any abnormal shutdown of server or hardware failure should guarantee instance and committed transactional recovery. The pattern translator is an intelligent functional and data pattern engine, which associates a pattern of data and set of functionalities under various utility conditions. These conditions may be query specific object specific, operation or transaction specific the correlation of dependencies of which is determined at run time based on various parameters such as static of dynamic pattern of data, size of data pattern, frequency and type of usage, change in state of the same data pattern and set of associated operations required (e.g. for fault tolerance, data integrity, concurrency workload).
During any disk operation required by a server the query operation and conditions in the query along with transactional integrity associated with the query dictate the sequencing of functional patterns on the data generated as a result of the query execution. The disk agent provides these as a standard set of generic event features, which can be used by any application requiring disk as a resource. The set of disk resource functionality is governed and dictated as a set of generic patterns such as Data Pattern and Functional Pattern.
Functional Pattern comprises of a set of generic functional modules sequenced to deliver desired functionality such as managing optimum journaling of data so as to recover and rebuild data from any abnormal termination. This involves maintaining data at various state entities such as RBS file, Log file and the final DB file. Another module is responsible for managing the functional data pattern entities, which clear and archive logs, terminate or truncate logs, manage overflows and underflows without much user intervention. A module to monitor the data pattern and check, prevent and notify the user regarding data fragmentation to allocate freed or available resources optimally. Another module is responsible for managing the user interface data pattern which is equipped with a language support or command set API's or other messaging events which can interact or communicate and expose various functionalities for developers and end-users irrespective of vendor hardware or operating system in a uniform and simple manner.
Only in case of an abnormal shutdown or data corruption the Recovery Manager 401 thread is evoked. This happens only when during the database instance startup the integrity check on the database and its supporting synchronization files fails. A recovery process is triggered which analyses every smallest tuple in a transaction and commits or rollbacks data between the database file and log file so as to guarantee transactional integrity. The Recovery Manager 401 is also responsible for generating a status log of instance recovery or for applying incremental changes from the log file between specified periods for any previously backup database file.
The Resource Manager 402 allocates resources as per the requirements. It takes care and manages the different shared resources. The Error Handler 403 carries out the error handling function and reporting mechanism according to the nature and type of errors
The Disk Optimizer/ Operation Manager 404 analyzes the type of requests and optimizes the block request across sessions and nature of operations such as reads or writes so as to merge and manage input and output and corresponding resource allocation for these operations, which can be shared. Based on the result of request analysis the operation manager schedules the other read or write threads. If current volume of data to be flushed to the database file, which does not justify the write process overheads, the operation manager may even ignore it till a minimum critical threshold level is reached to justify database write cost. Such lazy or differed write may be forced committed by check pointing process. The Optimizer module tries to optimize input output by analyzing the current pending requests for read or write across sessions, transactions or databases. Also as per the nature of request certain input or output are prioritized and precedence's of read or write requests are dynamically scheduled. It also manages the read or write operation to the disk taking care for optimum disk usage. The Memory Map File 405. consists of Read Cache 410, Write or Dirty Cache 411, Roll Back Segment/Swap Cache 412 and Memory Map Cache 413 which all aid the process of caching. Hence disk space can be mapped and used as memory. Depending upon usage these different caches can be clubbed and used as mapped memory along with RAM.
The DB Reader (DBRD) 406 threads perform only read operation on the database file as specified by the database index in the request. The translation of disk block data to a page data and corresponding caching of these disk blocks is managed by the DBRD thread. The DB Writer (DBWR) 407 thread performs only write operation on the database files for blocks, which are marked dirty and acknowledged by successful write in LOG file. The translation of page data into database blocks is managed by the DBWR thread.
Log writing is generally a sequential append write request managed by Log Writer (LOGWR) 408 thread. As per the operation the log data may have to save various states of nested committed transactional data and their prior states in case of conditional or partial rollbacks. LOGWR 408 guarantees immediate flushing of committed data taking care of Log size restrictions and switching over to an active new log file in case of size overflows.
FIG. 5 is a flow diagram of the Disk Agent 426. As soon as a request for the Disk Agent 426 is received 500 the Operation Manager 404 classifies the request 501. Next, the isolation of the Database 502 is done with, respect to the even parameters and the active database index for the operation. After isolation of the database, the Disk Agent 426 proceeds to isolate the operand 503. After the isolation of the operand, depending on the nature of the request, it is passed to the LOG 504, Database 505, SWAP 506 or Roll back Segment (RBS) 507. The Disk Agent 426 then proceeds to isolate the request with respect to current position and handle 508. After the isolation of the request with respect to current position and handle, the isolation of the offset is carried out 509. After isolating the offset, the Disk Agent 426 then proceeds to isolate the operation and associate thread 510. Based on this isolation, the request is passed to DBRD 406 / DBWR 407/ Log Writer 408 / Log Reader 425 and Roll back Segment/Swap 409 threads for optimization 511. The pattern association of these outputs of respective thread is then carried out 512. After the process of pattern association, the translation of these patterns is carried out 513. The Disk Agent 526 then proceeds to check if there is a read 514. In the event of no read request, then the Disk Agent 426 proceeds to Page Patterns to a persistent state 515. In the event of successful read, the Disk Agent 426 proceeds block to Pattern Page 516. The Disk Agent 426 then proceeds to insert data in the associate cache 51 .
Fig 6 shows the interaction of the Disk Agent 426 with various other agents. As depicted in the figure of the Disk Agent 426 three threads have been considered, namely, the Log WR thread 408, the DBWR 407 and the Disk I/O 427.
The LogWR 408 and LogRD 425 threads are used in case of recovery. Whenever any file is opened a transaction record is always made in the LogWR 408 file. The LogRD 425 is rarely used; it's useful in case of file recovery. As mentioned earlier the LogWR 408 is transaction-committed file. This is done to maintain data integrity irrespective of sessions, which are working on th© same data. Further as shown in the figure the interaction of the Scheduler Agent with the Disk Agent 426, the combination takes care of the concurrent queries, caching, locking and managing irrespective of the number and nature of clients and across server functionality having shared resources.
Depending on the DBWR 407 and DBRD 406 size specified at time of database file creation these threads are created only when a database file is opened. The DBWR 407 thread is activated only in case of a database file. Every database write is first made to RAM of the computer system. Block is translated to page and page is block of data written to memory. The DBWR 407 writes to a database file in case of a database Read or database Write. LogWR 408 has higher priority. As soon as commit is given everything is written to disk. The DBWR 407 is a lazy writer that is it delays the write to the disk.
Depending upon the two conditions such as size of transaction and number of transactions as specified these two conditions dictates the frequency with which to flush the data. The Timer Agent 600 checks these two conditions, this Timer Agent 600 can be customized as per the configuration settings and it forces synchronicity. Checkpointing can be synchronous or asynchronous type. Check pointing is a process by which Timer Agent 600 forces all the uncommit data to the disk. One of the uniqueness of the current invention is we don't have a separate checking thread it is merged with the Disk Agent 426 and the Timer Agent 600 forces for synchronicity. Further the Timer Agent 600 is customizable and hence synchronous and asynchronous check pointing is handled.
The Disk Agent 426 along with the Server Agent 601 , DML Agent 602, Index Agent 603 works to manage every disk read and write activity. The Disk Agent 426 talks to the database and according to the type of query, the Disk Agent 426 either gives the query to the DML Agent 602 in case of a DML query or to Server Agent 601 in case of DDL or DCL commands. According to the active database it can also be given to the Index Agent. Also the DML Agent as well as Server Agent 601 can check for Index Agent 603.

Claims

What is claimed is
1. A system for managing a plurality of disk resources irrespective of functional servers, comprising a translator means to associate a pattern of data and a set of functionalities of said data using a plurality of predetermined conditions; an optimizer means to analyze and optimize the type of instructions received on said patterns of data in a concurrent environment based on results received from said pattern translator; a caching means to store and facilitate the process of said optimization.
2. The system as recited in claim 1 wherein said disk resource functionality is dictated using a set of generic patterns including data pattern and functional pattern.
3. The system as recited in claim 2 wherein the allocation of said disk resource using data pattern is based on correlating dependencies determined at run-time based on predetermined parameters comprising of said data pattern being static or dynamic, size of said data pattern, frequency and usage of said data pattern, change in state of said data pattern and set of operations required on said data pattern.
4. The system as recited in claim 2 wherein said functional patterns are used to manage optimum journaling of said data pattern to recover and rebuild said data pattern from any abnormal termination.
5. The system as recited in claim 1 wherein data pattern from any functional server is persisted in the form of one or a plurality of database files.
6. The system as recited in claim 5 wherein said system is capable of performing a seek operation efficiently by opening multiple instances of said database files with file pointer of said database files positioned at different said database file offsets.
7. The system as recited in claim 5 provides for greater security by inheriting security properties of databases.
8. The system as recited in claim 1 wherein said system enforces industry standard compliant format for persistence of said data exchange between functionally different server objects without any layer of conversion.
9. The system as recited in claim 1 wherein said system imparts atomicity, consistency, isolation and durability properties to any sever objects irrespective of server functionality.
10. The system as recited in claim 1 wherein said system exposes any functionality expected by any functional server in a multi-user environment from said disk resource for final or intermediate persistence to achieve simplicity, standardization, operating system portability, said data pattern and functional event portability across vendors and versions of said functional servers.
11. The method of allocating disk resources irrespective of functional servers, comprising of: associating a pattern of data and a set of functionalities of said data under a plurality of conditions analyzing and optimizing the type of requests received on said patterns of data caching said patterns of data whereby said disk can be mapped and utilized as memory
12. The method as recited in claim 11 wherein said disk resource functionality is dictated using a set of generic patterns including data pattern and functional pattern.
13. The method as recited in claim 12 wherein the process of allocating disk resources using data pattern is done by correlating dependencies determined at run-time based on parameters comprising of said data pattern being static or dynamic, size of said data pattern, frequency and usage of said data pattern, change in state of said data pattern and set of operations required on said data pattern.
14. The method as recited in claim 12 wherein the process of allocating disk resource using functional pattern is done by using said functional patterns for managing optimum journaling of said data pattern to recover and rebuild said data pattern from any abnormal termination.
15. The method as recited in claim 11 wherein data pattern from any functional server is finally persisted in the form of one or a plurality of database files.
16. The method as recited in claim 15 wherein said system is capable of performing a seek operation efficiently by opening multiple instances of said database files with file handles of said database files positioned at different said database file offsets.
17. The method as recited in claim 15 wherein the process of intermediate persistence to final persistence comprises the steps of: saving said intermediate data pattern in a roll back segment file saving said data pattern in roll back segment file in a log file for recovery incase of abnormal termination saving said final data pattern in a database file for final persistence.
18. The method as recited in claim 17 wherein the user is given an option for saving said data pattern between using a secure process including said roll back segment file, said log file and thereafter a said database file or directly persist data into said database file.
19. The method as recited in claim 11 enforces industry standard compliant format for persistence of said data exchange between functionally different server objects without any layer of conversion.
20. The method as recited in claim 11 imparts atomicity, consistency, isolation and durability properties to any sever objects irrespective of server functionality.
21. The method as recited in claim 11 exposes any functionality expected by any functional server in a multi-user environment from said disk resource for final or intermediate persistence to achieve simplicity, standardization, operating system portability, said data pattern and functional event portability across vendors and versions of said functional servers.
PCT/IN2004/000030 2003-01-30 2004-01-29 System and method of mapping patterns of data, optimising disk read and write, verifying data integrity across clients and servers of different functionality having shared resources WO2004077219A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN125/MUM/2003 2003-01-30
IN125MU2003 2003-01-30

Publications (2)

Publication Number Publication Date
WO2004077219A2 true WO2004077219A2 (en) 2004-09-10
WO2004077219A3 WO2004077219A3 (en) 2005-05-19

Family

ID=32922939

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2004/000030 WO2004077219A2 (en) 2003-01-30 2004-01-29 System and method of mapping patterns of data, optimising disk read and write, verifying data integrity across clients and servers of different functionality having shared resources

Country Status (1)

Country Link
WO (1) WO2004077219A2 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7536428B2 (en) * 2006-06-23 2009-05-19 Microsoft Corporation Concurrent read and write access to a linked list where write process updates the linked list by swapping updated version of the linked list with internal list
WO2012082792A2 (en) 2010-12-13 2012-06-21 Fusion-Io, Inc. Apparatus, system, and method for auto-commit memory
US9047178B2 (en) 2010-12-13 2015-06-02 SanDisk Technologies, Inc. Auto-commit memory synchronization
US9218278B2 (en) 2010-12-13 2015-12-22 SanDisk Technologies, Inc. Auto-commit memory
US9767017B2 (en) 2010-12-13 2017-09-19 Sandisk Technologies Llc Memory device with volatile and non-volatile media
US10817421B2 (en) 2010-12-13 2020-10-27 Sandisk Technologies Llc Persistent data structures
US10817502B2 (en) 2010-12-13 2020-10-27 Sandisk Technologies Llc Persistent memory management
US11573909B2 (en) 2006-12-06 2023-02-07 Unification Technologies Llc Apparatus, system, and method for managing commands of solid-state storage using bank interleave

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011031899A2 (en) 2009-09-09 2011-03-17 Fusion-Io, Inc. Apparatus, system, and method for power reduction in a storage device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11219286A (en) * 1998-02-02 1999-08-10 Meidensha Corp Construction method for machine environment
JPH11331161A (en) * 1998-05-08 1999-11-30 Cai Kk Network maintenance control system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11219286A (en) * 1998-02-02 1999-08-10 Meidensha Corp Construction method for machine environment
JPH11331161A (en) * 1998-05-08 1999-11-30 Cai Kk Network maintenance control system

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7536428B2 (en) * 2006-06-23 2009-05-19 Microsoft Corporation Concurrent read and write access to a linked list where write process updates the linked list by swapping updated version of the linked list with internal list
US11573909B2 (en) 2006-12-06 2023-02-07 Unification Technologies Llc Apparatus, system, and method for managing commands of solid-state storage using bank interleave
US11960412B2 (en) 2006-12-06 2024-04-16 Unification Technologies Llc Systems and methods for identifying storage resources that are not in use
US11847066B2 (en) 2006-12-06 2023-12-19 Unification Technologies Llc Apparatus, system, and method for managing commands of solid-state storage using bank interleave
US11640359B2 (en) 2006-12-06 2023-05-02 Unification Technologies Llc Systems and methods for identifying storage resources that are not in use
EP2652623A4 (en) * 2010-12-13 2014-04-30 Fusion Io Inc Apparatus, system, and method for auto-commit memory
CN103262054B (en) * 2010-12-13 2015-11-25 桑迪士克科技股份有限公司 For automatically submitting device, the system and method for storer to
US9218278B2 (en) 2010-12-13 2015-12-22 SanDisk Technologies, Inc. Auto-commit memory
US9767017B2 (en) 2010-12-13 2017-09-19 Sandisk Technologies Llc Memory device with volatile and non-volatile media
US9772938B2 (en) 2010-12-13 2017-09-26 Sandisk Technologies Llc Auto-commit memory metadata and resetting the metadata by writing to special address in free space of page storing the metadata
US10817421B2 (en) 2010-12-13 2020-10-27 Sandisk Technologies Llc Persistent data structures
US10817502B2 (en) 2010-12-13 2020-10-27 Sandisk Technologies Llc Persistent memory management
US9047178B2 (en) 2010-12-13 2015-06-02 SanDisk Technologies, Inc. Auto-commit memory synchronization
EP2652623A2 (en) * 2010-12-13 2013-10-23 Fusion-io, Inc. Apparatus, system, and method for auto-commit memory
CN103262054A (en) * 2010-12-13 2013-08-21 弗森-艾奥公司 Apparatus, system, and method for auto-commit memory
WO2012082792A2 (en) 2010-12-13 2012-06-21 Fusion-Io, Inc. Apparatus, system, and method for auto-commit memory

Also Published As

Publication number Publication date
WO2004077219A3 (en) 2005-05-19

Similar Documents

Publication Publication Date Title
Matsunobu et al. Myrocks: Lsm-tree database storage engine serving facebook's social graph
US10437721B2 (en) Efficient garbage collection for a log-structured data store
US11755415B2 (en) Variable data replication for storage implementing data backup
US10042910B2 (en) Database table re-partitioning using two active partition specifications
CN107835983B (en) Backup and restore in distributed databases using consistent database snapshots
US9146934B2 (en) Reduced disk space standby
US10942814B2 (en) Method for discovering database backups for a centralized backup system
Stonebraker The design of the Postgres storage system
CN111352925B (en) Policy driven data placement and information lifecycle management
US7840539B2 (en) Method and system for building a database from backup data images
Flouris et al. Clotho: Transparent Data Versioning at the Block I/O Level.
US20020184244A1 (en) System and method for parallelizing file archival and retrieval
EP2590078B1 (en) Shadow paging based log segment directory
US10885023B1 (en) Asynchronous processing for synchronous requests in a database
KR20200056357A (en) Technique for implementing change data capture in database management system
KR20200056526A (en) Technique for implementing change data capture in database management system
WO2004077219A2 (en) System and method of mapping patterns of data, optimising disk read and write, verifying data integrity across clients and servers of different functionality having shared resources
US11615083B1 (en) Storage level parallel query processing
US20040059706A1 (en) System and method for providing concurrent usage and replacement of non-native language codes
WO2004077216A2 (en) System and method for heterogeneous data migration in real-time
Richardson Disambiguating databases
US11789951B2 (en) Storage of data structures
US12007983B2 (en) Optimization of application of transactional information for a hybrid transactional and analytical processing architecture
US12093239B2 (en) Handshake protocol for efficient exchange of transactional information for a hybrid transactional and analytical processing architecture
US20240004897A1 (en) Hybrid transactional and analytical processing architecture for optimization of real-time analytical querying

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase