Nothing Special   »   [go: up one dir, main page]

US20090063486A1 - Data replication using a shared resource - Google Patents

Data replication using a shared resource Download PDF

Info

Publication number
US20090063486A1
US20090063486A1 US11/846,704 US84670407A US2009063486A1 US 20090063486 A1 US20090063486 A1 US 20090063486A1 US 84670407 A US84670407 A US 84670407A US 2009063486 A1 US2009063486 A1 US 2009063486A1
Authority
US
United States
Prior art keywords
node
write
shared disk
disk
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US11/846,704
Other versions
US8527454B2 (en
Inventor
Dhairesh Oza
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US11/846,704 priority Critical patent/US8527454B2/en
Assigned to NOVELL, INC. reassignment NOVELL, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: OZA, DHAIRESH
Publication of US20090063486A1 publication Critical patent/US20090063486A1/en
Assigned to EMC CORPORATON reassignment EMC CORPORATON ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CPTN HOLDINGS LLC
Assigned to CPTN HOLDINGS, LLC reassignment CPTN HOLDINGS, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NOVELL, INC.
Application granted granted Critical
Publication of US8527454B2 publication Critical patent/US8527454B2/en
Assigned to CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT reassignment CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT SECURITY AGREEMENT Assignors: ASAP SOFTWARE EXPRESS, INC., AVENTAIL LLC, CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL SOFTWARE INC., DELL SYSTEMS CORPORATION, DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., MAGINATICS LLC, MOZY, INC., SCALEIO LLC, SPANNING CLOUD APPS LLC, WYSE TECHNOLOGY L.L.C.
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT SECURITY AGREEMENT Assignors: ASAP SOFTWARE EXPRESS, INC., AVENTAIL LLC, CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL SOFTWARE INC., DELL SYSTEMS CORPORATION, DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., MAGINATICS LLC, MOZY, INC., SCALEIO LLC, SPANNING CLOUD APPS LLC, WYSE TECHNOLOGY L.L.C.
Assigned to EMC IP Holding Company LLC reassignment EMC IP Holding Company LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EMC CORPORATION
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. SECURITY AGREEMENT Assignors: CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., WYSE TECHNOLOGY L.L.C.
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. SECURITY AGREEMENT Assignors: CREDANT TECHNOLOGIES INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., WYSE TECHNOLOGY L.L.C.
Assigned to DELL MARKETING L.P., WYSE TECHNOLOGY L.L.C., DELL INTERNATIONAL, L.L.C., DELL PRODUCTS L.P., FORCE10 NETWORKS, INC., MOZY, INC., EMC CORPORATION, CREDANT TECHNOLOGIES, INC., AVENTAIL LLC, SCALEIO LLC, ASAP SOFTWARE EXPRESS, INC., EMC IP Holding Company LLC, DELL USA L.P., DELL SYSTEMS CORPORATION, DELL SOFTWARE INC., MAGINATICS LLC reassignment DELL MARKETING L.P. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH
Assigned to DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), DELL PRODUCTS L.P., DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), DELL USA L.P., EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), SCALEIO LLC, DELL INTERNATIONAL L.L.C., EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC) reassignment DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.) RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Assigned to DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), DELL INTERNATIONAL L.L.C., DELL PRODUCTS L.P., DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), SCALEIO LLC, DELL USA L.P., EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC) reassignment DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.) RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • G06F11/2071Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using a plurality of controllers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • G06F11/2082Data synchronisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/526Mutual exclusion algorithms

Definitions

  • an enterprise may dynamically replicate the state of its data for a particular volume with an entirely different and remote volume. If something should happen to the particular volume, a user can have uninterrupted access to the remote volume with little noticeable or detectable loss of service from the viewpoint of the user. Additionally, both volumes can be independently accessed by different sets of users. Thus, replication permits access in the event of failure and can help alleviate load for any particular volume.
  • a method is provided for data replication.
  • a write lock is acquired on a first node and blocks associated with the write lock are writing to one or more local disks of the first node. Simultaneously, the blocks associated with the write lock are written to a shared disk over a network connection. The shared disk is shared with a second node, and the first node and second node are replicated with one another. Next, the write lock is released after the writing to the one or more local disks and after the writing to the shared disk complete.
  • FIG. 1 is a diagram of a method for data replication using a shared resource, according to an example embodiment.
  • FIG. 2 is a diagram of a method for managing data replication failure situations, according to an example embodiment.
  • FIG. 3 is a diagram of a data replication system, according to an example embodiment.
  • FIG. 4 is a diagram of a data replication failure management system, according to an example embodiment.
  • a “node” refers to a machine or collection of machines within a local processing environment.
  • a “machine” refers to a processing device, such as but not limited to a computer, a phone, a peripheral device, a television (TV), a set-top-box (STB), a personal digital assistant (PDA), etc.
  • TV television
  • STB set-top-box
  • PDA personal digital assistant
  • Any particular node processes within a local processing environment, which may include a local area network (LAN) and that particular node has access within that local processing environment to one or more local storage disks or devices.
  • LAN local area network
  • Each local disk for a particular node is a replica of another local disk for a different node.
  • the nodes communicate over a wide area network (WAN), such as but not limited to the Internet.
  • WAN wide area network
  • each of the nodes communicate with an independent shared resource.
  • the “shared resource” is a shared storage device or disk that is accessible over the WAN to each of the nodes.
  • the shared disk is on a node that is independent of the other nodes involved in data replication.
  • SAN Storage Area Network
  • iSCSI Internet Small Computer System Interface
  • NFS Network File System
  • the techniques presented herein may be implemented within Novell products distributed by Novell, Inc. of Provo, Utah; such as but not limited to Novell's Distributed File Services (DFS).
  • Novell's Distributed File Services DFS
  • OS operating system
  • FIG. 1 is a diagram of a method 100 for data replication using a shared resource, according to an example embodiment.
  • the method 100 (hereinafter “replication service”) is implemented in a machine-accessible and machine-readable medium and is accessible over a network.
  • the network may be wired, wireless, or a combination of wired and wireless.
  • the replication service processes on a node that is engaged in replication with at least one other different node of the network.
  • each node involved in replicating its local disks can include a processing instances of the replication service. In this manner, each node involved in replication performs the processing discussed herein with reference to the FIG. 1 .
  • the replication service acquires a write lock from a first node.
  • this is achieved by hooking onto the Input/Output I/O path of the first node so that I/O is detected as it is being requested on the first node and before it actually processes.
  • the write lock is associated with a write request that includes data and one or more blocks. The write requests are to be performed on one or more local disks of the first node.
  • the replication service writes the blocks associated with the write lock to the one or more local disks of the first node.
  • the replication service writes the blocks or messages associated with the write blocks to a shared disk over a network connection.
  • messages it is meant that the details of the write or other control information identifying the replication service and its actions are recorded in the shared disk.
  • the shared disk is also shared with at least one other second node over the network connection. Furthermore, the first node and the second node are dynamically and actively being replicated and synchronized with one another. This means the first node's local disks are replicas of the second node's local disks.
  • the replication service releases the write lock after the blocks are written to the one or more local disks of the first node and after the blocks or messages are written to the shared disk. That is, after writes to both the local disks and the shared disk complete, the write lock is released by the replication service.
  • the replication service acquires a read lock associated with a read request to service data.
  • the replication service serves the read request from the local disks of the first node and immediately releases the read lock.
  • the addition of one is for the extra write that is done to the shared disk.
  • N ⁇ 1 reads since each of the remaining nodes not initially processing the write have to read the write that occurred from the shared disk and write the blocks of data associated with the write to their local disks.
  • the reads are performed locally on the node having the read lock from their local disks. Therefore, reads are processed rapidly and served efficiently with little overhead involved. So, in situations where shared data is read more often then it is written or modified, the above scenario creates an optimal replication processing environment.
  • the replication service determines that the shared disk becomes unresponsive. In response, the replication service automatically and dynamically switches to an alternative shared disk to communicate with for subsequent write locks involved in the replication process.
  • a third-party service can dynamically identify the alternative shared disk to communicate with.
  • configuration, policy, or profile information can provide the identity of the alternative shared disk.
  • the alternative shared disk may itself be replicated with the failed shared disk.
  • the replication service determines that the shared disk becomes unresponsive. In response, the replication service automatically fails processing associated with subsequent write and read locks during the period of unresponsiveness.
  • the replication service acquires a bitmap from the shared disk after the first node has failed and starts up after a recovery.
  • the bitmap identifies blocks that were modified during the period within which the first node was down. This can be done by setting each bit representing the local disks of the first node that were changed.
  • the replication service inspects the bitmap for set bits and acquires those blocks from the second node to update and re-synchronize itself with the second node.
  • the replication service replays a shared data log from the shared disk after the first node recovers from a failure situation on the first node to re-synchronize the first node with the second node. That is, the first node acquires the missing data blocks from the shared disk once it comes online following a failure situation.
  • FIG. 2 is a diagram of a method 200 for managing data replication failure situations, according to an example embodiment.
  • the method 200 (hereinafter “monitor service”) is implemented in a machine-accessible and readable medium and is accessible over a network.
  • the network may be wired, wireless, or a combination of wired and wireless.
  • the monitor service process on any node of the network. This means that the monitor service can process on a node that is actively engaged in replication or that the monitor service can process on a node associated with the shared disk. In some embodiments, the monitor service may also be duplicated similar to the replication process and may coordinate with other processing instances of itself on the network.
  • the replication service represented by the method 100 of the FIG. 1 demonstrates how a shared disk is used to manage multi-node replication of disks.
  • An enhancement to that processing involves the use of a different service, namely the monitor service that provides a variety of administrative and management features discussed herein and below.
  • the monitor service actively and dynamically monitors the progress of two or more nodes that are replicas of one another and that are being synchronized with one another.
  • the monitor service detects a failure in a particular node (one of the two more nodes that are being replicated). The failure is detected while a write was being attempted to the shared disk but did not successfully complete processing on the shared disk.
  • the monitor service clears a log in the shared disk associated with a partial write that never completed with the attempted write.
  • the monitor service releases the write lock or clears it out.
  • the write lock was associated with a write that the particular node was attempting to process before it failed or encountered a failure situation. That is, the write lock is pending and being held by the particular node (an outstanding and unprocessed write lock) when the failure is detected. This ensures that the particular node will know that the write failed when it recovers, since it will not see any indication of the write in the shared disk when it comes back online and it ensures that the other nodes being replicated will not hang or attempt to process a partial and incomplete write request.
  • the monitor service detects a different failure associated with a particular node (again one of the two or more nodes being replicated with one another).
  • the failing node had an unprocessed read lock associated with a pending read request when it failed.
  • the monitor service checks for any and all pending and unprocessed read locks on the failed node and clears each of them; so that the failed node will reprocess them or know that they failed when it comes back on line.
  • the monitor service detects yet another different type of failure situation.
  • the particular failing node fails after it successfully completes a write request and the shared disk has complete information regarding the write.
  • each of the remaining and non failing nodes of the two or more nodes is updated with the successful write that occurred just before the particular node failed. This ensures each of the other nodes is synchronized with this write when the failed node recovers and comes back online.
  • the monitor service can perform a variety of useful actions that will assist the particular failing node when it recovers and comes back on line.
  • the monitor service writes all subsequent writes made by non failing nodes while the failing node is out of commission to an area of the shared disk associated with the particular failing node. This area is accessed by the failing node when it comes back online and the information is used to re-synchronize the failed node when it recovers.
  • the monitor service populates a bitmap to identify modified and changed blocks (associated with the failed node's local disks) when the failed node is offline.
  • the failed node comes online, it acquires the bitmap to determine which blocks of its local disks need updated.
  • the data associated with the blocks can be acquired from one of the non-failing nodes or from a different area associated with the shared disk.
  • the entire processing associated with the monitor service can execute in a processing environment that is local to and associated with the shared disk.
  • the entire processing associated with the monitor service can execute in a processing environment that is local to and associated with any one of the non failing nodes that are being replicated with one another and a failed node.
  • FIG. 3 is a diagram of a data replication system 300 , according to an example embodiment.
  • the data replication system 300 is implemented in a machine-accessible and readable medium and is accessible over a network.
  • the network may be wired, wireless, or a combination of wired and wireless.
  • the data replication system 300 implements, among other things, the process associated with the replication service represented by the method 100 of the FIG. 1 .
  • the data replication system 300 includes a replication service 301 and a shared disk 302 . Each of these and their interactions with one another will now be discussed in turn.
  • the replication service 301 is implemented in a machine-accessible and readable medium and is to process on a machine of the network, which is associated with a first node. Example processing of the replication service 301 was presented in detail above with reference to the method 100 of the FIG. 1 .
  • the replication service 301 keeps a first local disk associated with the first node in synchronization with a third local disk associated with a third node.
  • the third node has its own processing instance of the replication service 301 and the two instances cooperate and communicate with one another. This is done by acquiring write locks and performing the writes for blocks of data against the first local disk and communicating the writes to the shared disk 302 and then releasing the write locks.
  • the replication service 301 first communicates the writes to the shared disk 302 and waits for an indication that the third node has completed the writes before the writes are processed or performed against the first local disks. This can also be a processing parameter or option configured by an administrator.
  • the replication service 301 simultaneously performs the writes against the first local disk while communicating the writes to the shared disk 302 .
  • the replication service 301 acquires a read lock and services data associated with a read request from the first local disk and then releases the read lock. This is entirely local and requires very little overhead to perform.
  • the replication service 301 communicates the writes to a designated area of the shared disk 302 that is accessible to just the first node. This is discussed more completely below.
  • the shared disk 302 is implemented as a device accessible to the first node over a WAN network connection or the network. Additionally, the shared disk is locally accessible to a different machine associated with a second node.
  • the shared disk 302 is a block or network accessible device accessible to all nodes being replicated with one another. This can be a SAN, Logical Unit Number (LUN), iSCSI, NFS, Server Message Block (SMB), Network Control Program (NCP), etc.
  • LUN Logical Unit Number
  • iSCSI iSCSI
  • NFS Server Message Block
  • SMB Server Message Block
  • NCP Network Control Program
  • the shared disk 302 is structured to include at least three areas, a data log, lock data information, and administrative data. Other data and partitions can be achieved according to the needs of a particular enterprise. For example, the tracking and audit information that includes identity information for resources making write transactions can be retained.
  • the shared disk 302 includes a designated partitioned or reserved area access to the first node and another different and designated partitioned area accessible to the third node. This permits the first node and the third node to simultaneously communicate information to the shared disk 302 without conflict and without delay.
  • the shared disk 302 provides the novel mechanism by which traditional replication offloads replication and failure services to a shared resource, namely the shared disk 302 .
  • FIG. 4 is a diagram of a data replication failure management system 400 , according to an example embodiment.
  • the data replication failure management system 400 is implemented in a machine-accessible and readable medium is accessible over a network.
  • the network may be wired, wireless, or a combination of wired and wireless.
  • the data replication failure management system 400 implements various aspects associated with the method 100 of the FIG. 1 .
  • the data replication failure management system 400 includes a monitor service 401 and a shard disk 402 . Each of these and their interactions with one another will now be discussed in detail.
  • the monitor service 401 is implemented in a machine-accessible and readable medium and is to process on a machine of the network; the machine associated with a first node. Example processing associated with the monitor service 401 was presented above with reference to the method 100 of the FIG. 1 .
  • the monitor service 401 dynamically and in real time keeps track of replication services that process on the at least one additional node and writes information to the shared disk 402 when a particular node fails that permits that particular node to resynchronize with other nodes when the particular node recovers from a failure situation.
  • the monitor service 401 records write messages in a particular partition or with particular identifying information within the shared disk 402 .
  • the write messages are associated with non failing nodes and their write activity while the particular node is in the failure situation.
  • the particular partition or the particular identifying information permits the particular node to identify the write messages and replay them when the particular node recovers.
  • the monitor service 401 maintains a bitmap in the shared disk 402 that identifies blocks that non failing nodes modified while the particular node was in the failure situation.
  • the particular node uses the bitmap recorded in the shared disk 402 to synchronize with the non failing nodes when the particular node recovers.
  • the monitor service 401 clears any attempted write to the shared disk 402 that does not complete successfully and which was attempted by the particular node before it encountered the failure situation. Additionally, the monitor service 401 releases a write lock associated with the attempted write.
  • the storage disk 402 is implemented as a device that is accessible to the first node over a network connection and that is accessible to at least one additional node. Various aspects of the storage 402 were discussed above with reference to the system 300 of the FIG. 3 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Techniques are presented for data replication. Two or more nodes are engaged in data replication with one another. When a node acquires a write lock, the write is processed to a local disk associated with the node and communicated to a shared disk. During recovery of a particular node following a failure, the shared disk is accessed to assist in synchronizing that particular node.

Description

    BACKGROUND
  • Data and information are rapidly becoming the life blood of enterprises. Transactions with customers, operational data, financial data, corporate intelligence data; in fact, all types of information are now captured, indexed, stored, and mined by enterprises in today's highly competitive and world economy.
  • Since information is vital to the enterprise, it is often made available twenty-four hours a day, seven days a week, and three hundred sixty-five days a year. To achieve such a feat, the enterprises have to implement a variety of data replication, data backup, and data versioning techniques against their data warehouses.
  • For example, an enterprise may dynamically replicate the state of its data for a particular volume with an entirely different and remote volume. If something should happen to the particular volume, a user can have uninterrupted access to the remote volume with little noticeable or detectable loss of service from the viewpoint of the user. Additionally, both volumes can be independently accessed by different sets of users. Thus, replication permits access in the event of failure and can help alleviate load for any particular volume.
  • Today, most approaches utilize an approach where the volumes that are synchronized with one another directly and exclusively communicate with one another to perform replication with one another. This can lead to complex synchronization when more than two volumes are being replicated and multiple failures occur. Additionally, it places the management and communication associated with recovery from failures on the nodes themselves, which can degrade performance in servicing a user when a failing node needs resynchronized.
  • As a result, there is a need for improved data replication techniques.
  • SUMMARY
  • In various embodiments, techniques are provided for data replication using a shared resource. More particularly and in an embodiment, a method is provided for data replication. A write lock is acquired on a first node and blocks associated with the write lock are writing to one or more local disks of the first node. Simultaneously, the blocks associated with the write lock are written to a shared disk over a network connection. The shared disk is shared with a second node, and the first node and second node are replicated with one another. Next, the write lock is released after the writing to the one or more local disks and after the writing to the shared disk complete.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a diagram of a method for data replication using a shared resource, according to an example embodiment.
  • FIG. 2 is a diagram of a method for managing data replication failure situations, according to an example embodiment.
  • FIG. 3 is a diagram of a data replication system, according to an example embodiment.
  • FIG. 4 is a diagram of a data replication failure management system, according to an example embodiment.
  • DETAILED DESCRIPTION
  • A “node” refers to a machine or collection of machines within a local processing environment. A “machine” refers to a processing device, such as but not limited to a computer, a phone, a peripheral device, a television (TV), a set-top-box (STB), a personal digital assistant (PDA), etc.
  • Any particular node processes within a local processing environment, which may include a local area network (LAN) and that particular node has access within that local processing environment to one or more local storage disks or devices.
  • Multiple nodes from multiple processing environments communicate with one another to replicate their local disks with one another. Each local disk for a particular node is a replica of another local disk for a different node. The nodes communicate over a wide area network (WAN), such as but not limited to the Internet.
  • In addition, and as is discussed in greater detail herein and below, each of the nodes communicate with an independent shared resource. The “shared resource” is a shared storage device or disk that is accessible over the WAN to each of the nodes. In an embodiment, the shared disk is on a node that is independent of the other nodes involved in data replication.
  • A variety of architectures or technologies can be used to access the shared disk, such as but not limited to a Storage Area Network (SAN), Internet Small Computer System Interface (iSCSI), Network File System (NFS), etc.
  • According to an embodiment, the techniques presented herein may be implemented within Novell products distributed by Novell, Inc. of Provo, Utah; such as but not limited to Novell's Distributed File Services (DFS). Of course it is to be understood that any network architecture, device, proxy, operating system (OS), or product may be enhanced to utilize and deploy the techniques presented herein and below.
  • It is within this context that embodiments of the present invention are discussed with reference to the FIGS. 1-4.
  • FIG. 1 is a diagram of a method 100 for data replication using a shared resource, according to an example embodiment. The method 100 (hereinafter “replication service”) is implemented in a machine-accessible and machine-readable medium and is accessible over a network. The network may be wired, wireless, or a combination of wired and wireless. The replication service processes on a node that is engaged in replication with at least one other different node of the network.
  • It is noted, that each node involved in replicating its local disks can include a processing instances of the replication service. In this manner, each node involved in replication performs the processing discussed herein with reference to the FIG. 1.
  • At 110, the replication service acquires a write lock from a first node. In an embodiment, at 111, this is achieved by hooking onto the Input/Output I/O path of the first node so that I/O is detected as it is being requested on the first node and before it actually processes. The write lock is associated with a write request that includes data and one or more blocks. The write requests are to be performed on one or more local disks of the first node.
  • At 120, the replication service writes the blocks associated with the write lock to the one or more local disks of the first node. Simultaneously, at 130, the replication service writes the blocks or messages associated with the write blocks to a shared disk over a network connection. By messages it is meant that the details of the write or other control information identifying the replication service and its actions are recorded in the shared disk.
  • The shared disk is also shared with at least one other second node over the network connection. Furthermore, the first node and the second node are dynamically and actively being replicated and synchronized with one another. This means the first node's local disks are replicas of the second node's local disks.
  • At 130, the replication service releases the write lock after the blocks are written to the one or more local disks of the first node and after the blocks or messages are written to the shared disk. That is, after writes to both the local disks and the shared disk complete, the write lock is released by the replication service.
  • In an embodiment, at 150, the replication service acquires a read lock associated with a read request to service data. In response to this, the replication service serves the read request from the local disks of the first node and immediately releases the read lock.
  • It can be seen from the above processing that a single write results in N+1 writes, where N is an Integer representing the number of nodes being replicated (in the present example N=2 for the first and second nodes). The addition of one is for the extra write that is done to the shared disk. This also results in N−1 reads, since each of the remaining nodes not initially processing the write have to read the write that occurred from the shared disk and write the blocks of data associated with the write to their local disks.
  • However, as is shown at 150, the reads are performed locally on the node having the read lock from their local disks. Therefore, reads are processed rapidly and served efficiently with little overhead involved. So, in situations where shared data is read more often then it is written or modified, the above scenario creates an optimal replication processing environment.
  • According to an embodiment, at 160, the replication service determines that the shared disk becomes unresponsive. In response, the replication service automatically and dynamically switches to an alternative shared disk to communicate with for subsequent write locks involved in the replication process. A third-party service can dynamically identify the alternative shared disk to communicate with. Alternatively, configuration, policy, or profile information can provide the identity of the alternative shared disk. The alternative shared disk may itself be replicated with the failed shared disk.
  • In a different embodiment, at 170, the replication service determines that the shared disk becomes unresponsive. In response, the replication service automatically fails processing associated with subsequent write and read locks during the period of unresponsiveness.
  • In an embodiment, at 180, the replication service acquires a bitmap from the shared disk after the first node has failed and starts up after a recovery. The bitmap identifies blocks that were modified during the period within which the first node was down. This can be done by setting each bit representing the local disks of the first node that were changed. When the first node comes back online, the replication service inspects the bitmap for set bits and acquires those blocks from the second node to update and re-synchronize itself with the second node.
  • In still another embodiment, at 190, the replication service replays a shared data log from the shared disk after the first node recovers from a failure situation on the first node to re-synchronize the first node with the second node. That is, the first node acquires the missing data blocks from the shared disk once it comes online following a failure situation.
  • FIG. 2 is a diagram of a method 200 for managing data replication failure situations, according to an example embodiment. The method 200 (hereinafter “monitor service”) is implemented in a machine-accessible and readable medium and is accessible over a network. The network may be wired, wireless, or a combination of wired and wireless.
  • The monitor service process on any node of the network. This means that the monitor service can process on a node that is actively engaged in replication or that the monitor service can process on a node associated with the shared disk. In some embodiments, the monitor service may also be duplicated similar to the replication process and may coordinate with other processing instances of itself on the network.
  • The replication service represented by the method 100 of the FIG. 1 demonstrates how a shared disk is used to manage multi-node replication of disks. An enhancement to that processing involves the use of a different service, namely the monitor service that provides a variety of administrative and management features discussed herein and below.
  • At 210, the monitor service actively and dynamically monitors the progress of two or more nodes that are replicas of one another and that are being synchronized with one another.
  • At some point, at 220, the monitor service detects a failure in a particular node (one of the two more nodes that are being replicated). The failure is detected while a write was being attempted to the shared disk but did not successfully complete processing on the shared disk.
  • In response to this failed write attempt, at 230, the monitor service clears a log in the shared disk associated with a partial write that never completed with the attempted write.
  • Next, at 240, the monitor service releases the write lock or clears it out. The write lock was associated with a write that the particular node was attempting to process before it failed or encountered a failure situation. That is, the write lock is pending and being held by the particular node (an outstanding and unprocessed write lock) when the failure is detected. This ensures that the particular node will know that the write failed when it recovers, since it will not see any indication of the write in the shared disk when it comes back online and it ensures that the other nodes being replicated will not hang or attempt to process a partial and incomplete write request.
  • In another failure situation, at 250, the monitor service detects a different failure associated with a particular node (again one of the two or more nodes being replicated with one another). In this different failure situation, the failing node had an unprocessed read lock associated with a pending read request when it failed. Here, the monitor service checks for any and all pending and unprocessed read locks on the failed node and clears each of them; so that the failed node will reprocess them or know that they failed when it comes back on line.
  • In still another failure situation, at 260, the monitor service detects yet another different type of failure situation. Here, the particular failing node fails after it successfully completes a write request and the shared disk has complete information regarding the write. In such a situation, each of the remaining and non failing nodes of the two or more nodes is updated with the successful write that occurred just before the particular node failed. This ensures each of the other nodes is synchronized with this write when the failed node recovers and comes back online.
  • During a failure, the monitor service can perform a variety of useful actions that will assist the particular failing node when it recovers and comes back on line.
  • For example, at 270, the monitor service writes all subsequent writes made by non failing nodes while the failing node is out of commission to an area of the shared disk associated with the particular failing node. This area is accessed by the failing node when it comes back online and the information is used to re-synchronize the failed node when it recovers.
  • In a different situation, at 280, the monitor service populates a bitmap to identify modified and changed blocks (associated with the failed node's local disks) when the failed node is offline. When the failed node comes online, it acquires the bitmap to determine which blocks of its local disks need updated. The data associated with the blocks can be acquired from one of the non-failing nodes or from a different area associated with the shared disk.
  • According to an embodiment, at 290, the entire processing associated with the monitor service can execute in a processing environment that is local to and associated with the shared disk. In another situation, the entire processing associated with the monitor service can execute in a processing environment that is local to and associated with any one of the non failing nodes that are being replicated with one another and a failed node.
  • FIG. 3 is a diagram of a data replication system 300, according to an example embodiment. The data replication system 300 is implemented in a machine-accessible and readable medium and is accessible over a network. The network may be wired, wireless, or a combination of wired and wireless. In an embodiment, the data replication system 300 implements, among other things, the process associated with the replication service represented by the method 100 of the FIG. 1.
  • The data replication system 300 includes a replication service 301 and a shared disk 302. Each of these and their interactions with one another will now be discussed in turn.
  • The replication service 301 is implemented in a machine-accessible and readable medium and is to process on a machine of the network, which is associated with a first node. Example processing of the replication service 301 was presented in detail above with reference to the method 100 of the FIG. 1.
  • The replication service 301 keeps a first local disk associated with the first node in synchronization with a third local disk associated with a third node. Again, the third node has its own processing instance of the replication service 301 and the two instances cooperate and communicate with one another. This is done by acquiring write locks and performing the writes for blocks of data against the first local disk and communicating the writes to the shared disk 302 and then releasing the write locks.
  • In an embodiment, the replication service 301 first communicates the writes to the shared disk 302 and waits for an indication that the third node has completed the writes before the writes are processed or performed against the first local disks. This can also be a processing parameter or option configured by an administrator.
  • In another case, the replication service 301 simultaneously performs the writes against the first local disk while communicating the writes to the shared disk 302.
  • According to an embodiment, the replication service 301 acquires a read lock and services data associated with a read request from the first local disk and then releases the read lock. This is entirely local and requires very little overhead to perform.
  • In some cases, the replication service 301 communicates the writes to a designated area of the shared disk 302 that is accessible to just the first node. This is discussed more completely below.
  • The shared disk 302 is implemented as a device accessible to the first node over a WAN network connection or the network. Additionally, the shared disk is locally accessible to a different machine associated with a second node.
  • The shared disk 302 is a block or network accessible device accessible to all nodes being replicated with one another. This can be a SAN, Logical Unit Number (LUN), iSCSI, NFS, Server Message Block (SMB), Network Control Program (NCP), etc.
  • In an embodiment, the shared disk 302 is structured to include at least three areas, a data log, lock data information, and administrative data. Other data and partitions can be achieved according to the needs of a particular enterprise. For example, the tracking and audit information that includes identity information for resources making write transactions can be retained.
  • In an embodiment, the shared disk 302 includes a designated partitioned or reserved area access to the first node and another different and designated partitioned area accessible to the third node. This permits the first node and the third node to simultaneously communicate information to the shared disk 302 without conflict and without delay.
  • The shared disk 302 provides the novel mechanism by which traditional replication offloads replication and failure services to a shared resource, namely the shared disk 302.
  • FIG. 4 is a diagram of a data replication failure management system 400, according to an example embodiment. The data replication failure management system 400 is implemented in a machine-accessible and readable medium is accessible over a network. The network may be wired, wireless, or a combination of wired and wireless. In an embodiment, the data replication failure management system 400 implements various aspects associated with the method 100 of the FIG. 1.
  • The data replication failure management system 400 includes a monitor service 401 and a shard disk 402. Each of these and their interactions with one another will now be discussed in detail.
  • The monitor service 401 is implemented in a machine-accessible and readable medium and is to process on a machine of the network; the machine associated with a first node. Example processing associated with the monitor service 401 was presented above with reference to the method 100 of the FIG. 1.
  • The monitor service 401 dynamically and in real time keeps track of replication services that process on the at least one additional node and writes information to the shared disk 402 when a particular node fails that permits that particular node to resynchronize with other nodes when the particular node recovers from a failure situation.
  • In an embodiment, the monitor service 401 records write messages in a particular partition or with particular identifying information within the shared disk 402. The write messages are associated with non failing nodes and their write activity while the particular node is in the failure situation. The particular partition or the particular identifying information permits the particular node to identify the write messages and replay them when the particular node recovers.
  • In yet another embodiment, the monitor service 401 maintains a bitmap in the shared disk 402 that identifies blocks that non failing nodes modified while the particular node was in the failure situation. The particular node uses the bitmap recorded in the shared disk 402 to synchronize with the non failing nodes when the particular node recovers.
  • In other situations, the monitor service 401 clears any attempted write to the shared disk 402 that does not complete successfully and which was attempted by the particular node before it encountered the failure situation. Additionally, the monitor service 401 releases a write lock associated with the attempted write.
  • The storage disk 402 is implemented as a device that is accessible to the first node over a network connection and that is accessible to at least one additional node. Various aspects of the storage 402 were discussed above with reference to the system 300 of the FIG. 3.
  • It is now appreciated how a shared resource can be integrated into a replication process to enhance and improve failure recovery and improve read processing.
  • The above description is illustrative, and not restrictive. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of embodiments should therefore be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
  • The Abstract is provided to comply with 37 C.F.R. § 1.72(b) and will allow the reader to quickly ascertain the nature and gist of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
  • In the foregoing description of the embodiments, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting that the claimed embodiments have more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the Description of the Embodiments, with each claim standing on its own as a separate exemplary embodiment.

Claims (24)

1. A method, comprising:
acquiring a write lock on a first node;
writing blocks associated with the write lock to one or more local disks of the first node;
simultaneously writing the blocks associated with the write lock to a shared disk over a network connection, wherein the shared disk is shared with a second node and wherein the first node and second node are replicated with one another; and
releasing the write lock after the writing to the one or more local disks and the writing to the shared disk complete.
2. The method of claim 1 further comprising:
acquiring a read lock;
serving a read request associated with the read lock from the one or more local disks; and
releasing the read lock after the read request is served.
3. The method of claim 1 further comprising:
determining that the shared disk is unresponsive; and
automatically switching to an alternative shared disk to communicate with for subsequent write locks.
4. The method of claim 1 further comprising:
determining that the shared disk is unresponsive; and
failing processing associated with subsequent write locks and subsequent read locks on the first node while the shared disk remains unresponsive.
5. The method of claim 1 further comprising:
acquiring a bitmap from the shared disk after recovering from a failure situation on the first node; and
using the bitmap to determine which blocks have to be updated and from which other nodes to acquire the blocks in order to resynchronize the first node after the failure situation.
6. The method of claim 1 further comprising, replaying a shared data log from the shared disk after recovering from a failure situation on the first node in order to resynchronize the first node after the failure situation.
7. The method of claim 1, wherein acquiring further includes hooking onto an Input/Output (I/O) path of the first node.
8. A method, comprising:
monitoring progress of two or more nodes that are replicas and synchronized with one another;
detecting a failure in a particular node while a write was attempting to take place but does not complete, and wherein the write was being attempted to a shared disk associated with the two or more nodes; and
clearing a log in the shared disk associated with a partial write associated with the attempted write to ensure that there is no indication of the attempted write in the log.
9. The method of claim 8 further comprising, release a write lock associated with the attempted write and held by the particular node.
10. The method of claim 8 further comprising:
detecting a different failure with the particular node associated with a read lock; and
clearing each additional read lock held by the particular node.
11. The method of claim 8 further comprising:
detecting a different failure with the particular node, wherein a particular write completes successfully to the shared disk before the different failure is detected; and
updating each of the two or more nodes that did not fail with the write from the shared disk.
12. The method of claim 8 further comprising, writing subsequent writes made by the non failing ones of the two or more nodes after the particular node fails to an area of the shared disk associated with the particular node thereby permitting the particular node to replay the subsequent writes when the particular node recovers.
13. The method of claim 8 further comprising, populating a bitmap to identify blocks that were updated while the particular node is in a failure situation on the shared disk thereby permitting the particular node to resynchronize with those blocks from the non failing ones of the two or more nodes when the particular node recovers.
14. The method of claim 8, processing the method as a service executing in a processing environment associated with a non failing one of the two or more nodes or associated with the shared disk.
15. A system, comprising:
a replication service implemented in a machine-accessible and readable medium and to process on a machine associated with a first node; and
a shared disk implemented as a device accessible to the first node over a network connection and locally accessible to a different machine associated with a second node;
wherein the replication service is to keep a first local disk associated with the first node synchronized with a third local disk associated with a third node by acquiring write locks for writes and performing the writes against the first local disk and communicating the writes to the shared disk and then releasing the write locks.
16. The system of claim 15, wherein the shared disk includes a designated partitioned area accessible to the first node and a different designated partitioned area accessible to the third node thereby permitting the first node and the third node to simultaneously communicate information without waiting on one another.
17. The system of claim 16, wherein the replication service communicates the writes to the designated area accessible to the first node on the shared disk.
18. The system of claim 15, wherein the replication service first communicates the writes to the shared disk and waits for an indication that the third node has completed the writes before performing the writes against the first local disk.
19. The system of claim 15, wherein the replication service simultaneously performs the writes against the first local disk while communicating the writes to the shared disk.
20. The system of claim 15, wherein the replication service acquires a read lock and services data associated with a read request and the read lock from the first local disk and then releases the read lock.
21. A system, comprising:
a monitor service implemented in a machine-accessible and readable medium and to process on a machine associated with a first node; and
a shared disk implemented as a device accessible to the first node over a network connection and accessible to at least one additional node;
wherein the monitor service is to keep track of replication services that process on the at least one additional node and writes information to the shared disk when a particular node fails that permits the particular node to resynchronize with other nodes when the particular node recovers from a failure situation.
22. The system of claim 21, wherein the monitor service records write messages in a particular partition or with particular identifying information within the shared disk, the write messages associated non failing nodes and their write activity while the particular node is in the failure situation and the particular partition or the particular identifying information permits the particular node to identity the write messages and replay then when the particular node recovers.
23. The system of claim 21, wherein the monitor service maintains a bitmap in the shared disk that identifies blocks that non failing nodes modified while the particular node was in the failure situation, and wherein the particular node uses the bitmap recorded in the shared disk to synchronize with the non failing nodes when the particular node recovers.
24. The system of claim 21, wherein the monitor service clears any attempted write to the shared disk that does not complete successfully, which was attempted by the particular node and releases a write lock associated with the attempted write.
US11/846,704 2007-08-29 2007-08-29 Data replication using a shared resource Active 2029-12-22 US8527454B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/846,704 US8527454B2 (en) 2007-08-29 2007-08-29 Data replication using a shared resource

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/846,704 US8527454B2 (en) 2007-08-29 2007-08-29 Data replication using a shared resource

Publications (2)

Publication Number Publication Date
US20090063486A1 true US20090063486A1 (en) 2009-03-05
US8527454B2 US8527454B2 (en) 2013-09-03

Family

ID=40409086

Family Applications (1)

Application Number Title Priority Date Filing Date
US11/846,704 Active 2029-12-22 US8527454B2 (en) 2007-08-29 2007-08-29 Data replication using a shared resource

Country Status (1)

Country Link
US (1) US8527454B2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090271579A1 (en) * 2008-04-24 2009-10-29 Hitachi, Ltd. Storage subsystem and storage system
US20140101099A1 (en) * 2012-10-04 2014-04-10 Sap Ag Replicated database structural change management
US20150066855A1 (en) * 2013-09-04 2015-03-05 Red Hat, Inc. Outcast index in a distributed file system
CN108959390A (en) * 2018-06-01 2018-12-07 新华三云计算技术有限公司 Resource-area synchronous method and device after shared-file system node failure
US10853200B2 (en) * 2019-02-01 2020-12-01 EMC IP Holding Company LLC Consistent input/output (IO) recovery for active/active cluster replication
US20210286526A1 (en) * 2020-03-12 2021-09-16 EMC IP Holding Company LLC Method, device and computer program products for storage management
CN113890817A (en) * 2021-08-27 2022-01-04 济南浪潮数据技术有限公司 Communication optimization method and device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10235249B1 (en) * 2016-07-01 2019-03-19 EMC IP Holding Company LLC System and method for PaaS replication
CN110389905B (en) 2018-04-20 2023-12-19 伊姆西Ip控股有限责任公司 Resource release method, resource allocation method, device and computer program product

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5418966A (en) * 1992-10-16 1995-05-23 International Business Machines Corporation Updating replicated objects in a plurality of memory partitions
US5463733A (en) * 1993-06-14 1995-10-31 International Business Machines Corporation Failure recovery apparatus and method for distributed processing shared resource control
US20030033487A1 (en) * 2001-08-09 2003-02-13 International Business Machines Corporation Method and apparatus for managing data in a distributed buffer system
US20030172106A1 (en) * 2002-02-14 2003-09-11 Iti, Inc. Method of increasing system availability by assigning process pairs to processor pairs
US20030187991A1 (en) * 2002-03-08 2003-10-02 Agile Software Corporation System and method for facilitating communication between network browsers and process instances
US20040128470A1 (en) * 2002-12-27 2004-07-01 Hetzler Steven Robert Log-structured write cache for data storage devices and systems
US20040215640A1 (en) * 2003-08-01 2004-10-28 Oracle International Corporation Parallel recovery by non-failed nodes
US6823349B1 (en) * 2001-09-21 2004-11-23 Emc Corporation Method and system for establishing, maintaining, and using a persistent fracture log
US20080235245A1 (en) * 2005-12-19 2008-09-25 International Business Machines Corporation Commitment of transactions in a distributed system
US7617369B1 (en) * 2003-06-30 2009-11-10 Symantec Operating Corporation Fast failover with multiple secondary nodes
US7631009B1 (en) * 2007-01-02 2009-12-08 Emc Corporation Redundancy check of transaction records in a file system log of a file server

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT1239122B (en) 1989-12-04 1993-09-28 Bull Hn Information Syst MULTIPROCESSOR SYSTEM WITH DISTRIBUTED RESOURCES WITH DYNAMIC REPLICATION OF GLOBAL DATA
DE69022716T2 (en) 1990-03-19 1996-03-14 Bull Hn Information Syst Multi-computer system with distributed common resources and dynamic and selective duplication of global data and procedures for it.
GB2263797B (en) 1992-01-31 1996-04-03 Plessey Telecomm Object orientated system
US7469331B2 (en) 2004-07-22 2008-12-23 International Business Machines Corporation Method and apparatus for supporting shared library text replication across a fork system call
US7403958B2 (en) 2005-01-19 2008-07-22 International Business Machines Corporation Synchronization-replication concurrency using non-shared snapshot query on a history table at read-uncommitted isolation level

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5418966A (en) * 1992-10-16 1995-05-23 International Business Machines Corporation Updating replicated objects in a plurality of memory partitions
US5463733A (en) * 1993-06-14 1995-10-31 International Business Machines Corporation Failure recovery apparatus and method for distributed processing shared resource control
US20030033487A1 (en) * 2001-08-09 2003-02-13 International Business Machines Corporation Method and apparatus for managing data in a distributed buffer system
US6823349B1 (en) * 2001-09-21 2004-11-23 Emc Corporation Method and system for establishing, maintaining, and using a persistent fracture log
US20030172106A1 (en) * 2002-02-14 2003-09-11 Iti, Inc. Method of increasing system availability by assigning process pairs to processor pairs
US20030187991A1 (en) * 2002-03-08 2003-10-02 Agile Software Corporation System and method for facilitating communication between network browsers and process instances
US20040128470A1 (en) * 2002-12-27 2004-07-01 Hetzler Steven Robert Log-structured write cache for data storage devices and systems
US7617369B1 (en) * 2003-06-30 2009-11-10 Symantec Operating Corporation Fast failover with multiple secondary nodes
US20040215640A1 (en) * 2003-08-01 2004-10-28 Oracle International Corporation Parallel recovery by non-failed nodes
US20080235245A1 (en) * 2005-12-19 2008-09-25 International Business Machines Corporation Commitment of transactions in a distributed system
US7631009B1 (en) * 2007-01-02 2009-12-08 Emc Corporation Redundancy check of transaction records in a file system log of a file server

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090271579A1 (en) * 2008-04-24 2009-10-29 Hitachi, Ltd. Storage subsystem and storage system
US20140101099A1 (en) * 2012-10-04 2014-04-10 Sap Ag Replicated database structural change management
US9720994B2 (en) * 2012-10-04 2017-08-01 Sap Se Replicated database structural change management
US20150066855A1 (en) * 2013-09-04 2015-03-05 Red Hat, Inc. Outcast index in a distributed file system
US10120868B2 (en) * 2013-09-04 2018-11-06 Red Hat, Inc. Outcast index in a distributed file system
CN108959390A (en) * 2018-06-01 2018-12-07 新华三云计算技术有限公司 Resource-area synchronous method and device after shared-file system node failure
US10853200B2 (en) * 2019-02-01 2020-12-01 EMC IP Holding Company LLC Consistent input/output (IO) recovery for active/active cluster replication
US20210286526A1 (en) * 2020-03-12 2021-09-16 EMC IP Holding Company LLC Method, device and computer program products for storage management
US11829604B2 (en) * 2020-03-12 2023-11-28 EMC IP Holding Company LLC Method, device and computer program products for storage management
CN113890817A (en) * 2021-08-27 2022-01-04 济南浪潮数据技术有限公司 Communication optimization method and device

Also Published As

Publication number Publication date
US8527454B2 (en) 2013-09-03

Similar Documents

Publication Publication Date Title
US8527454B2 (en) Data replication using a shared resource
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
US11704207B2 (en) Methods and systems for a non-disruptive planned failover from a primary copy of data at a primary storage system to a mirror copy of the data at a cross-site secondary storage system without using an external mediator
US9904605B2 (en) System and method for enhancing availability of a distributed object storage system during a partial database outage
EP2619695B1 (en) System and method for managing integrity in a distributed database
US9940205B2 (en) Virtual point in time access between snapshots
US9785691B2 (en) Method and apparatus for sequencing transactions globally in a distributed database cluster
US20240291887A1 (en) Commissioning and decommissioning metadata nodes in a running distributed data storage system
US8856091B2 (en) Method and apparatus for sequencing transactions globally in distributed database cluster
US7657581B2 (en) Metadata management for fixed content distributed data storage
US8229893B2 (en) Metadata management for fixed content distributed data storage
JP6225262B2 (en) System and method for supporting partition level journaling to synchronize data in a distributed data grid
US9268659B2 (en) Detecting failover in a database mirroring environment
US20050193245A1 (en) Internet protocol based disaster recovery of a server
JP2005196683A (en) Information processing system, information processor and control method of information processing system
JP2005243026A (en) Method, system, and computer program for system architecture for arbitrary number of backup components
TW201514684A (en) Speculative recovery using storage snapshot in a clustered database
KR102016095B1 (en) System and method for persisting transaction records in a transactional middleware machine environment
JP2011530127A (en) Method and system for maintaining data integrity between multiple data servers across a data center
KR20060004915A (en) Recovery from failures within data processing systems
JP2008059583A (en) Cluster system, method for backing up replica in cluster system, and program product
CN104994168A (en) distributed storage method and distributed storage system
US20120278429A1 (en) Cluster system, synchronization controlling method, server, and synchronization controlling program
CN106325768B (en) A kind of two-shipper storage system and method
WO2017014814A1 (en) Replicating memory volumes

Legal Events

Date Code Title Description
AS Assignment

Owner name: NOVELL, INC., UTAH

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:OZA, DHAIRESH;REEL/FRAME:019890/0996

Effective date: 20070829

AS Assignment

Owner name: EMC CORPORATON, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CPTN HOLDINGS LLC;REEL/FRAME:027016/0160

Effective date: 20110909

AS Assignment

Owner name: CPTN HOLDINGS, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NOVELL, INC.;REEL/FRAME:027169/0200

Effective date: 20110427

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040136/0001

Effective date: 20160907

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT, NORTH CAROLINA

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040134/0001

Effective date: 20160907

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLAT

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040134/0001

Effective date: 20160907

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., A

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040136/0001

Effective date: 20160907

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EMC CORPORATION;REEL/FRAME:040203/0001

Effective date: 20160906

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., T

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES, INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:049452/0223

Effective date: 20190320

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES, INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:049452/0223

Effective date: 20190320

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:053546/0001

Effective date: 20200409

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8

AS Assignment

Owner name: WYSE TECHNOLOGY L.L.C., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: MOZY, INC., WASHINGTON

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: MAGINATICS LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: FORCE10 NETWORKS, INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL SYSTEMS CORPORATION, TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL SOFTWARE INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL MARKETING L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL INTERNATIONAL, L.L.C., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: CREDANT TECHNOLOGIES, INC., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: AVENTAIL LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: ASAP SOFTWARE EXPRESS, INC., ILLINOIS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

AS Assignment

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL INTERNATIONAL L.L.C., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

AS Assignment

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL INTERNATIONAL L.L.C., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329