CN108984345B - Big data backup method based on virtual shared directory - Google Patents
Big data backup method based on virtual shared directory Download PDFInfo
- Publication number
- CN108984345B CN108984345B CN201810776448.0A CN201810776448A CN108984345B CN 108984345 B CN108984345 B CN 108984345B CN 201810776448 A CN201810776448 A CN 201810776448A CN 108984345 B CN108984345 B CN 108984345B
- Authority
- CN
- China
- Prior art keywords
- data
- big data
- backup
- medium
- nfs
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1464—Management of the backup or restore process for networked environments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1461—Backup scheduling policy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1469—Backup restoration techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A big data backup method based on virtual shared directory is prepared as providing file sharing protocol interface to external by local storage on media server, setting up a virtual shared directory, providing said interface to big data platform A to be backed up, carrying partition on local when big data platform A needs to be backed up to obtain sharing right of said virtual directory, disconnecting partition after backup is finished, backing back said partition to media server and providing shared directory service to another storage server.
Description
Technical Field
The invention belongs to the technical field of data backup, and particularly relates to a big data backup method for improving big data backup efficiency.
Background
The value of data in the big data era is more critical, and the safety of data running on the big data needs to be guaranteed, so that a faster and more universal backup technology is needed to realize data backup of various big data platforms and guarantee backup efficiency and compatibility.
At present, the method for data backup of a large data platform generally follows the following architecture, which includes the following parts: backup agents (i.e., agents), media servers, storage media.
The details of the specific implementation can be roughly divided into the following two types:
The backup agent is installed on a large data host of a to-be-backed end, collects backup data, and transmits the data to the media server through a network HTTP protocol, the media server is often deployed independently, collects data from each backup agent, and transmits and stores the data to a storage medium (such as disk) through an ISCSI interface after deduplication and compression are performed.
The backup agent is installed on a big data host of a side to be backed up, collects backup data, transmits the data to the media server through a network HTTP protocol, the media server is deployed independently, collects the data from each backup agent, performs deduplication and compression, and transmits and stores the data to a storage medium (such as object storage) through an HTTP interface.
In the prior art (1), corresponding acquisition clients are required for different backup objects, and agents are required to transfer data from a real data source, such as a hadoop name, to a temporary directory (on the host), then, the data in the directory is processed by block cutting (for example, one 64K data block at a time), and then each data block is transmitted to the media server end by the HTTP protocol, and after the media server receives the data, after a series of deduplication and compression processing, data is transmitted to a special storage medium (such as disk) through an FC network by an ISCSI protocol, the data in the whole process is subjected to 4 key time-consuming steps (i.e., agent local temporary storage, local switching, network transmission to a media server, and network transmission of the media server to the storage medium), the efficiency of data backup is difficult to guarantee, and the running risk of the system is increased by too many links.
Compared with the technology (1), the difference is that after the data is transmitted to the media server, the data is not directly transmitted to the storage media through the ISCSI protocol, but is cut into blocks again through the HTTP protocol, and the data is transmitted to the object storage through the HTTP protocol (object storage), the technology (2) is only different in the back-end storage protocol compared with the technology (1), the overall storage efficiency and risk are not effectively avoided, meanwhile, corresponding client agent agents also need to be developed for the acquisition of a multi-type large data platform, and the complexity and compatibility of a backup system are not improved. Therefore, there is a need in the art for a new solution to solve this problem.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the big data backup method based on the virtual shared directory improves the compatibility of a data backup system under a heterogeneous big data platform, simplifies the backup process of the big data platform backup system and improves the backup efficiency.
A big data backup method based on a virtual shared directory is characterized in that: the method comprises the following steps:
step one, establishing a virtual shared data storage backup system comprising a big data platform, a backup medium layer, a medium service layer and a storage medium;
secondly, the big data platform initiates a backup requirement to the system, the backup medium layer remotely mounts the network file medium NFS agent on the big data platform, provides a virtual shared directory based on a network file NFS protocol for the big data platform, and temporarily stores data in an internal directory of the NFS agent;
step three, after the NFS agent provided by the backup medium layer finishes temporary storage, the virtual sharing link is disconnected, and the data of the large data platform belongs to the backup medium layer;
step four, after data processing is carried out on the backup medium layer, the NFS agent is sent to a storage medium, and data of the big data platform is reserved in the storage medium;
step five, the big data platform initiates a data recovery request, backups data corresponding to the storage medium on the medium layer, establishes a shared virtual directory through the NFS agent, and sends the shared virtual directory to the medium service layer;
step six, mounting the NFS agent to the big data platform again through the medium service layer, and obtaining the file level access authority of the data by the big data platform;
and seventhly, the big data platform restores the data to the production environment, the restoration operation of the data is carried out, and the big data backup based on the virtual shared directory is completed.
The storage medium is an entity terminal device for actually storing data, can be automatically partitioned inside and is used for backing up data storage of more than one big data platform at the same time.
The backup medium layer is used for adapting the data receiving layer corresponding to the NFS agent to the storage medium for temporary storage and processing of data.
Through the design scheme, the invention can bring the following beneficial effects: a big data backup method based on a virtual shared directory improves the compatibility of a data backup system under a heterogeneous big data platform, simplifies the backup process of the big data platform backup system and improves the backup efficiency.
The invention can bring the following further beneficial effects: the invention realizes the creation of the virtual shared directory by two times of remote mounting, simplifies the complexity caused by the repeated processing and transmission of the existing backup software, and improves the efficiency of backup recovery.
The remote mounting technology of the invention adopts NFS protocol support, and a universal file protocol can be adapted to various big data platforms, and the compatibility of data backup of the big data platforms is improved without the need of traditional backup software for various clients.
Drawings
The invention is further described with reference to the following figures and detailed description:
fig. 1 is a schematic block diagram of a process of a big data backup method based on a virtual shared directory according to the present invention.
Detailed Description
A big data backup method based on a virtual shared directory is characterized in that: the method comprises the following steps:
step one, establishing a virtual shared data storage backup system comprising a big data platform, a backup medium layer, a medium service layer and a storage medium;
secondly, the big data platform initiates a backup requirement to the system, the backup medium layer remotely mounts the network file medium NFS agent on the big data platform, provides a virtual shared directory based on a network file NFS protocol for the big data platform, and temporarily stores data in an internal directory of the NFS agent;
step three, after the NFS agent provided by the backup medium layer finishes temporary storage, the virtual sharing link is disconnected, and the data of the large data platform belongs to the backup medium layer;
step four, after data processing is carried out on the backup medium layer, the NFS agent is sent to a storage medium, and data of the big data platform is reserved in the storage medium; the virtual shared directory is provided on the storage medium in a remote mounting mode, so that the disk-drop persistence of the backup data on the storage medium is realized, namely the shared directory is used as storage and reserved at the storage medium, and when other large data platforms need to be backed up at the moment, a new partition is divided at the storage medium and used for storing new backup data;
step five, the big data platform initiates a data recovery request, backups data corresponding to the storage medium on the medium layer, establishes a shared virtual directory through the NFS agent, and sends the shared virtual directory to the medium service layer;
step six, mounting the NFS agent to the big data platform again through the medium service layer, and obtaining the file level access authority of the data by the big data platform;
and seventhly, the big data platform restores the data to the production environment, the restoration operation of the data is carried out, and the big data backup based on the virtual shared directory is completed.
The invention provides a file sharing protocol interface to the outside through the local storage on the medium server, establishes a virtual sharing directory, if the interface is provided for the big data platform A which needs to be backed up, the partition is mounted on the local when the big data platform A needs to be backed up, the sharing right of the virtual directory can be obtained, after the backup is finished, the partition is disconnected, the partition can be returned to the medium server, meanwhile, the sharing directory service is provided for the other storage server, and the backup of the big data file is realized simply through the file copying.
The recovery process is the reverse of the backup process, except that the order of the two data shares is different.
Claims (3)
1. A big data backup method based on a virtual shared directory is characterized in that: comprises the following steps of (a) carrying out,
step one, establishing a virtual shared data storage backup system comprising a big data platform, a backup medium layer, a medium service layer and a storage medium;
secondly, the big data platform initiates a backup requirement to the system, the backup medium layer remotely mounts the network file medium NFS agent on the big data platform, provides a virtual shared directory based on a network file NFS protocol for the big data platform, and temporarily stores data in an internal directory of the NFS agent;
step three, after the NFS agent provided by the backup medium layer finishes temporary storage, the virtual sharing link is disconnected, and the data of the large data platform belongs to the backup medium layer;
step four, after data processing is carried out on the backup medium layer, the NFS agent is sent to a storage medium, and data of the big data platform is reserved in the storage medium;
step five, the big data platform initiates a data recovery request, backups data corresponding to the storage medium on the medium layer, establishes a shared virtual directory through the NFS agent, and sends the shared virtual directory to the medium service layer;
step six, mounting the NFS agent to the big data platform again through the medium service layer, and obtaining the file level access authority of the data by the big data platform;
and seventhly, the big data platform restores the data to the production environment, the restoration operation of the data is carried out, and the big data backup based on the virtual shared directory is completed.
2. The method for backing up big data based on the virtual shared directory as claimed in claim 1, wherein: the storage medium is a disk for actually storing data, can be automatically partitioned inside and is used for backing up data storage of more than one big data platform at the same time.
3. The method for backing up big data based on the virtual shared directory as claimed in claim 1, wherein: the backup medium layer is used for adapting the data receiving layer corresponding to the NFS agent to the storage medium for temporary storage and processing of data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810776448.0A CN108984345B (en) | 2018-07-11 | 2018-07-11 | Big data backup method based on virtual shared directory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810776448.0A CN108984345B (en) | 2018-07-11 | 2018-07-11 | Big data backup method based on virtual shared directory |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108984345A CN108984345A (en) | 2018-12-11 |
CN108984345B true CN108984345B (en) | 2020-06-23 |
Family
ID=64548399
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810776448.0A Active CN108984345B (en) | 2018-07-11 | 2018-07-11 | Big data backup method based on virtual shared directory |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108984345B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111399984A (en) * | 2020-03-19 | 2020-07-10 | 上海英方软件股份有限公司 | File recovery method and system based on virtual machine backup data |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1554055A (en) * | 2001-07-23 | 2004-12-08 | �Ƚ�װ�ù�˾ | High-availability cluster virtual server system |
CN102375955A (en) * | 2010-08-17 | 2012-03-14 | 伊姆西公司 | System and method for locking files in combined naming space in network file system |
US8429140B1 (en) * | 2010-11-03 | 2013-04-23 | Netapp. Inc. | System and method for representing application objects in standardized form for policy management |
US8655851B2 (en) * | 2011-04-08 | 2014-02-18 | Symantec Corporation | Method and system for performing a clean file lock recovery during a network filesystem server migration or failover |
CN103761168A (en) * | 2014-01-26 | 2014-04-30 | 上海爱数软件有限公司 | Method for mounting backup virtual machine based on nfs volume |
CN104461776A (en) * | 2014-11-26 | 2015-03-25 | 上海爱数软件有限公司 | Application disaster tolerance method based on CDP and iSCSI virtual disk technology |
CN105224256A (en) * | 2015-10-13 | 2016-01-06 | 浪潮(北京)电子信息产业有限公司 | A kind of storage system |
CN105740052A (en) * | 2016-01-28 | 2016-07-06 | 浪潮(北京)电子信息产业有限公司 | Method, device and system for online migration of virtual machines of non-shared memories |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7103638B1 (en) * | 2002-09-04 | 2006-09-05 | Veritas Operating Corporation | Mechanism to re-export NFS client mount points from nodes in a cluster |
US8694469B2 (en) * | 2009-12-28 | 2014-04-08 | Riverbed Technology, Inc. | Cloud synthetic backups |
US10108687B2 (en) * | 2015-01-21 | 2018-10-23 | Commvault Systems, Inc. | Database protection using block-level mapping |
CN105468476B (en) * | 2015-11-18 | 2019-03-08 | 盛趣信息技术(上海)有限公司 | Data disaster recovery and backup systems based on HDFS |
CN106250270B (en) * | 2016-07-28 | 2019-05-21 | 广东奥飞数据科技股份有限公司 | A kind of data back up method under cloud computing platform |
-
2018
- 2018-07-11 CN CN201810776448.0A patent/CN108984345B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1554055A (en) * | 2001-07-23 | 2004-12-08 | �Ƚ�װ�ù�˾ | High-availability cluster virtual server system |
CN102375955A (en) * | 2010-08-17 | 2012-03-14 | 伊姆西公司 | System and method for locking files in combined naming space in network file system |
US8429140B1 (en) * | 2010-11-03 | 2013-04-23 | Netapp. Inc. | System and method for representing application objects in standardized form for policy management |
US8655851B2 (en) * | 2011-04-08 | 2014-02-18 | Symantec Corporation | Method and system for performing a clean file lock recovery during a network filesystem server migration or failover |
CN103761168A (en) * | 2014-01-26 | 2014-04-30 | 上海爱数软件有限公司 | Method for mounting backup virtual machine based on nfs volume |
CN104461776A (en) * | 2014-11-26 | 2015-03-25 | 上海爱数软件有限公司 | Application disaster tolerance method based on CDP and iSCSI virtual disk technology |
CN105224256A (en) * | 2015-10-13 | 2016-01-06 | 浪潮(北京)电子信息产业有限公司 | A kind of storage system |
CN105740052A (en) * | 2016-01-28 | 2016-07-06 | 浪潮(北京)电子信息产业有限公司 | Method, device and system for online migration of virtual machines of non-shared memories |
Non-Patent Citations (2)
Title |
---|
NetBackup Disk Based Data Protection Options;Alex Davies;《eval.symantec.com/enterprise/white_papers》;20090430;全文 * |
基于虚拟化技术的三级存储方案研究与实现;韩雪;《万方数据知识服务平台》;20150730;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108984345A (en) | 2018-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108255641B (en) | CDP disaster recovery method based on cloud platform | |
CN106250270B (en) | A kind of data back up method under cloud computing platform | |
CN107256182B (en) | Method and device for restoring database | |
CN107526626B (en) | Docker container thermal migration method and system based on CRIU | |
CN112084098A (en) | Resource monitoring system and working method | |
CN103875229B (en) | asynchronous replication method, device and system | |
US11921597B2 (en) | Cross-platform replication | |
CN109144785B (en) | Method and apparatus for backing up data | |
CN106302806B (en) | A kind of method of data synchronization, system, synchronous obtaining method and relevant apparatus | |
CN106294585A (en) | A kind of storage method under cloud computing platform | |
US10534796B1 (en) | Maintaining an active-active cloud across different types of cloud storage services | |
CN101808127B (en) | Data backup method, system and server | |
CN109976941B (en) | Data recovery method and device | |
US20070294310A1 (en) | Method and apparatus for storing and recovering fixed content | |
CN105677507B (en) | A kind of business data cloud standby system and method | |
US20220091749A1 (en) | Resilient implementation of client file operations and replication | |
US8315986B1 (en) | Restore optimization | |
CN103780417A (en) | Database failure transfer method based on cloud hard disk and device thereof | |
CN104035837A (en) | Method for backing up isomorphic/isomerous UNIX/Linux host on line | |
CN108710550B (en) | Double-data-center disaster tolerance system for public security traffic management inspection and control system | |
CN115658390A (en) | Container disaster tolerance method, system, device, equipment and computer readable storage medium | |
CN114185484A (en) | Method, device, equipment and medium for clustering document storage | |
CN108984345B (en) | Big data backup method based on virtual shared directory | |
CN116560904A (en) | NAS data backup disaster recovery method, system, terminal and storage medium | |
CN105323271B (en) | Cloud computing system and processing method and device thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |