Nothing Special   »   [go: up one dir, main page]

WO2016120988A1 - Database system and database management method - Google Patents

Database system and database management method Download PDF

Info

Publication number
WO2016120988A1
WO2016120988A1 PCT/JP2015/052160 JP2015052160W WO2016120988A1 WO 2016120988 A1 WO2016120988 A1 WO 2016120988A1 JP 2015052160 W JP2015052160 W JP 2015052160W WO 2016120988 A1 WO2016120988 A1 WO 2016120988A1
Authority
WO
WIPO (PCT)
Prior art keywords
log
database
logs
integrated
generated
Prior art date
Application number
PCT/JP2015/052160
Other languages
French (fr)
Japanese (ja)
Inventor
田中 剛
敦 友田
有哉 礒田
上村 哲也
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2015/052160 priority Critical patent/WO2016120988A1/en
Publication of WO2016120988A1 publication Critical patent/WO2016120988A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures

Definitions

  • the present invention generally relates to database management, for example, to output of transaction processing logs.
  • Patent Document 1 a transaction processing log (Tx log) is output.
  • Tx log transaction processing log
  • Patent Document 1 a transaction processing log
  • a plurality of clients in an application program each pass a Tx log to a single log management system, and the log management system stores a plurality of logs in a single log disk.
  • Multiple clients and log management systems exist within a computing device.
  • the HA High Availability
  • the database system includes an active server and a standby server.
  • the HA configuration it is necessary to transfer the Tx log from the active server to the standby server.
  • Patent Document 2 is known as this type of technology.
  • Patent Document 2 a plurality of Tx logs generated in the active server are transferred in parallel to the standby server.
  • Patent Document 1 a conflict may occur when writing or transferring a Tx log, and thus it is necessary to acquire a lock.
  • the client needs to write a Tx log in a single log management system after acquiring a lock.
  • the transfer parallelism the number of Tx logs that can be simultaneously transferred from the active server to the standby server
  • the transfer from the active server to the standby server is performed. It is necessary to transfer the Tx log after acquiring the lock of the transfer path.
  • the active server has a first execution unit that executes a plurality of first sub-execution units in parallel, and an integrated log management unit.
  • the plurality of first sub-execution units executed in parallel execute a plurality of transactions for the first database managed by the active server, generate a plurality of logs respectively corresponding to the plurality of transactions, and generate the plurality of logs Are respectively written in a plurality of log storage areas.
  • the integrated log management unit reads a plurality of logs from a plurality of log storage areas, generates an integrated log including the read logs, and transfers the generated integrated log to the standby server.
  • log writing and transfer can be performed at high speed.
  • FIG. 1 shows a configuration of a database system according to a first embodiment.
  • the data structure of the Tx log which concerns on Example 1, and an integrated Tx log is shown.
  • An example of log output is shown.
  • It is a flowchart of a Tx process.
  • 6 is a flowchart of log output processing according to the first embodiment.
  • 6 is a flowchart of integrated Tx log writing processing according to the first embodiment.
  • the structure of the database system which concerns on Example 2 is shown.
  • the data structure of the integrated Tx log which concerns on Example 2 is shown.
  • the structure of a partition access map is shown.
  • 12 is a flowchart of log output processing according to the second embodiment.
  • 12 is a flowchart of integrated Tx log writing processing according to the second embodiment. It is a flowchart of an integrated Tx log reflection process.
  • PDEV indicates a physical storage device, and may typically be a nonvolatile storage device (for example, an auxiliary storage device).
  • the PDEV may be, for example, an HDD (Hard Disk Drive) or an SSD (Solid State Drive).
  • the “storage unit” may be one or more storage devices including a memory.
  • the storage unit may be at least a main storage device of a main storage device (typically a volatile memory) and an auxiliary storage device (typically a nonvolatile storage device).
  • a functional unit for example, a query reception unit, a query execution plan generation unit, a query execution unit, an integrated log management unit, an LLSN management unit
  • the program is executed by a processor (for example, a CPU (Central Processing Unit)), so that a predetermined process is appropriately performed using a storage unit (for example, a memory) and / or an interface device (for example, a communication port).
  • the subject of processing may be a processor.
  • the processing described with the functional unit as the subject may be processing performed by a processor or an apparatus or system having the processor.
  • the processor may include a hardware circuit that performs a part or all of the processing.
  • the program may be installed in a computer-like device from a program source.
  • the program source may be, for example, a storage medium that can be read by a program distribution server or a computer.
  • the program distribution server may include a processor (for example, a CPU) and a storage unit, and the storage unit may further store a distribution program and a program to be distributed.
  • the processor of the program distribution server executes the distribution program, so that the processor of the program distribution server may distribute the distribution target program to other computers.
  • two or more functional units may be realized as one functional unit, or one functional unit may be realized as two or more functional units.
  • a common code among the reference codes is used (for example, DBMS 101).
  • the reference numerals of the active server and its constituent elements include “C” meaning the active system, and the reference numerals of the standby server and its constituent elements are “S” meaning the standby system. Including.
  • FIG. 1 shows a configuration of a database system according to the first embodiment.
  • the database system includes an active server 100C and a standby server 100S.
  • An in-memory database is realized in each of the active server 100C and the standby server 100S. That is, the memory 117C of the active server 100C stores the DB partitions 109CA and 109CB constituting the DB, and the memory 117S of the standby server 100S also corresponds to the DB partitions 109CA and 109CB and constitutes the DB. 109SA and 109SB are stored.
  • the DB is divided into two DB partitions, but the DB may be divided into three or more DB partitions.
  • the DB partitions 109SA and 109SB have the same contents (replication) as the DB partitions 109CA and 109CB.
  • the active server 100C As an example. At least a part of the description of the active server 100C can be applied to the standby server 100S.
  • the elements illustrated for the standby server 100S are elements related to the operation as the standby server 100S, but the standby server 100S operates as the active server by being failed over from the active server 100C. Therefore, although not shown, it can have the same functional units as the active server 100C.
  • the active server 100C includes a PDEV I / F (interface device) 119C, a network I / F 120C, a memory 117C, and a processor 118C connected thereto.
  • a plurality of local logs PDEV 121C are connected to the PDEV I / V 119C.
  • the local log PDEV 121C is a PDEV that stores a local log file (described later).
  • the local log file (for example, the local log PDEV 121C) and the DB partition 109C may correspond 1: 1.
  • the standby server 100S is connected to the network I / F 120C via a communication medium such as a communication network.
  • the memory 117C is an example of a storage unit, and includes at least a main memory of a main memory (for example, a volatile memory such as a DRAM (Dynamic Random Access Memory)) and an auxiliary memory (for example, a non-volatile memory such as a flash memory). .
  • main memory for example, a volatile memory such as a DRAM (Dynamic Random Access Memory)
  • auxiliary memory for example, a non-volatile memory such as a flash memory.
  • the DBMS 101C is realized by the processor 118C executing a DBMS (Database Management System) program.
  • the DBMS 101C includes a query reception unit 102C, a query execution plan generation unit 103C, a dictionary 104C, an integrated log management unit 105C, and a query execution unit 106C.
  • the DBMC 101C manages the integrated log buffer 115C.
  • the DBMS 101C has an LLSN management unit 111C for each DB partition 109C, and manages a log buffer 113C provided for each DB partition 109C.
  • the processor 118C executes an OS (Operating System) 116C.
  • the DBMS 101C is executed on the OS 116C.
  • the log buffer 113C temporarily stores a Tx log including the update history of the corresponding DB partition 109C.
  • the Tx log written in the log buffer 113C is not deleted from the log buffer 113C even if written in the local log PDEV 121C, and is at least acquired by the integrated log management unit 105C (included in the integrated Tx log).
  • the processing thread 107C and the log buffer 113C may correspond 1: 1.
  • the LLSN management unit 111C is an example of an order management unit, and manages the LLSN.
  • “LLSN” is an abbreviation for local log sequence number.
  • the LLSN is a number that does not overlap in one DB partition 109C.
  • the LLSN is numbered when outputting the Tx log.
  • the dictionary 104C is information indicating the position of a database element (for example, a table and an index).
  • the query receiving unit 102C receives a query issued by a query issuer.
  • the query is described by, for example, a structured query language (SQL, Structured Query Language).
  • SQL Structured Query Language
  • a plurality of transactions may be described by one query, and a plurality of transactions may be described by a plurality of queries.
  • the query issuer may be a functional unit in the DBMS 101C or a functional unit outside the DBMS 101C (for example, a client computer (not shown)).
  • the query execution plan generation unit 103C generates a query execution plan including one or more database operations necessary for executing the query from the query received by the query reception unit 102C.
  • the query execution plan is information including, for example, a relationship between one or more database operations and the execution order of the database operations, and may be stored as query execution plan information.
  • a query execution plan may be represented by a tree structure in which a database operation is a node and a relation of execution order of database operations is an edge.
  • One or a plurality of transaction sets can be specified from one query execution plan or a combination of a plurality of query execution plans.
  • the query execution unit 106C executes the query received by the query reception unit 102C according to the query execution plan generated by the query execution plan generation unit 103C, and returns the execution result to the query issuer. At this time, the query execution unit 106C issues a data read request (reference request) necessary for the execution of the database operation, and uses the data read from the DB partition 109C in accordance with the read request, to execute the database operation ( For example, new data is calculated using the read data (value), and a write request for updating the data in the read source record to the calculated data is issued.
  • the query execution unit 106C performs a database operation by executing the processing thread 107C. That is, the processing thread 107C can execute the database operation with reference to the dictionary 104C as appropriate.
  • the processor 118C has a plurality of cores.
  • a plurality of cores exist in one or a plurality of processors 118C.
  • the processing thread 107C may be called a task.
  • a user thread realized by a library or the like may be used in addition to a process and kernel thread realized by the OS 116C.
  • One transaction corresponding to one or more database operations may be executed by one processing thread 107C.
  • the subject of processing performed by the query execution unit 106C executing the processing thread 107C may be the processing thread 107C.
  • the query execution unit 106C (processing thread 107C) issues an I / O request for the local log PDEV 121C to the OS 116C in order to write a Tx log to the local log file in the local log PDEV 121C during execution of the transaction.
  • the OS 116C accepts the I / O request and issues an I / O request to the local log PDEV 121C.
  • the PDEV I / F 119C may be provided with a plurality of I / O queues (not shown).
  • the processing thread 107C issues an I / O request for writing the Tx log, but the I / O request may be stored in the I / O queue.
  • the I / O request may be stored in the I / O queue by the OS 116C.
  • the local log PDEV 121C stores the local log file.
  • the Tx log to which the I / O request is written is recorded in the local log file.
  • the DB partition 109C, the I / O queue, and the local log file may correspond 1: 1: 1. That is, there may be one I / O queue and one local log file for each DB partition 109C. There may be one or more processing threads 107C per local log file. However, it may be possible to reduce the I / O request processing by sending an interrupt for notifying completion of the I / O request to a specific processing thread 107C for each I / O queue. For example, there may be a case where the processing thread 107C, the DB partition 109C, and the I / O queue have a 1: 1 correspondence. In this embodiment, in order to make the explanation easy to understand, the processing thread 107C may also correspond to the local log file 1: 1.
  • the processing thread 107CA issues a Tx log I / O request indicating that a record in the DB partition 109CA has been updated to a local log file corresponding to the DB partition 109CA.
  • the issued I / O request is sent to the OS 116C via the log buffer 113CA.
  • the OS 116C receives the I / O request for the local log file, and stores the I / O request in the I / O queue corresponding to the local log file.
  • the I / O request stored in the I / O queue is sent from the I / O queue to the local log PDEV 121C that stores the local log file of the I / O destination by the OS 116C.
  • the plurality of processing threads 107CA and 107CB write Tx logs in the plurality of log buffers 113CA and 113CB, respectively.
  • the integrated log management unit 105C extracts a plurality of Tx logs from the plurality of log buffers 113CA and 113CB, generates one integrated Tx log including the plurality of Tx logs, and stores the integrated Tx log in the integrated log buffer 115C. Write. Then, the integrated log management unit 105C transfers the integrated Tx log written in the integrated log buffer 115C to the standby server 100S.
  • the integrated Tx log transferred to the standby server 100S and received by the standby server 100S is written to the integrated log PDEV (PDEV in which the integrated Tx log is stored) 134S through the integrated log buffer 115S by the integrated log management unit 105S.
  • a processing thread (processing thread for log expansion processing) 107S for each DB partition 109S copies the integrated Tx log from the integrated log PDEV 134S to the memory 117S.
  • the processing thread 107S changes the records in the order of the LLSN written in the integrated Tx log by referring to the last Tx log from the first Tx log in the integrated Tx log based on the position information in the integrated Tx log.
  • the update indicated by the history 204 is performed on the corresponding DB partition 109S.
  • FIG. 2 shows the data structure of the Tx log and the integrated Tx log.
  • the Tx log 201 may be one per transaction, and the TxID 202 that is the ID of the transaction, one or more LLSNs 203 that are numbered during the execution of the transaction, and the record change that is information representing the update history of the transaction History 204.
  • the number of LLSNs 203 included in the Tx log 201 is the same as the number of DB partitions 109C updated in the execution of one corresponding transaction.
  • the checkpoint log may be a log indicating that a checkpoint has been generated.
  • the checkpoint log may include the TxID of the transaction that generates the checkpoint, all the LLSNs numbered in the generation of the checkpoint, and the ID of the generated checkpoint.
  • the checkpoint ID can be used to specify the restoration point.
  • the checkpoint ID may be a value representing the time when the checkpoint is generated.
  • the integrated Tx log 209 includes position information 210 and a plurality of Tx logs 201A, 201B,... Read from the plurality of log buffers 113CA and 113CB, respectively.
  • the position information is at least a part of the header information in the integrated Tx log 209 and includes a plurality of Tx start positions 211A, 211B,... Respectively corresponding to the plurality of Tx logs 201A, 201B,.
  • the Tx start position 211 is information (for example, an address) that is a position in the integrated Tx log 209 and that indicates the start position of the corresponding Tx log 201.
  • the integrated log management unit 105S can identify the positions of the plurality of Tx logs 201A, 201B,... In the integrated Tx log 209 by referring to the plurality of Tx start positions 211A, 211B,. .
  • Fig. 3 shows an example of log output.
  • the local log file 211C is stored in the local log PDEV 121C.
  • the Tx log 201 is stored in the local log file 211C.
  • a plurality of Tx logs 201 are arranged sequentially.
  • the integrated log file 219S is stored in the integrated log PDEV 134S.
  • the integrated Tx log 209 is stored in the integrated log file 219S.
  • a plurality of integrated Tx logs 209 are arranged sequentially.
  • FIG. 4 is a flowchart of the Tx process.
  • the thread that executes the transaction A is the processing thread 107CA
  • the DB partitions updated by the transaction A are the DB partitions 109CA and 109CB
  • the local log files corresponding to the DB partitions 109CA and 109CB, respectively, are local log files.
  • 211CA and 211CB are local log files.
  • the processing thread 107CA When the transaction A is started, the processing thread 107CA generates a reference / update set for each of the DB partitions 109CA and 109CB based on an instruction corresponding to the transaction A (instruction in the query) (S301).
  • the reference / update set is a set of record reference (read request for partition) and record update (write request for partition).
  • the reference / update set is a request set for updating the partition, but at the time of S301, the DB partitions 109CA and 109CB are not changed, and are allocated in the local memory area (main memory 461) corresponding to the transaction A.
  • the reference / update set is held in a region (not shown).
  • the processing thread 107CA makes a commit determination (S302).
  • the commit determination is performed according to the isolation level of the database, for example, whether the change to the DB partitions 109CA and 109CB performed by the transaction A based on the reference / update set is consistent with other transactions. .
  • the processing thread 107CA executes a log output process (S304).
  • the processing thread 107CA updates the DB partitions 109CA and 109CB based on the reference / update set (S305), issues a commit completion notification to the query issuer (S306), and ends the transaction.
  • FIG. 5 is a flowchart of the log output process.
  • the LLSN management unit of the local log file 211CA associated with the processing thread 107CA executing the transaction A is the LLSN management unit 111CA.
  • the processing thread 107CA acquires the log file address from the LLSN management unit 111CA, and adds the log size of the transaction A to the log file address of the LLSN management unit 111CA (S401).
  • the processing thread 107CA numbers the LLSN of the DB partition 109CA or 109CB updated by executing the transaction A (S402).
  • the processing thread 107CA numbers the LLSN of the DB partition 109CA or 109CB updated by executing the transaction A (S402).
  • the processing thread 107CA performs S402 on the unnumbered DB partition. On the other hand, if the LLSNs of all the updated DB partitions 109CA and 109CB have been numbered (S403: No), the processing thread 107CA generates the Tx log 201 and writes the Tx log 201 to the log buffer 113CA. Then, a write request for the Tx log 201 (write request specifying the log file address acquired from the LLSN management unit 111CA) is issued (S404). The processing thread 107CA receives a write completion notification from the local log PDEV 121CA via the PDEV I / F 119C (S405).
  • the processing thread 107CA executes the integrated Tx log writing process (S407), and ends the log output process when the integrated Tx log writing process is completed.
  • the mode is the log asynchronous mode (S406: No)
  • the processing thread 107CA ends the log output process without executing the integrated Tx log writing process.
  • the “log synchronous mode” is a mode in which the log output process ends when the integrated Tx log write process is executed in the log output process and the integrated Tx log write process ends.
  • the “log asynchronous mode” is a mode in which the integrated Tx log writing process is executed asynchronously with the log output process.
  • the current mode may be registered in the memory 117C.
  • the mode is the log synchronous mode or the log asynchronous mode may be set or changed by the user, or may be dynamically changed depending on the status of the active server 100C.
  • Synchronous mode is used for systems that handle information that should not be lost, such as bank account information such as bank ATMs, and is used in cases where reliability is more important than performance.
  • the transaction log is completely identical between the active system and the standby system, so even if a single failure point (single point of failure) occurs, up to which transaction can be processed by referring to either log disk. Whether or not the processing is complete can be reliably determined, and the processing data is not lost.
  • the asynchronous mode is utilized in a system that places importance on response time even if there is a possibility that a log that cannot be copied to the standby system may occur when a failure occurs.
  • FIG. 6 is a flowchart of the integrated Tx log writing process.
  • the integrated Tx log writing process starts when the processing thread 107CA that executes the transaction A to be executed starts the integrated log management unit 105C.
  • the integrated Tx log writing process starts at a preset interval or when a predetermined event occurs.
  • the integrated log management unit 105C polls all the log buffers 113C and determines whether Tx logs are stored in two or more log buffers 113C (for example, all log buffers 113C) (S501). If the determination result in S501 is negative (S501: No), the integrated Tx log writing process ends.
  • the integrated log management unit 105C performs S502. That is, all Tx logs are acquired from all log buffers 113C in which Tx logs are stored.
  • the integrated log management unit 105C determines a Tx start position 211 (position (address) in the integrated Tx log) for each Tx log based on a plurality of Tx log sizes respectively corresponding to the acquired plurality of Tx logs.
  • Position information 210 including a plurality of Tx start positions 211 respectively corresponding to the Tx logs.
  • the integrated log management unit 105C generates an integrated Tx log 209 including the generated position information 210 and a plurality of acquired and sequentially arranged Tx logs.
  • the integrated log management unit 105C writes the generated integrated Tx log 209 in the integrated log buffer 115C.
  • the integrated log management unit 105C transfers the integrated Tx log 209 (the integrated Tx log 209 generated in S502) in the integrated log buffer 115C to the standby server 100S through the network I / F 120C (S503).
  • the standby server 100S receives the integrated Tx log 209, and the integrated log management unit 105S writes the received integrated Tx log 209 in the integrated log file 219S in the integrated log PDEV 134S (S504).
  • the integrated log management unit 105S notifies the active server 100C of the reception completion (S505).
  • the active server 100C receives the notification of reception completion, and the integrated log management unit 105C clears the integrated Tx log 209 from the integrated log buffer 115C in response to the notification, and the processing thread 107C (at least Tx in S502)
  • the processing thread 107C) corresponding to the log buffer 113C of the log acquisition source is notified of the completion of log writing in the standby server 100S (S506).
  • the processing thread 107C that has received the notification clears the Tx log from the log buffer 113C corresponding to the processing thread 107C (S507).
  • the processing thread 107C writes the Tx log 201 to the log buffer 113C corresponding to the processing thread 107C.
  • a dedicated log buffer 113C is provided for each processing thread 107C. Therefore, in the active server 100C, a plurality of Tx can be written in the plurality of log buffers 113C in parallel, and there is no contention for writing to the log buffer 113C. Further, since the integrated Tx log is generated by reading a plurality of Tx logs from the plurality of log buffers 113C and written to the integrated log buffer 115C, there is no contention when the integrated Tx logs are generated / written.
  • a plurality of Tx logs are serially arranged in the integrated Tx log, and the single integrated Tx log is transferred from the active server 100C to the standby server 100S.
  • the single integrated Tx log is transferred from the active server 100C to the standby server 100S.
  • Example 2 will be described. At that time, differences from the first embodiment will be mainly described, and description of common points with the first embodiment will be omitted or simplified.
  • the integrated Tx log includes a Tx log that does not need to be referred to by the processing thread 107S, specifically, a Tx log that includes an update history of a DB partition 109S different from the DB partition 109S corresponding to the processing thread 107S. ing.
  • the processing thread 107S of each DB partition 109S refers to all the Tx logs of the integrated Tx log. For this reason, it takes time to restore the DB partition 109S. Particularly in an in-memory database, it is desirable to shorten the database restoration time as much as possible.
  • the processing thread in reflecting the integrated Tx log, refers to only the Tx log related to the update of the DB partition corresponding to the processing thread among the plurality of Tx logs in the integrated Tx log. It is possible to skip Tx logs that are not related to the update of the DB partition corresponding to the processing thread. For this reason, the database restoration time can be shortened.
  • Example 2 will be described in detail.
  • elements different from those in the first embodiment are denoted by reference numerals different from those in the first embodiment.
  • FIG. 7 shows the configuration of the database system according to the second embodiment.
  • the DBMS 1010C manages the partition access map 151C for each processing thread 1070C.
  • the partition access map 151C is information indicating the DB partition 109C updated by the corresponding processing thread 1070C.
  • FIG. 8 shows the data structure of the integrated Tx log according to the second embodiment.
  • the location information 2100 of the integrated Tx log 2090 includes a partition access map 151C in addition to the Tx start location 211 for each Tx log 201.
  • the processing thread (processing thread for log expansion processing) 1070SA determines whether or not the partition access map 151CA indicates the update of the DB partition 109SA. When the determination result is affirmative, the processing thread 1070 Sk updates the DB partition 109SA according to the Tx log 201A specified from the Tx start position 211A. On the other hand, if the determination result is negative, the processing thread 1070 Sk skips reading the Tx start position 211A (Tx log 201A).
  • FIG. 9 shows the configuration of the partition access map 151C.
  • the partition access map 151C is a bitmap.
  • the partition access map 151C is composed of a plurality of bits respectively corresponding to the plurality of DB partitions 109C. Bit “1” means that the DB partition 109 corresponding to the bit has been updated. Bit “0” means that the DB partition 109 corresponding to the bit has not been updated.
  • the bit map is an example of the partition access map 151C.
  • the partition access map 151C may have another configuration, for example, a list of updated DB partition 109C IDs.
  • FIG. 10 is a flowchart of the log output process according to the second embodiment.
  • the processing thread 1070S updates the bit corresponding to the updated DB partition 109C in the partition access map 151C corresponding to the processing thread 1070S to “1” for each DB partition 109C updated in the transaction processing (S441). Others are the same as in the first embodiment.
  • FIG. 11 is a flowchart of the integrated Tx log writing process according to the second embodiment.
  • S5020 is performed instead of S502.
  • all Tx logs are acquired from all log buffers 113C in which Tx logs are stored.
  • the integrated log management unit 105C determines a Tx start position 211 (position (address) in the integrated Tx log) for each Tx log based on a plurality of Tx log sizes respectively corresponding to the acquired plurality of Tx logs.
  • the integrated log management unit 1050C includes a plurality of partition access maps 151C corresponding to the plurality of Tx logs in addition to the plurality of Tx start positions 211 respectively corresponding to the plurality of Tx logs included in the integrated Tx log. Information 2100 is generated.
  • the integrated log management unit 1050C generates an integrated Tx log 2090 that includes the generated position information 2100 and a plurality of acquired and sequentially arranged Tx logs.
  • the integrated log management unit 1050C writes the generated integrated Tx log 2090 in the integrated log buffer 115C. Others are the same as in the first embodiment.
  • FIG. 12 is a flowchart of the integrated Tx log reflection process.
  • the integrated Tx log reflection process may be started, for example, when the administrator of the standby server 100S is instructed to start the integrated Tx log reflection process, or when the integrated Tx log is stored in the integrated log PDEV 134S. May be started.
  • the plurality of processing threads (log development processing thread) 1070S are executed in parallel, for example, and the following processing is performed for each processing thread 1070S.
  • the processing thread 1070S reads the integrated Tx log from the integrated log PDEV 134S to the work area of the processing thread 1070S (S1201).
  • Each processing thread 1070S refers to the first partition access map 151C in the integrated Tx log stored in the work area, and determines whether the partition access map 151C indicates the update of the DB partition 109S (S1202). .
  • the processing thread 1070S performs a reflection process of the Tx log 201 specified from the Tx start position 211x corresponding to the partition access map 151C (S1203). Specifically, the processing thread 1070S refers to the identified Tx log 201, and updates the DB partition 109Sk according to the referenced Tx log 201k.
  • the processing thread 1070S skips the Tx start position 211 (Tx log 201) corresponding to the referenced partition access map 151C (S1204).
  • the processing thread 1070S determines whether or not the partition access map 151C referenced in the immediately preceding 1202 is the last partition access map 151C (S1205).
  • the processing thread 1070S performs S1202 for the next partition access map 151C.
  • the processing thread 1070S selects only the Tx log related to the update of the DB partition corresponding to the processing thread 1070S among the plurality of Tx logs in the integrated Tx log.
  • the Tx log that is not related to the update of the DB partition corresponding to the processing thread 1070S can be skipped. For this reason, the database restoration time can be shortened.
  • the Tx log may be deleted from the log buffer 113C when it is written from the log buffer 113C to the local log PDEV 121C.
  • the integrated log management unit 105C may acquire a Tx log to be included in the integrated Tx log (a Tx log that has not been included in the integrated Tx log) from the local log PDEV 121C in the integrated Tx log writing process. .
  • the case where the LLSN is numbered may be a case where a set of transactions executed between checkpoints is at least a first type transaction set of a first type transaction set and a second type transaction set.
  • the first type transaction set may be a set of transactions whose results change depending on the transaction execution order. For example, according to the partial order, a plurality of transactions that update the same record must be executed in a defined order, but a plurality of transactions that update different records may be executed in any order.
  • the second type transaction set may be a transaction set in which the execution order of transactions does not affect the result. Whether the transaction to be executed belongs to the first type or the second type transaction set may be determined from one or a plurality of query execution plans, for example.
  • the LLSN (order) managed by the LLSN management unit 111C may be updated each time a Tx log of a transaction in which the DB partition 109C corresponding to the LLSN management unit 111C is updated is generated.
  • M Tx logs may be generated (M is N or less and M is an integer of 1 or more) as a transaction log (N is an integer of 2 or more) that updates N DB partitions 109C among a plurality of DB partitions 109C. ).
  • At least one of the M Tx logs may include two or more LLSNs among the N LLSNs respectively corresponding to the N DB partitions 109C.
  • the Tx log including one LLSN is written to the log buffer 113C corresponding to the LLSN and includes two or more LLSNs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

In this database system, an active system server has a first execution unit, which executes a plurality of first sub-execution units in a parallel manner, and an integrated log management unit. The plurality of first sub-execution units executed in a parallel manner perform a plurality of transactions with respect to a first database managed by the active system server, generate a plurality of logs, each corresponding to a respective one of the plurality of transactions, and writes each of the generated plurality of logs to a corresponding one of a plurality of log storage regions. The integrated log management unit reads the plurality of logs from the plurality of log storage regions, generates an integrated log including the read plurality of logs, and transfers the generated integrated log to a standby system server.

Description

データベースシステム及びデータベース管理方法Database system and database management method
 本発明は、概して、データベース管理に関し、例えば、トランザクション処理のログの出力に関する。 The present invention generally relates to database management, for example, to output of transaction processing logs.
 一般に、データベース管理において、トランザクション処理のログ(Txログ)を出力することが行われる。この種の技術として、例えば特許文献1が知られている。特許文献1では、アプリケーションプログラム内の複数のクライアントがそれぞれ単一のログ管理システムにTxログを渡し、ログ管理システムが、単一のログディスクに複数のログを格納する。複数のクライアントもログ管理システムも1つのコンピューティングデバイス内に存在する。 Generally, in database management, a transaction processing log (Tx log) is output. For example, Patent Document 1 is known as this type of technology. In Patent Document 1, a plurality of clients in an application program each pass a Tx log to a single log management system, and the log management system stores a plurality of logs in a single log disk. Multiple clients and log management systems exist within a computing device.
 金融機関等のシステムにデータベースを適用する場合、HA(High Availability)構成、すなわち、データベースシステムが現用系サーバと待機系サーバとを含むことが望ましい。HA構成では、Txログを現用系サーバから待機系サーバに転送する必要がある。この種の技術として、例えば特許文献2が知られている。特許文献2では、現用系サーバにおいて生成された複数のTxログがパラレルに待機系サーバに転送される。 When a database is applied to a system such as a financial institution, it is desirable that the HA (High Availability) configuration, that is, the database system includes an active server and a standby server. In the HA configuration, it is necessary to transfer the Tx log from the active server to the standby server. For example, Patent Document 2 is known as this type of technology. In Patent Document 2, a plurality of Tx logs generated in the active server are transferred in parallel to the standby server.
US2012/0078854US2012 / 0078854 US 7,305,421US 7,305,421
 特許文献1及び2のいずれでも、Txログの書込み又は転送の際に競合が発生し得るので、ロックの取得が必要になる。具体的には、特許文献1では、クライアントはロックを取得した上で単一のログ管理システムにTxログを書き込む必要がある。特許文献2では、特に、転送並列度(現用系サーバから待機系サーバへ同時に転送可能なTxログの数)よりも転送対象のTxログの数が多い場合、現用系サーバから待機系サーバへの転送経路のロックを取得した上でTxログを転送する必要がある。 In either of Patent Documents 1 and 2, a conflict may occur when writing or transferring a Tx log, and thus it is necessary to acquire a lock. Specifically, in Patent Document 1, the client needs to write a Tx log in a single log management system after acquiring a lock. In Patent Document 2, in particular, when the number of Tx logs to be transferred is larger than the transfer parallelism (the number of Tx logs that can be simultaneously transferred from the active server to the standby server), the transfer from the active server to the standby server is performed. It is necessary to transfer the Tx log after acquiring the lock of the transfer path.
 このように、特許文献1のようなシリアルログ書込みでも、特許文献2のようなパラレルログ転送でも、競合が生じ得るのでロックの取得が必要になる。 As described above, both serial log writing as in Patent Document 1 and parallel log transfer as in Patent Document 2 can cause contention, so that it is necessary to acquire a lock.
 現用系サーバが、複数の第1サブ実行部を並列に実行する第1実行部と、統合ログ管理部とを有する。並列に実行される複数の第1サブ実行部は、現用系サーバが管理する第1データベースに対する複数のトランザクションを実行し、複数のトランザクションにそれぞれ対応した複数のログを生成し、生成した複数のログを複数のログ格納領域にそれぞれ書き込む。統合ログ管理部は、複数のログ格納領域から複数のログを読み出し、読み出した複数のログを含んだ統合ログを生成し、その生成した統合ログを待機系サーバに転送する。 The active server has a first execution unit that executes a plurality of first sub-execution units in parallel, and an integrated log management unit. The plurality of first sub-execution units executed in parallel execute a plurality of transactions for the first database managed by the active server, generate a plurality of logs respectively corresponding to the plurality of transactions, and generate the plurality of logs Are respectively written in a plurality of log storage areas. The integrated log management unit reads a plurality of logs from a plurality of log storage areas, generates an integrated log including the read logs, and transfers the generated integrated log to the standby server.
 ログの書き込みにおいても転送においても競合が生じず、故に、ログ書込み及び転送を高速に行える。 ∙ There is no contention in log writing and transfer, so log writing and transfer can be performed at high speed.
実施例1に係るデータベースシステムの構成を示す。1 shows a configuration of a database system according to a first embodiment. 実施例1に係るTxログ及び統合Txログのデータ構造を示す。The data structure of the Tx log which concerns on Example 1, and an integrated Tx log is shown. ログ出力の一例を示す。An example of log output is shown. Tx処理のフローチャートである。It is a flowchart of a Tx process. 実施例1に係るログ出力処理のフローチャートである。6 is a flowchart of log output processing according to the first embodiment. 実施例1に係る統合Txログ書込み処理のフローチャートである。6 is a flowchart of integrated Tx log writing processing according to the first embodiment. 実施例2に係るデータベースシステムの構成を示す。The structure of the database system which concerns on Example 2 is shown. 実施例2に係る統合Txログのデータ構造を示す。The data structure of the integrated Tx log which concerns on Example 2 is shown. パーティションアクセスマップの構成を示す。The structure of a partition access map is shown. 実施例2に係るログ出力処理のフローチャートである。12 is a flowchart of log output processing according to the second embodiment. 実施例2に係る統合Txログ書込み処理のフローチャートである。12 is a flowchart of integrated Tx log writing processing according to the second embodiment. 統合Txログ反映処理のフローチャートである。It is a flowchart of an integrated Tx log reflection process.
 以下、図面を参照して、幾つかの実施例を説明する。 Hereinafter, some embodiments will be described with reference to the drawings.
 以下の説明では、「PDEV」は、物理的な記憶デバイスを示し、典型的には、不揮発性の記憶デバイス(例えば補助記憶デバイス)でよい。PDEVは、例えば、HDD(Hard Disk Drive)又はSSD(Solid State Drive)でよい。 In the following description, “PDEV” indicates a physical storage device, and may typically be a nonvolatile storage device (for example, an auxiliary storage device). The PDEV may be, for example, an HDD (Hard Disk Drive) or an SSD (Solid State Drive).
 以下の説明では、「記憶部」は、メモリを含んだ1以上の記憶デバイスでよい。例えば、記憶部は、主記憶デバイス(典型的には揮発性のメモリ)及び補助記憶デバイス(典型的には不揮発性の記憶デバイス)のうちの少なくとも主記憶デバイスでよい。 In the following description, the “storage unit” may be one or more storage devices including a memory. For example, the storage unit may be at least a main storage device of a main storage device (typically a volatile memory) and an auxiliary storage device (typically a nonvolatile storage device).
 また、以下の説明では、機能部(例えば、クエリ受付部、クエリ実行プラン生成部、クエリ実行部、統合ログ管理部、LLSN管理部)を主語として処理を説明する場合があるが、機能部は、プログラムがプロセッサ(例えばCPU(Central Processing Unit))によって実行されることで、定められた処理を、適宜に記憶部(例えばメモリ)及び/又はインターフェースデバイス(例えば通信ポート)等を用いながら行うため、処理の主語がプロセッサとされてもよい。機能部を主語として説明された処理は、プロセッサあるいはそのプロセッサを有する装置又はシステムが行う処理としてもよい。また、プロセッサは、処理の一部または全部を行うハードウェア回路を含んでもよい。複数の機能部のうちの少なくとも一部がハードウェア回路で実現されてもよい。プログラムは、プログラムソースから計算機のような装置にインストールされてもよい。プログラムソースは、例えば、プログラム配布サーバまたは計算機が読み取り可能な記憶メディアであってもよい。プログラムソースがプログラム配布サーバの場合、プログラム配布サーバはプロセッサ(例えばCPU)と記憶部を含み、記憶部はさらに配布プログラムと配布対象であるプログラムとを記憶してよい。そして、プログラム配布サーバのプロセッサが配布プログラムを実行することで、プログラム配布サーバのプロセッサは配布対象のプログラムを他の計算機に配布してよい。また、以下の説明において、2以上の機能部が1つの機能部として実現されてもよいし、1つの機能部が2以上の機能部として実現されてもよい。 Further, in the following description, there is a case where the process is described with a functional unit (for example, a query reception unit, a query execution plan generation unit, a query execution unit, an integrated log management unit, an LLSN management unit) as a subject, The program is executed by a processor (for example, a CPU (Central Processing Unit)), so that a predetermined process is appropriately performed using a storage unit (for example, a memory) and / or an interface device (for example, a communication port). The subject of processing may be a processor. The processing described with the functional unit as the subject may be processing performed by a processor or an apparatus or system having the processor. The processor may include a hardware circuit that performs a part or all of the processing. At least a part of the plurality of functional units may be realized by a hardware circuit. The program may be installed in a computer-like device from a program source. The program source may be, for example, a storage medium that can be read by a program distribution server or a computer. When the program source is a program distribution server, the program distribution server may include a processor (for example, a CPU) and a storage unit, and the storage unit may further store a distribution program and a program to be distributed. Then, the processor of the program distribution server executes the distribution program, so that the processor of the program distribution server may distribute the distribution target program to other computers. In the following description, two or more functional units may be realized as one functional unit, or one functional unit may be realized as two or more functional units.
 また、以下の説明では、同一の要素を区別しないで説明する場合、参照符号のうちの共通符号を使用するが(例えば、DBMS101)、同一の要素を区別して説明する場合、要素の参照符号全体を使用する(例えば、DBMS101C、101S)ことがある。 Further, in the following description, when a description is given without distinguishing the same element, a common code among the reference codes is used (for example, DBMS 101). (For example, DBMS 101C, 101S).
 また、以下、現用系サーバ及びそれの構成要素の参照符号は、現用系を意味する「C」を含み、待機系サーバ及びそれの構成要素の参照符号は、待機系を意味する「S」を含む。 Also, hereinafter, the reference numerals of the active server and its constituent elements include “C” meaning the active system, and the reference numerals of the standby server and its constituent elements are “S” meaning the standby system. Including.
 図1は、実施例1に係るデータベースシステムの構成を示す。 FIG. 1 shows a configuration of a database system according to the first embodiment.
 データベースシステムは、現用系サーバ100Cと、待機系サーバ100Sとを有する。現用系サーバ100Cと待機系サーバ100Sの各々において、インメモリデータベースが実現されている。すなわち、現用系サーバ100Cのメモリ117Cが、DBを構成するDBパーティション109CA及び109CBを記憶しており、待機系サーバ100Sのメモリ117Sも、DBパーティション109CA及び109CBにそれぞれ対応しDBを構成するDBパーティション109SA及び109SBを記憶している。なお、本実施例では、DBは2つのDBパーティションに分割されているが、DBは3以上のDBパーティションに分割されていてもよい。現用系サーバ100Cと待機系サーバ100Sが同期している状態では、DBパーティション109SA及び109SBは、DBパーティション109CA及び109CBと同じ内容(複製)である。 The database system includes an active server 100C and a standby server 100S. An in-memory database is realized in each of the active server 100C and the standby server 100S. That is, the memory 117C of the active server 100C stores the DB partitions 109CA and 109CB constituting the DB, and the memory 117S of the standby server 100S also corresponds to the DB partitions 109CA and 109CB and constitutes the DB. 109SA and 109SB are stored. In this embodiment, the DB is divided into two DB partitions, but the DB may be divided into three or more DB partitions. In a state where the active server 100C and the standby server 100S are synchronized, the DB partitions 109SA and 109SB have the same contents (replication) as the DB partitions 109CA and 109CB.
 以下、主に現用系サーバ100Cを例に取り説明する。現用系サーバ100Cの説明の少なくとも一部は、待機系サーバ100Sに適用可能である。また、待機系サーバ100Sについて図示されている要素は、待機系サーバ100Sとしての動作に関わる要素であるが、待機系サーバ100Sは、現用系サーバ100Cからフェイルオーバがされることで現用系サーバとして動作することになるので、図示はされていないが、現用系サーバ100Cと同じ機能部を有することができる。 Hereinafter, description will be given mainly using the active server 100C as an example. At least a part of the description of the active server 100C can be applied to the standby server 100S. The elements illustrated for the standby server 100S are elements related to the operation as the standby server 100S, but the standby server 100S operates as the active server by being failed over from the active server 100C. Therefore, although not shown, it can have the same functional units as the active server 100C.
 現用系サーバ100Cは、PDEV I/F(インターフェースデバイス)119Cと、ネットワークI/F120Cと、メモリ117Cと、それらに接続されたプロセッサ118Cとを有する。PDEV I/V119Cに、複数のローカルログPDEV121Cが接続されている。ローカルログPDEV121Cは、ローカルログファイル(後述)が格納されるPDEVである。ローカルログファイル(例えばローカルログPDEV121C)とDBパーティション109Cが1:1で対応してよい。ネットワークI/F120Cに通信ネットワークのような通信媒体を介して待機系サーバ100Sが接続される。 The active server 100C includes a PDEV I / F (interface device) 119C, a network I / F 120C, a memory 117C, and a processor 118C connected thereto. A plurality of local logs PDEV 121C are connected to the PDEV I / V 119C. The local log PDEV 121C is a PDEV that stores a local log file (described later). The local log file (for example, the local log PDEV 121C) and the DB partition 109C may correspond 1: 1. The standby server 100S is connected to the network I / F 120C via a communication medium such as a communication network.
 メモリ117Cは、記憶部の一例であり、主メモリ(例えば、DRAM(Dynamic Random Access Memory)のような揮発メモリ)及び補助メモリ(例えばフラッシュメモリのような不揮発メモリ)のうちの少なくとも主メモリを含む。 The memory 117C is an example of a storage unit, and includes at least a main memory of a main memory (for example, a volatile memory such as a DRAM (Dynamic Random Access Memory)) and an auxiliary memory (for example, a non-volatile memory such as a flash memory). .
 プロセッサ118CがDBMS(Database Management System)プログラムを実行することにより、DBMS101Cが実現される。DBMS101Cが、クエリ受付部102Cと、クエリ実行プラン生成部103Cと、ディクショナリ104Cと、統合ログ管理部105Cと、クエリ実行部106Cとを有する。DBMC101Cは、統合ログバッファ115Cを管理する。また、DBMS101Cは、DBパーティション109C毎に、LLSN管理部111Cを有し、DBパーティション109C毎に設けられたログバッファ113Cを管理する。プロセッサ118Cは、OS(Operating System)116Cを実行する。DBMS101CはOS116C上で実行される。 The DBMS 101C is realized by the processor 118C executing a DBMS (Database Management System) program. The DBMS 101C includes a query reception unit 102C, a query execution plan generation unit 103C, a dictionary 104C, an integrated log management unit 105C, and a query execution unit 106C. The DBMC 101C manages the integrated log buffer 115C. The DBMS 101C has an LLSN management unit 111C for each DB partition 109C, and manages a log buffer 113C provided for each DB partition 109C. The processor 118C executes an OS (Operating System) 116C. The DBMS 101C is executed on the OS 116C.
 ログバッファ113Cは、対応するDBパーティション109Cの更新履歴を含んだTxログを一時格納する。本実施例では、ログバッファ113Cに書き込まれたTxログは、ローカルログPDEV121Cに書き込まれてもログバッファ113Cから削除されず、少なくとも統合ログ管理部105Cにより取得されるまで(統合Txログに含まれるまで)、ログバッファ113Cに残る。処理スレッド107Cとログバッファ113Cは、1:1で対応していてよい。 The log buffer 113C temporarily stores a Tx log including the update history of the corresponding DB partition 109C. In this embodiment, the Tx log written in the log buffer 113C is not deleted from the log buffer 113C even if written in the local log PDEV 121C, and is at least acquired by the integrated log management unit 105C (included in the integrated Tx log). To the log buffer 113C. The processing thread 107C and the log buffer 113C may correspond 1: 1.
 LLSN管理部111Cは、順番管理部の一例であり、LLSNを管理する。「LLSN」は、ローカルログシーケンス番号の略である。LLSNは、1つのDBパーティション109Cにおいて重複しない番号である。LLSNは、Txログの出力の際に採番される。 The LLSN management unit 111C is an example of an order management unit, and manages the LLSN. “LLSN” is an abbreviation for local log sequence number. The LLSN is a number that does not overlap in one DB partition 109C. The LLSN is numbered when outputting the Tx log.
 ディクショナリ104Cは、データベース要素(例えばテーブル及びインデックス)の位置を示す情報である。 The dictionary 104C is information indicating the position of a database element (for example, a table and an index).
 クエリ受付部102Cは、クエリ発行元が発行するクエリを受け付ける。クエリは、例えば、構造化問合せ言語(SQL、Structured Query Language)によって記述される。1つのクエリで複数のトランザクションが記述されていてもよいし、複数のクエリで複数のトランザクションが記述されてもよい。また、クエリ発行元は、DBMS101C内の機能部であってよいし、DBMS101C外(例えば図示しないクライアント計算機)の機能部であってよい。 The query receiving unit 102C receives a query issued by a query issuer. The query is described by, for example, a structured query language (SQL, Structured Query Language). A plurality of transactions may be described by one query, and a plurality of transactions may be described by a plurality of queries. The query issuer may be a functional unit in the DBMS 101C or a functional unit outside the DBMS 101C (for example, a client computer (not shown)).
 クエリ実行プラン生成部103Cは、クエリ受付部102Cが受け付けたクエリから、当該クエリを実行するために必要な1以上のデータベースオペレーションを含むクエリ実行プランを生成する。クエリ実行プランは、例えば、1以上のデータベースオペレーションと、データベースオペレーションの実行順序の関係を含む情報であり、クエリ実行プラン情報として格納されてよい。クエリ実行プランは、データベースオペレーションをノード、データベースオペレーションの実行順序の関係をエッジとする木構造で表されることがある。1つのクエリ実行プラン、又は、複数のクエリ実行プランの組合せから、1又は複数のトランザクション集合を特定可能である。 The query execution plan generation unit 103C generates a query execution plan including one or more database operations necessary for executing the query from the query received by the query reception unit 102C. The query execution plan is information including, for example, a relationship between one or more database operations and the execution order of the database operations, and may be stored as query execution plan information. A query execution plan may be represented by a tree structure in which a database operation is a node and a relation of execution order of database operations is an edge. One or a plurality of transaction sets can be specified from one query execution plan or a combination of a plurality of query execution plans.
 クエリ実行部106Cは、クエリ実行プラン生成部103Cが生成したクエリ実行プランに従って、クエリ受付部102Cが受け付けたクエリを実行し、その実行結果をクエリ発行元に返す。この際、クエリ実行部106Cは、データベースオペレーションの実行に必要なデータの読出し要求(参照要求)を発行し、その読出し要求に従いDBパーティション109Cから読み出されたデータを使用して、そのデータベースオペレーション(例えば、読み出されたデータ(値)を用いて新たなデータを算出し、読出し元レコードにおけるデータを算出後のデータに更新する書込み要求を発行する)を行う。クエリ実行部106Cは、データベースオペレーションを、処理スレッド107Cを実行することにより行う。つまり、処理スレッド107Cが、適宜ディクショナリ104Cを参照して、データベースオペレーションを実行できる。なお、DBMS101Cにおいて、複数の処理スレッド107Cが並行して実行される。このため、プロセッサ118Cは、複数のコアを有する。複数のコアは、1又は複数のプロセッサ118Cに存在する。処理スレッド107Cは、タスクと呼ばれてもよい。処理スレッド107Cの実装としては、例えば、OS116Cが実現するプロセスやカーネルスレッド等のほか、ライブラリ等が実現するユーザスレッドを用いてよい。1つの処理スレッド107Cにより、1以上のデータベースオペレーションに対応した1つのトランザクションが実行されてよい。以下、クエリ実行部106Cが処理スレッド107Cを実行することにより行われる処理の主語を、処理スレッド107C、とすることがある。 The query execution unit 106C executes the query received by the query reception unit 102C according to the query execution plan generated by the query execution plan generation unit 103C, and returns the execution result to the query issuer. At this time, the query execution unit 106C issues a data read request (reference request) necessary for the execution of the database operation, and uses the data read from the DB partition 109C in accordance with the read request, to execute the database operation ( For example, new data is calculated using the read data (value), and a write request for updating the data in the read source record to the calculated data is issued. The query execution unit 106C performs a database operation by executing the processing thread 107C. That is, the processing thread 107C can execute the database operation with reference to the dictionary 104C as appropriate. In the DBMS 101C, a plurality of processing threads 107C are executed in parallel. For this reason, the processor 118C has a plurality of cores. A plurality of cores exist in one or a plurality of processors 118C. The processing thread 107C may be called a task. As the implementation of the processing thread 107C, for example, a user thread realized by a library or the like may be used in addition to a process and kernel thread realized by the OS 116C. One transaction corresponding to one or more database operations may be executed by one processing thread 107C. Hereinafter, the subject of processing performed by the query execution unit 106C executing the processing thread 107C may be the processing thread 107C.
 クエリ実行部106C(処理スレッド107C)は、トランザクションの実行において、ローカルログPDEV121C内のローカルログファイルにTxログを書き込むために、ローカルログPDEV121Cに対するI/O要求をOS116Cに発行する。OS116Cは、そのI/O要求を受け付け、ローカルログPDEV121CへI/O要求を発行する。 The query execution unit 106C (processing thread 107C) issues an I / O request for the local log PDEV 121C to the OS 116C in order to write a Tx log to the local log file in the local log PDEV 121C during execution of the transaction. The OS 116C accepts the I / O request and issues an I / O request to the local log PDEV 121C.
 PDEV I/F119Cには、複数のI/Oキュー(図示せず)が用意されてよい。トランザクションの処理において、処理スレッド107Cが、Txログの書込みのためのI/O要求を発行するが、I/Oキューには、そのI/O要求が格納されてよい。具体的には、I/O要求は、OS116CによりI/Oキューに格納されてよい。 The PDEV I / F 119C may be provided with a plurality of I / O queues (not shown). In the transaction processing, the processing thread 107C issues an I / O request for writing the Tx log, but the I / O request may be stored in the I / O queue. Specifically, the I / O request may be stored in the I / O queue by the OS 116C.
 ローカルログPDEV121Cが、ローカルログファイルを記憶する。ローカルログファイルに、I/O要求の書込み対象のTxログが記録される。 The local log PDEV 121C stores the local log file. The Tx log to which the I / O request is written is recorded in the local log file.
 本実施例では、DBパーティション109Cと、I/Oキューと、ローカルログファイルが、1:1:1で対応していてよい。つまり、DBパーティション109C毎に、1つのI/Oキューと、1つのローカルログファイルがあってよい。処理スレッド107Cは、1つのローカルログファイルにつき1つであっても複数であってもよい。ただし、I/O要求の完了を通知するための割り込みをI/Oキュー毎に特定の処理スレッド107Cに送信するようにする方がI/O要求の処理が軽減できる場合があり、その場合には例えば処理スレッド107CとDBパーティション109CとI/Oキューを1:1対応させておくとよい場合がある。本実施例では、説明を分かり易くするために、処理スレッド107Cもローカルログファイルに1:1で対応していてよい。例えば、処理スレッド107CAが、DBパーティション109CAにおけるレコードを更新したことを表すTxログのI/O要求を、そのDBパーティション109CAに対応するローカルログファイルに対して発行するようになっている。発行されたI/O要求は、ログバッファ113CAを経由して、OS116Cに送られる。OS116Cが、ローカルログファイルに対するI/O要求を受けて、そのI/O要求を、ローカルログファイルに対応するI/Oキューに格納する。I/Oキューに格納されたI/O要求は、OS116CによりI/Oキューから、I/O先のローカルログファイルを記憶するローカルログPDEV121Cに送られる。 In this embodiment, the DB partition 109C, the I / O queue, and the local log file may correspond 1: 1: 1. That is, there may be one I / O queue and one local log file for each DB partition 109C. There may be one or more processing threads 107C per local log file. However, it may be possible to reduce the I / O request processing by sending an interrupt for notifying completion of the I / O request to a specific processing thread 107C for each I / O queue. For example, there may be a case where the processing thread 107C, the DB partition 109C, and the I / O queue have a 1: 1 correspondence. In this embodiment, in order to make the explanation easy to understand, the processing thread 107C may also correspond to the local log file 1: 1. For example, the processing thread 107CA issues a Tx log I / O request indicating that a record in the DB partition 109CA has been updated to a local log file corresponding to the DB partition 109CA. The issued I / O request is sent to the OS 116C via the log buffer 113CA. The OS 116C receives the I / O request for the local log file, and stores the I / O request in the I / O queue corresponding to the local log file. The I / O request stored in the I / O queue is sent from the I / O queue to the local log PDEV 121C that stores the local log file of the I / O destination by the OS 116C.
 上述したように、複数の処理スレッド107CA及び107CBが、それぞれ、複数のログバッファ113CA及び113CBにTxログを書き込む。統合ログ管理部105Cは、複数のログバッファ113CA及び113CBからそれぞれ複数のTxログを取り出し、それら複数のTxログを含んだ1つの統合Txログを生成し、統合Txログを、統合ログバッファ115Cに書き込む。そして、統合ログ管理部105Cは、統合ログバッファ115Cに書き込まれた統合Txログを、待機系サーバ100Sに転送する。このような構成により、Txログをログバッファ113Cに書き込むとき、統合Txログを統合ログバッファ115Cに書き込むとき、及び、統合Txログを待機系サーバ100Sに転送するとき、のいずれのときにも、競合が生じ得ず、故に、ロック取得の必要が無い。 As described above, the plurality of processing threads 107CA and 107CB write Tx logs in the plurality of log buffers 113CA and 113CB, respectively. The integrated log management unit 105C extracts a plurality of Tx logs from the plurality of log buffers 113CA and 113CB, generates one integrated Tx log including the plurality of Tx logs, and stores the integrated Tx log in the integrated log buffer 115C. Write. Then, the integrated log management unit 105C transfers the integrated Tx log written in the integrated log buffer 115C to the standby server 100S. With such a configuration, when writing the Tx log into the log buffer 113C, when writing the integrated Tx log into the integrated log buffer 115C, and when transferring the integrated Tx log to the standby server 100S, There can be no contention, so there is no need to acquire a lock.
 待機系サーバ100Sに転送され待機系サーバ100Sが受信した統合Txログは、統合ログ管理部105Sにより、統合ログバッファ115Sを通じて統合ログPDEV(統合Txログが格納されるPDEV)134Sに書き込まれる。そして、DBパーティション109S毎の処理スレッド(ログ展開処理の処理スレッド)107Sが、統合ログPDEV134Sから統合Txログをメモリ117Sにコピーする。処理スレッド107Sが、統合Txログ内の位置情報を基に、統合Txログ内の先頭のTxログから末尾のTxログを参照することで、統合Txログに書かれているLLSNの順に、レコード変更履歴204が示す更新を、対応するDBパーティション109Sに対して行う。 The integrated Tx log transferred to the standby server 100S and received by the standby server 100S is written to the integrated log PDEV (PDEV in which the integrated Tx log is stored) 134S through the integrated log buffer 115S by the integrated log management unit 105S. Then, a processing thread (processing thread for log expansion processing) 107S for each DB partition 109S copies the integrated Tx log from the integrated log PDEV 134S to the memory 117S. The processing thread 107S changes the records in the order of the LLSN written in the integrated Tx log by referring to the last Tx log from the first Tx log in the integrated Tx log based on the position information in the integrated Tx log. The update indicated by the history 204 is performed on the corresponding DB partition 109S.
 図2は、Txログ及び統合Txログのデータ構造を示す。 FIG. 2 shows the data structure of the Tx log and the integrated Tx log.
 Txログ201は、1つのトランザクションにつき1つでよく、トランザクションのIDであるTxID202と、そのトランザクションの実行中に採番された1以上のLLSN203と、そのトランザクションの更新履歴を表す情報であるレコード変更履歴204とを含む。Txログ201に含まれるLLSN203の数は、対応する1つのトランザクションの実行において更新されたDBパーティション109Cの数と同じである。 The Tx log 201 may be one per transaction, and the TxID 202 that is the ID of the transaction, one or more LLSNs 203 that are numbered during the execution of the transaction, and the record change that is information representing the update history of the transaction History 204. The number of LLSNs 203 included in the Tx log 201 is the same as the number of DB partitions 109C updated in the execution of one corresponding transaction.
 なお、Txログとして、チェックポイントログがあってもよい。チェックポイントログは、チェックポイントが生成されたことのログでよい。チェックポイントログは、チェックポイントを生成するトランザクションのTxIDと、そのチェックポイントの生成において採番された全てのLLSNと、生成されたチェックポイントのIDとを含んでよい。データベースの復元の際には、チェックポイントIDを使用して、復元する時点を特定することができる。なお、チェックポイントIDは、チェックポイント生成時の時刻を表す値でもよい。 Note that there may be a checkpoint log as the Tx log. The checkpoint log may be a log indicating that a checkpoint has been generated. The checkpoint log may include the TxID of the transaction that generates the checkpoint, all the LLSNs numbered in the generation of the checkpoint, and the ID of the generated checkpoint. When the database is restored, the checkpoint ID can be used to specify the restoration point. Note that the checkpoint ID may be a value representing the time when the checkpoint is generated.
 統合Txログ209は、位置情報210と、複数のログバッファ113CA及び113CBからそれぞれ読み出された複数のTxログ201A、201B、…とを含む。位置情報は、統合Txログ209におけるヘッダ情報の少なくとも一部であり、統合Txログ209が含む複数のTxログ201A、201B、…にそれぞれ対応した複数のTx開始位置211A、211B、…を含む。Tx開始位置211は、統合Txログ209における位置であって対応するTxログ201の開始位置を表す情報(例えばアドレス)である。待機系サーバ100Sにおいて、統合ログ管理部105Sは、複数のTx開始位置211A、211B、…を参照することで、統合Txログ209における複数のTxログ201A、201B、…のそれぞれの位置を特定できる。 The integrated Tx log 209 includes position information 210 and a plurality of Tx logs 201A, 201B,... Read from the plurality of log buffers 113CA and 113CB, respectively. The position information is at least a part of the header information in the integrated Tx log 209 and includes a plurality of Tx start positions 211A, 211B,... Respectively corresponding to the plurality of Tx logs 201A, 201B,. The Tx start position 211 is information (for example, an address) that is a position in the integrated Tx log 209 and that indicates the start position of the corresponding Tx log 201. In the standby server 100S, the integrated log management unit 105S can identify the positions of the plurality of Tx logs 201A, 201B,... In the integrated Tx log 209 by referring to the plurality of Tx start positions 211A, 211B,. .
 図3は、ログ出力の一例を示す。 Fig. 3 shows an example of log output.
 ローカルログPDEV121Cに、ローカルログファイル211Cが格納される。ローカルログファイル211Cに、Txログ201が格納される。ローカルログファイル211Cにおいて、シーケンシャルに、複数のTxログ201が並ぶ。 The local log file 211C is stored in the local log PDEV 121C. The Tx log 201 is stored in the local log file 211C. In the local log file 211C, a plurality of Tx logs 201 are arranged sequentially.
 同様に、統合ログPDEV134Sに、統合ログファイル219Sが格納される。統合ログファイル219Sに、統合Txログ209が格納される。統合ログファイル219Sにおいて、シーケンシャルに、複数の統合Txログ209が並ぶ。 Similarly, the integrated log file 219S is stored in the integrated log PDEV 134S. The integrated Tx log 209 is stored in the integrated log file 219S. In the integrated log file 219S, a plurality of integrated Tx logs 209 are arranged sequentially.
 以下、本実施例で行われる処理を説明する。 Hereinafter, processing performed in this embodiment will be described.
 図4は、Tx処理のフローチャートである。なお、以下の説明では、1つのトランザクションAを例に取り説明する。そのため、そのトランザクションAを実行するスレッドは処理スレッド107CAであり、そのトランザクションAにより更新されるDBパーティションはDBパーティション109CA及び109CBであり、DBパーティション109CA及び109CBにそれぞれ対応したローカルログファイルはローカルログファイル211CA及び211CBである。 FIG. 4 is a flowchart of the Tx process. In the following description, one transaction A is taken as an example. Therefore, the thread that executes the transaction A is the processing thread 107CA, the DB partitions updated by the transaction A are the DB partitions 109CA and 109CB, and the local log files corresponding to the DB partitions 109CA and 109CB, respectively, are local log files. 211CA and 211CB.
 処理スレッド107CAが、トランザクションAが開始されると、トランザクションAに対応した指示(クエリ中の指示)に基づいて、DBパーティション109CA及び109CBの各々について、参照/更新セットの生成を行う(S301)。参照/更新セットは、レコードの参照(パーティションに対する読出し要求)とレコードの更新(パーティションに対する書込み要求)とのセットである。参照/更新セットは、パーティションの更新のための要求セットであるが、S301の時点では、DBパーティション109CA及び109CBの変更は行われず、トランザクションAに対応したローカルメモリ領域(メインメモリ461上に確保された領域(図示せず))に参照/更新セットが保持される。 When the transaction A is started, the processing thread 107CA generates a reference / update set for each of the DB partitions 109CA and 109CB based on an instruction corresponding to the transaction A (instruction in the query) (S301). The reference / update set is a set of record reference (read request for partition) and record update (write request for partition). The reference / update set is a request set for updating the partition, but at the time of S301, the DB partitions 109CA and 109CB are not changed, and are allocated in the local memory area (main memory 461) corresponding to the transaction A. The reference / update set is held in a region (not shown).
 次に、処理スレッド107CAが、コミット判定を行う(S302)。コミット判定は、例えば、トランザクションAが参照/更新セットに基づいて行うDBパーティション109CA及び109CBへの変更が他のトランザクションとの整合性を保てているかどうかをデータベースのアイソレーションレベルに応じて行われる。 Next, the processing thread 107CA makes a commit determination (S302). The commit determination is performed according to the isolation level of the database, for example, whether the change to the DB partitions 109CA and 109CB performed by the transaction A based on the reference / update set is consistent with other transactions. .
 コミット判定がNGの場合(S303:No)、処理スレッド107CAは、アボート処理を行う(S307)。 When the commit determination is NG (S303: No), the processing thread 107CA performs an abort process (S307).
 コミット判定がOKの場合(S303:Yes)、処理スレッド107CAは、ログ出力処理を実行する(S304)。次に、処理スレッド107CAは、参照/更新セットに基づいてDBパーティション109CA及び109CBをそれぞれ更新し(S305)、コミット完了通知をクエリ発行元に出して(S306)、トランザクションを終了する。 When the commit determination is OK (S303: Yes), the processing thread 107CA executes a log output process (S304). Next, the processing thread 107CA updates the DB partitions 109CA and 109CB based on the reference / update set (S305), issues a commit completion notification to the query issuer (S306), and ends the transaction.
 図5は、ログ出力処理のフローチャートである。 FIG. 5 is a flowchart of the log output process.
 トランザクションAを実行している処理スレッド107CAと対応付けられているローカルログファイル211CAのLLSN管理部は、LLSN管理部111CAである。処理スレッド107CAは、LLSN管理部111CAからログファイルアドレスを取得し、LLSN管理部111CAのログファイルアドレスにトランザクションAのログサイズを加算する(S401)。 The LLSN management unit of the local log file 211CA associated with the processing thread 107CA executing the transaction A is the LLSN management unit 111CA. The processing thread 107CA acquires the log file address from the LLSN management unit 111CA, and adds the log size of the transaction A to the log file address of the LLSN management unit 111CA (S401).
 次に、処理スレッド107CAは、トランザクションAの実行により更新されたDBパーティション109CA又は109CBのLLSNを採番する(S402)。データベースのアイソレーションレベルやデータ構造によっては、トランザクションAの実行により参照されたパーティションのLLSNも採番する必要があり得る。 Next, the processing thread 107CA numbers the LLSN of the DB partition 109CA or 109CB updated by executing the transaction A (S402). Depending on the isolation level and data structure of the database, it may be necessary to also assign the LLSN of the partition referenced by the execution of transaction A.
 処理スレッド107CAは、更新されたDBパーティション109CA及び109CBのうちLLSNが採番されていないパーティションがあれば(S403:No)、その採番されていないDBパーティションについてS402を行う。一方、更新された全てのDBパーティション109CA及び109CBのLLSNが採番されたのであれば(S403:No)、処理スレッド107CAは、Txログ201を生成し、そのTxログ201をログバッファ113CAに書き込み、そのTxログ201の書込み要求(LLSN管理部111CAから取得したログファイルアドレスを指定した書込み要求)を発行する(S404)。処理スレッド107CAは、ローカルログPDEV121CAからPDEV I/F119Cを介して書込み完了通知を受信する(S405)。 If there is a partition in which the LLSN is not numbered among the updated DB partitions 109CA and 109CB (S403: No), the processing thread 107CA performs S402 on the unnumbered DB partition. On the other hand, if the LLSNs of all the updated DB partitions 109CA and 109CB have been numbered (S403: No), the processing thread 107CA generates the Tx log 201 and writes the Tx log 201 to the log buffer 113CA. Then, a write request for the Tx log 201 (write request specifying the log file address acquired from the LLSN management unit 111CA) is issued (S404). The processing thread 107CA receives a write completion notification from the local log PDEV 121CA via the PDEV I / F 119C (S405).
 処理スレッド107CAは、モードがログ同期モードである場合(S406:Yes)、統合Txログ書込み処理を実行し(S407)、統合Txログ書込み処理が完了した場合に、ログ出力処理を終了する。処理スレッド107CAは、モードがログ非同期モードである場合(S406:No)、統合Txログ書込み処理を実行すること無く、ログ出力処理を終了する。「ログ同期モード」とは、ログ出力処理において統合Txログ書込み処理を実行し統合Txログ書込み処理が終了した場合にログ出力処理が終了するモードである。「ログ非同期モード」とは、ログ出力処理とは非同期に統合Txログ書込み処理を実行するモードである。現在のモードは、メモリ117Cに登録されていてよい。モードをログ同期モードとするかログ非同期モードとするかは、ユーザにより設定又は変更されてもよいし、現用系サーバ100Cの状況に応じてログ同期モードにより動的に変更されてもよい。同期モードは、例えば、銀行のATMのような銀行口座情報等の失ってはならない情報を扱っているシステム向けで、性能よりも信頼性を特に重視するケースで活用する。同期モードではトランザクションログが完全に現用系と待機系で一致しているので、単一障害点(single point of failure)が発生しても、どちらかのログディスクを参照すればどの処理までトランザクションが完了しているか確実に判別ができ、処理データを失うことはない。一方、非同期モードは、例えば、性能重視で、障害発生時に待機系にコピーできていないログが発生する可能性はあっても、応答時間を重視するシステムで活用される。 When the mode is the log synchronization mode (S406: Yes), the processing thread 107CA executes the integrated Tx log writing process (S407), and ends the log output process when the integrated Tx log writing process is completed. When the mode is the log asynchronous mode (S406: No), the processing thread 107CA ends the log output process without executing the integrated Tx log writing process. The “log synchronous mode” is a mode in which the log output process ends when the integrated Tx log write process is executed in the log output process and the integrated Tx log write process ends. The “log asynchronous mode” is a mode in which the integrated Tx log writing process is executed asynchronously with the log output process. The current mode may be registered in the memory 117C. Whether the mode is the log synchronous mode or the log asynchronous mode may be set or changed by the user, or may be dynamically changed depending on the status of the active server 100C. Synchronous mode is used for systems that handle information that should not be lost, such as bank account information such as bank ATMs, and is used in cases where reliability is more important than performance. In synchronous mode, the transaction log is completely identical between the active system and the standby system, so even if a single failure point (single point of failure) occurs, up to which transaction can be processed by referring to either log disk. Whether or not the processing is complete can be reliably determined, and the processing data is not lost. On the other hand, the asynchronous mode is utilized in a system that places importance on response time even if there is a possibility that a log that cannot be copied to the standby system may occur when a failure occurs.
 図6は、統合Txログ書込み処理のフローチャートである。 FIG. 6 is a flowchart of the integrated Tx log writing process.
 モードがログ同期モードである場合、統合Txログ書込み処理は、実行対象のトランザクションAを実行する処理スレッド107CAが統合ログ管理部105Cを起動することにより開始する。モードがログ非同期モードである場合、統合Txログ書込み処理は、予め設定されたインターバルで又は所定のイベントが契機で開始する。 When the mode is the log synchronous mode, the integrated Tx log writing process starts when the processing thread 107CA that executes the transaction A to be executed starts the integrated log management unit 105C. When the mode is the log asynchronous mode, the integrated Tx log writing process starts at a preset interval or when a predetermined event occurs.
 統合ログ管理部105Cが、全てのログバッファ113Cをポーリングし、2以上のログバッファ113C(例えば全てのログバッファ113C)にTxログが格納されているか否かを判断する(S501)。S501の判断結果が否定の場合(S501:No)、統合Txログ書込み処理が終了する。 The integrated log management unit 105C polls all the log buffers 113C and determines whether Tx logs are stored in two or more log buffers 113C (for example, all log buffers 113C) (S501). If the determination result in S501 is negative (S501: No), the integrated Tx log writing process ends.
 S501の判断結果が肯定の場合(S501:Yes)、統合ログ管理部105Cが、S502を行う。すなわち、Txログが格納されている全てのログバッファ113Cから全てのTxログを取得する。統合ログ管理部105Cは、取得された複数のTxログにそれぞれ対応した複数のTxログサイズを基に、Txログ毎のTx開始位置211(統合Txログにおける位置(アドレス))を決定し、複数のTxログにそれぞれ対応した複数のTx開始位置211を含んだ位置情報210を生成する。統合ログ管理部105Cは、生成した位置情報210と、取得されシーケンシャルに並べた複数のTxログとを含んだ統合Txログ209を生成する。統合ログ管理部105Cは、生成した統合Txログ209を、統合ログバッファ115Cに書き込む。 If the determination result in S501 is affirmative (S501: Yes), the integrated log management unit 105C performs S502. That is, all Tx logs are acquired from all log buffers 113C in which Tx logs are stored. The integrated log management unit 105C determines a Tx start position 211 (position (address) in the integrated Tx log) for each Tx log based on a plurality of Tx log sizes respectively corresponding to the acquired plurality of Tx logs. Position information 210 including a plurality of Tx start positions 211 respectively corresponding to the Tx logs. The integrated log management unit 105C generates an integrated Tx log 209 including the generated position information 210 and a plurality of acquired and sequentially arranged Tx logs. The integrated log management unit 105C writes the generated integrated Tx log 209 in the integrated log buffer 115C.
 次に、統合ログ管理部105Cが、統合ログバッファ115C内の統合Txログ209(S502で生成された統合Txログ209)を、ネットワークI/F120Cを通じて待機系サーバ100Sに転送する(S503)。 Next, the integrated log management unit 105C transfers the integrated Tx log 209 (the integrated Tx log 209 generated in S502) in the integrated log buffer 115C to the standby server 100S through the network I / F 120C (S503).
 待機系サーバ100Sがその統合Txログ209を受信し、統合ログ管理部105Sが、受信した統合Txログ209を、統合ログPDEV134S内の統合ログファイル219Sに書き込む(S504)。統合ログ管理部105Sは、受信した統合Txログ209を統合ログファイル219Sに書き込んだ場合、受信完了を現用系サーバ100Cに通知する(S505)。 The standby server 100S receives the integrated Tx log 209, and the integrated log management unit 105S writes the received integrated Tx log 209 in the integrated log file 219S in the integrated log PDEV 134S (S504). When the received integrated Tx log 209 is written in the integrated log file 219S, the integrated log management unit 105S notifies the active server 100C of the reception completion (S505).
 現用系サーバ100Cが受信完了の通知を受信し、統合ログ管理部105Cが、その通知に応答して、統合ログバッファ115Cから統合Txログ209をクリアし、処理スレッド107C(少なくとも、S502でのTxログ取得元のログバッファ113Cに対応した処理スレッド107C)に、待機系サーバ100Sでのログ書込み完了を通知する(S506)。通知を受けた処理スレッド107Cが、その処理スレッド107Cに対応したログバッファ113CからTxログをクリアする(S507)。 The active server 100C receives the notification of reception completion, and the integrated log management unit 105C clears the integrated Tx log 209 from the integrated log buffer 115C in response to the notification, and the processing thread 107C (at least Tx in S502) The processing thread 107C) corresponding to the log buffer 113C of the log acquisition source is notified of the completion of log writing in the standby server 100S (S506). The processing thread 107C that has received the notification clears the Tx log from the log buffer 113C corresponding to the processing thread 107C (S507).
 以上、実施例1によれば、処理スレッド107Cは、Txログ201を、その処理スレッド107Cに対応するログバッファ113Cに書き込む。言い換えれば、処理スレッド107C毎に、その処理スレッド107Cに専用のログバッファ113Cが設けられている。このため、現用系サーバ100Cにおいて、複数のTxを並列に複数のログバッファ113Cに書き込むことができ、且つ、ログバッファ113Cへの書き込みについての競合が生じることは無い。また、統合Txログは、複数のログバッファ113Cから複数のTxログを読み出し生成され統合ログバッファ115Cに書き込まれるので、統合Txログの生成/書込みのときにも競合が生じることは無い。また、統合Txログにおいて複数のTxログがシリアルに並んでおり、その単一の統合Txログが現用系サーバ100Cから待機系サーバ100Sに転送されるので、Txログのパラレル転送の場合のように生じ得る競合が生じることも無い。このように、実施例1では、Txログをログバッファ113Cに書き込むとき、統合Txログを統合ログバッファ115Cに書き込むとき、及び、統合Txログを待機系サーバ100Sに転送するとき、のいずれのときにも、競合が生じ得ず、故に、ロック取得の必要が無い。 As described above, according to the first embodiment, the processing thread 107C writes the Tx log 201 to the log buffer 113C corresponding to the processing thread 107C. In other words, a dedicated log buffer 113C is provided for each processing thread 107C. Therefore, in the active server 100C, a plurality of Tx can be written in the plurality of log buffers 113C in parallel, and there is no contention for writing to the log buffer 113C. Further, since the integrated Tx log is generated by reading a plurality of Tx logs from the plurality of log buffers 113C and written to the integrated log buffer 115C, there is no contention when the integrated Tx logs are generated / written. In addition, a plurality of Tx logs are serially arranged in the integrated Tx log, and the single integrated Tx log is transferred from the active server 100C to the standby server 100S. Thus, as in the case of parallel transfer of Tx logs. There is no competition that can occur. As described above, in the first embodiment, when writing the Tx log into the log buffer 113C, when writing the integrated Tx log into the integrated log buffer 115C, or when transferring the integrated Tx log to the standby server 100S. In addition, no contention can occur, so there is no need to acquire a lock.
 実施例2を説明する。その際、実施例1との相違点を主に説明し、実施例1との共通点については説明を省略又は簡略する。 Example 2 will be described. At that time, differences from the first embodiment will be mainly described, and description of common points with the first embodiment will be omitted or simplified.
 統合Txログには、処理スレッド107Sが参照する必要の無いTxログ、具体的には、その処理スレッド107Sに対応するDBパーティション109Sとは異なるDBパーティション109Sの更新履歴を含んだTxログも含まれている。実施例1では、待機系サーバ100Sでの統合Txログの反映において、各DBパーティション109Sの処理スレッド107Sが、統合Txログの全てのTxログを参照する。このため、DBパーティション109Sの復元に時間がかかる。特にインメモリデータベースでは、データベース復元時間を少しでも短くすることが望ましい。 The integrated Tx log includes a Tx log that does not need to be referred to by the processing thread 107S, specifically, a Tx log that includes an update history of a DB partition 109S different from the DB partition 109S corresponding to the processing thread 107S. ing. In the first embodiment, when the integrated Tx log is reflected in the standby server 100S, the processing thread 107S of each DB partition 109S refers to all the Tx logs of the integrated Tx log. For this reason, it takes time to restore the DB partition 109S. Particularly in an in-memory database, it is desirable to shorten the database restoration time as much as possible.
 そこで、実施例2では、統合Txログの反映において、処理スレッドが、統合Txログ内の複数のTxログのうち、その処理スレッドに対応したDBパーティションの更新に関わるTxログのみを参照し、その処理スレッドに対応したDBパーティションの更新に関係しないTxログを読み飛ばすことができる。このため、データベース復元時間を短縮できる。 Therefore, in the second embodiment, in reflecting the integrated Tx log, the processing thread refers to only the Tx log related to the update of the DB partition corresponding to the processing thread among the plurality of Tx logs in the integrated Tx log. It is possible to skip Tx logs that are not related to the update of the DB partition corresponding to the processing thread. For this reason, the database restoration time can be shortened.
 以下、実施例2を詳細に説明する。なお、以下の説明では、実施例1と異なる要素に実施例1と異なる参照符号を付す。 Hereinafter, Example 2 will be described in detail. In the following description, elements different from those in the first embodiment are denoted by reference numerals different from those in the first embodiment.
 図7は、実施例2に係るデータベースシステムの構成を示す。 FIG. 7 shows the configuration of the database system according to the second embodiment.
 現用系サーバ100Cのメモリ127Cにおいて、DBMS1010Cが、処理スレッド1070C毎に、パーティションアクセスマップ151Cを管理する。パーティションアクセスマップ151Cは、対応する処理スレッド1070Cにより更新されたDBパーティション109Cを示す情報である。 In the memory 127C of the active server 100C, the DBMS 1010C manages the partition access map 151C for each processing thread 1070C. The partition access map 151C is information indicating the DB partition 109C updated by the corresponding processing thread 1070C.
 図8は、実施例2に係る統合Txログのデータ構造を示す。 FIG. 8 shows the data structure of the integrated Tx log according to the second embodiment.
 統合Txログ2090の位置情報2100が、Txログ201毎に、Tx開始位置211に加えて、パーティションアクセスマップ151Cを含む。この構成により、例えば、処理スレッド(ログ展開処理の処理スレッド)1070SAが、パーティションアクセスマップ151CAがDBパーティション109SAの更新を示しているか否かを判断する。その判断結果が肯定の場合、処理スレッド1070Skは、Tx開始位置211Aから特定されるTxログ201Aに従い、DBパーティション109SAを更新する。一方、その判断結果が否定の場合、処理スレッド1070Skは、Tx開始位置211A(Txログ201A)を読み飛ばす。 The location information 2100 of the integrated Tx log 2090 includes a partition access map 151C in addition to the Tx start location 211 for each Tx log 201. With this configuration, for example, the processing thread (processing thread for log expansion processing) 1070SA determines whether or not the partition access map 151CA indicates the update of the DB partition 109SA. When the determination result is affirmative, the processing thread 1070 Sk updates the DB partition 109SA according to the Tx log 201A specified from the Tx start position 211A. On the other hand, if the determination result is negative, the processing thread 1070 Sk skips reading the Tx start position 211A (Tx log 201A).
 図9は、パーティションアクセスマップ151Cの構成を示す。 FIG. 9 shows the configuration of the partition access map 151C.
 パーティションアクセスマップ151Cは、ビットマップである。パーティションアクセスマップ151Cは、複数のDBパーティション109Cにそれぞれ対応した複数のビットで構成されている。ビット「1」が、そのビットに対応するDBパーティション109が更新されたことを意味する。ビット「0」が、そのビットに対応するDBパーティション109が更新されていないことを意味する。 The partition access map 151C is a bitmap. The partition access map 151C is composed of a plurality of bits respectively corresponding to the plurality of DB partitions 109C. Bit “1” means that the DB partition 109 corresponding to the bit has been updated. Bit “0” means that the DB partition 109 corresponding to the bit has not been updated.
 ビットマップであることは、パーティションアクセスマップ151Cの一例である。パーティションアクセスマップ151Cは、別の構成、例えば、更新されたDBパーティション109CのIDのリストでもよい。 The bit map is an example of the partition access map 151C. The partition access map 151C may have another configuration, for example, a list of updated DB partition 109C IDs.
 図10は、実施例2に係るログ出力処理のフローチャートである。 FIG. 10 is a flowchart of the log output process according to the second embodiment.
 処理スレッド1070Sが、トランザクション処理において更新されたDBパーティション109C毎に、処理スレッド1070Sに対応するパーティションアクセスマップ151Cにおける、更新されたDBパーティション109Cに対応するビットを「1」に更新する(S441)。その他は、実施例1と同様である。 The processing thread 1070S updates the bit corresponding to the updated DB partition 109C in the partition access map 151C corresponding to the processing thread 1070S to “1” for each DB partition 109C updated in the transaction processing (S441). Others are the same as in the first embodiment.
 図11は、実施例2に係る統合Txログ書込み処理のフローチャートである。 FIG. 11 is a flowchart of the integrated Tx log writing process according to the second embodiment.
 S502に代えて、S5020が行われる。S5020において、Txログが格納されている全てのログバッファ113Cから全てのTxログを取得する。統合ログ管理部105Cは、取得された複数のTxログにそれぞれ対応した複数のTxログサイズを基に、Txログ毎のTx開始位置211(統合Txログにおける位置(アドレス))を決定する。統合ログ管理部1050Cが、統合Txログに含められる複数のTxログにそれぞれ対応した複数のTx開始位置211の他に、その複数のTxログにそれぞれ対応した複数のパーティションアクセスマップ151Cを含んだ位置情報2100を生成する。統合ログ管理部1050Cは、生成した位置情報2100と、取得されシーケンシャルに並べた複数のTxログとを含んだ統合Txログ2090を生成する。統合ログ管理部1050Cは、生成した統合Txログ2090を、統合ログバッファ115Cに書き込む。その他は、実施例1と同様である。 S5020 is performed instead of S502. In S5020, all Tx logs are acquired from all log buffers 113C in which Tx logs are stored. The integrated log management unit 105C determines a Tx start position 211 (position (address) in the integrated Tx log) for each Tx log based on a plurality of Tx log sizes respectively corresponding to the acquired plurality of Tx logs. The integrated log management unit 1050C includes a plurality of partition access maps 151C corresponding to the plurality of Tx logs in addition to the plurality of Tx start positions 211 respectively corresponding to the plurality of Tx logs included in the integrated Tx log. Information 2100 is generated. The integrated log management unit 1050C generates an integrated Tx log 2090 that includes the generated position information 2100 and a plurality of acquired and sequentially arranged Tx logs. The integrated log management unit 1050C writes the generated integrated Tx log 2090 in the integrated log buffer 115C. Others are the same as in the first embodiment.
 図12は、統合Txログ反映処理のフローチャートである。 FIG. 12 is a flowchart of the integrated Tx log reflection process.
 統合Txログ反映処理は、例えば、待機系サーバ100Sの管理者から統合Txログ反映処理の開始が指示された場合に開始されてもよいし、統合Txログが統合ログPDEV134Sに格納された場合に開始されてもよい。複数の処理スレッド(ログ展開処理の処理スレッド)1070Sは、例えば並列に実行され、各処理スレッド1070Sについて、以下の処理が行われる。 The integrated Tx log reflection process may be started, for example, when the administrator of the standby server 100S is instructed to start the integrated Tx log reflection process, or when the integrated Tx log is stored in the integrated log PDEV 134S. May be started. The plurality of processing threads (log development processing thread) 1070S are executed in parallel, for example, and the following processing is performed for each processing thread 1070S.
 処理スレッド1070Sが、統合ログPDEV134Sから、統合Txログを、その処理スレッド1070Sのワークエリアに読み出す(S1201)。 The processing thread 1070S reads the integrated Tx log from the integrated log PDEV 134S to the work area of the processing thread 1070S (S1201).
 各処理スレッド1070Sは、ワークエリアに格納されている統合Txログ内の最初のパーティションアクセスマップ151Cを参照し、パーティションアクセスマップ151CがDBパーティション109Sの更新を示しているか否かを判断する(S1202)。 Each processing thread 1070S refers to the first partition access map 151C in the integrated Tx log stored in the work area, and determines whether the partition access map 151C indicates the update of the DB partition 109S (S1202). .
 S1202の判断結果が肯定の場合、処理スレッド1070Sは、パーティションアクセスマップ151Cに対応したTx開始位置211xから特定されるTxログ201の反映処理を行う(S1203)。具体的には、処理スレッド1070Sは、その特定されるTxログ201を参照し、参照したTxログ201kに従い、DBパーティション109Skを更新する。 If the determination result in S1202 is affirmative, the processing thread 1070S performs a reflection process of the Tx log 201 specified from the Tx start position 211x corresponding to the partition access map 151C (S1203). Specifically, the processing thread 1070S refers to the identified Tx log 201, and updates the DB partition 109Sk according to the referenced Tx log 201k.
 S1202の判断結果が否定の場合、処理スレッド1070Sは、参照したパーティションアクセスマップ151Cに対応したTx開始位置211(Txログ201)を読み飛ばす(S1204)。 If the determination result in S1202 is negative, the processing thread 1070S skips the Tx start position 211 (Tx log 201) corresponding to the referenced partition access map 151C (S1204).
 S1203又はS1204の次に、処理スレッド1070Sは、直前の1202で参照したパーティションアクセスマップ151Cが最後のパーティションアクセスマップ151Cか否かを判断する(S1205)。 Next to S1203 or S1204, the processing thread 1070S determines whether or not the partition access map 151C referenced in the immediately preceding 1202 is the last partition access map 151C (S1205).
 S1205の判断結果が肯定の場合、処理スレッド1070Sは、統合Txログの反映を終了する。 If the determination result in S1205 is affirmative, the processing thread 1070S finishes reflecting the integrated Tx log.
 S1205の判断結果が否定の場合、処理スレッド1070Sは、次のパーティションアクセスマップ151Cについて、S1202を行う。 If the determination result in S1205 is negative, the processing thread 1070S performs S1202 for the next partition access map 151C.
 以上、実施例2によれば、統合Txログ反映処理において、処理スレッド1070Sが、統合Txログ内の複数のTxログのうち、その処理スレッド1070Sに対応したDBパーティションの更新に関わるTxログのみを参照し、その処理スレッド1070Sに対応したDBパーティションの更新に関係しないTxログを読み飛ばすことができる。このため、データベース復元時間を短縮できる。 As described above, according to the second embodiment, in the integrated Tx log reflection process, the processing thread 1070S selects only the Tx log related to the update of the DB partition corresponding to the processing thread 1070S among the plurality of Tx logs in the integrated Tx log. The Tx log that is not related to the update of the DB partition corresponding to the processing thread 1070S can be skipped. For this reason, the database restoration time can be shortened.
 以上、幾つかの実施例を説明したが、それらは本発明の説明のための例示であって、本発明の範囲をそれらの実施例にのみ限定する趣旨ではない。本発明は、他の種々の形態でも実行することが可能である。 Although several embodiments have been described above, they are merely examples for explaining the present invention, and the scope of the present invention is not limited to these embodiments. The present invention can be implemented in various other forms.
 例えば、Txログは、ログバッファ113CからローカルログPDEV121Cに書き込まれた場合に、ログバッファ113Cから削除されてもよい。その場合、統合ログ管理部105Cは、統合Txログ書込み処理において、ローカルログPDEV121Cから、統合Txログに含めるTxログ(未だ統合Txログに含まれたことが無いTxログ)を取得してもよい。 For example, the Tx log may be deleted from the log buffer 113C when it is written from the log buffer 113C to the local log PDEV 121C. In this case, the integrated log management unit 105C may acquire a Tx log to be included in the integrated Tx log (a Tx log that has not been included in the integrated Tx log) from the local log PDEV 121C in the integrated Tx log writing process. .
 また、例えば、LLSNが採番されるケースは、チェックポイント間で実行されるトランザクションの集合が、第1種トランザクション集合と第2種トランザクション集合のうちの少なくとも第1種トランザクション集合の場合でよい。第1種トランザクション集合は、トランザクション実行順序によって結果が変化するトランザクションの集合でよい。例えば、半順序によれば、同一のレコードを更新する複数のトランザクションは定義された順序で実行されなければならないが、異なるレコードを更新する複数のトランザクションはどのような順序で実行されてもよい。第2種トランザクション集合は、トランザクションの実行順序が結果に影響を与えないトランザクション集合でよい。実行対象のトランザクションが第1種及び第2種トランザクション集合のどちらに属するかは、例えば、1又は複数のクエリ実行プランから判断されてよい。 Also, for example, the case where the LLSN is numbered may be a case where a set of transactions executed between checkpoints is at least a first type transaction set of a first type transaction set and a second type transaction set. The first type transaction set may be a set of transactions whose results change depending on the transaction execution order. For example, according to the partial order, a plurality of transactions that update the same record must be executed in a defined order, but a plurality of transactions that update different records may be executed in any order. The second type transaction set may be a transaction set in which the execution order of transactions does not affect the result. Whether the transaction to be executed belongs to the first type or the second type transaction set may be determined from one or a plurality of query execution plans, for example.
 また、例えば、LLSN管理部111Cにより管理されるLLSN(順番)は、そのLLSN管理部111Cに対応したDBパーティション109Cを更新したトランザクションのTxログの生成の都度に更新されてよい。複数のDBパーティション109CのうちN個のDBパーティション109Cを更新したトランザクションのログとして(Nは2以上の整数)、M個のTxログが生成されてよい(MはN以下であり1以上の整数)。M個のTxログのうちの少なくとも1つが、N個のDBパーティション109Cにそれぞれ対応したN個のLLSNのうちの2以上のLLSNを含んでよい。また、N個のDBパーティション109Cにそれぞれ対応したN個のログバッファ113Cのうち、1つのLLSNを含んだTxログは、そのLLSNに対応したログバッファ113Cに書き込まれ、2以上のLLSNを含んだTxログは、2以上のLLSNにそれぞれ対応した2以上のログバッファ113Cのうちのいずれかに書き込まれてよい。M=1でよい。 Also, for example, the LLSN (order) managed by the LLSN management unit 111C may be updated each time a Tx log of a transaction in which the DB partition 109C corresponding to the LLSN management unit 111C is updated is generated. M Tx logs may be generated (M is N or less and M is an integer of 1 or more) as a transaction log (N is an integer of 2 or more) that updates N DB partitions 109C among a plurality of DB partitions 109C. ). At least one of the M Tx logs may include two or more LLSNs among the N LLSNs respectively corresponding to the N DB partitions 109C. Of the N log buffers 113C corresponding to the N DB partitions 109C, the Tx log including one LLSN is written to the log buffer 113C corresponding to the LLSN and includes two or more LLSNs. The Tx log may be written in any one of two or more log buffers 113C corresponding to two or more LLSNs. M = 1 may be sufficient.
 100C:現用系サーバ 100S:待機系サーバ

 
100C: Active server 100S: Standby server

Claims (10)

  1.  現用系サーバと、前記現用系サーバに接続されている待機系サーバとのうちの少なくとも前記現用系サーバを有し、
     前記現用系サーバが、
     第1データベースに対する複数のトランザクションを実行し複数のトランザクションにそれぞれ対応した複数のログを生成し前記生成した複数のログを複数のログ格納領域にそれぞれ書き込む複数の第1サブ実行部を並列に実行する第1実行部と、
     前記複数のログ格納領域から前記複数のログを読み出し、前記読み出した複数のログを含んだ統合ログを生成し、前記生成した統合ログを前記待機系サーバに転送する統合ログ管理部と
    を有するデータベースシステム。
    Having at least the active server of the active server and the standby server connected to the active server,
    The active server is
    A plurality of first sub-execution units that execute a plurality of transactions for the first database, generate a plurality of logs corresponding to the plurality of transactions, and write the generated logs to a plurality of log storage areas, respectively, are executed in parallel. A first execution unit;
    A database having an integrated log management unit that reads the plurality of logs from the plurality of log storage areas, generates an integrated log including the read logs, and transfers the generated integrated log to the standby server system.
  2.  前記複数の第1サブ実行部の各々は、分割された前記第1データベースである複数の第1データベース部分のうち、トランザクションの実行により2以上の第1データベース部分を更新することがあり、
     前記複数の第1サブ実行部の各々は、前記複数のサブ実行部にそれぞれ対応した複数の更新部分マップのうちのそのサブ実行部に対応した更新部分マップを、そのサブ実行部がトランザクションの実行において更新した第1データベース部分を特定可能な情報に更新し、
     前記統合ログは、前記複数のログの他に前記複数のログにそれぞれ対応した複数の更新部分マップを含む、
    請求項1記載のデータベースシステム。
    Each of the plurality of first sub-execution units may update two or more first database portions by executing a transaction among the plurality of first database portions that are the divided first databases,
    Each of the plurality of first sub-execution units executes an update partial map corresponding to the sub-execution unit among the plurality of update partial maps respectively corresponding to the plurality of sub-execution units, and the sub-execution unit executes a transaction. Update the first database part updated in step 1 to the identifiable information,
    The integrated log includes a plurality of updated partial maps corresponding to the plurality of logs in addition to the plurality of logs,
    The database system according to claim 1.
  3.  前記待機系サーバを更に含み、
     前記待機系サーバは、分割された第2データベースである複数の第2データベース部分を管理し、
     前記待機系サーバは、複数の第2サブ実行部を並列に実行する第2実行部を有し、
     前記複数の第2データベース部分は、前記複数の第1データベース部分にそれぞれ対応しており、
     前記複数の第2サブ実行部の各々は、前記統合ログ内の更新部分マップ毎に、
      前記統合ログ内の更新部分マップを参照し、
      前記参照した更新部分マップから、その第2サブ実行部に対応した第2データベース部分に対応する第1データベース部分が更新されたことを特定した場合、前記統合ログ内の複数のログのうち前記参照した更新部分マップに対応したログを参照し、その参照したログに従い、その第2サブ実行部に対応した第2データベース部分を更新し、
      前記参照した更新部分マップから、その第2サブ実行部に対応した第2データベース部分に対応する第1データベース部分が更新されていないことを特定した場合、前記統合ログ内の複数のログのうち前記参照した更新部分マップに対応したログを参照しない、
    請求項2記載のデータベースシステム。
    Further comprising the standby server,
    The standby server manages a plurality of second database parts that are divided second databases,
    The standby server has a second execution unit that executes a plurality of second sub-execution units in parallel,
    The plurality of second database portions correspond to the plurality of first database portions, respectively.
    Each of the plurality of second sub execution units, for each update partial map in the integrated log,
    Refer to the update partial map in the unified log,
    When it is determined from the referenced update part map that the first database part corresponding to the second database part corresponding to the second sub-execution unit has been updated, the reference among the plurality of logs in the unified log The second database part corresponding to the second sub-execution unit is updated in accordance with the referred log.
    When it is determined from the referenced update part map that the first database part corresponding to the second database part corresponding to the second sub-execution part has not been updated, the plurality of logs in the unified log Do not refer to the log corresponding to the referenced update partial map,
    The database system according to claim 2.
  4.  分割された前記第1データベースである複数の第1データベース部分の各々について、ログの順番を管理する順番管理部を有し、
     各順番管理部により管理される順番は、その順番管理部に対応した第1データベース部分を更新したトランザクションのログの生成の都度に更新され、
     前記複数の第1データベース部分のうちN個の第1データベース部分を更新したトランザクションのログとして(Nは2以上の整数)、M個のログが生成され(MはN以下であり1以上の整数)、
     前記M個のログのうちの少なくとも1つが、前記N個の第1データベース部分にそれぞれ対応したN個の順番のうちの2以上の順番を含む、
    請求項1記載のデータベースシステム。
    For each of a plurality of first database portions that are the first database divided, an order management unit that manages the order of logs,
    The order managed by each order management unit is updated each time a transaction log generated by updating the first database part corresponding to the order management unit is generated.
    M logs are generated (N is an integer equal to or greater than N, where M is an integer equal to or greater than 1) as a log of transactions in which N of the first database portions are updated (N is an integer greater than or equal to 2). ),
    At least one of the M logs includes two or more orders of N orders corresponding to the N first database parts, respectively.
    The database system according to claim 1.
  5.  前記N個の第1データベース部分にそれぞれ対応したN個のログ格納領域があり、
     1つの順番を含んだログは、その順番に対応したログ格納領域に書き込まれ、
     前記2以上の順番を含んだログは、前記2以上の順番にそれぞれ対応した2以上のログ格納領域のうちのいずれかに書き込まれる、
    請求項4記載のデータベースシステム。
    There are N log storage areas respectively corresponding to the N first database parts,
    Logs containing one order are written to the log storage area corresponding to the order,
    The log including the two or more orders is written in one of two or more log storage areas respectively corresponding to the two or more orders.
    The database system according to claim 4.
  6.  M=1である、
    請求項4記載のデータベースシステム。
    M = 1
    The database system according to claim 4.
  7.  前記統合ログにおいて、複数のログがシリアルに並んでいる、
    請求項1記載のデータベースシステム。
    In the integrated log, a plurality of logs are serially arranged.
    The database system according to claim 1.
  8.  前記現用系サーバが第1メモリを有し、
     前記待機系サーバが第2メモリを有し、
     前記第1データベースは、前記第1メモリに格納されており、
     前記第2データベースは、前記第2メモリに格納されている、
    請求項3記載のデータベースシステム。
    The active server has a first memory;
    The standby server has a second memory;
    The first database is stored in the first memory;
    The second database is stored in the second memory;
    The database system according to claim 3.
  9.  現用系サーバが管理する第1データベースに対する複数のトランザクションを並列に実行し、前記複数のトランザクションにそれぞれ対応した複数のログを並列に生成し、前記生成した複数のログを複数のログ格納領域にそれぞれ並列に書き込み、
     前記複数のログ格納領域から前記複数のログを読み出し、前記読み出した複数のログを含んだ統合ログを生成し、前記生成した統合ログを待機系サーバに転送する、
    データベース管理方法。
    A plurality of transactions for the first database managed by the active server are executed in parallel, a plurality of logs corresponding to the plurality of transactions are generated in parallel, and the generated logs are respectively stored in a plurality of log storage areas. Write in parallel,
    Reading the plurality of logs from the plurality of log storage areas, generating an integrated log including the plurality of read logs, and transferring the generated integrated log to a standby server;
    Database management method.
  10.  待機系サーバに接続されたインターフェースデバイスと、
     第1データベースを格納し複数のログ可能領域を有する記憶部と、
     前記インターフェースデバイス及び前記記憶部に接続されたプロセッサと
    を有し、
     前記プロセッサが、
      複数のスレッドを並列に実行することにより、前記第1データベースに対する複数のトランザクションを実行し、前記複数のトランザクションにそれぞれ対応した複数のログを生成し、前記生成した複数のログを前記複数のログ格納領域にそれぞれ書き込み、
      前記複数のログ格納領域から前記複数のログを読み出し、前記読み出した複数のログを含んだ統合ログを生成し、前記生成した統合ログを前記待機系サーバに転送する、
    計算機。
     
     
     
     
     

     
    An interface device connected to the standby server;
    A storage unit storing a first database and having a plurality of logable areas;
    A processor connected to the interface device and the storage unit;
    The processor is
    By executing a plurality of threads in parallel, a plurality of transactions for the first database are executed, a plurality of logs corresponding to the plurality of transactions are generated, and the generated logs are stored in the plurality of logs. Write to each area,
    Reading the plurality of logs from the plurality of log storage areas, generating an integrated log including the plurality of read logs, and transferring the generated integrated log to the standby server;
    calculator.






PCT/JP2015/052160 2015-01-27 2015-01-27 Database system and database management method WO2016120988A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/052160 WO2016120988A1 (en) 2015-01-27 2015-01-27 Database system and database management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/052160 WO2016120988A1 (en) 2015-01-27 2015-01-27 Database system and database management method

Publications (1)

Publication Number Publication Date
WO2016120988A1 true WO2016120988A1 (en) 2016-08-04

Family

ID=56542646

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/052160 WO2016120988A1 (en) 2015-01-27 2015-01-27 Database system and database management method

Country Status (1)

Country Link
WO (1) WO2016120988A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH096658A (en) * 1995-06-21 1997-01-10 Shikoku Nippon Denki Software Kk Transaction journal management system
JP2006268531A (en) * 2005-03-24 2006-10-05 Hitachi Ltd Data processing system and method for managing database
JP2006277208A (en) * 2005-03-29 2006-10-12 Hitachi Ltd Backup system, program and backup method
JP2010287142A (en) * 2009-06-15 2010-12-24 Hitachi Ltd Fault tolerant computer system and method in fault tolerant computer system
WO2014097475A1 (en) * 2012-12-21 2014-06-26 株式会社Murakumo Information processing method, information processing device, and program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH096658A (en) * 1995-06-21 1997-01-10 Shikoku Nippon Denki Software Kk Transaction journal management system
JP2006268531A (en) * 2005-03-24 2006-10-05 Hitachi Ltd Data processing system and method for managing database
JP2006277208A (en) * 2005-03-29 2006-10-12 Hitachi Ltd Backup system, program and backup method
JP2010287142A (en) * 2009-06-15 2010-12-24 Hitachi Ltd Fault tolerant computer system and method in fault tolerant computer system
WO2014097475A1 (en) * 2012-12-21 2014-06-26 株式会社Murakumo Information processing method, information processing device, and program

Similar Documents

Publication Publication Date Title
US9798792B2 (en) Replication for on-line hot-standby database
US9563636B2 (en) Allowing writes to complete without obtaining a write lock to a file
CN109086388B (en) Block chain data storage method, device, equipment and medium
JP6152431B2 (en) Database management system and method
JP6445049B2 (en) Log management method and computer system
US10007548B2 (en) Transaction system
US11321302B2 (en) Computer system and database management method
US11366788B2 (en) Parallel pipelined processing for snapshot data deletion
EP4170509A1 (en) Method for playing back log on data node, data node, and system
KR101584760B1 (en) Method and apparatus of journaling by block group unit for ordered mode journaling file system
US8015375B1 (en) Methods, systems, and computer program products for parallel processing and saving tracking information for multiple write requests in a data replication environment including multiple storage devices
CN112015591A (en) Log management method, server and database system
US11442663B2 (en) Managing configuration data
US11210024B2 (en) Optimizing read-modify-write operations to a storage device by writing a copy of the write data to a shadow block
EP3951611A1 (en) Block verification method, apparatus and device
US10025680B2 (en) High throughput, high reliability data processing system
US7949632B2 (en) Database-rearranging program, database-rearranging method, and database-rearranging apparatus
WO2016120988A1 (en) Database system and database management method
CN112559457A (en) Data access method and device
KR20190096837A (en) Method and apparatus for parallel journaling using conflict page list
US20210034580A1 (en) Method, apparatus and computer program product for maintaining metadata
US20230385271A1 (en) Throughput-optimized schema-flexible storage with transactional properties
JP7024432B2 (en) Database management system, data conversion program, data conversion method and data conversion device
JP6263673B2 (en) Computer system and database management method
WO2018107460A1 (en) Object-based copying method and apparatus, and object-based storage device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15879885

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15879885

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP