KR101428649B1

KR101428649B1 - Encryption system for mass private information based on map reduce and operating method for the same

Info

Publication number: KR101428649B1
Application number: KR1020140027447A
Authority: KR
Inventors: 김현욱
Original assignee: (주)케이사인
Priority date: 2014-03-07
Filing date: 2014-03-07
Publication date: 2014-08-13

Abstract

A method of operating a system to encrypt mass private information based on map-reduce includes the following steps of: distributing a plurality of private information file blocks into which a private information sequential access method (SAM) file is divided to a plurality of slave servers, and copying and storing the private information file blocks; setting a private information encryption policy in units of private information fields of a private information record included in the private information SAM file; distributing encryption processing tasks to encrypt the private information file blocks to the slave servers; dividing each private information record included in the private information file blocks into a plurality of private information field values in the slave servers, and encrypting each of the private information field values in accordance to the private information encryption policy to generate encrypted field values; reconstructing the encrypted field values generated from the same private information record to generate an encrypted record; and aligning and combining the encrypted records in one of the slave servers to generate an encrypted SAM file.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a large-capacity personal information encryption system based on map re-

본 발명은 대용량 개인정보 처리 방법에 관한 것으로, 보다 상세하게는 맵 리듀스 기반의 대용량 개인정보 암호화 시스템 및 그의 동작 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a large-capacity personal information processing method, and more particularly, to a large-capacity personal information encryption system based on map reduction and an operation method thereof.

최근, 개인정보를 저장하는 서버가 해킹되어 개인정보가 유출되는 등 개인정보 유출 관련 사고가 빈번히 발생하여 개인정보의 관리에 대한 문제가 매우 심각하다. 또한, 개인정보보호법, 정보통신망법 등의 재정으로 인해 법률적으로 개인정보에 대한 보호가 의무화되어 개인정보 관리의 중요성이 매우 높아졌다.Recently, a server for storing personal information is hacked, and personal information is leaked. Therefore, accidents related to leakage of personal information frequently occur, and the problem of management of personal information is very serious. In addition, privacy of personal information has become very important because legal information such as personal information protection law and information communication network law is required to protect personal information.

개인정보를 저장하는 데이터베이스 관리 시스템은 해킹 등으로 인한 개인정보의 유출을 방지하기 위해 개인정보를 암호화하여 저장하는 방식을 주로 사용하고 있다. 예를 들어, 개인정보를 암호화하여 데이터베이스 관리 시스템에 저장하는 방법은 어플리케이션 레벨에서 암호화 모듈을 사용하여 개인정보를 암호화하고 이를 데이터베이스에 저장하는 방법, 데이터베이스 레벨에서 수행되는 암호화 모듈을 이용하여 개인정보를 저장하는 컬럼에 암호화를 적용하여 저장하는 방법 등이 있다. 이와 같은 개인정보의 암호화 방법들은 대용량의 개인정보를 취급하는 경우 암호화를 수행하는 과정에서 서버에 큰 로드가 걸릴 수 있으며, 시스템의 전체적인 성능을 저하시킬 수 있다.A database management system for storing personal information mainly uses a method of encrypting and storing personal information in order to prevent leakage of personal information due to hacking or the like. For example, encrypting and storing personal information in a database management system involves encrypting personal information using an encryption module at the application level and storing it in a database, encrypting personal information using a cryptographic module performed at the database level And applying encryption to the columns to be stored. Such encryption methods of personal information may cause a heavy load on the server in the course of performing encryption in handling a large amount of personal information, and may degrade the overall performance of the system.

본 발명의 일 목적은 대용량 개인정보에 대한 암호화를 분산 처리할 수 있는 대용량 개인정보 암호화 시스템을 제공하는 것이다.An object of the present invention is to provide a large-capacity personal information encryption system capable of distributing encryption for large-capacity personal information.

본 발명의 다른 목적은 상기 대용량 개인정보 암호화 시스템의 동작 방법을 제공하는 것이다.Another object of the present invention is to provide a method of operating the large-capacity personal information encryption system.

다만, 본 발명의 목적은 상기 목적들로 한정되는 것이 아니며, 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위에서 다양하게 확장될 수 있을 것이다.It should be understood, however, that the present invention is not limited to the above-described embodiments, and may be variously modified without departing from the spirit and scope of the present invention.

본 발명의 일 목적을 달성하기 위하여, 본 발명의 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템의 동작 방법은 개인정보 SAM(Sequential Access Method)파일로부터 분할된 복수의 개인정보 파일블록들을 복수의 슬레이브(slave) 서버들에 분산 및 복제하여 저장하는 단계, 상기 개인정보 SAM파일에 포함된 개인정보 레코드의 개인정보 필드 단위로 개인정보 암호화 정책을 설정하는 단계, 상기 개인정보 파일블록들에 대해 암호화하는 암호화 처리 태스크를 복수의 슬레이브 서버들로 분배하는 단계, 상기 슬레이브 서버들에서 상기 개인정보 파일블록들에 포함된 각각의 상기 개인정보 레코드를 복수의 개인정보 필드값들로 분할하고, 상기 개인정보 암호화 정책에 따라 상기 개인정보 필드값들을 각각 암호화하여 암호화 필드값들을 생성하는 단계, 동일한 상기 개인정보 레코드로부터 생성된 상기 암호화 필드값들을 재조립하여 암호화 레코드를 생성하는 단계, 및 상기 슬레이브 서버들 중 어느 한 슬레이브 서버에서 상기 암호화 레코드들을 정렬하고, 합병하여 암호화 SAM파일을 생성하는 단계를 포함할 수 있다.In order to accomplish one object of the present invention, a method of operating a mapping-based large-capacity personal information encryption system according to embodiments of the present invention includes a plurality of personal information file blocks divided from a personal information SAM (Sequential Access Method) Distributing and copying the personal information file blocks to a plurality of slave servers and storing them, setting a personal information encryption policy for each personal information field of the personal information record included in the personal information SAM file, And dividing each of the personal information records included in the individual information file blocks into a plurality of personal information field values in the slave servers, Encrypts the personal information field values according to the personal information encryption policy to generate encryption field values And reassembling the encrypted field values generated from the same personal information record to generate an encrypted record; and sorting and merging the encrypted records in one of the slave servers to generate an encrypted SAM file .

일 실시예에 의하면, 상기 슬레이브 서버들 각각에서, 분배된 상기 개인정보 파일블록에 포함되는 상기 개인정보 레코드들로부터 생성된 상기 암호화 레코드들을 상기 개인정보 레코드들과 동일한 순서로 정렬한 후 상기 정렬된 암호화 레코드들을 상기 어느 한 슬레이브 서버에 전송하는 단계를 더 포함할 수 있다.According to an embodiment, in each of the slave servers, the encryption records generated from the personal information records included in the distributed personal information file block are sorted in the same order as the personal information records, And transmitting the encrypted records to one of the slave servers.

일 실시예에 의하면, 상기 암호화 필드값들은 상기 개인정보 암호화 정책 및 상기 개인정보 필드값을 이용한 암호화 요청에 의해 적어도 하나의 암호화 서버에서 생성될 수 있다.According to an embodiment, the encryption field values may be generated in at least one encryption server by an encryption request using the personal information encryption policy and the private information field value.

일 실시예에 의하면, 기 지정된 수의 상기 개인정보 레코드들이 하나의 상기 암호화 요청을 통해 상기 적어도 하나의 암호화 서버에 제공될 수 있다.According to one embodiment, a pre-specified number of said personal information records may be provided to said at least one encryption server via one said encryption request.

일 실시예에 의하면, 상기 암호화 레코드를 생성하는 단계는 상기 암호화 필드값들 중 중복되는 상기 암호화 필드값을 제거하는 단계, 및 동일한 상기 암호화 레코드에 해당하는 상기 암호화 필드값들을 상기 개인정보 레코드의 상기 개인정보의 필드와 동일한 순서로 정렬하는 단계를 포함할 수 있다.According to an embodiment, the step of generating the encrypted record may include the steps of removing the encryption field value among the encrypted field values, and replacing the encryption field values corresponding to the same encrypted record with the encryption field values And sorting them in the same order as the fields of the personal information.

일 실시예에 의하면, 상기 대용량 개인정보 암호화 시스템은 하둡(Hadoop) 시스템을 사용할 수 있다.According to one embodiment, the high capacity personal information encryption system may use a Hadoop system.

본 발명의 다른 목적을 달성하기 위하여, 본 발명의 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템은 개인정보를 암호화하는 암호화 처리 태스크에 대한 정책을 설정하고, 상기 암호화 처리 태스크의 분산 처리를 위해 상기 암호화 처리 태스크를 분배하는 마스터(master) 서버, 및 상기 마스터 서버로부터 분배된 상기 암호화 처리 태스크를 수행하는 복수의 슬레이브 서버들을 포함할 수 있다. 상기 마스터 서버는 개인정보 SAM파일로부터 분할된 복수의 개인정보 파일블록들에 대한 정보를 관리하는 개인정보 SAM파일 관리부, 상기 개인정보 SAM파일에 포함된 개인정보 레코드의 개인정보 필드 단위로 개인정보 암호화 정책을 설정하는 정책 설정부, 및 상기 개인정보 파일블록들에 대한 상기 암호화 처리 태스크를 상기 슬레이브 서버들로 분배하는 태스크 분배부를 포함할 수 있다. 상기 슬레이브 서버는 상기 개인정보 파일블록들을 저장하는 개인정보 파일블록 저장부, 상기 개인정보 파일블록들에 포함된 상기 개인정보 레코드를 복수의 개인정보 필드값들로 분할하고 상기 개인정보 암호화 정책에 따라 상기 개인정보 필드값들을 각각 암호화하여 암호화 필드값들을 생성하는 암호화부, 동일한 상기 개인정보 레코드로부터 생성된 상기 암호화 필드값들을 재조립하여 암호화 레코드를 생성하는 암호화 레코드 생성부, 및 상기 암호화 레코드들을 정렬하고, 합병하여 암호화 SAM파일을 생성하는 암호화 SAM파일 생성부를 포함할 수 있다.According to another aspect of the present invention, there is provided a method for encrypting private information based on mapridesses, the method comprising: setting a policy for an encryption processing task for encrypting private information; A master server for distributing the encryption processing task for processing, and a plurality of slave servers for performing the encryption processing task distributed from the master server. Wherein the master server comprises a personal information SAM file management unit for managing information on a plurality of personal information file blocks divided from the personal information SAM file, A policy setting unit for setting a policy, and a task distribution unit for distributing the encryption processing task for the personal information file blocks to the slave servers. Wherein the slave server comprises: a personal information file block storage unit for storing the personal information file blocks; a storage unit for dividing the personal information record included in the personal information file blocks into a plurality of personal information field values, An encryption unit generating an encryption record by reassembling the encryption field values generated from the same personal information record by encrypting each of the private information field values to generate encryption field values, And an encrypted SAM file generating unit for merging and generating an encrypted SAM file.

일 실시예에 의하면, 상기 암호화 레코드 생성부는 상기 개인정보 파일블록에 포함되는 상기 개인정보 레코드들로부터 생성된 상기 암호화 레코드들을 상기 개인정보 레코드들과 동일한 순서로 정렬될 수 있다.According to an embodiment, the encryption record generating unit may sort the encrypted records generated from the personal information records included in the personal information file block in the same order as the personal information records.

일 실시예에 의하면, 상기 암호화부는 상기 개인정보 암호화 정책 및 상기 개인정보 필드값을 이용하여 적어도 하나의 암호화 서버로 암호화 요청을 수행하고, 상기 암호화 서버에서 생성된 상기 암호화 필드값을 수신할 수 있다.According to an embodiment, the encryption unit may perform an encryption request to at least one encryption server using the personal information encryption policy and the private information field value, and may receive the encryption field value generated at the encryption server .

일 실시예에 의하면, 상기 암호화부는 기 지정된 수의 상기 개인정보 레코드들을 하나의 상기 암호화 요청을 통해 상기 암호화 서버에 제공할 수 있다.According to one embodiment, the encryption unit may provide a predetermined number of the personal information records to the encryption server through one encryption request.

일 실시예에 의하면, 상기 개인정보 SAM파일 관리부 및 상기 개인정보 파일블록 저장부는 하둡 분산 파일시스템(Hadoop Distributed File System; HDFS)을 이용하고, 상기 정책 설정부, 상기 태스크 분배부, 상기 암호화부, 상기 암호화 레코드 생성부, 및 상기 암호화 SAM파일 생성부는 상기 하둡의 맵 리듀스(Map-Reduce)를 이용할 수 있다.According to an embodiment, the personal information SAM file management unit and the personal information file block storage unit use a Hadoop Distributed File System (HDFS), and the policy setting unit, the task distribution unit, the encryption unit, The encrypted record generating unit and the encrypted SAM file generating unit may use the Hadoop map-reduce.

본 발명의 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템의 동작 방법은 복수의 컴퓨터 클러스터(computer cluster)를 이용하여 대용량 개인정보를 분산하여 암호화 처리함으로써, 개인정보 암호화에 따른 부하를 분산시키고 대용량 개인정보 암호화의 수행 속도를 높일 수 있다.The operation method of the large-capacity personal information encryption system based on the map reduction according to the embodiments of the present invention is a method of distributing and encrypting a large amount of personal information using a plurality of computer clusters, And speed up the execution of large-capacity personal information encryption.

본 발명의 다른 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템은 대용량 개인정보 암호화의 수행 속도를 높일 수 있다. 또한, 대용량 개인정보 암호화 시스템은 작은 컴퓨터 클러스터로 구성이 가능하고, 처리할 개인정보 용량의 증가에 따라 시스템 확장이 용이하여 유연하게 시스템을 구성할 수 있다.The mapping-based large-size personal information encryption system according to other embodiments of the present invention can speed up the execution of large-capacity personal information encryption. In addition, a large-capacity personal information encryption system can be constituted by a small computer cluster, and a system can be flexibly constructed by expanding the system according to an increase in the capacity of personal information to be processed.

다만, 본 발명의 효과는 상기 효과들로 한정되는 것이 아니며, 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위에서 다양하게 확장될 수 있을 것이다.However, the effects of the present invention are not limited to the above effects, and may be variously extended without departing from the spirit and scope of the present invention.

도 1은 본 발명의 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템을 나타내는 블록도이다.
도 2는 본 발명의 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템의 동작 방법을 나타내는 순서도이다.
도 3은 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 개인정보 암호화 정책을 설정하는 일 예를 나타내는 도면이다.
도 4는 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 개인정보 SAM파일이 분산 처리되는 일 예를 나타내는 도면이다.
도 5는 도 4의 맵 태스크에서 개인정보 레코드로부터 암호화 필드값들이 생성되는 방법을 나타내는 순서도이다.
도 6은 도 5의 암호화 필드값들이 생성되는 방법에서 암호화 필드값들이 생성되는 일 예를 나타내는 도면이다.
도 7은 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 암호화 레코드들이 생성되는 방법을 나타내는 순서도이다.
도 8은 도 7의 암호화 레코드들이 생성되는 방법에서 암호화 레코드들이 생성되는 일 예를 나타내는 도면이다.
도 9는 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 암호화 SAM파일이 생성되는 일 예를 나타내는 도면이다.
도 10은 본 발명의 실시예들에 따른 대용량 개인정보 암호화 시스템의 효과를 나타내는 도면이다.FIG. 1 is a block diagram illustrating a mapping-based, large-capacity personal information encryption system in accordance with embodiments of the present invention.
FIG. 2 is a flowchart illustrating an operation method of a large-capacity personal information encryption system based on mapridesses according to embodiments of the present invention.
FIG. 3 is a diagram illustrating an example of setting a personal information encryption policy in the method of operating the large-capacity personal information encryption system of FIG.
FIG. 4 is a diagram showing an example in which personal information SAM files are distributedly processed in the method of operating the large-capacity personal information encryption system of FIG.
5 is a flowchart illustrating how encryption field values are generated from a personal information record in the map task of FIG.
FIG. 6 is a diagram illustrating an example in which encryption field values are generated in a method in which the encryption field values of FIG. 5 are generated.
7 is a flowchart illustrating a method of generating encrypted records in an operation method of the large-volume personal information encryption system of FIG.
FIG. 8 is a diagram showing an example in which encrypted records are generated in a method in which the encrypted records of FIG. 7 are generated. FIG.
9 is a diagram showing an example in which an encrypted SAM file is generated in an operation method of the large-capacity personal information encryption system of FIG.
FIG. 10 is a diagram illustrating an effect of a large-capacity personal information encryption system according to embodiments of the present invention.

이하, 첨부한 도면들을 참조하여, 본 발명의 실시예들을 보다 상세하게 설명하고자 한다. 도면상의 동일한 구성 요소에 대해서는 동일하거나 유사한 참조 부호를 사용한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. The same or similar reference numerals are used for the same components in the drawings.

도 1은 본 발명의 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템을 나타내는 블록도이다.FIG. 1 is a block diagram illustrating a mapping-based, large-capacity personal information encryption system in accordance with embodiments of the present invention.

도 1을 참조하면, 대용량 개인정보 암호화 시스템은 마스터(master) 서버(120) 및 복수의 슬레이브(slave) 서버(140)들을 포함할 수 있다. 마스터 서버(120)는 개인정보를 암호화하는 암호화 처리 태스크에 대한 정책을 설정하고, 암호화 처리 태스크의 분산 처리를 위해 암호화 처리 태스크를 슬레이브 서버(140)로 분배할 수 있다. 각각의 슬레이브 서버(140)는 마스터 서버(120)로부터 분배된 암호화 처리 태스크를 수행할 수 있다. 일 실시예에서, 대용량 개인정보 암호화 시스템은 개인정보 데이터를 분산 병렬처리하기 위해 클라우드 컴퓨팅 방법 중 하둡(Hadoop) 시스템을 이용할 수 있다. 하둡은 데이터를 분산 저장하는 하둡 분산 파일시스템(Hadoop Distributed File System; HDFS)과 데이터를 분산 처리 및 분석하는 맵 리듀스(Map-Reduce)로 이루어진다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템은 마스터 서버(120)에 1개의 잡 트래커(Job Tracker)가 포함될 수 있고, 슬레이브 서버(140)에 복수의 태스크 트래커(Task Tracker)가 포함될 수 있다. 잡 트래커는 맵 리듀스의 잡(Job) 실행 요청을 받아 하나의 잡을 맵과 리듀스로 분리하여 태스크(Task) 단위로 태스크 트래커에 할당할 수 있다. 태스크 트래커는 잡 트래커로부터 할당받은 태스크를 실행할 수 있다.Referring to FIG. 1, a large-capacity personal information encryption system may include a master server 120 and a plurality of slave servers 140. The master server 120 may set a policy for an encryption processing task that encrypts personal information and may distribute the encryption processing task to the slave server 140 for distributed processing of the encryption processing task. Each slave server 140 can perform a distributed cryptographic processing task from the master server 120. In one embodiment, a large personal information encryption system may utilize the Hadoop system of cloud computing methods for distributed parallel processing of personal information data. Hadoop consists of a Hadoop Distributed File System (HDFS) that distributes data and a Map-Reduce that distributes and analyzes data. A large personal information encryption system using the Hadoop system may include one job tracker in the master server 120 and a plurality of task trackers in the slave server 140. [ A job tracker can receive a job execution request from a map re- ducer and divide a job into a map and a re- duse, and assign it to a task tracker on a task-by-task basis. Task tracker can execute task assigned from job tracker.

마스터 서버(120)는 개인정보 SAM(Sequential Access Method)파일 관리부(122), 정책 설정부(124), 및 태스크 분배부(126)를 포함할 수 있다.The master server 120 may include a personal information SAM (Sequential Access Method) file management unit 122, a policy setting unit 124, and a task distribution unit 126.

개인정보 SAM파일 관리부(122)는 개인정보 SAM파일로부터 분할된 복수의 개인정보 파일블록들에 대한 정보를 관리할 수 있다. 일 실시예에서, 개인정보 SAM파일 관리부(122)는 클라이언트에서 개인정보 SAM파일 저장 요청 시 개인정보 SAM파일로부터 분할된 복수의 개인정보 파일블록들이 어느 슬레이브 서버에 저장해야 하는지에 대한 정보를 클라이언트에 제공할 수 있다. 예를 들어, 클라이언트는 개인정보 SAM파일을 복수의 개인정보 파일블록들로 분할하여 개인정보 SAM파일 관리부(122)에 개인정보 SAM파일에 대한 저장요청을 할 수 있다. 개인정보 SAM파일 관리부(122)는 이에 응답하여 각 개인정보 파일 블록이 어느 슬레이브 서버에 저장해야 하는지에 대한 정보를 클라이언트에 제공할 수 있다. 클라이언트는 개인정보 SAM파일 관리부(122)로부터 제공받은 정보를 이용하여 슬레이브 서버들에 개인정보 파일 블록들에 대한 저장 요청을 할 수 있다. 일 실시예에서, 개인정보 SAM파일 관리부(122)는 개인정보 파일 블록에 대한 정보를 관리할 수 있다. 예를 들어, 개인정보 SAM파일 관리부(122)는 개인정보 SAM파일이 개인정보 파일 블록들로 분할된 정보, 개인정보 파일 블록들의 크기, 개인정보 파일 블록들의 저장 위치 등에 대한 메타데이터를 저장하고 관리할 수 있다. 일 실시예에서, 개인정보 SAM파일 관리부(122)는 HDFS를 이용할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 개인정보 SAM파일 관리부(122)는 HDFS를 이용하여 개인정보 SAM파일에 대한 정보를 관리할 수 있다. 즉, 개인정보 SAM파일 관리부(122)는 HDFS의 네임 노드(Name Node)에 상응하여 개인정보 SAM파일을 개인정보 파일블록들로 분할하여 분산 및 복제 저장한 정보를 생성하고, 이를 관리할 수 있다. HDFS를 이용하는 경우, 슬레이브 서버(140)들에 개인정보 SAM파일을 분산 및 복제 저장하기 때문에 다중화 처리에 의해 데이터의 안정성 및 신뢰성을 높일 수 있다.The personal information SAM file management unit 122 can manage information on a plurality of personal information file blocks divided from the personal information SAM file. In one embodiment, the personal information SAM file management unit 122 stores information on which slave server a plurality of personal information file blocks, which are divided from the personal information SAM file, . For example, the client may divide the personal information SAM file into a plurality of personal information file blocks and request the personal information SAM file manager 122 to store the personal information SAM file. In response, the personal information SAM file management unit 122 may provide the client with information on which slave server each personal information file block should be stored. The client can request the slave servers to store the personal information file blocks using the information provided from the personal information SAM file management unit 122. [ In one embodiment, the personal information SAM file manager 122 may manage information about personal information file blocks. For example, the personal information SAM file management unit 122 stores and manages metadata of the personal information SAM file divided into personal information file blocks, the size of personal information file blocks, storage locations of personal information file blocks, and the like can do. In one embodiment, the personal information SAM file manager 122 may use HDFS. In the large-capacity personal information encryption system using the Hadoop system, the personal information SAM file management unit 122 can manage information on the personal information SAM file using the HDFS. That is, the personal information SAM file management unit 122 divides the personal information SAM file into personal information file blocks corresponding to the name node of the HDFS, generates information distributed and duplicated and stored, and manages the information . In the case of using the HDFS, since the personal information SAM file is distributed and duplicated in the slave servers 140, the reliability and reliability of the data can be enhanced by the multiplexing process.

정책 설정부(124)는 개인정보 SAM파일에 포함된 개인정보 레코드의 개인정보 필드 단위로 개인정보 암호화 정책을 설정할 수 있다. 일 실시예에서, 정책 설정부(124)는 암호화 대상인 개인정보 필드와 암호화 대상이 아닌 개인정보 필드를 구분하여 지정할 수 있고, 암호화 대상인 개인정보 필드에 대해서는 암호화할 방법과 매핑될 수 있는 암호화 코드를 지정할 수 있다. 일 실시예에서, 정책 설정부(124)는 하둡의 맵 리듀스를 이용할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 정책 설정부(124)는 드라이버 클래스(Driver Class)에서 정의될 수 있다.The policy setting unit 124 may set the personal information encryption policy in units of personal information fields of the personal information records included in the personal information SAM file. In one embodiment, the policy setting unit 124 may distinguish between a personal information field as an object of encryption and a private information field that is not an object of encryption, and may specify an encryption method and an encryption code that can be mapped Can be specified. In one embodiment, the policy setting unit 124 may utilize Hadoop's mapping reduction. In the large-capacity personal information encryption system using the Hadoop system, the policy setting unit 124 may be defined in a driver class.

태스크 분배부(126)는 개인정보 암호화를 각 슬레이브 서버(140)에서 분산 처리할 수 있도록 개인정보 SAM파일로부터 분할된 개인정보 파일블록들에 대한 암호화 처리 태스크를 슬레이브 서버(140)들로 분배할 수 있다. 일 실시예에서, 태스크 분배부(126)는 하둡의 맵 리듀스를 이용할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 태스크 분배부(126)는 잡 트래커에 포함될 수 있고, 잡 실행 요청을 받아 태스크 단위로 슬레이브 서버(140)의 태스크 트래커에 할당할 수 있다.The task distribution unit 126 distributes the encryption processing task for the personal information file blocks divided from the personal information SAM file to the slave servers 140 so that the slave servers 140 can distribute the personal information encryption . In one embodiment, the task distributor 126 may utilize Hadoop's map reduction. In the large-capacity personal information encryption system using the Hadoop system, the task distribution unit 126 may be included in the job tracker, and may be allocated to the task tracker of the slave server 140 on a task-by-task basis upon receipt of the job execution request.

슬레이브 서버(140)는 개인정보 파일블록 저장부(141), 암호화부(142), 암호화 레코드 생성부(144), 및 암호화 SAM파일 생성부(146)를 포함할 수 있다.The slave server 140 may include a personal information file block storage unit 141, an encryption unit 142, an encrypted record generation unit 144, and an encrypted SAM file generation unit 146.

개인정보 파일블록 저장부(141)는 개인정보 SAM파일로부터 분할된 개인정보 파일블록들에 대한 저장요청을 수신하여 개인정보 파일블록들을 저장할 수 있다. 일 실시예에서, 개인정보 파일블록 저장부(141)는 HDFS를 이용할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 개인정보 파일블록 저장부(141)는 HDFS를 이용하여 개인정보 파일블록들을 저장할 수 있다. 즉, 파일블록 저장부(141)는 HDFS의 데이터 노드(Data Node)에 상응하여 개인정보 파일블록들에 대한 저장요청을 수신하여 개인정보 파일블록들을 저장할 수 있다.The personal information file block storage unit 141 may store the personal information file blocks by receiving the request for storing the personal information file blocks divided from the personal information SAM file. In one embodiment, the personal information file block storage unit 141 may use HDFS. In the high-capacity personal information encryption system using the Hadoop system, the personal information file block storage unit 141 can store the personal information file blocks using the HDFS. That is, the file block storage unit 141 may store the personal information file blocks by receiving a request for storing the personal information file blocks corresponding to the data node of the HDFS.

암호화부(142)는 개인정보 파일블록들에 포함된 개인정보 레코드를 복수의 개인정보 필드값들로 분할하고 개인정보 암호화 정책에 따라 개인정보 필드값들을 각각 암호화하여 암호화 필드값들을 생성할 수 있다. 즉, 암호화부(142)는 개인정보 파일블록들에 포함된 하나의 개인정보 레코드를 필드별로 분할하여 개인정보 필드값들을 얻을 수 있다. 암호화부(142)는 각각의 개인정보 필드값과 개인정보 필드값에 상응하는 개인정보 암호화 정책을 이용하여 암호화 필드값들을 생성할 수 있다. 암호화부(142)는 개인정보 필드값에 대해 암호화부(142) 내부에서 암호화를 수행하거나, 암호화 서버(30) 등과 같은 외부 시스템을 이용하여 암호화를 수행할 수 있다. 일 실시예에서, 암호화부(142)는 개인정보 암호화 정책 및 개인정보 필드값을 이용하여 적어도 하나의 암호화 서버(30)로 암호화 요청을 수행하고, 암호화 서버(30)에서 생성된 암호화 필드값을 수신할 수 있다. 이 때, 개인정보 필드값에 대해 각각 암호화 요청을 수행하는 경우, 암호화 요청-응답 수가 매우 커짐으로써 네트워크 부하와 시스템 부하가 그 만큼 늘어날 수 있다. 따라서, 암호화부(142)는 복수의 개인정보 필드값을 묶어서 하나의 암호화 요청으로 수행될 수 있다. 일 실시예에서, 암호화부(142)는 기 지정된 수의 개인정보 레코드들을 하나의 암호화 요청을 통해 암호화 서버(30)에 제공할 수 있다. 예를 들어, 암호화부(142)는 1만개의 개인정보 레코드에 대한 개인정보 필드값을 묶어서 하나의 암호화 요청으로 수행할 수 있다. 이를 위해, 암호화부(142)는 개인정보 파일블록에 포함된 개인정보 레코드들을 1만개씩 묶어서 암호화 요청을 수행하고, 파일의 끝에 도달하는 경우 나머지 개인정보 레코드들에 대해 암호화 요청을 수행할 수 있다. 일 실시예에서, 암호화부(142)는 하둡의 맵 리듀스를 이용할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서, 암호화부(142)는 맵퍼 클래스(Mapper Class)에서 정의될 수 있다.The encryption unit 142 may divide the personal information record included in the personal information file blocks into a plurality of personal information field values and encrypt the individual information field values according to the personal information encryption policy to generate the encryption field values . That is, the encryption unit 142 may obtain the personal information field values by dividing one personal information record included in the personal information file blocks by fields. The encryption unit 142 may generate the encryption field values using a personal information encryption policy corresponding to each personal information field value and a personal information field value. The encryption unit 142 may encrypt the private information field value in the encryption unit 142 or may perform encryption using an external system such as the encryption server 30 or the like. In one embodiment, the encryption unit 142 performs an encryption request to at least one encryption server 30 using the personal information encryption policy and the private information field value, and transmits the encryption field value generated at the encryption server 30 . In this case, when the encryption request is performed for each value of the private information field, the number of encryption request-response is greatly increased, and the network load and the system load can be increased accordingly. Accordingly, the encryption unit 142 can be performed with one encryption request by grouping a plurality of personal information field values. In one embodiment, the encryption unit 142 may provide a predetermined number of personal information records to the encryption server 30 through one encryption request. For example, the encryption unit 142 may group the personal information field values of 10,000 personal information records into one encryption request. To this end, the encryption unit 142 performs an encryption request by bundling ten thousand personal information records included in the personal information file block, and when the end of the file is reached, requests encryption for the remaining personal information records . In one embodiment, the encryption unit 142 may utilize Hadoop's mapping reduction. In the large-capacity personal information encryption system using the Hadoop system, the encryption unit 142 can be defined in a mapper class.

암호화 레코드 생성부(144)는 동일한 개인정보 레코드로부터 생성된 암호화 필드값들을 재조립하여 암호화 레코드를 생성할 수 있다. 즉, 암호화 레코드 생성부(144)는 암호화부(142)에서 생성된 암호화 필드값들을 재조립하여 개인정보 레코드에 대응하는 암호화 레코드를 생성할 수 있다. 일 실시예에서, 암호화 레코드 생성부(144)는 각각의 슬레이브 서버(140)들에서 생성된 암호화 필드값들을 하나의 슬레이브 서버(140)에서 재조립하여 암호화 레코드를 생성할 수 있다. 다른 실시예에서, 암호화 레코드 생성부(144)는 각각의 슬레이브 서버(140)들에서 생성된 암호화 필드값들을 각각의 슬레이브 서버(140)에서 재조립하여 암호화 레코드를 생성할 수 있다. 이 때, 암호화 레코드 생성부(144)는 개인정보 파일블록에 포함되는 개인정보 레코드들로부터 생성된 암호화 레코드들을 개인정보 레코드들과 동일한 순서로 정렬할 수 있다. 즉, 암호화 레코드 생성부(144)는 각각의 슬레이브 서버(140)에서 재조립한 암호화 레코드에 대해 부분적으로 정렬함으로써 암호화 SAM파일 생성부(146)에서 수행하는 암호화 레코드들을 정렬 속도를 효율적으로 높일 수 있다. 일 실시예에서, 암호화 레코드 생성부(144)는 하둡의 맵 리듀스를 이용할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서, 암호화 레코드 생성부(144)는 컴바이너 클래스(Combiner Class) 또는 리듀서 클래스(Reducer Class) 에서 정의될 수 있다.The encryption record generation unit 144 can re-assemble the encryption field values generated from the same personal information record to generate an encrypted record. That is, the encryption record generation unit 144 may reassemble the encryption field values generated by the encryption unit 142 to generate an encryption record corresponding to the personal information record. In one embodiment, the cryptographic record generator 144 may reassemble the cryptographic field values generated in each slave server 140 in one slave server 140 to generate an encrypted record. In another embodiment, the encryption record generator 144 may reassemble the encryption field values generated in each slave server 140 in each slave server 140 to generate an encrypted record. At this time, the encryption record generation unit 144 may sort the encrypted records generated from the personal information records included in the personal information file block in the same order as the personal information records. That is, the encryption record generation unit 144 partially sorts the encrypted records re-assembled by the slave servers 140, thereby efficiently increasing the sorting speed of the encrypted records performed by the encrypted SAM file generation unit 146 have. In one embodiment, the encryption record generation unit 144 may use Hadoop's mapping reduction. In the large-capacity personal information encryption system using the Hadoop system, the encryption record generation unit 144 may be defined in a Combiner class or a Reducer class.

암호화 SAM파일 생성부(146)는 암호화 레코드 생성부(144)에서 생성된 암호화 레코드들을 정렬하고, 합병하여 암호화 SAM파일을 생성할 수 있다. 암호화 SAM파일 생성부(146)는 하나의 슬레이브 서버(140)에서 수행될 수 있다. 일 실시예에서, 암호화 SAM파일 생성부(146)는 하둡의 맵 리듀스를 이용할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서, 암호화 SAM파일 생성부(146)는 리듀서 클래스(Reducer Class) 에서 정의될 수 있다.The encrypted SAM file generation unit 146 may sort and merge the encrypted records generated by the encrypted record generation unit 144 to generate an encrypted SAM file. The encryption SAM file generation unit 146 may be performed in one slave server 140. In one embodiment, the encrypted SAM file generation unit 146 may utilize Hadoop's mapping reduction. In the large-capacity personal information encryption system using the Hadoop system, the encrypted SAM file generation unit 146 may be defined in the Reducer Class.

도 2는 본 발명의 실시예들에 따른 맵 리듀스 기반의 대용량 개인정보 암호화 시스템의 동작 방법을 나타내는 순서도이다.FIG. 2 is a flowchart illustrating an operation method of a large-capacity personal information encryption system based on mapridesses according to embodiments of the present invention.

도 2을 참조하면, 개인정보 SAM파일은 다양한 데이터베이스 또는 파일시스템으로부터 개인정보를 수집하여 처리하고자 하는 형식에 맞도록 생성(S110)될 수 있다. 일 실시예에서, 개인정보 SAM파일은 클라이언트에서 생성될 수 있다. 예를 들면, 클라이언트는 개인정보 SAM파일 생성부를 포함하고, SAM파일 생성부는 개인정보를 수집하여 적어도 하나의 개인정보 SAM파일을 생성할 수 있다. 개인정보 SAM파일 생성부는 적어도 하나 이상의 데이터베이스로부터 개인정보를 로드하여 개인정보 SAM파일을 생성할 수 있다. 따라서, 개인정보 SAM파일 생성부는 다양한 플랫폼을 갖는 복수의 데이터베이스로부터 개인정보를 로드하여 개인정보 SAM파일을 생성하고 저장할 수 있으며, 생성된 개인정보 SAM파일은 한 개 또는 그 이상일 수 있다.Referring to FIG. 2, the personal information SAM file may be generated (S110) according to a format for collecting and processing personal information from various databases or file systems. In one embodiment, a personal information SAM file may be generated at the client. For example, the client may include a personal information SAM file generation unit, and the SAM file generation unit may collect at least one personal information SAM file by collecting personal information. The personal information SAM file creating unit can load the personal information from at least one of the databases and generate the personal information SAM file. Accordingly, the personal information SAM file generation unit can load and store the personal information SAM from a plurality of databases having various platforms, and the generated personal information SAM file can be one or more.

개인정보 SAM파일은 복수의 개인정보 파일블록들로 분할되고, 개인정보 파일블록들은 슬레이브 서버들에 분산 및 복제하여 저장(S120)될 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 개인정보 SAM파일은 HDFS에 의해 개인정보 파일블록들로 분할되어 각 슬레이브 서버에 저장될 수 있다.The personal information SAM file is divided into a plurality of personal information file blocks, and the personal information file blocks may be distributed and duplicated and stored in the slave servers (S120). In a large-capacity personal information encryption system using the Hadoop system, a personal information SAM file can be divided into individual information file blocks by HDFS and stored in each slave server.

개인정보 암호화 정책은 개인정보 SAM파일에 포함된 개인정보 레코드의 개인정보 필드 단위로 설정(S130)될 수 있다. 개인정보 암호화 정책은 암호화 대상인 개인정보 필드와 암호화 대상이 아닌 개인정보 필드를 구분하여 지정될 수 있고, 암호화 대상인 개인정보 필드에 대해서는 암호화할 방법과 매핑될 수 있는 암호화 코드가 지정될 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 암호화 정책은 드라이버 클래스(Driver Class)에서 정의될 수 있다.The personal information encryption policy may be set (S130) on a personal information field basis of the personal information record included in the personal information SAM file. The personal information encryption policy may be specified by distinguishing between a personal information field as an encryption target and a personal information field that is not an encryption target, and an encryption method and an encryption code that can be mapped may be designated for a personal information field to be encrypted. In a large-capacity personal information encryption system using the Hadoop system, the encryption policy can be defined in a driver class.

개인정보 파일블록들에 대해 암호화하는 암호화 처리 태스크가 복수의 슬레이브 서버들로 분배되고, 각각의 슬레이브 서버에서 실행(S140)될 수 있다. 즉, 개인정보 암호화가 각 슬레이브 서버에서 분산 처리될 수 있도록 각 슬레이브 서버에 저장된 개인정보 파일블록들을 암호화하는 암호화 처리 태스크가 분배될 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 맵 리듀스의 잡 트래커에 의해 태스크 단위로 슬레이브 서버의 태스크 트래커에 할당될 수 있다.An encryption processing task for encrypting private information file blocks may be distributed to a plurality of slave servers and executed in each slave server (S140). That is, an encryption processing task for encrypting personal information file blocks stored in each slave server can be distributed so that personal information encryption can be distributed on each slave server. Can be assigned to the task tracker of the slave server on a task-by-task basis by the job tracker of the Map Reduce in a large-capacity personal information encryption system using the Hadoop system.

개인정보 파일블록들에 포함된 각각의 개인정보 레코드들은 슬레이브 서버들에서 복수의 개인정보 필드값들로 분할되고, 개인정보 필드값들은 개인정보 암호화 정책에 따라 각각 암호화되어 암호화 필드값들이 생성(S150)될 수 있다. 즉, 개인정보 필드값들은 개인정보 파일블록들에 포함된 하나의 개인정보 레코드가 필드별로 분할되어 생성될 수 있다. 또한, 암호화 필드값들은 각각의 개인정보 필드값과 개인정보 필드값에 상응하는 개인정보 암호화 정책을 이용하여 생성될 수 있다. 일 실시예에서, 암호화 필드값들은 개인정보 암호화 정책 및 개인정보 필드값을 이용한 암호화 요청에 의해 적어도 하나의 암호화 서버에서 생성될 수 있다. 일 실시예에서, 기 지정된 수의 개인정보 레코드들이 하나의 암호화 요청을 통해 적어도 하나의 암호화 서버에 제공될 수 있다. 다만, 암호화 서버를 이용한 개인정보 레코드에 대한 암호화 방법에 대해서는 상술한 바 있으므로, 그에 대한 중복되는 설명은 생략하기로 한다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서, 암호화 필드값들의 생성에 대한 태스크가 맵퍼 클래스(Mapper Class)에서 정의되어 태스크 트래커에 할당될 수 있다.Each of the personal information records included in the individual information file blocks is divided into a plurality of personal information field values at the slave servers, and the personal information field values are encrypted according to the personal information encryption policy to generate encrypted field values ). That is, the personal information field values may be generated by dividing one personal information record included in the personal information file blocks into fields. In addition, the encryption field values may be generated using a personal information encryption policy corresponding to each personal information field value and a personal information field value. In one embodiment, the encryption field values may be generated in the at least one encryption server by a request for encryption using a personal information encryption policy and a private information field value. In one embodiment, a predetermined number of personal information records may be provided to at least one encryption server via one encryption request. However, since the encryption method for the personal information record using the encryption server has been described above, a duplicate description thereof will be omitted. In a large-capacity personal information encryption system using the Hadoop system, a task for generating encryption field values may be defined in a Mapper class and assigned to a task tracker.

암호화 레코드는 동일한 개인정보 레코드로부터 생성된 암호화 필드값들이 재조립되어 생성(S160)될 수 있다. 일 실시예에서, 각각의 슬레이브 서버들에서 생성된 암호화 필드값들이 하나의 슬레이브 서버에서 재조립되어 암호화 레코드가 생성될 수 있다. 다른 실시예에서, 각각의 슬레이브 서버들에서 생성된 암호화 필드값들이 각각의 슬레이브 서버에서 재조립되어 암호화 레코드가 생성될 수 있다. 이 때, 슬레이브 서버들 각각에서, 분배된 개인정보 파일블록에 포함되는 개인정보 레코드들로부터 생성된 암호화 레코드들을 개인정보 레코드들과 동일한 순서로 정렬되고, 정렬된 암호화 레코드들은 암호화 SAM파일을 생성하기 위해 어느 한 슬레이브 서버에 전송될 수 있다. 즉, 각각의 슬레이브 서버에서 재조립한 암호화 레코드에 대해 부분적으로 정렬함으로써 암호화 SAM파일 생성 시 수행되는 암호화 레코드들을 정렬 속도를 효율적으로 높일 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서, 암호화 레코드 생성에 대한 태스크가 컴바이너 클래스(Combiner Class) 또는 리듀서 클래스(Reducer Class)에서 정의되어 태스크 트래커에 할당될 수 있다.The encryption record may be generated (S160) by reassembling the encryption field values generated from the same personal information record. In one embodiment, the encrypted field values generated in each slave server may be reassembled in one slave server to generate an encrypted record. In another embodiment, the encryption field values generated in each slave server may be reassembled in each slave server to generate an encrypted record. At this time, in each of the slave servers, the encrypted records generated from the personal information records included in the distributed personal information file block are sorted in the same order as the personal information records, and the sorted encrypted records are generated in the encrypted SAM file It can be transmitted to any slave server. That is, by partially sorting the reassembled encrypted records in each slave server, it is possible to efficiently increase the sorting speed of the encrypted records to be performed when generating the encrypted SAM file. In a large-capacity personal information encryption system using the Hadoop system, tasks for generating encrypted records can be defined in the Combiner Class or the Reducer Class and assigned to the Task Tracker.

암호화 SAM파일은 슬레이브 서버들 중 어느 한 슬레이브 서버에서 암호화 레코드들이 정렬되고 합병되어 생성(S170)될 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서, 암호화 SAM파일 생성에 대한 태스크가 리듀서 클래스(Reducer Class)에서 정의되어 태스크 트래커에 할당될 수 있다.The encrypted SAM file may be generated (S170) by sorting and merging the encryption records in one of the slave servers. In a large-capacity personal information encryption system using the Hadoop system, a task for generating an encrypted SAM file may be defined in the Reducer class and assigned to the task tracker.

도 3은 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 개인정보 암호화 정책을 설정하는 일 예를 나타내는 도면이다.FIG. 3 is a diagram illustrating an example of setting a personal information encryption policy in the method of operating the large-capacity personal information encryption system of FIG.

도 3을 참조하면, 개인정보 암호화 정책은 개인정보 SAM파일에 포함된 개인정보 레코드의 개인정보 필드 단위로 설정될 수 있다. 개인정보 암호화 정책은 암호화 대상인 개인정보 필드와 암호화 대상이 아닌 개인정보 필드를 구분하여 지정될 수 있고, 암호화 대상인 개인정보 필드에 대해서는 암호화할 방법과 매핑될 수 있는 암호화 코드가 지정될 수 있다. 예를 들어, 개인정보 SAM파일은 아이디, 이름, 주민등록번호, 전화번호 필드의 순으로 구성되고, 암호화 정책 파일은 필드 순번과 암호화 코드로 구성될 수 있다. 암호화 정책 파일은 각 필드 단위로 암호화 정책을 설정할 수 있으므로 암호화를 수행할 필요가 없는 아이디와 이름 필드는 암호화를 적용하지 않고, 주민등록번호 필드는 제1 암호화 정책(P001), 전화번호 필드는 제2 암호화 정책(P002)을 지정할 수 있다. 제1 암호화 정책(P001)은 문자열의 마지막 6자리를 부분적으로 암호화하는 것으로 설정될 수 있고, 제2 암호화 정책(P002)은 문자열의 마지막 5자리를 부분적으로 암호화하는 것으로 설정될 수 있다.Referring to FIG. 3, the personal information encryption policy may be set in units of personal information fields of the personal information record included in the personal information SAM file. The personal information encryption policy may be specified by distinguishing between a personal information field as an encryption target and a personal information field that is not an encryption target, and an encryption method and an encryption code that can be mapped may be designated for a personal information field to be encrypted. For example, the personal information SAM file may consist of an ID, a name, a resident registration number, and a telephone number field, and the encryption policy file may be composed of a field sequence number and an encryption code. Since the encryption policy file can set an encryption policy for each field, encryption is not applied to the ID and the name field for which encryption is not required, the resident registration number field is the first encryption policy (P001), the telephone number field is the second encryption A policy (P002) can be specified. The first encryption policy P001 may be set to partially encrypt the last six digits of the string and the second encryption policy P002 may be set to partially encrypt the last five digits of the string.

도 4는 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 개인정보 SAM파일이 분산 처리되는 일 예를 나타내는 도면이다.FIG. 4 is a diagram showing an example in which personal information SAM files are distributedly processed in the method of operating the large-capacity personal information encryption system of FIG.

도 4을 참조하면, 개인정보 SAM파일이 복수의 개인정보 파일블록들로 분할되어 슬레이브 서버들에 저장되고, 개인정보 파일블록들 각각에 대해 암호화하는 암호화 처리 태스크를 복수의 슬레이브 서버들로 분배하고, 슬레이브 서버들에서 개인정보 파일블록들에 포함된 각각의 개인정보 레코드를 복수의 개인정보 필드값들로 분할하고, 개인정보 암호화 정책에 따라 개인정보 필드값들을 각각 암호화하여 암호화 필드값들을 생성할 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템에서 개인정보 SAM파일은 HDFS에 의해 개인정보 파일블록들로 분할되어 각 슬레이브 서버에 저장되고, 맵 리듀스의 잡 트래커에 의해 태스크 단위로 슬레이브 서버의 태스크 트래커에 할당될 수 있다. 암호화 필드값들의 생성에 대한 태스크는 맵퍼 클래스(Mapper Class)에서 정의되고, 복수의 슬레이브 서버들 각각에서 맵 태스크(Map Task)로 수행될 수 있다. 예를 들어, 개인정보 SAM파일에 저장된 제1 개인정보 레코드(id001 홍길동 7801011234567 01012341234)는 분할된 개인정보 파일블록에 저장되고 각 필드가 개인정보 필드값('id001', '홍길동', '7801011234567', '01012341234')들로 분할될 수 있다. 개인정보 필드값들에 대해 암호화 정책에 따라 아이디, 이름에 대해서는 암호화를 수행하지 않고, 주민등록번호와 전화번호는 각각의 정책에 상응하는 암호화가 수행될 수 있다. 일 실시예에서, 개인정보 필드값들에 대해 암호화를 수행하여 암호화 필드값을 생성하고, 암호화 필드값을 포함하는 키(key)와 벨류(value)값으로 이루어진 암호화 결과값(key, value)이 생성될 수 있다. 암호화 결과값의 형식은 키(key)로 '라인넘버', 벨류(value)로 '필드 순번*암호화 필드값'으로 이루어질 수 있다. 이 때, 암호화는 복수의 개인정보 레코드에 포함된 개인정보 필드의 묶음 단위로 수행될 수 있으므로, 하나의 벨류(value)에 동일한 개인정보 레코드에 속하는 복수의 암호화 필드값들을 저장할 수 있다. 이와 같이, 하나의 벨류(value)에 동일한 개인정보 레코드에 속하는 복수의 암호화 필드값들을 저장함으로써 암호화 레코드 생성 시 정렬, 합병에 대한 부담을 줄일 수 있어 암호화 레코드 생성 속도를 높일 수 있다.4, a personal information SAM file is divided into a plurality of personal information file blocks and stored in slave servers, and an encryption processing task for encrypting each personal information file block is distributed to a plurality of slave servers , Divides each personal information record included in the individual information file blocks in the slave servers into a plurality of private information field values, and encrypts individual information field values according to the private information encryption policy to generate encryption field values . In the large-capacity personal information encryption system using the Hadoop system, the personal information SAM file is divided into personal information file blocks by HDFS and stored in each slave server, and the task tracker of the map redesses the task tracker of the slave server Can be assigned. The task of generating the encryption field values is defined in a mapper class and can be performed as a map task in each of a plurality of slave servers. For example, the first personal information record (id001 Hongdil 7801011234567 01012341234) stored in the personal information SAM file is stored in the divided personal information file block and each field is stored in the personal information field values 'id001', 'Hongdongdong', '7801011234567' , &Quot; 01012341234 "). According to the encryption policy for the personal information field values, the encryption for the ID and the name is not performed, and the encryption for the resident registration number and the telephone number corresponding to the respective policies can be performed. In one embodiment, encryption is performed on the private information field values to generate an encryption field value, and a key including an encryption field value and an encryption result value (key, value) composed of a value value Lt; / RTI > The format of the encryption result value may be 'line number' as a key and 'field sequence number * encryption field value' as a value. At this time, the encryption can be performed in units of individual information fields included in a plurality of personal information records, so that it is possible to store a plurality of encryption field values belonging to the same personal information record in one value. As described above, by storing a plurality of encryption field values belonging to the same personal information record in one value, it is possible to reduce the burden on sorting and merging when generating an encrypted record, thereby increasing the speed of generating encrypted records.

한편, 암호화 결과값의 키(value)는 '라인넘버' 이외에도 다른 키 값을 사용할 수 있다. 또한, 암호화 결과값의 벨류(value) 형식은 '필드 순번*암호화 필드값' 이외에도, 다양한 형식이 될 수 있다. 예를 들어, 모든 개인정보 필드가 암호화 대상인 경우에는 벨류(value) 형식은 '암호화 필드값'이 될 수 있다.On the other hand, the key value of the encryption result value may be a key value other than 'line number'. In addition, the value format of the encryption result value may be various formats in addition to 'field sequence number * encryption field value'. For example, if all personal information fields are to be encrypted, the value format may be an 'encryption field value'.

도 5는 도 4의 맵 태스크에서 개인정보 레코드로부터 암호화 필드값들이 생성되는 방법을 나타내는 순서도이고, 도 6은 도 5의 암호화 필드값들이 생성되는 방법에서 암호화 필드값들이 생성되는 일 예를 나타내는 도면이다.FIG. 5 is a flowchart showing how encryption field values are generated from a personal information record in the map task of FIG. 4, FIG. 6 is a diagram illustrating an example of how encryption field values are generated in a method in which encryption field values of FIG. to be.

도 5을 참조하면, 개인정보 파일블록으로부터 하나의 개인정보 레코드가 독출되고, 개인정보 레코드는 개인정보 필드들로 분할(S210)될 수 있다. 개인정보 필드가 암호화 대상인지 여부(S220)를 확인하여, 암호화 대상인 경우 암호화를 수행하기 위해 제1 리스트에 암호화 정책 및 개인정보 필드값이 저장되고, 암호화 결과값을 생성하여 암호화 레코드로 재조립하기 위해 제2 리스트에 키값, 제3 리스트에 필드 순번이 저장(S230)될 수 있다. 제1 리스트를 이용하여 암호화를 수행하고 암호화 필드값이 생성(S240)될 수 있다. 생성된 암호화 필드값이 출력(S250)되고 처리하지 않은 개인정보 필드의 존재가 존재하는지 여부(S260)를 확인하여 상기 단계들이 반복 수행될 수 있다.Referring to FIG. 5, one personal information record is read out from the personal information file block, and the personal information record is divided into personal information fields (S210). It is checked whether the private information field is an encryption target (S220). If the encryption information is an encryption target, the encryption policy and personal information field values are stored in the first list to perform encryption, and the encryption result value is generated and reassembled into an encrypted record A key value may be stored in the second list, and a field sequence number may be stored in the third list (S230). Encryption may be performed using the first list and an encryption field value may be generated (S240). The generated encryption field value is output (S250), and whether or not the presence of the unprocessed private information field exists exists (S260), the above steps can be repeatedly performed.

도 6을 참조하면, 도 3과 동일한 개인정보 SAM파일과 암호화 정책파일이 설정된 경우, 개인정보 파일블록으로부터 제1 개인정보 레코드(id0001 홍길동 7801011234567 01012341234)가 독출되어 개인정보 필드('id001', '홍길동', '7801011234567', '01012341234')들로 분할될 수 있다. 아이디와 패스워드는 암호화 대상인 필드가 아니므로 암호화를 수행하지 않고 암호화 결과값(key, value)인 '1 1*id001' 과 '1 2*홍길동'으로 각각 출력될 수 있다. 주민등록 번호는 제1 암호화 정책(P001)에 상응하고 핸드폰 번호는 제2 암호화 정책(P002)에 상응하므로 암호화를 수행하기 위해 제1 리스트에 'P001, 7801011234567', 'P002, 01012341234' 가 저장되고, 제2 리스트에 키 값으로 라인넘버인 '1', '1'이 저장되며, 제3 리스트에 필드 순번인 '1', '3'이 저장될 수 있다. 개인정보 암호화는 별도의 암호화 서버를 이용하여 수행될 수 있다. 이 때, 네트워크 부하 및 시스템 부하를 감소시키기 위해 복수의 개인정보 레코드에 포함되는 개인정보 필드값들이 하나의 암호화 요청으로 수행될 수 있다. 따라서, 기 지정된 수의 개인정보 레코드들에 대해 제1, 제2, 및 제3 리스트에 값을 저장하고, 하나의 암호화 요청을 통해 적어도 하나의 암호화 서버에 제공될 수 있다. 암호화 서버에서 개인정보 필드에 대한 암호화를 수행하여 암호화 필드값('7801011iqxkqf', '010123ajqjk')이 생성되고 이는 제1 리스트에 업데이트 될 수 있다. 따라서, 제1 리스트는 'P001, 7801011iqxkqf', 'P002, 010123ajqjk', 'P001, 5602011diqjkj', 'P002, 010998qksjq' 등과 같이 암호화 필드값을 포함하는 데이터를 저장할 수 있다. 또한, 업데이트 된 제1 리스트 및 제2, 제3 리스트를 이용하여 '1 3*7801011iqxkqf', '1 4*010123ajqjk', '1 3*5602011diqjkj', '1 4*010998qksjq'와 같은 암호화 결과값들이 각각 출력될 수 있다. 이 때, 암호화 레코드를 생성시 암호화 필드값들을 개인정보 레코드의 개인정보의 필드와 동일한 순서로 정렬하는 수행 속도를 높이기 위해, 동일한 레코드 별로 암호화 필드값들을 미리 정렬하고 합병하여, '1 3*7801011iqxkqf 4*010123ajqjk', '1 3*5602011diqjkj 4*010998qksjq'와 같은 암호화 결과값들이 각각 출력될 수 있다.6, when the same personal information SAM file and encryption policy file as in FIG. 3 are set, a first personal information record (id0001) is read out from a personal information file block and personal information fields ('id001' Quot ;, and " 01012341234 "). Since the ID and the password are not fields to be encrypted, they can be output as '1 1 * id001' and '1 2 * Hong Kil Dong' as encryption result values (key, value) without performing encryption. 'P001, 7801011234567', 'P002, 01012341234' are stored in the first list to perform encryption because the resident registration number corresponds to the first encryption policy P001 and the mobile phone number corresponds to the second encryption policy P002, The line numbers '1' and '1' are stored in the second list as key values, and the field numbers '1' and '3' are stored in the third list. Personal information encryption can be performed using a separate encryption server. At this time, the personal information field values included in the plurality of personal information records may be performed with one encryption request to reduce the network load and the system load. Thus, values may be stored in the first, second, and third lists for a predetermined number of personal information records, and provided to at least one encryption server via one encryption request. Encryption field values ('7801011iqxkqf', '010123ajqjk') are generated by performing encryption on the private information field at the encryption server, which may be updated in the first list. Accordingly, the first list may store data including the encryption field values such as 'P001, 7801011iqxkqf', 'P002, 010123ajqjk', 'P001, 5602011diqjkj', 'P002, 010998qksjq' The encrypted result values such as '13 * 7801011iqxkq', '14 * 010123ajqjk', '13 * 5602011diqjkj', and '14 * 010998qksjq' are updated using the updated first list and the second and third lists Respectively. At this time, in order to increase the speed of sorting the encrypted field values in the same order as the private information field of the personal information record when generating the encrypted record, the encryption field values are sorted and merged by the same record, and '1 3 * 7801011iqxkqf 4 * 010123ajqjk ',' 1 3 * 5602011diqjkj 4 * 010998qksjq 'can be output, respectively.

도 7은 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 암호화 레코드들이 생성되는 방법을 나타내는 순서도이고, 도 8은 도 7의 암호화 레코드들이 생성되는 방법에서 암호화 레코드들이 생성되는 일 예를 나타내는 도면이다.FIG. 7 is a flowchart illustrating a method of generating encrypted records in an operation method of the large-capacity personal information encryption system of FIG. 2, and FIG. 8 illustrates an example of generating encrypted records in a method of generating encrypted records of FIG. 7 .

도 7을 참조하면, 암호화 레코드를 생성하기 위해 암호화 필드값들 중 중복되는 암호화 필드값을 제거(S310)할 수 있다. 클라우드 컴퓨팅 또는 암호화 요청의 네트워크 송수신 특성 상 동일한 개인정보 필드 값에 대한 암호화가 중복적으로 수행되어 암호화 필드값이 중복되어 생성되는 경우가 발생할 수 있다. 따라서, 데이터의 무결성을 위해 암호화 필드값들 중 중복되는 암호화 필드값을 제거할 수 있다. 필드 순번 순서로 정렬되지 않은 암호화 필드값의 존재 여부를 판단(S320)할 수 있다. 만약, 정렬되지 않은 암호화 필드값이 존재하지 않으면, 이미 동일한 레코드 별로 암호화 필드값들이 미리 정렬된 것이므로, 별도의 정렬 없이 암호화 레코드가 생성될 수 있다. 예를 들면, 모든 개인정보 필드가 암호화 대상인 경우 모든 암호화 필드값들이 생성될 때 미리 정렬될 수 있으므로 별도의 정렬 없이 암호화 레코드가 생성(S340)될 수 있다. 반면에, 암호화 대상이 아닌 개인정보 필드값이 존재하여 정렬되지 않은 암호화 필드값의 존재하는 경우, 동일한 암호화 레코드에 해당하는 암호화 필드값들이 개인정보 레코드의 개인정보의 필드와 동일한 순서로 정렬(S330)되고, 암호화 레코드가 생성(S340)될 수 있다.Referring to FIG. 7, the duplicated encryption field value among the encryption field values may be removed (S310) to generate the encrypted record. It may happen that encryption of the same private information field value is performed redundantly in the network transmission / reception characteristic of the cloud computing or the encryption request, so that the encryption field value is duplicated. Therefore, redundant encryption field values among encryption field values can be removed for data integrity. It is possible to determine whether there is an unencrypted encryption field value in the field sequence number order (S320). If an unaligned encryption field value does not exist, the encrypted field values are already sorted in advance by the same record, so an encrypted record can be generated without any sorting. For example, if all the private information fields are to be encrypted, the encryption records can be generated (S340) without any sorting because all encryption field values can be sorted in advance when they are generated. On the other hand, when there is an unaligned encryption field value, the encrypted field values corresponding to the same encrypted record are sorted in the same order as the private information field of the personal information record (S330 ), And an encrypted record may be generated (S340).

도 8을 참조하면, 암호화 레코드는 동일한 개인정보 레코드로부터 생성된 암호화 필드값들을 재조립하여 생성될 수 있다. 일 실시예에서, 슬레이브 서버들 각각에서, 분배된 개인정보 파일블록에 포함되는 개인정보 레코드들로부터 생성된 암호화 레코드들은 개인정보 레코드들과 동일한 순서로 정렬될 수 있다. 즉, 각각의 슬레이브 서버에서 재조립한 암호화 레코드에 대해 부분적으로 정렬함으로써 암호화 SAM파일 생성 시 효율을 높일 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템은 암호화 레코드 생성에 대한 태스크가 리듀서 클래스(Reducer Class)에서 정의되고, 각각의 슬레이브 서버에서 컴바이너 단계로 수행될 수 있다. 예를 들어, 암호화 필드값을 포함하는 암호화 결과값들로 '1 1*id0001', '1 2*홍길동', '1 3*7801011iqxkqf 4*010123ajqjk', '2 1*id0002', ... 가 생성된 경우, 제1 개인정보 레코드로부터 생성된 암호화 필드값들을 재조립하여 암호화 레코드 '1 id0001 홍길동 7801011iqxkqf 010123ajqjk'가 생성될 수 있다. 마찬가지로, 제2 개인정보 레코드로부터 생성된 암호화 필드값들을 재조립하여 암호화 레코드 '2 id0002 이순신 5602011diqjkj 010998qksjq'가 생성되고, 또 다른 개인정보 레코드로부터 생성된 암호화 필드값들이 재조립되어 암호화 레코드가 생성될 수 있다. 이 때, 동일한 슬레이브 서버에서 처리된 동일한 개인정보 파일블록에 포함되는 개인정보 레코드들로부터 생성된 암호화 레코드들은 개인정보 레코드들과 동일한 순서로 정렬될 수 있다.Referring to FIG. 8, an encrypted record may be generated by reassembling encrypted field values generated from the same personal information record. In one embodiment, in each of the slave servers, the encrypted records generated from the personal information records contained in the distributed personal information file block may be arranged in the same order as the personal information records. That is, by partially sorting the reassembled cryptographic records in each slave server, it is possible to increase the efficiency in generating the encrypted SAM file. A large-capacity personal information encryption system using the Hadoop system is defined in the Reducer class for the task of generating encrypted records, and can be performed in a combiner stage in each slave server. For example, '1 1 * id0001', '1 2 * Hong Kil Dong', '1 3 * 7801011iqxkqf 4 * 010123ajqjk', '2 1 * id0002', ... If generated, the encrypted field values generated from the first personal information record may be reassembled to generate the encrypted record '1 id0001 Hong Gil Dong 7801011iqxkqf 010123ajqjk'. Similarly, the encrypted field values generated from the second personal information record are reassembled to generate an encrypted record '2 id0002 yi 5602011diqjkj 010998qksjq', and the encrypted field values generated from another personal information record are reassembled to generate an encrypted record . At this time, the encrypted records generated from the personal information records included in the same personal information file block processed in the same slave server can be arranged in the same order as the personal information records.

한편, 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템은 암호화 레코드 생성에 대한 태스크가 각각의 슬레이브 서버에서 수행되는 컴바이너 클래스(Combiner Class)에서 정의되지 않고, 하나의 슬레이브 서버에서 수행되는 리듀서 클래스(Reducer Class)에서 정의될 수 있다.On the other hand, the large-capacity personal information encryption system using the Hadoop system has a problem that a task for generating an encrypted record is not defined in a combiner class performed in each slave server, but a reducer class Class).

도 9는 도 2의 대용량 개인정보 암호화 시스템의 동작 방법에서 암호화 SAM파일이 생성되는 일 예를 나타내는 도면이다.9 is a diagram showing an example in which an encrypted SAM file is generated in an operation method of the large-capacity personal information encryption system of FIG.

도 9를 참조하면, 암호화 SAM파일은 슬레이브 서버들 중 어느 한 슬레이브 서버에서 암호화 레코드들을 정렬하고, 합병하여 생성될 수 있다. 하둡 시스템을 이용한 대용량 개인정보 암호화 시스템은 암호화 SAM파일 생성에 대한 태스크가 리듀서 클래스(Reducer Class)에서 정의되고, 슬레이브 서버들 중 어느 한 슬레이브 서버에서 리듀스 태스크(Reduce Task)로 수행될 수 있다. 일 실시예에서, 각 슬레이브 서버에서 생성된 암호화 레코드들을 하나의 슬레이브 서버에서 수집, 정렬 및 합병을 수행하여 암호화 SAM파일을 생성할 수 있다. 예를 들어, 각 슬레이브 서버에서 수행되는 컴바이너(Combiner) 단계에서 암호화 레코드들이 생성되고, 부분적으로 정렬될 수 있다. 각 슬레이브 서버에서 생성된 암호화 레코드들은 한 슬레이브 서버에서 수행되는 서버로 전송되고, 리듀스 태스크(Reduce Task)는 암호화 레코드들을 수집, 정렬 및 합병을 수행하여 암호화 SAM파일을 생성할 수 있다. Referring to FIG. 9, the encrypted SAM file may be generated by sorting and merging encrypted records in one of the slave servers. In the large-capacity personal information encryption system using the Hadoop system, a task for generating the encrypted SAM file is defined in the Reducer Class and can be performed as a Reduce Task in one of the slave servers. In one embodiment, the cryptographic records generated on each slave server may be collected, sorted and merged in one slave server to generate a cryptographic SAM file. For example, in the combiner stage performed on each slave server, encrypted records can be generated and partially sorted. The encrypted records generated in each slave server are transmitted to the server executed in one slave server, and the Reduce Task can collect, sort and merge the encrypted records to generate the encrypted SAM file.

다른 실시예에서, 컴바이너(Combiner) 단계 없이 한 슬레이브 서버에서 수행되는 리듀스 태스크(Reduce Task)로 암호화 레코드와 암호화 SAM파일이 생성될 수 있다. 즉, 각 슬레이브 서버에서 생성된 암호화 필드값들이 하나의 슬레이브 서버에 전송되고, 하나의 슬레이브 서버에서 암호화 레코드들이 생성되며, 생성된 암호화 레코드들을 정렬 및 합병을 수행함으로써 암호화 SAM파일이 생성될 수 있다. 다만, 하나의 슬레이브 서버에서 암호화 레코드를 생성하는 경우, 해당 슬레이브 서버에 부하가 커질 수 있다.In another embodiment, encrypted records and encrypted SAM files may be generated with a Reduce Task performed on one slave server without a combiner step. That is, encrypted field values generated in each slave server are transmitted to one slave server, encrypted records are generated in one slave server, and an encrypted SAM file can be generated by sorting and merging the generated encrypted records . However, when an encrypted record is generated in one slave server, the load on the slave server may be increased.

도 10은 본 발명의 실시예들에 따른 대용량 개인정보 암호화 시스템의 효과를 나타내는 도면이다.FIG. 10 is a diagram illustrating an effect of a large-capacity personal information encryption system according to embodiments of the present invention.

도 10을 참조하면, 개인정보 암호화를 1대의 노드에서 순차적으로 처리하는 방법(200)은 암호화 서버로 암호화 요청을 수행한 후 암호화 요청에 대한 응답을 받기까지 다음 암호화 요청이 이루어지지 않으므로 대기시간이 발생하게 된다. 반면에, 하둡 시스템을 이용하여 분산 처리 하는 방법(300)은 복수의 암호화 요청이 슬레이브 서버들에서 동시에 이루어지므로 암호화 요청에 대한 대기시간에 대한 제약이 없다고 볼 수 있다. 개인정보 암호화를 1대의 노드에서 순차적으로 처리하는 방법과, 개인정보 암호화를 7대의 노드를 포함하는 하둡 시스템을 이용하여 분산 처리 하는 방법에 대해 수행 속도 테스트를 수행한 결과는 하기 [표 1] 과 같다.Referring to FIG. 10, a method 200 for sequentially processing personal information in a single node may be configured such that a next encryption request is not received until a response to an encryption request is received after an encryption request is performed by the encryption server. . On the other hand, the method 300 for performing the distributed processing using the Hadoop system can be regarded as having no restriction on the waiting time for the encryption request since a plurality of encryption requests are simultaneously performed in the slave servers. The results of performing a performance test on a method of sequentially processing personal information encryption in one node and a method of distributing private information by using Hadoop system including seven nodes are shown in Table 1 same.

[표 1][Table 1]

이와 같이, 본 발명의 실시예들에 따른 대용량 개인정보 암호화 시스템의 동작 방법은 복수의 컴퓨터 클러스터를 이용하여 대용량 개인정보를 분산하여 암호화 처리함으로써, 개인정보 암호화에 따른 부하를 분산시키고 개인정보 암호화의 수행 속도를 높일 수 있다. 또한, 대용량 개인정보 암호화 시스템은 작은 컴퓨터 클러스터로 구성이 가능하고, 처리할 개인정보 용량의 증가에 따라 시스템 확장이 용이하여 유연하게 시스템을 구성할 수 있다. 이 때, 암호화 처리 속도가 장비의 수에 비례하여 증가될 수 있으므로 더욱 효과적이다.As described above, in the method of operating the large-capacity personal information encryption system according to the embodiments of the present invention, the large-capacity personal information is distributed and encrypted by using the plurality of computer clusters, The speed of execution can be increased. In addition, a large-capacity personal information encryption system can be constituted by a small computer cluster, and a system can be flexibly constructed by expanding the system according to an increase in the capacity of personal information to be processed. At this time, the encryption processing speed can be increased in proportion to the number of devices, which is more effective.

이상, 본 발명의 실시예들에 따른 대용량 개인정보 암호화 시스템 및 그의 동작 방법에 대하여 도면을 참조하여 설명하였지만, 상기 설명은 예시적인 것으로서 본 발명의 기술적 사상을 벗어나지 않는 범위에서 해당 기술 분야에서 통상의 지식을 가진 자에 의하여 수정 및 변경될 수 있을 것이다. 예를 들어, 상기에서는 개인정보가 개인들의 신상정보인 것으로 설명하였으나, 개인정보는 암호화하여 보호할 필요성이 있는 모든 정보를 포괄하는 것으로 이해되어야 한다. 또한, 암호화 방식은 다양한 방식을 사용할 수 있으며, 실시예들에 한정되는 것이 아니다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the present invention is not limited to the disclosed exemplary embodiments. Modifications and alterations may be made by those skilled in the art. For example, although personal information is described as personal information of individuals, the personal information should be understood to include all information that needs to be encrypted and protected. The encryption method may use various methods, and is not limited to the embodiments.

본 발명은 암호화 처리를 수행할 정보를 관리하는 모든 시스템에 적용될 수 있다. 예를 들어, 본 발명은 고객정보 관리 시스템, 거래내역 관리 시스템, 영업비밀 관리 시스템 등에 적용될 수 있다.The present invention can be applied to any system that manages information to be subjected to encryption processing. For example, the present invention can be applied to a customer information management system, a transaction history management system, a trade secret management system, and the like.

상기에서는 본 발명의 실시예들을 참조하여 설명하였지만, 해당 기술분야에서 통상의 지식을 가진 자는 하기의 특허청구범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined in the appended claims. You will understand.

30: 암호화 서버 120: 마스터 서버
122: 개인정보 SAM파일 관리부 124: 정책 설정부
126: 태스크 분배부 140: 슬레이브 서버
141: 개인정보 파일블록 저장부 142: 암호화부
144: 암호화 레코드 생성부 146: 암호화 SAM파일 생성부30: encryption server 120: master server
122: personal information SAM file management unit 124: policy setting unit
126: task distributor 140: slave server
141: Personal information file block storage unit 142: Encryption unit
144: Encryption record generation unit 146: Encryption SAM file generation unit

Claims

Distributing and copying a plurality of personal information file blocks divided from a personal information SAM (Sequential Access Method) file to a plurality of slave servers and storing the same;
Setting a personal information encryption policy for each personal information field of the personal information record included in the personal information SAM file;
Distributing an encryption processing task for encrypting the personal information file blocks to the slave servers;
The slave servers divide each of the personal information records included in the individual information file blocks into a plurality of personal information field values and encrypt the individual information field values in accordance with the personal information encryption policy, Generating values;
Reassembling the encrypted field values generated from the same personal information record to generate an encrypted record; And
And sorting and merging the encrypted records in one of the slave servers to generate an encrypted SAM file.

The method according to claim 1,
Each of the slave servers arranges the encrypted records generated from the personal information records included in the distributed personal information file block in the same order as the personal information records, To the slave server in response to the request from the slave server.

The method as claimed in claim 1, wherein the encryption field values are generated in at least one encryption server by an encryption request using the personal information encryption policy and the private information field value.

4. The method of claim 3, wherein a predetermined number of said personal information records are provided to said at least one encryption server via one said encryption request.

2. The method of claim 1, wherein generating the encrypted record comprises:
Removing redundant encryption field values among the encryption field values; And
And arranging the encrypted field values corresponding to the same encrypted record in the same order as the fields of the private information of the private information record.

The method as claimed in claim 1, wherein the large-capacity personal information encryption system uses a Hadoop system.

A master server for setting a policy for an encryption processing task for encrypting personal information and distributing the encryption processing task for distributed processing of the encryption processing task; And
And a plurality of slave servers for performing the encryption processing task distributed from the master server,
The master server comprises:
A personal information SAM file management unit for managing information on a plurality of personal information file blocks divided from the personal information SAM file;
A policy setting unit for setting a personal information encryption policy in units of personal information fields of the personal information record included in the personal information SAM file; And
And a task distributor for distributing the encryption processing task for the personal information file blocks to the slave servers,
The slave server includes:
A personal information file block storage unit for storing the personal information file blocks;
An encryption unit for dividing the personal information record included in the personal information file blocks into a plurality of personal information field values and encrypting the personal information field values according to the personal information encryption policy to generate encrypted field values;
An encryption record generation unit for generating an encryption record by reassembling the encryption field values generated from the same personal information record; And
And a cipher SAM file generation unit for sorting and merging the encrypted records to generate an encrypted SAM file.

8. The method according to claim 7, wherein the encryption record generation unit arranges the encrypted records generated from the personal information records included in the personal information file block in the same order as the personal information records. system.

The information processing apparatus according to claim 7, wherein the encryption unit performs an encryption request to at least one encryption server using the personal information encryption policy and the private information field value, and receives the encryption field value generated in the encryption server A large-capacity personal information encryption system.

10. The system of claim 9, wherein the encryption unit provides the predetermined number of the personal information records to the encryption server through one encryption request.

8. The system according to claim 7, wherein the personal information SAM file management unit and the personal information file block storage unit use a Hadoop Distributed File System (HDFS)
Wherein the policy setting unit, the task distribution unit, the encryption unit, the encryption record creation unit, and the encryption SAM file creation unit use the Hadoop map-reduce.