KR100568369B1

KR100568369B1 - Method for classifying and searching data

Info

Publication number: KR100568369B1
Application number: KR1019990003437A
Authority: KR
Inventors: 구본곤; 김정중
Original assignee: 엘지전자 주식회사
Priority date: 1998-05-23
Filing date: 1999-02-02
Publication date: 2006-04-05
Also published as: JPH11353313A; JP3362125B2; KR19990087858A

Abstract

본 발명은 데이터의 자동 분류방법 및 분류데이터의 검색/분석방법에 관한 것으로서, 본 발명에 따른 데이터의 자동 분류방법은, 기록매체에 기록된 자료데이터에서 기 지정된 특정부분을 선택하는 제 1단계; 상기 선택된 데이터 중에서 특정한 항목필드에 포함되는 데이터를, 그 데이터의 내용을 기준으로 하여 각각의 독립적인 2개 이상의 항목으로 분류하는 제 2단계; 상기 분류된 데이터의 내용에 상응하는 식별코드를 각기 선택하여 상기 분류된 해당 항목의 데이터에 부가하는 제 3단계; 및 상기 분류 및 부가된 자료데이터를 타 기록매체에 기록하는 제 4단계를 포함하여 이루어져, 방대한 자료데이터 중에서, 사용자가 원하는 문제해결에 필요한 자료데이터에 가장 근접하는 자료데이터의 검색 및 분석이 이루어지도록 하여, 새로운 문제해결을 위한 신기술이나 타인의 권리를 침해하지 않는 새로운 해결수단을 용이하게 찾을 수 있도록 하며, 또한 자동분류된 자료데이터의 검색 및 분석결과를 문제해결에 관한 내용을 구분하여 제공하는 한편, 검색 및 분석된 자료를 그래프 맵, 파이 맵, 트리 맵 등과 같이 다양한 형태로 화면상에 제공함으로써, 검색 및 분석된 자료들에서, 문제해결 관점에서 원하는 자료를 신속하게 찾을 수 있도록 하며, 또한 사용자의 자료데이터 검색 및 분석결과 파악이 용이하도록 한 매우 유용한 발명인 것이다.The present invention relates to a method for automatically classifying data and a method for searching / analyzing classification data. The method for automatically classifying data according to the present invention includes: a first step of selecting a predetermined specific part from data data recorded on a recording medium; A second step of classifying data included in a specific item field among the selected data into two or more independent items based on the contents of the data; A third step of selecting respective identification codes corresponding to the contents of the classified data and adding them to the data of the classified items; And a fourth step of recording the classified and added data data on another recording medium, so that the search and analysis of the data data closest to the data data necessary for the user's desired problem solving can be performed among the vast data data. In this way, it is easy to find new technologies for solving new problems or new solutions that do not infringe the rights of others, and also provide information on problem solving by searching and analyzing the data classified automatically. By providing searched and analyzed data on the screen in various forms such as graph map, pie map, tree map, etc., it is possible to quickly find the desired data from the searched and analyzed data from the point of view of problem solving. It is a very useful invention that makes it easy to search for data and analyze analysis results.

Description

Automatic classification method of data and searching / analysis method of classification data {Method for classifying and searching data}

도1은 종래의 자료데이터 수동 분류방법의 흐름도이고,1 is a flowchart of a conventional manual method for classifying data data;

도2는 종래의 자료데이터 화면 출력방법의 흐름도이고,2 is a flowchart of a conventional method of outputting a data screen of data;

도3은 본 발명에 따른 데이터의 자동 분류방법 및 분류데이터의 검색/분석방법의 일 실시예가 적용된 통합 정보관리 시스템의 전체 네트워크 구성을 개략화하여 도시한 것이고,3 is a schematic diagram illustrating an entire network configuration of an integrated information management system to which an embodiment of an automatic classification method of data and a search / analysis method of classification data according to the present invention is applied,

도4는 본 발명에 따른 데이터의 자동 분류방법의 바람직한 일 실시예의 흐름을 도시한 것이고,Figure 4 shows the flow of a preferred embodiment of the automatic classification method of data according to the present invention,

도5는 각 기술분야별 국제특허 분류코드(IPC Code)에 상응하는 해결과제 및 해결수단의 단어 및 이에 대한 식별코드가 연계저장되어 있는 것을 예시적으로 도시한 것이고,FIG. 5 is an exemplary diagram showing that the words and the identification codes of the solutions and solutions corresponding to the international patent classification codes (IPC Code) for each technical field are stored in association.

도6은 도3 장치 내의 하드디스크 장치(HDD) 내에 축적저장되는 자료데이터들 중 하나의 구성데이터의 내용을 도식화하여 표현한 것이고,FIG. 6 is a diagram showing the content of one configuration data among data data accumulated and stored in a hard disk device (HDD) in FIG. 3;

도7은 본 발명에 따른 분류된 데이터의 검색/분석방법의 바람직한 일 실시예의 흐름을 도시한 것이고,Figure 7 shows the flow of one preferred embodiment of a method for searching / analyzing classified data according to the present invention,

도8은 웹 브라우저(Web Browser)를 통한 통합 정보관리 시스템의 접속상태를 도시한 것이고,8 shows a connection state of an integrated information management system through a web browser,

도9는 사용자의 자료데이터 검색을 위해 통합 정보관리 시스템이 제공하는 초기 메뉴화면을 도시한 것이고,FIG. 9 illustrates an initial menu screen provided by an integrated information management system for retrieving user data data. FIG.

도10은 자료데이터의 검색결과에 따른 개략정보의 화면출력 형태를 예시적으로 도시한 것이고,10 is a diagram illustrating an example of a screen output form of schematic information according to a search result of data data;

도11은 자료데이터의 검색결과가 트리 맵(Tree Map) 형태로 화면출력되는 경우를 도시한 것이고,FIG. 11 illustrates a case in which data search results are outputted in the form of a tree map.

도12는 자료데이터의 검색결과가 파이 맵(Pie Map) 형태로 화면출력되는 경우를 도시한 것이고,FIG. 12 illustrates a case in which the search result of the data data is output on a pie map form.

도13은 자료데이터의 검색결과가 그래프 맵(Graph Map) 형태로 화면출력되는 경우를 도시한 것이다.FIG. 13 shows a case where a search result of data data is output on a screen in the form of a graph map.

※ 도면의 주요부분에 대한 부호의 설명※ Explanation of code for main part of drawing

PC₁,PC₂,PC₃,.. : 검색단말기10 : 네트워크 인터페이스PC ₁ , PC ₂ , PC ₃ , ..: Search Terminal 10: Network Interface

20 : 롬(ROM)21 : 메모리 뱅크20: ROM 21: Memory Bank

30 : 하드디스크 장치(HDD)40 : 중앙처리장치30: hard disk device (HDD) 40: central processing unit

41 : 분류 프로그램42 : 검색엔진41: classification program 42: search engine

43 : DB 관리프로그램50 : 시디롬 드라이버(CD-ROM Driver)43: DB manager 50: CD-ROM driver

100 : 통합 정보관리 검색서버100: integrated information management search server

본 발명은, 데이터의 자동 분류방법 및 분류데이터의 검색/분석방법에 관한 것으로서, 더욱 상세하게는 특허, 논문 또는 기술보고서 등의 자료데이터를, 자료가 제시하는 문제해결에 관해 지정된 분류기준에 근거하여 그 내용을 분류하고, 이와 같이 분류된 자료데이터를 사용자가 필요로 하는 문제 해결과제 또는 수단에 의해 검색 및 분석할 수 있도록 하는 데이터의 자동 분류방법 및 분류데이터의 검색/분석방법에 관한 것이다.The present invention relates to a method for automatically classifying data and a method for searching / analyzing classified data, and more particularly, based on a specified classification standard for solving a problem in which data is presented, such as a patent, a paper, or a technical report. The present invention relates to a method for automatically classifying data and a method for searching / analyzing classified data so as to classify the contents and search and analyze the classified data data by a problem solving task or means required by a user.

도1은 종래의 자료데이터 수동 분류방법의 흐름도이고, 도2는 종래의 자료데이터 화면 출력방법의 흐름도로서, 이를 참조하여 종래의 자료데이터의 분류 및 화면 출력방법에 대해 설명하면 다음과 같다.1 is a flowchart of a conventional method of manually classifying data data, and FIG. 2 is a flowchart of a method of outputting a conventional data data screen. Referring to this, a method of classifying and outputting conventional data data is as follows.

여기서, 자료데이터라 함은, 특허, 논문, 기술보고서, 부품정보 등을 의미하는데, 특허정보를 예로하여 설명하기로 한다.Here, data data means patents, articles, technical reports, parts information, and the like, which will be described using patent information as an example.

먼저, 도1에 의한 특허정보의 수동 분류방법에 대해 설명하면, 사용자는 특허정보 중에서 검색대상의 자료를 선정하여(S1) 그 자료의 형태가 문서형태인지 또는 시디롬(CD-ROM)(이하, '디스크'라 함) 형태인지 또는 기타 다른 형태인지를 파악하게 된다(S2).First, the manual classification method of patent information according to FIG. 1 will be described. The user selects data to be searched from the patent information (S1), and whether the data is in document form or CD-ROM (hereinafter, referred to as "CD-ROM"). 'Disk') or other forms are identified (S2).

이후, 자료데이터가 디스크나 테이프와 같은 기록매체에 기록되어 있는지 확 인하여(S3), 기록매체에 기록되어 있지 않은 경우에는 데이터베이스(DB)화 할 수 있도록, 디스크와 같은 기록매체에 저장하게 된다(S4).After that, it is checked whether the data is recorded on a recording medium such as a disk or a tape (S3). If it is not recorded on the recording medium, the data is stored on a recording medium such as a disk so that a database (DB) can be made ( S4).

상기와 같은 일련의 준비과정이 완료되면, 각각의 자료를 디스크로 부터 순차적으로 읽어내어, 검색대상의 필드, 즉 필요로 하는 항목의 데이터(예를 들어, 발명의 목적, 요약서 등)가 존재하는지를 확인하여(S5), 존재하는 경우에는 사용자가 직접 자료를 검색한 후 수동으로 분류하고(S7), 만일 존재하지 않는 경우에는 임의로 검색대상 필드를 선정하게 된다(S6).Upon completion of the above series of preparations, each data is read sequentially from the disc to determine whether there is a field to be searched, i.e., the data of the required item (e.g., the purpose of the invention, the abstract, etc.). If it is present (S5), if present, the user directly retrieves the data and sorts it manually (S7). If it is not present, the search target field is randomly selected (S6).

이후, 상기의 과정을 통해 분류된 데이터의 신뢰성을 확인하여(S8), 즉 정상적으로 분류되었는지를 확인하여, 조건에 만족되는 경우 그 분류된 데이터를 하드디스크 장치(HDD) 등에 저장함과 아울러 화면 또는 프린터로 출력하고 종료한다(S9).Thereafter, the reliability of the classified data is checked through the above process (S8), that is, whether the classification is normally performed, and if the condition is satisfied, the classified data is stored in a hard disk device (HDD) or the like, and the screen or printer Output and exit (S9).

도2에 의한, 상기와 같이 분류된 특허정보의 화면 출력방법은 다음과 같다.According to FIG. 2, the screen output method of the patent information classified as described above is as follows.

상기와 같이 특허정보가 분류저장된 상태에서, 사용자가 검색하고자 하는 대상의 자료를 선정한 후(S11), 그 검색대상 자료에 관련된 키워드를 입력하여(S12) 해당 검색대상 자료가 검출되었는지를 확인하게 된다(S13).In the state where the patent information is classified and stored as described above, the user selects the material of the object to be searched (S11), and inputs a keyword related to the searched material (S12) to check whether the corresponding searched material is detected. (S13).

해당 검색대상 자료가 검출되면, 그 검출된 자료의 리스트 출력화면을 선정한 후(S14) 소정의 검색결과 형태로 화면에 출력표시되도록 함으로써(S15), 사용자는 그 출력표시된 화면의 데이터가 적절한지를 확인하게 된다(S16).When the corresponding search target data is detected, the list output screen of the detected data is selected (S14) and then displayed on the screen in the form of a predetermined search result (S15), so that the user checks whether the data on the output displayed screen is appropriate. (S16).

상기 확인결과, 상기 화면데이터가 적절하지 않으면, 적절한 데이터를 얻을 때까지 상기의 검색과정을 반복수행하고, 이에 따라 적절하다고 판단되면 그 데이터를 저장 및 출력하게 된다(S17).As a result of the check, if the screen data is not appropriate, the above search process is repeated until the appropriate data is obtained, and if it is determined that appropriate, the data is stored and output (S17).

그러나, 상기와 같이 이루어지는 종래의 자료데이터의 분류 및 표시방법에 있어서는, 방대한 정보량을 수동으로 직접 분류하고, 기업의 업무분야별 특성을 고려하지 않은채 일반적인 정보를 구축대상으로 하여 데이터베이스의 기능을 비효율적으로 활용하게 되는 문제점이 있으며, 특히 키워드(Keyword)로 데이터베이스를 검색하여 검색결과를 리스트 형태로만 화면상에 출력표시하므로, 사용자가 요구하는 다양한 형태의 검색결과를 제공할 수 없으며, 또한 데이터베이스 검색을 위한 입력요소가 키워드 등으로 한정되어 있어, 검색되는 정보량이 방대하며, 이에 따라 사용자는 수회에 걸쳐 검색하고 화면출력되는 방대한 검색 결과정보를 일일이 확인하면서, 자신이 원하는 자료를 찾아야 하는 번거로운 문제점이 있었다.However, in the conventional method of classifying and displaying data data as described above, a large amount of information is directly classified manually, and general information is used as an object of construction without considering the characteristics of the business fields of the enterprise, thereby inefficiently functioning the database. There is a problem to utilize, and in particular, since searching the database by keyword and displaying the search results on the screen only in the form of a list, it is impossible to provide various types of search results required by the user, and also to search the database. Since input elements are limited to keywords and the like, the amount of information to be searched is enormous, and accordingly, the user has to search for a number of times and check the vast search result information displayed on the screen one by one, and has a troublesome problem of finding the desired data.

따라서, 본 발명은 상기와 같은 문제점을 해결하기 위하여 창작된 것으로서, 각종 자료데이터가 데이터의 검색용도에 맞게 기 지정된 분류기준에 근거하여 자동으로 분류되도록 하고, 이와 같이 분류된 자료데이터를 사용자가 필요로 하는 문제 해결과제 또는 수단에 의해 손쉽게 검색 및 분석할 수 있도록 함으로써, 자료데이터의 검색 또는 분석 및 이의 획득이 용이하도록 하는 데 그 목적이 있는 것이다.
Therefore, the present invention has been created to solve the above problems, so that various data data are automatically classified based on a predetermined classification criteria according to the retrieval purpose of the data, and the user needs the classified data data as described above. The purpose of the present invention is to make it easy to search and analyze data by a problem solving task or means, so that the search or analysis of data data can be easily obtained.

상기와 같은 목적을 달성하기 위한 본 발명에 따른 데이터의 자동 분류방법은, 기록매체에 기록된 자료데이터에서 기 지정된 특정부분을 선택하는 제 1단계; 상기 선택된 데이터 중에서 특정한 항목필드에 포함되는 데이터를, 그 데이터의 내용을 기준으로 하여 각각의 독립적인 2개 이상의 항목으로 분류하는 제 2단계; 상기 분류된 데이터의 내용에 상응하는 식별코드를 각기 선택하여 상기 분류된 해당 항목의 데이터에 부가하는 제 3단계; 및 상기 분류 및 부가된 자료데이터를 타 기록매체에 기록하는 제 4단계를 포함하여 이루어지는 것에 특징이 있는 것이며,According to an aspect of the present invention, there is provided a method for automatically classifying data, comprising: a first step of selecting a predetermined specific portion from data data recorded on a recording medium; A second step of classifying data included in a specific item field among the selected data into two or more independent items based on the contents of the data; A third step of selecting respective identification codes corresponding to the contents of the classified data and adding them to the data of the classified items; And a fourth step of recording the classified and added data data on another recording medium.

또한, 본 발명에 따른 자동분류된 데이터의 검색방법은, 외부로 부터 입력되는, 문제해결에 관한 분류코드를 기 지정된 데이터베이스(DB) 상에서 검색하는 제 1단계; 및 상기 검색된 분류코드와 연계저장된 자료데이터의 개략정보를 화면상에 출력표시하는 제 2단계를 포함하여 이루어지는 것에 특징이 있는 것이다.In addition, a method for retrieving automatically classified data according to the present invention may include: a first step of retrieving a classification code related to problem solving, which is input from the outside, on a predetermined database (DB); And a second step of outputting and displaying, on the screen, the schematic information of the data data stored in association with the retrieved classification code.

이하, 본 발명에 따른 데이터의 자동 분류방법 및 분류데이터의 검색/분석방법의 일 실시예에 대해 첨부된 도면에 의거하여 상세히 설명한다.Hereinafter, an embodiment of an automatic classification method of data and a search / analysis method of classification data according to the present invention will be described in detail with reference to the accompanying drawings.

도3은 본 발명에 따른 데이터의 자동 분류방법 및 분류데이터의 검색/분석방법의 일 실시예가 적용된 통합 정보관리 시스템의 전체 네트워크 구성을 도시한 것으로서, 기 지정된 네트워크(LAN)를 통해 다수의 검색단말기(PC₁,PC₂,..)와 데이터를 교환하는 네트워크 인터페이스(10); 각 기술분야별 국제특허 분류코드(IPC Code)에 대응예정된 해결과제 및 해결수단의 대표단어 및 이에 대한 식별코드가 연 계저장되어 있는 롬(ROM)(20); 프로그램 로딩을 위한 메모리 뱅크(21); 상기 메모리 뱅크(21)에 로딩된 프로그램을 실행하여 자료데이터의 분류, 저장, 검색 및 분석을 수행하는 중앙처리장치(40); 분류된 자료데이터를 축적저장하고, 분류/검색을 위한 프로그램이 수록되어 있는 하드디스크 장치(HDD)(30); 및 특허 자료데이터가 수록된 시디롬(CD-ROM)이 삽입장착되는 시디롬 드라이버(CD-ROM Driver)(50)를 포함하여 구성되어 있다.3 is a diagram illustrating an entire network configuration of an integrated information management system to which an embodiment of an automatic classification method of data and a search / analysis method of classification data according to the present invention is applied, and a plurality of search terminals through a predetermined network (LAN). A network interface 10 for exchanging data with (PC ₁ , PC ₂ , ..); A ROM (ROM) 20 in which representative words of a problem and a means for solving the corresponding problem and an identification code corresponding to the international patent classification code (IPC Code) for each technical field are stored in association; A memory bank 21 for program loading; A central processing unit 40 executing a program loaded in the memory bank 21 to perform classification, storage, retrieval and analysis of data data; A hard disk device (HDD) 30 for storing and storing classified data data and storing a program for sorting / searching; And a CD-ROM driver 50 into which a CD-ROM containing patent data data is inserted.

도4는 본 발명에 따른 데이터의 자동 분류방법의 바람직한 일 실시예의 흐름을 도시한 것으로서, 이하 도3의 구성을 참조하여 본 발명에 따른 도4의 자동 분류방법에 대해 상세히 설명한다.4 is a flowchart illustrating a preferred embodiment of the automatic classification method of data according to the present invention. Hereinafter, the automatic classification method of FIG. 4 according to the present invention will be described in detail with reference to the configuration of FIG. 3.

먼저, 전술한 종래의 방법에서와 같이, 상기 자료데이터를 특허정보로 가정하고, 도5에 도시된 바와 같이 상기 롬(20)에, 각 기술분야별 국제특허 분류코드(IPC)와 이에 대응되는 해결과제 및 해결수단의 대표단어, 그리고 이에 대한 식별코드가 연계저장되어 있는 상태에서, 분석대상 자료의 선정 및 그 자료의 형태파악, 이에 따른 자료데이터의 디스크에의 저장 등과 같은 일련의 준비과정이 완료된 뒤(S21~S24), 특허정보(자료데이터)가 수록된 시디롬(디스크)이 시디롬 드라이버(50)에 삽입되면, 상기 하드디스크 장치(30)로 부터 상기 메모리 뱅크(21)에 로드되어 상기 중앙처리장치(40)에 의해 실행되는 분류 프로그램(41)은 상기 시디롬 드라이버(50) 내의 제어부(도면 미도시)에 삽입된 디스크 내의 자료데이터의 송신을 요청하게 되고, 이에 따라 상기 시디롬 드라이버(50) 내의 제어부는 삽입된 디스크 내의 자료데이터가 순차적으로 독출되도록 한다.First, as in the conventional method described above, it is assumed that the data data as patent information, and in the ROM 20, as shown in FIG. 5, an international patent classification code (IPC) corresponding to each technical field and a solution corresponding thereto. In the state that the representative words of the task and the solution and the identification code for the related data are stored together, a series of preparation processes such as selection of the data to be analyzed and the identification of the data and the storage of the data on the disk are completed. Afterwards (S21 to S24), when a CD-ROM (disk) containing patent information (data data) is inserted into the CD-ROM driver 50, it is loaded from the hard disk device 30 into the memory bank 21 and is subjected to the central processing. The classification program 41 executed by the device 40 requests transmission of data data in a disc inserted into a control unit (not shown) in the CD-ROM driver 50, and thus the CD-ROM. The control unit in the driver 50 causes the data data in the inserted disc to be read sequentially.

이에 따라, 상기 분류 프로그램(41)은 디스크로 부터 독출되는 자료데이터에 대해 분류동작을 수행하게 되는데, 상기 분류 프로그램(41)은 삽입장착된 상기 디스크로 부터 독출 입력되는 자료데이터로 부터, 전술한 바와 같이 특정 분석대상 필드의 존재여부 확인에 따라 분석대상 필드를 선정하게 되고(S25~S26), 분석대상의 필드, 즉 필요로 하는 항목(특허자료의 요약서 등)의 데이터가 선정되면, 상기 분류 프로그램(41)은 상기 선정된 분석대상 필드내에서 일정하게 규칙성을 가진 단어나 문장을 검출하여, 이를 기준으로 하여 분석을 위한 기본데이터인 해결과제와 해결수단으로 문장을 분류하게 된다(S27).Accordingly, the classification program 41 performs a classification operation on the data data read out from the disk, and the classification program 41 reads from the data data read out and inputted from the inserted disk. As described above, the analysis target field is selected according to the existence of the specific analysis target field (S25 ~ S26), and when the data of the field of the analysis target, ie, the required item (summary of the patent data) is selected, the classification is performed. The program 41 detects a word or a sentence having regularity in the selected analysis target field and classifies the sentence into a solution and a solution which are basic data for analysis based on this (S27). .

그 예로서, 일본특허 자료데이터의 초록(요약서)부분에는 발명의 목적필드가 TO 문장과 BY 문장으로, 일정한 형태로 구성되어 있으므로, 이를 이용하여 TO 문장은 해결과제로, BY 문장은 해결수단으로 발명의 목적 필드를 분류하게 된다.As an example, in the abstract (summary) part of Japanese patent data data, since the object field of the invention is composed of a TO sentence and a BY sentence, the TO sentence is a task and the BY sentence is a solution. The purpose field of the invention will be classified.

또한, 미국특허 자료데이터에서는 초록의 전(全)문장 중에서 첫문장을 해결과제로, 나머지 문장을 해결수단으로 분류하며, 국내특허 자료데이터인 경우에는, 요약부분이 있는 자료는 요약부분을 해결과제로, 특허청구범위를 해결수단으로 분류하고, 만약 요약부분이 없는 경우에는, 특허청구범위의 첫문장을 해결과제로, 나머지 문장을 해결수단으로 분류함으로써, 분석을 위한 기본데이터로 활용할 수 있도록 한다.In addition, in the US patent data data, the first sentence of the entire sentence of the abstract is classified as a solution, and the remaining sentences are classified as a solution. In the case of domestic patent data, the data with the summary portion resolves the summary portion. The claim is classified as a solution, and if there is no summary, the first sentence of the claim is classified as a solution and the remaining sentence as a solution, so that it can be used as basic data for analysis. .

이후, 상기 분류 프로그램(41)은 상기와 같이 분류된 데이터의 신뢰성을 확인하여, 즉 정상적으로 분류되었는지를 확인하여(S28) 조건에 만족되는 경우, 그 분류된 데이터 내용에 상응하는 식별코드를 상기 분류데이터에 부가하게 되는데, 이를 위해 상기 중앙처리장치(40)는 상기 분류된 특허자료에 지정되어 있는 국제특허 분류코드를 해당 자료데이터로 부터 검출하여, 상기 검출된 분류코드의 상위개념의 코드를 선정한다. 즉, 상기 검출된 분류코드가 G11B7/00인 경우에 이 분류코드의 상위개념 코드는 G11B가 되게 된다.Thereafter, the classification program 41 checks the reliability of the classified data as described above, that is, confirms whether it is normally classified (S28), and when the condition is satisfied, classifies the identification code corresponding to the classified data content. The central processing unit 40 detects the international patent classification code specified in the classified patent data from the corresponding data data, and selects a code of a higher concept of the detected classification code. do. That is, when the detected classification code is G11B7 / 00, the higher concept code of this classification code is G11B.

상기 분류 프로그램(41)은 상기 선정된 상위개념 코드인 G11B를 상기 롬(20)에서 검색하여, 상기 상위개념 코드에 연계저장되어 있는 해결과제 및 해결수단의 대표단어들을 확인하게 되고(S29), 상기 확인된 대표단어가 상기 분류된 해결과제 및 해결수단 항목의 데이터 내에 포함되어 있는지를 검색하게 되는데(S30), 상기 국제특허 분류코드와 연계저장된 해결과제 및 해결수단의 대표단어들은, 해당 기술분야에서 흔히 사용되고 있는 핵심 기술용어 또는 국제특허 분류코드에 기술되어 있는 단어들에 근거한 단어들로 지정될 수 있다.The classification program 41 searches the ROM 20 for the selected higher concept code in the ROM 20 to identify representative words of the problem and the solution stored in association with the higher concept code (S29). It is searched whether the identified representative word is included in the data of the classified problem and solution item (S30). The representative words of the solution and solution stored in association with the international patent classification code are related to the technical field. It can be assigned to words based on the core technical terms commonly used in the word or words described in the international patent classification code.

상기 검색결과, 상기 해결과제 문장에 기 지정된 대표단어인 '단순화'와 '안정화'단어가 포함되어 있고, 또한 상기 해결수단 문장에는 '집적화'라는 단어가 포함되어 있는 경우에는, 상기 분류 프로그램(41)은 상기 해당 대표단어와 연계저장되어 있는 식별코드를 상기 롬(20)으로 부터 독출하여, 상기 해결과제 문장에는 도5의 부여코드에 따라 C와 E코드를 부가하게 되고, 상기 해결수단 문장에는 E코드를 부가하게 된다(S31).When the search result includes the words 'simplification' and 'stabilization' which are pre-designated representative words in the solution sentence, and the word 'aggregation' is included in the solution means sentence, the classification program 41 ) Reads out the identification code stored in association with the corresponding representative word from the ROM 20, and adds C and E codes to the solution sentence according to the grant code of FIG. E code is added (S31).

이로써, 해결과제 및 해결수단의 항목이 추출분류되고, 그 내용을 대표하는 식별코드가 특허자료에 부가되어, 도6 형태로 그 내용이 재구성된 특허 자료데이터가 상기 하드디스크 장치(30)에 저장되게 된다(S32).As a result, items of the problem and the solution are extracted and classified, and an identification code representative of the content is added to the patent data, and the patent data data whose contents are reconstructed in FIG. 6 is stored in the hard disk device 30. It becomes (S32).

상기 분류 프로그램(41)은 삽입된 디스크로 부터 독출되는 모든 자료데이터에 대해, 전술한 해결과제에 따른 분류과정 및 코드 부가과정을 수행함으로써, 삽입된 상기 디스크에 수록되어 있는 모든 특허 자료데이터가 도6의 형태로 재가공되어, 상기 하드디스크 장치(30)의 해당 저장영역에 구분저장되도록 함으로써, 통합 정보관리 시스템을 위한 데이터베이스가 구축되는 것이다.The classification program 41 executes the classification process and the code addition process according to the above-described problems with respect to all the data data read out from the inserted disc, whereby all the patent data data contained in the inserted disc is shown. By reprocessing in the form of 6, to be stored separately in the storage area of the hard disk device 30, the database for the integrated information management system is constructed.

한편, 전술한 기술분야에 대한 해결과제 및 해결수단을 위한 대표단어들은 각 기술분야 마다 상이하므로(예를 들어, 농업분야는 '증산' 등의 대표단어가 있을 수 있으나, 전기분야는 이러한 단어가 대표적인 기술용어로 사용되지 않는다), 해결과제 및 해결수단에 부여되는 코드는 동일코드라고 하더라도 그 코드가 나타내는 대표단어는 각기 다르게 지정될 수 있다. 즉 G11B에서는 A가 '에러방지'의 대표단어에 대응되나, A01B 분야에서는 '증산'의 대표단어에 대응될 수 있는 것이다.On the other hand, since the representative words for the above-mentioned technical fields and solutions for the technical fields are different for each technical field (for example, the agricultural field may have a representative word such as 'increase', but in the electric field, Although the code assigned to the problem and the solution means the same code, the representative word represented by the code may be designated differently. That is, in G11B, A corresponds to the representative word of 'error prevention', but in A01B field, A may correspond to the representative word of 'proliferation'.

도7은 본 발명에 따른 자동분류된 데이터의 검색/분석방법의 바람직한 일 실시예의 흐름을 도시한 것으로서, 이하 도3의 구성을 참조하여 본 발명에 따른 도7의 검색/분석방법에 대해 상세히 설명한다.FIG. 7 illustrates a flow of a preferred embodiment of a method for searching / analyzing automatically classified data according to the present invention. Hereinafter, the searching / analyzing method of FIG. 7 according to the present invention will be described in detail with reference to FIG. 3. do.

상기와 같이 구축된 통합 정보관리 시스템의 데이터베이스에서 특허자료를 검색 또는 분석하는 경우에는(S41), 먼저 사용자가 검색단말기(PC)에 설치되어 있는 웹 브라우저(Web Browser)를 실행시켜, 이를 통해 상기 통합 정보관리 데이터베이스가 구축된 검색서버(100)에 접속하여(S42), 도8과 같은 홈페이지의 내용을 수신할 수 있게 되는데, 이를 위해서 상기 검색서버(100)는 검색단말기의 자료검색 및 분석을 위한 프로그램인 '검색엔진(Search Engine)'과 DB 관리프로그램을 상기 하드디스크 장치(30)로 부터 메모리 뱅크(21)에 로딩시켜 이를 수행시키게 된다.In the case of searching or analyzing patent data in the database of the integrated information management system constructed as described above (S41), the user first executes a web browser installed in a search terminal (PC), and thereby By accessing the search server 100 in which the integrated information management database is established (S42), the contents of the homepage as shown in FIG. 8 can be received. For this purpose, the search server 100 performs data search and analysis of the search terminal. This is done by loading a search engine (Search Engine) and a DB management program for the memory bank 21 from the hard disk device 30.

상기 검색서버(100) 내의 수행 프로그램인 검색엔진(42)은 검색단말기로 부터의 접속이 이루어지면, 통합 정보관리 시스템을 구성하고 있는 JABA APPLET을 상기 네트워크 인터페이스(10)를 통해 해당 검색단말기로 다운로드 하게 된다.The search engine 42, which is an execution program in the search server 100, downloads the JABA APPLET constituting the integrated information management system to the corresponding search terminal through the network interface 10 when a connection is made from the search terminal. Done.

상기와 같이, 상기 통합 정보관리 검색서버(100)에의 접속이 완료되어 검색가능 상태가 되면 사용자는 상기 통합 정보관리 검색서버(100)로 부터 제공되는 도9와 같은 초기 메뉴화면 상에서 검색 및 분석하고자 하는 자료데이터의 검색 및 분석조건들을 설정하게 되는데(S43). 즉 검색 및 분석하고자 하는 특허 자료데이터의 검색 및 분석대상(예를 들어, 미국특허,일본특허,한국특허 등), IPC 코드(예를 들어, G01B* 등), 키워드(예를 들어, 비디오), 해결과제 코드(예를 들어, A), 그리고 해결수단 코드(예를 들어, B) 등의 입력요소 중에서, 그 값을 선별적으로 입력설정하게 된다.As described above, when the access to the integrated information management search server 100 is completed and becomes a searchable state, the user wants to search and analyze the initial menu screen as shown in FIG. 9 provided from the integrated information management search server 100. Set the search and analysis conditions of the data data to (S43). That is, the search and analysis of patent data data to be searched and analyzed (for example, US patent, Japanese patent, Korean patent, etc.), IPC code (for example, G01B *, etc.), keywords (for example, video) From among input elements such as a problem code (eg, A) and a solution code (eg, B), the value is selectively set.

상기 다운로드된 검색단말기 내의 JABA APPLET은 사용자가 설정한 조건값들을 원하는 검색조건 또는 분석유형에 대한 질의(query)로 변환하여 사용자 및 자바 인터페이스로 사용되고 있는 상기 검색엔진(42)에 전달하게 되고, 이를 수신한 상기 검색엔진(42)은 그 질의를 다시, 기 독출되어 상기 메모리 뱅크(21)에 로딩되어 구동되고 있는 DB 관리프로그램(43)에 전달함으로써, 상기 DB 관리프로그램(43)으로 하여금 상기 입력설정된 질의에 상응하는 자료데이터를 상기 하드디스크 장치(30)로 부터 검색하도록 한다.JABA APPLET in the downloaded search terminal converts the condition values set by the user into a query for a desired search condition or analysis type and delivers them to the search engine 42 used as a user and a Java interface. The search engine 42 receives the query again, and transmits the query to the DB manager 43 which is previously read and loaded into the memory bank 21 to be driven, thereby causing the DB manager 43 to input the query. The data data corresponding to the set query is retrieved from the hard disk device 30.

이에 따라, 상기 DB 관리프로그램(43)은 상기 수신된 질의에 상응하는 특허 자료데이터를 상기 하드디스크 장치(30)로 부터 검색하게 되는데(S44), 수신된 질의의 내용이 검색대상으로서는 미국,일본,한국 특허자료이고, IPC 코드로서 'G01B*'와, 해결과제 식별코드로서 'A'(도5에서 '에러방지'의 대표단어에 대응), 그리고 해결수단 식별코드로서 'B'(도5에서 '치환'의 대표단어에 대응)인 경우에는, 상기 DB 관리프로그램(43)은 상기 도6 형태로 상기 하드디스크 장치(30)에 저장되어 있는 방대한 특허 자료데이터 중에서, 상기 입력설정된 조건을 모두 만족하는 자료데이터가 상기 하드디스크 장치(30) 내에 존재하는지를 확인하게 된다.Accordingly, the DB management program 43 searches for the patent data data corresponding to the received query from the hard disk device 30 (S44). , Korean patent data, 'G01B *' as the IPC code, 'A' as the problem identification code (corresponding to the representative word of 'error prevention' in Fig. 5), and 'B' as the solution identification code (Fig. 5). In the case of a representative word of 'substitution', the DB manager 43 performs all of the input and set conditions among the vast patent data data stored in the hard disk device 30 in the form of FIG. It is checked whether or not satisfactory data data exists in the hard disk device 30.

상기 검색결과, 사용자가 입력설정한 조건을 모두 만족하는 특허 자료데이터, 즉 G01B로 시작되는 IPC 코드를 가지며, 해결과제 식별코드는 'A', 그리고 해결수단 식별코드는 'B'인 미국, 일본, 한국 특허 자료데이터가 상기 하드디스크 장치(30)로 부터 검출되면(S45), 상기 DB 관리프로그램(43)은 상기 검색된 자료데이터에서, 발명의 명칭(제목), 출원인, 특허번호 등과 같이 개략정보를 구성하는 데이터만을 추출하여, 이를 상기 검색엔진(42)에 전달하게 되며, 상기 검색엔진(42)은 이로부터 화면표시될 도10 형태의 내용이 되는 개략정보를 구성한다.As a result of the search, it has patent data that satisfies all the conditions set by the user, that is, the IPC code starts with G01B, the problem identification code is 'A', and the solution identification code is 'B'. When the Korean patent data is detected from the hard disk device 30 (S45), the DB management program 43, in the searched data data, outlines information such as the title of the invention, the applicant, the patent number, and the like. It extracts only the data constituting the data, and delivers it to the search engine 42, the search engine 42 constitutes the schematic information to be the content of the form of FIG.

이와 같이 구성된 개략정보를 상기 네트워크 인터페이스(10)를 통해 해당 검색단말기(PC)로 송신함으로써, 사용자가 찾고자 하는 특허 자료데이터에 대한 개략정보가 검색단말기(PC)의 화면상에 도10의 형태로 출력표시되게 된다(S46).By transmitting the outlined information thus configured to the corresponding search terminal PC through the network interface 10, the outline information on patent data data that the user wants to find is displayed on the screen of the search terminal PC in the form of FIG. The output is displayed (S46).

이와 같이 화면출력되는 개략정보를 통해, 사용자는 자신이 원하는 특허 자료데이터를 선택하여 좀더 면밀하게 검색 및 분석할 수 있게 되며, 상기 개략정보 를 통해 특정 특허 자료데이터를 선택하여 이의 열람을 요청하게 되면(S47), 상기 검색엔진(42)은 상기 네트워크 인터페이스(10)를 통한 질의데이터로 부터 이를 인식한 뒤, 상기 열람요청된 특정 특허 자료데이터를 상기 하드디스크 장치(30)로 부터 검색하여 이를 독출하도록 하고(S48), 이와 같이 독출되는 특정 특허 자료데이터를 상기 네트워크 인터페이스(10)를 통해 다시 해당 검색단말기(PC)로 전송하여 화면출력되도록 하는데, 이때의 데이터 제공 형태는 해당 특허자료의 기술적인 해결과제와 해결수단 부분이 구별되어 표시되도록 함으로써(S49), 사용자로 하여금 화면출력되는 상세한 특허 자료데이터를 모두 읽어보기 전에 찾고자 하는 대상의 특허자료인지를 해결과제 위주로 쉽게 알수 있게 한다.Through the schematic information output as described above, the user can select the patent data of the user's desired data and search and analyze it more closely. When the specific patent data is selected and requested to view the data through the schematic information, In operation S47, the search engine 42 recognizes the query data from the query data through the network interface 10, and then searches for the specific patent data data requested from the hard disk device 30 and reads it out. In operation S48, the specific patent data data read as described above is transmitted to the corresponding search terminal PC through the network interface 10 so as to be displayed on the screen. By solving the problem and the solution means to be displayed separately (S49), the user to display the detailed patent Whether the patent data of the destination you're looking for before you read all of the data makes it easy to see the challenges focused.

또한, 상기의 검색된 자료데이터의 출력시에는 문제해결 항목(해결과제 및 해결수단)에 대한 해당코드와 그 코드가 대표하는 단어들도 함께 출력하게 되는데, 이때 검색된 자료에 문제해결 항목에 대한 식별코드가 복수개 있는 경우에는 이를 모두 검색단말기에 제공하게 된다.In addition, when outputting the searched document data, the corresponding code for the problem solving item (solution and solution) and the words represented by the code are also output together, and the identification code for the problem solving item is found in the searched material. If there are a plurality of, all of them are provided to the search terminal.

이와 같이, 복수개의 문제해결 코드가 제공되는 경우, 검색하는 사용자들이 복수개의 문제해결에 관련된 대표단어 중 어떤 것이 해당 기술자료에 대한 것으로서 적절한 것인지를 판단하여, 검색서버(100)의 운용자에게 피드백(feed-back)하는 경우에는, 복수개의 대표단어의 제공순서를 바꾸어서 제공하게 되며, 검색시에도 필요한 경우에는 제1 대표단어에 의해 검색된 단어와 제1 이후의 대표단어에 의해 검색된 자료를 구분하여 사용자에게 제공할 수도 있다.As such, when a plurality of troubleshooting codes are provided, the searching users determine which of the representative words related to the plurality of troubleshooting are appropriate as the corresponding technical data, and provide feedback to the operator of the search server 100. In the case of a feed-back, the order of providing a plurality of representative words is provided in a different order, and if necessary, the user can distinguish between the words searched by the first representative word and the data searched by the first and subsequent representative words. It can also be provided to.

상기의 실시예에서, 상기 통합 정보관리 검색서버(100)는 검색된 자료데이터 의 IPC 분류 또는 해결과제 위주의 분석결과의 화면출력 형태를 사용자가 선택할 수 있도록 하며, 이 선택에 따라 도11과 같은 트리 맵(Tree Map) 형태, 도12와 같은 파이(Pie Map) 맵 형태(해결과제가 지정되지 않은 경우에 검색된 데이터에 대한 것임), 그리고 도13과 같은 그래프 맵(Graph Map) 형태(도13에서도 IPC와 또는 키워드만 지정된 경우 해결과제 또는 해결수단 코드로 그래프 형태로 나타낼 수 있다) 등과 같이 다양한 형태로 그래픽 데이터를 제공하게 된다.In the above embodiment, the integrated information management retrieval server 100 allows the user to select a screen output form of the IPC classification or problem-oriented analysis results of the retrieved data data, according to the selection, the tree as shown in FIG. Tree Map type, Pie Map type as shown in Fig. 12 (for data retrieved when no task is specified), and Graph Map type as shown in Fig. 13 (also in Fig. 13). Graphic data can be provided in various forms such as IPC and or keyword only, if it is specified, it can be represented in graph form as a solution or a solution code.

또한, 상기의 실시예에서는 국제 특허분류 코드를 분류기준으로 하여 자료를 기술분류 또는 검색분류 하는 것을 예로 하였으나, 그외에도 독자적인 분류체계를 기준으로 자료데이터를 분류할 수도 있는데, 이때 분류체계는 구조적인 성질에 따른 분류코드(Structure Classification Code), 기능적인 성질에 따른 분류코드(Function Classification Code)로 선택할 수 있으며, 여기서 기능적인 분류코드는 다시 효과항목 코드(Effect Code)와 과제항목 코드(Subject Code)로 세분류한 뒤, 이를 조합하여 사용할 수도 있다.
이상 전술한 본 발명의 바람직한 실시예는 예시의 목적을 위해 개시된 것으로, 당업자라면 이하 첨부된 특허청구범위에 개시된 본 발명의 기술적 사상과 그 기술적 범위 내에서, 다양한 다른 실시예들을 개량, 변경, 대체 또는 부가 등이 가능할 것이다.In addition, in the above embodiment, the technical classification or search classification of the data based on the international patent classification code as a classification standard, but also the data data can be classified based on the independent classification system, wherein the classification system is structural It can be selected as Structure Classification Code by Function or Function Classification Code by Functional Property, where functional classification code is again used as Effect Code and Subject Code. After subdividing into, it can also be used in combination.
The above-described preferred embodiments of the present invention are disclosed for purposes of illustration, and those skilled in the art can improve, change, and substitute various other embodiments within the technical spirit and scope of the present invention disclosed in the appended claims below. Or addition may be possible.

상기와 같이 이루어지는 본 발명에 따른 데이터의 자동 분류방법 및 분류데이터의 검색/분석방법은, 자료데이터가 제시하는 해결과제 및 해결수단에 근거하여 자동 분류하고, 분류된 데이터를 검색 및 분석이 가능한 문제해결의 단어들로 대표하여 저장함으로써, 방대한 자료데이터 중에서, 사용자가 원하는 문제해결에 필요한 자료데이터에 가장 근접하는 자료데이터의 검색 및 분석이 이루어지도록 하여, 새로운 문제해결을 위한 신기술이나 타인의 권리를 침해하지 않는 새로운 해결수단을 용이하게 찾을 수 있도록 한다.The automatic classification method of the data and the search / analysis method of the classification data according to the present invention as described above are automatically classified based on the problem and the solution suggested by the data data, and the classified data can be searched and analyzed. Representing and storing the words as the solution, the search and analysis of the data data that is closest to the data data necessary for the user's problem solving, among the vast data data, so that the new technology or the rights of others to solve the new problem Make it easy to find new solutions that do not infringe.

또한 자동분류된 자료데이터의 검색결과를 문제해결에 관한 내용을 구분하여 제공하는 한편, 검색 및 분석된 자료를 그래프 맵, 파이 맵, 트리 맵 등과 같이 다양한 형태로 화면상에 제공함으로써, 검색 및 분석된 자료들에서, 문제해결 관점에서 원하는 자료를 신속하게 찾을 수 있도록 하며, 또한 사용자의 자료데이터 검색 및 분석결과 파악이 용이하도록 한 매우 유용한 발명인 것이다.In addition, it provides search results of automatically classified data data by classifying the contents of problem solving, and provides search and analysis data on the screen in various forms such as graph map, pie map, tree map, etc. It is a very useful invention that makes it possible to find the desired data quickly from the point of view of problem solving, and also to make it easier for users to search and analyze the data data.

Claims

A first step of selecting a predetermined specific portion from the data data recorded on the recording medium;

A second step of classifying data included in a specific item field among the selected data into two or more independent items based on the contents of the data;

A third step of selecting respective identification codes corresponding to the contents of the classified data and adding them to the data of the classified items; And

And a fourth step of recording the classified and added data data on another recording medium.

The method of claim 1,

The two or more items to be classified include a problem item and a solution item item.

The method of claim 2,

The problem item is the automatic classification method of data, characterized in that the data appearing after the target phrase among the data included in the specific item field.

The method of claim 2,

And the solution means item is data that appears after the means phrase among the data included in the specific item field.

The method of claim 2,

The problem item is the automatic classification method of data, characterized in that the data corresponding to the first sentence of the green portion of the data contained in the specific item field.

The method of claim 2,

And the solution means item is data corresponding to a sentence excluding the first sentence of the green portion of the data included in the specific item field.

The method of claim 2,

The problem item is an automatic classification method of data, characterized in that the data corresponding to the summary portion of the data contained in the specific item field.

The method of claim 2,

And the solution means item is data corresponding to a claim portion of data included in the specific item field.

delete

The method of claim 1,

In the third step, a predetermined word is searched for in the classified item data, and an identification code corresponding to the searched word is selected and added to the classified item data. .

The method of claim 10,

In the third step, a plurality of corresponding identification codes are added.

The method of claim 10,

The identification code is an automatic classification method of data, characterized in that the same value corresponds to a word having a different meaning between different technical fields.

The method of claim 1,

On the basis of the correlation between the classified data, the classification information of the data is displayed on the screen in any one of the form of tree, pie, graph, the automatic classification method of the data.

The method of claim 1,

The classification into two or more independent items is performed based on an International Patent Classification Code (IPC Code).

In the database search / analysis method,

A first step of retrieving a classification code related to problem solving from a predetermined database (DB) input from the outside; And

And a second step of outputting and displaying, on a screen, the schematic information of the data data stored in association with the searched classification code.

The method of claim 15,

The classification code for solving the problem is a code for one of the predetermined words included in the content of the problem or solution item.

The method of claim 15,

The first step is a search / analysis method of the classified data, characterized in that the search based on the technical field.