Nothing Special   »   [go: up one dir, main page]

CN117493906A - City event allocation method, system and storage medium - Google Patents

City event allocation method, system and storage medium Download PDF

Info

Publication number
CN117493906A
CN117493906A CN202311533241.8A CN202311533241A CN117493906A CN 117493906 A CN117493906 A CN 117493906A CN 202311533241 A CN202311533241 A CN 202311533241A CN 117493906 A CN117493906 A CN 117493906A
Authority
CN
China
Prior art keywords
event
data
keywords
word
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311533241.8A
Other languages
Chinese (zh)
Inventor
余雁
苏如春
岑道岸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Hantele Communication Co ltd
Original Assignee
Guangzhou Hantele Communication Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Hantele Communication Co ltd filed Critical Guangzhou Hantele Communication Co ltd
Priority to CN202311533241.8A priority Critical patent/CN117493906A/en
Publication of CN117493906A publication Critical patent/CN117493906A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Educational Administration (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of data processing, and particularly relates to a city event distributing method, a city event distributing system and a storage medium. The method comprises the following steps: converging city historical event data, and extracting structured data in the event data; constructing a preset event list library; acquiring reported event data and extracting event keywords; identifying the event keywords according to the extracted event keywords, and determining event types; matching and matching the determined event type with a preset event list library to determine the event service type and the responsibility department; and carrying out event distribution according to the matching result. The invention can reduce the number of keywords, accurately select proper keywords, improve the matching efficiency and the matching precision, and further improve the allocation efficiency and the allocation accuracy.

Description

City event allocation method, system and storage medium
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a city event distributing method, a city event distributing system and a storage medium.
Background
Under the promotion of a new technological revolution, the current society is accelerating to a digital society, the development of the digital society is not separated from the construction of smart cities, and the command cities effectively integrate various city management systems by using an information communication technology, so that information resource sharing and business coordination among the city systems are realized. However, for deep construction and comprehensive operation of smart cities, the types and the number of urban event data are increasingly increased, and with the improvement of urban informatization, the access source of the event data is complicated, so that the inefficiency of urban event distribution is caused.
The problems of urban event allocation at present are that firstly, the system is distributed by the present urban event allocation personnel based on subjective judgment, the service flow efficiency is low, and the allocation accuracy is not high. Second, each event data source channel is many, there may be multiple reports of multiple channels for a single event, and the event data is generally unstructured data, and the total event data cannot be deduplicated, resulting in multiple assignments of events.
The prior art CN1 14446287a discloses a city event allocation method and system based on NLP and GIS, the city is divided into a plurality of grid areas in advance; based on GIS space analysis, combining the business department region division data and the supervision department region division data to determine corresponding business departments and supervision departments of each grid region; the event distribution method comprises the following steps: acquiring urban event data, wherein the event data comprises event comprehensive description information and position information; determining the service type and the belonged grid area of the event according to the comprehensive description information and the position information of the urban event; and determining corresponding business departments and supervision departments according to the business types of the events and the grid areas to which the events belong. In the prior art, the event type is determined through the matching quantity of the keywords, but for event description with more data information, the keywords are more in word segmentation, all word segmentation is used for matching, so that the calculated amount is increased undoubtedly, and the matching efficiency is reduced.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a method, a system and a storage medium for distributing urban events, which are used for carrying out structural processing on event data to finish intelligent distribution of the events and improving the high efficiency and accuracy of the distribution of the urban events.
In order to achieve the above purpose, the technical scheme of the invention is as follows:
a method of urban event distribution, comprising:
s1, converging city historical event data, and extracting structured data in the event data;
s2, constructing a preset event list library by using the structured data of the urban historical event data in the step S1;
s3, acquiring reported event data, segmenting the reported event data according to the summary and the position information of the reported event data, and extracting event keywords;
s4, identifying the event keywords according to the event keywords extracted in the step S3, and determining event types;
s5, matching and matching with a preset event list library according to the event type determined in the step S4, and determining the event service type and the responsibility department to which the event service type belongs;
s6, carrying out event allocation according to the matching result in the step S5.
Further, the city historical event data in step S1, including structured data and unstructured data, is cleaned for unstructured data, the structured data therein is identified, and the identified structured data is marked.
Still further, the structured data includes time, place, event type, treatment department, category level, and the like.
Further, in step S3, hanLP is used for word segmentation.
Further, the specific method for selecting the keywords in step S3 is as follows: calculating the word frequency inverse document frequency value of the jth word in the ith text data, and arranging the word frequency inverse document frequency values of all the words in a descending order, and intercepting a plurality of words from large to small as keywords.
Further, the word frequency inverse document frequency value is calculated by the following method:
TFIDF ij =TF ij ×IDF k ,WORD ij ==gWORD k
wherein TFIDF ij Word frequency inverse document frequency value, TF, representing the jth word in the ith text data ij Representing the frequency of occurrence of the jth word in the ith text, IDF k WORD representing the inverse document frequency of the kth global WORD ij gWORD representing the actual character of the jth word in the ith text k Representing the actual character of the kth global word.
Further, the frequency of occurrence TF of the jth word in the ith text ij The method is adopted for calculation:
wherein NUM ij Representing the number of occurrences of the jth word in the ith text, and r representing the number of different words in the ith text.
Further, the inverse document frequency IDF of the kth global word k The method is adopted for calculation:
wherein TOTAL is k TOTAL number of text data entries representing words containing kth global word r Representing the total number of text data entries containing the r-th global word, r representing the number of different words in the text.
Further, the number of keywords is limited to 10 or less.
Further, the determination conditions for selecting the keywords include: TFIDF (tfIDF) ij Not less than 0.025.
The invention also provides an urban event distribution system which comprises an acquisition module, a preprocessing module and an extraction and identification module, wherein the acquisition module is used for acquiring urban historical event data, the preprocessing module is used for carrying out structuring processing on the urban historical event data, and a preset event list library is constructed according to the event data after structuring processing; the extraction and identification module is used for extracting keywords from the reported event data, identifying the reported event according to the extracted keywords, and displaying the service type and the responsibility department of the reported event.
The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the urban event distribution method described above.
Compared with the prior art, the invention has the following beneficial effects:
according to the urban event distribution method provided by the invention, the historical events are firstly utilized to carry out structural processing, then the preset event list library is constructed, then the reported events are subjected to word segmentation and keyword extraction, the keywords are selected by sorting according to the magnitude of the word frequency inverse document frequency value, and meanwhile, the keywords are selected and set according to the setting conditions, so that the number of the keywords can be reduced, the proper keywords can be accurately selected, the matching efficiency and the matching precision are improved, and the distribution efficiency and the distribution accuracy are further improved.
Drawings
Fig. 1 is a flowchart of a city event allocation method provided by the present invention.
FIG. 2 is a flow chart of a method for structuring in the present invention.
Fig. 3 is a framework diagram of the urban event distribution system provided by the invention.
Detailed Description
The technical solutions of the present invention will be clearly described below with reference to the accompanying drawings, and it is obvious that the described embodiments are not all embodiments of the present invention, and all other embodiments obtained by a person skilled in the art without making any inventive effort are within the scope of protection of the present invention.
It is noted that the relative arrangement of the components and steps, numerical expressions, set forth in these embodiments should not be construed as limiting the scope of the present invention unless it is specifically stated otherwise.
The following description of the exemplary embodiment(s) is merely illustrative, and is in no way intended to limit the invention, its application, or uses. Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail herein, but where applicable, should be considered part of the present specification.
The invention provides a city event distributing method, as shown in figure 1, comprising the following steps:
s1, converging city historical event data, and extracting structured data in the event data;
and aggregating city historical event data, and carrying out structuring processing on city event unstructured data of obtained city historical event cases aiming at different business events such as city management, traffic, municipal administration and the like in the past city treatment process to obtain structured data. Unstructured data related to urban events, such as news stories, social media, public databases, etc., are collected from various channels. As shown in fig. 2, the text data is cleaned and preprocessed, including removing useless punctuation marks, special characters, processing cases, removing stop words, performing spelling correction, etc., and then key entities in the text, such as places, characters, organizations, etc., are identified and marked by a named entity recognition technique.
Then, the event is classified, and the text data is classified according to the content and the characteristics of the event. A classifier may be trained using supervised learning methods, such as naive bayes, support vector machines, etc., or classified using rule matching or keyword matching. The time information of the occurrence of the event is extracted from the text. The information about date, time and the like can be captured by using regular expressions or natural language processing technology, and geographic position information of the occurrence of the event can be extracted from the text by using a place name recognition or geocoding technology. The location resolution and encoding may be performed using existing map services or geographic information systems.
The processed structured data is converted to a format suitable for storage and analysis, as shown in table 1:
table 1 structured data store Format example
S2, constructing a preset event list library by using the structured data of the urban historical event data in the step S1;
the event list comprises types classification aiming at various business events, a plurality of keywords, departments for handling the business events and the like, such as events of street lamp inclination, wherein the business types are construction management types, the primary type is public facilities, the secondary type is street lamp components, the problem types comprise inclination, flickering, extinction and the like, and the keywords comprise related keywords such as street lamp inclination, street lamp flickering, street lamp extinction and the like. The preset event list library also comprises keywords of the solution and a disposal department.
S3, acquiring reported event data, segmenting the reported event data according to the summary and the position information of the reported event data, and extracting event keywords;
further, in step S3, hanLP is used to perform word segmentation, where the word segmentation result of HanLP is composed of words and parts of speech marks, the words and parts of speech are separated by "/", and every two words are separated by a space. When traversing word segmentation results of all texts, ignoring all punctuations according to part-of-speech information, recording the occurrence times of each word in the texts, and sequencing each word in the texts according to dictionary sequences in an orderly mapping mode and corresponding to the occurrence times of each word.
By NUM ij Representing the number of occurrences of the jth WORD in the ith text in WORD ij Representing the actual character of the jth word in the ith text. In addition, unique words that have appeared in all text data are summarized, i.e., all words are globally recorded, and the number of text data entries in which these words have appeared is recorded. By TOTAL k Representing the total number of text data entries containing the kth global word in gWORD k Representing the actual character of the kth global word. Definition of TF ij Representing the frequency of occurrence of the jth word in the ith textDefinition of IDF k The inverse document frequency representing the kth global word. TF (TF) ij And IDF (IDF) k Can be obtained by the following and calculation.
Wherein NUM ij The number of occurrences of the jth word in the ith text is represented, and r represents the number of different words in the text. TOTAL of (TOTAL) k TOTAL number of text data entries representing words containing kth global word r Representing the total number of text data entries containing the r-th global word.
Based on the above formula, TF is calculated for each word in the city event data ij And IDF (IDF) k Where WORD is to be guaranteed ij And gWORD k Consistent, the calculation result is recorded as TFIDF ij And representing the word frequency inverse document frequency value of the jth word in the ith text data. Finally, the main key words of the urban event data can be obtained by sorting the values in descending order and intercepting a plurality of words which are sorted in the front. The calculation formula is as follows:
TFIDF ij =TF ij ×IDF k ,WORD ij ==gWORD k
in the method, two thresholds are set for limiting the number of keywords and screening the keywords: the first threshold value represents the keyword number of each city event data and is expressed by KEYNUM, and the method designs the value as an integer of 10; another threshold value represents the minimum requirement of TFIDF value when a word is identified as a keyword, expressed in MINTFIDF, and the method designs the value to be floating point number 0.025.
The keywords of each piece of urban event data can be obtained, and when the keywords of the two pieces of urban event data are identical, and the absolute value of the difference between the timestamp values in the time field corresponding to the urban event data is smaller than a certain critical value, the two pieces of text data are considered to belong to the same event.
For example, taking the case that a road lamp is always in an off state in the event description data "Shen Hailu, which causes pedestrians and vehicles to be unable to see a road surface obstacle at night", using the method in S3, word segmentation processing is performed first, "Shen Hailu/medium road/on/street/always/on/off/state/, which causes/pedestrians/on/vehicle/on/night/unable to see/road surface/obstacle/", and after word segmentation, keywords of the event information are extracted according to the step S3.
S4, identifying event keywords according to the event keywords extracted in the step S3, and determining information such as event types, areas, events, loss degrees and the like;
the invention adopts an event identification method based on rule matching, and matches with the information in the table 2.
Table 2 match data format
Defining STR to represent character string of name to be matched, calculating number of Chinese characters in which STR is repeated with each recorded event type character string of data base by means of character string comparison and using nSTR i Representing the number of repeated Chinese characters of the type to be matched and the ith type name in the database, and defining similarity STM i Representing the similarity between the type to be matched and the ith type, and representing the length of the name character string to be matched by len (STR), and the similarity SIM i Can be represented by
Based on the formula, the similarity between the event with the matching and all types of events can be obtained, and the event type names with the highest similarity can be obtained by sequencing the similarity.
S5, matching and matching with a preset event list library according to the event type determined in the step S4, and determining the event service type and the responsibility department to which the event service type belongs;
s6, carrying out event allocation according to the matching result in the step S5.
The invention also provides a city event distributing system, as shown in figure 3, comprising an acquisition module, a preprocessing module and an extraction and identification module, wherein the acquisition module is used for acquiring city historical event data, the preprocessing module is used for carrying out structuring processing on the city historical event data, and a preset event list library is constructed according to the event data after structuring processing; the extraction and identification module is used for extracting keywords from the reported event data, identifying the reported event according to the extracted keywords, and displaying the service type and the responsibility department of the reported event.
The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the urban event distribution method described above.
The above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to examples, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the scope of the technical solution of the present invention, which is intended to be covered by the claims of the present invention.

Claims (10)

1. A method for distributing urban events, comprising:
s1, converging city historical event data, and extracting structured data in the event data;
s2, constructing a preset event list library by using the structured data of the urban historical event data in the step S1;
s3, acquiring reported event data, segmenting the reported event data according to the summary and the position information of the reported event data, and extracting event keywords;
s4, identifying the event keywords according to the event keywords extracted in the step S3, and determining event types;
s5, matching and matching with a preset event list library according to the event type determined in the step S4, and determining the event service type and the responsibility department to which the event service type belongs;
s6, carrying out event allocation according to the matching result in the step S5.
2. The method according to claim 1, wherein the city history event data in step S1 comprises structured data and unstructured data, the unstructured data is cleaned, the structured data is identified, and the identified structured data is marked.
3. The method of claim 1, wherein the segmentation is performed in step S3 using HanLP; the specific method for selecting the keywords in the step S3 is as follows: calculating the word frequency inverse document frequency value of the jth word in the ith text data, and arranging the word frequency inverse document frequency values of all the words in a descending order, and intercepting a plurality of words from large to small as keywords.
4. A method according to claim 3, wherein the number of keywords is limited to 10 or less.
5. A method according to claim 3, wherein the term frequency inverse document frequency value is calculated by:
TFIDF ij =TF ij ×IDF k ,WORD ij ==gWORD k
wherein TFIDF ij Word frequency inverse document frequency value, TF, representing the jth word in the ith text data ij Representing the frequency of occurrence of the jth word in the ith text, IDF k WORD representing the inverse document frequency of the kth global WORD ij Real representing the jth word in the ith textInter-character, gWORD k Representing the actual character of the kth global word.
6. The method of claim 5, wherein the frequency of occurrence TF of the jth word in the ith text ij The method is adopted for calculation:
wherein NUM ij Representing the number of occurrences of the jth word in the ith text, and r representing the number of different words in the ith text.
7. The method of claim 5, wherein the inverse document frequency IDF of the kth global word k The method is adopted for calculation:
wherein TOTAL is k TOTAL number of text data entries representing words containing kth global word r Representing the total number of text data entries containing the r-th global word, r representing the number of different words in the text.
8. The method of claim 5, wherein the decision condition for selecting a keyword comprises: TFIDF (tfIDF) ij Not less than 0.025.
9. The urban event distribution system is characterized by comprising an acquisition module, a preprocessing module and an extraction and identification module, wherein the acquisition module is used for acquiring urban historical event data, the preprocessing module is used for carrying out structuring processing on the urban historical event data, and a preset event list library is constructed according to the event data after structuring processing; the extraction and identification module is used for extracting keywords from the reported event data, identifying the reported event according to the extracted keywords, and displaying the service type and the responsibility department of the reported event.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the urban event distribution method according to any of claims 1-8.
CN202311533241.8A 2023-11-16 2023-11-16 City event allocation method, system and storage medium Pending CN117493906A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311533241.8A CN117493906A (en) 2023-11-16 2023-11-16 City event allocation method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311533241.8A CN117493906A (en) 2023-11-16 2023-11-16 City event allocation method, system and storage medium

Publications (1)

Publication Number Publication Date
CN117493906A true CN117493906A (en) 2024-02-02

Family

ID=89668937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311533241.8A Pending CN117493906A (en) 2023-11-16 2023-11-16 City event allocation method, system and storage medium

Country Status (1)

Country Link
CN (1) CN117493906A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118568267A (en) * 2024-08-05 2024-08-30 中电科新型智慧城市研究院有限公司 Event distribution method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118568267A (en) * 2024-08-05 2024-08-30 中电科新型智慧城市研究院有限公司 Event distribution method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109543178B (en) Method and system for constructing judicial text label system
CN107862070B (en) Online classroom discussion short text instant grouping method and system based on text clustering
CN103678670A (en) Micro-blog hot word and hot topic mining system and method
CN111090731A (en) Electric power public opinion abstract extraction optimization method and system based on topic clustering
CN113033198B (en) Similar text pushing method and device, electronic equipment and computer storage medium
CN110533212A (en) Urban waterlogging public sentiment monitoring and pre-alarming method based on big data
CN101075251A (en) Method for searching file based on data excavation
CN106354871A (en) Similarity search method of enterprise names
Christen et al. A probabilistic geocoding system based on a national address file
CN114896305A (en) Smart internet security platform based on big data technology
Schulz et al. Small-scale incident detection based on microposts
CN111859070A (en) Mass internet news cleaning system
CN110188092B (en) System and method for mining new type contradiction dispute in people mediation
CN116384889A (en) Intelligent analysis method for information big data based on natural language processing technology
CN115794798A (en) Market supervision informationized standard management and dynamic maintenance system and method
CN103218368A (en) Method and device for discovering hot words
CN117493906A (en) City event allocation method, system and storage medium
CN107463624B (en) A kind of method and system that city interest domain identification is carried out based on social media data
CN115618014A (en) Standard document analysis management system and method applying big data technology
CN112445955B (en) Business opportunity information management method, system and storage medium
CN109902148B (en) Automatic enterprise name completion method for address book contacts
CN116956930A (en) Short text information extraction method and system integrating rules and learning models
Christen et al. A probabilistic geocoding system utilising a parcel based address file
CN115982309A (en) Rail transit data analysis method based on big data
CN113159363B (en) Event trend prediction method based on historical news reports

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination