CN112214579B - Machine intelligent review method and system for short answer questions - Google Patents
Machine intelligent review method and system for short answer questions Download PDFInfo
- Publication number
- CN112214579B CN112214579B CN202011078190.0A CN202011078190A CN112214579B CN 112214579 B CN112214579 B CN 112214579B CN 202011078190 A CN202011078190 A CN 202011078190A CN 112214579 B CN112214579 B CN 112214579B
- Authority
- CN
- China
- Prior art keywords
- subject
- keyword
- sentence
- word
- score
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000012552 review Methods 0.000 title claims description 19
- 238000012360 testing method Methods 0.000 claims abstract description 81
- 230000004044 response Effects 0.000 claims abstract description 52
- 239000013598 vector Substances 0.000 claims description 34
- 230000003828 downregulation Effects 0.000 claims description 25
- 230000003827 upregulation Effects 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 20
- 238000010276 construction Methods 0.000 claims description 6
- 238000013136 deep learning model Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 abstract description 7
- 230000006399 behavior Effects 0.000 abstract description 3
- 230000011218 segmentation Effects 0.000 description 9
- 230000009471 action Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000003796 beauty Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000006116 polymerization reaction Methods 0.000 description 2
- 206010040007 Sense of oppression Diseases 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009323 psychological health Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Strategic Management (AREA)
- Databases & Information Systems (AREA)
- Tourism & Hospitality (AREA)
- Pure & Applied Mathematics (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Electrically Operated Instructional Devices (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to examination paper evaluation and discloses a machine intelligent evaluation method and system for short answer questions, which can avoid misjudgment of high-grade behaviors obtained by stacking words and achieve more objective and accurate evaluation results. The method comprises the following steps: acquiring subject keywords and common keywords of the subject based on a subject corpus in advance, acquiring a related word set of each subject keyword, and constructing a keyword library of the subject; acquiring answering information and standard answers of the target test questions; extracting a subject keyword set and a common keyword set in the standard answer based on a keyword library, and determining a related word set of each subject keyword to expand the subject keyword set; identifying subject keywords and associated words in answering information based on the expanded subject keyword set, and identifying common keywords in answering information based on the common keyword set; calculating sentence reasonableness of the answering information; and calculating the score of the response information according to a scoring formula.
Description
Technical Field
The application relates to examination paper reading, in particular to a machine intelligent reading technology of short answer questions.
Background
The examination is an indispensable important part in teaching activities, and is used for checking the ordinary learning condition of students and checking the teaching level of teachers.
With the development of computer technology, the examination paper evaluation is generally performed in a computer-aided manner at present, the automatic evaluation technology of objective questions is relatively mature at present, and the automatic evaluation of subjective questions (for example, short answer questions) remains a difficulty in the automatic evaluation technology of computers. Although some automatic review methods of subjective questions have been proposed, many problems still exist in these methods. For example, chinese application with publication number CN108959261A discloses a test paper subjective question determination device and method based on natural language, the device first extracts keywords in the answers of the examinees, then calculates the word similarity between the extracted keywords and the score point keywords, calculates the sentence similarity between the text sentences of the answers of the examinees and the reference answers to provide sentences, and finally obtains the determination scores of the answers of the examinees according to the word similarity and the sentence similarity. The device has problems in that: the word similarity of the keywords is calculated only by the word meaning similarity, and the semantic similarity of the two sentences is calculated by using the word similarity of the words contained in the two sentences, so that the judgment of the expression information of the sentences is only performed, the judgment of the logicality of the sentences is lacked, and the misjudgment of obtaining high scores by stacking words is easily caused.
Disclosure of Invention
The application aims to provide a machine intelligent review method and system for short answer questions, misjudgment of behaviors with high scores obtained by piling words is avoided, and scoring results are objective and accurate.
The application discloses a machine intelligent review method of short answer questions, which comprises the following steps:
acquiring subject keywords and common keywords of the subject based on subject linguistic data in advance, generating a word vector table of each keyword, clustering the keywords based on the word vector table, acquiring a related word set of the subject keywords, and constructing a keyword library of the subject;
acquiring answering information and standard answers of the target test questions;
on the basis of the keyword library, extracting a subject keyword set and a common keyword set in the standard answers, and determining a relevant word set of each subject keyword to expand the subject keyword set;
identifying subject keywords and associated words in the answering information based on the expanded subject keyword set, and identifying common keywords in the answering information based on the common keyword set;
calculating the sentence reasonableness of the answering information, wherein the sentence reasonableness refers to the reasonable degree of the logical sequence and relationship between words in the sentence;
according to the formula Calculating a score F of the response information, wherein s 1 、s 2 、s 3 、s 4 Weight coefficients s representing subject keyword information, associated word information, general keyword information, and sentence reasonableness in the answer information 1 >s 2 >s 3 ,F 0 And summarizing the target test questions.
In a preferred embodiment, the calculating the sentence reasonability of the response information, where the sentence reasonability refers to the reasonability of the logical sequence and relationship between words in a sentence, further includes:
respectively extracting word sequences of each sentence in the answering information and the standard answers;
calculating the probability value of the position of each word in each word sequence in the sentence by adopting an N-gram language model according to the Markov assumption;
calculating a word reasonable probability value of each sentence according to the probability value of the position of each word in the sentence based on a Bayesian conditional probability model;
and calculating the sentence reasonableness of the answering information according to the answering information and the word reasonable probability value of each sentence in the standard answers.
In a preferred embodiment, the calculating the sentence reasonability of the response information according to the response information and the word reasonability probability value of each sentence in the standard answers further includes:
respectively calculating word reasonable probability mean values of sentences of the answering information and the standard answers;
if the mean value of the reasonable probabilities of the words of the sentence of the answering information is smaller than the mean value of the reasonable probabilities of the words of the sentence of the standard answer, the sentence reasonability of the answering information is the quotient of the mean value of the reasonable probabilities of the words of the sentence of the answering information and the mean value of the reasonable probabilities of the words of the sentence of the standard answer;
and if the mean value of the reasonable probability of the words of the sentence of the response information is greater than or equal to the mean value of the reasonable probability of the words of the sentence of the standard answer, the sentence reasonableness of the response information is 1.
In a preferred embodiment, the obtaining, based on the subject corpus, subject keywords and common keywords of the subject, generating a word vector table of each keyword, clustering each keyword based on the word vector table, and obtaining an associated word set of each subject keyword, further includes:
acquiring subject keywords and common keywords of the subject based on the subject corpus;
generating word vectors of the keywords by using a text depth language model to obtain a word vector table;
calculating the distance between each keyword in the word vector table;
and acquiring words with the distance from each subject keyword to each subject keyword less than a preset threshold value to form a related word set of each subject keyword.
In a preferred example, the text deep language model is a deep learning model based on word2 vec;
and calculating the distance between the keywords in the word vector table by adopting a cosine similarity calculation method.
In a preferred embodiment, after the calculating the score of the response information, the method further includes:
calculating the average value of the scores of the answering information of all examinees in a preset examination group;
obtaining expected scores of the answering information of all examinees in the preset examination group, and calculating an average value of the expected scores;
and adjusting the scores of the answering information of each examinee according to the average value of the scores and the average value of the expected scores.
In a preferred embodiment, the adjusting the score of the answering information of each test taker according to the average value of the scores and the average value of the expected scores further comprises:
determining an expected average score range according to the average value of the expected scores, determining to adjust the scores upwards when the average value of the scores is smaller than the lower limit value of the expected average score range, and determining to adjust the scores downwards when the average value of the scores is larger than or equal to the upper limit value of the expected average score range;
after the determining to up-regulate the score or the determining to down-regulate the score, further comprising:
calculating a down-regulation or up-regulation base score as an absolute value of a difference between the average of the scores and the average of the expected scores;
according to the ranking of all the examinees in the preset examination group from high to low according to the scores of the answering information, dividing the examinees and the scores thereof into an excellent examinee set, a common examinee set and a poor examinee set according to the ranking result;
calculating an up-or down-regulation score for each test in the set of common test takers equal to the up-or down-regulation base score;
according to the formulaCalculating the advantagesUp-or down-regulation score of scores of each test in the set of test students, wherein TR represents the up-or down-regulation base score, F 0 Represents the total score of the target test questions, F i Representing the score, S, of the ith test taker in the set of excellent test takers i Representing an up or down score for the ith test taker in the set of excellent test takers;
according to the formulaCalculating an up-or down-adjustment score for each test taker in the poor set of test takers, wherein f represents the number of test takers in the excellent set of test takers, h represents the number of test takers in the poor set of test takers, S represents an up-or down-adjustment score for each test taker in the poor set of test takers, TR represents the up-or down-adjustment baseline score, S i Representing an up or down score for the ith test taker in the set of excellent test takers;
and adjusting the scores of the answering information of each examinee up or down according to the calculated up-adjustment or down-adjustment values of the scores of each examinee.
The application also discloses a machine intelligence system of reading of briefly answering, includes:
the keyword library construction module is used for acquiring subject keywords and common keywords of the subject based on subject linguistic data in advance, generating a word vector table of each keyword, clustering the keywords based on the word vector table, and acquiring a related word set of the keywords of each subject so as to construct a keyword library of the subject;
the acquisition module is used for acquiring answering information and standard answers of the target test questions;
the keyword identification module is used for extracting a subject keyword set and a common keyword set in the standard answers based on the keyword library, determining a related word set of each subject keyword to expand the subject keyword set, identifying the subject keywords and related words in the answering information based on the expanded subject keyword set, and identifying the common keywords in the answering information based on the common keyword set;
the reasonableness calculation module is used for calculating the sentence reasonableness of the answering information, wherein the sentence reasonableness refers to the reasonable degree of the logical sequence and relationship among the words in the sentence;
a scoring module for scoring according to a formula Calculating a score F of the response information, wherein s 1 、s 2 、s 3 、s 4 Weight coefficients s representing subject keyword information, associated word information, general keyword information, and sentence reasonableness in the answer information 1 >s 2 >s 3 ,F 0 And summarizing the target test questions.
The application also discloses machine intelligence system of reviewing of brief answer, include:
a memory for storing computer executable instructions; and the number of the first and second groups,
a processor for implementing the steps in the method as described hereinbefore when executing the computer-executable instructions.
The present application also discloses a computer-readable storage medium having stored therein computer-executable instructions which, when executed by a processor, implement the steps in the method as described hereinbefore.
Compared with the prior art, the embodiment of the application at least comprises the following advantages:
the keyword information contained in the answering information is identified by taking the standard answer as a reference, and the score of the answering information of the examinee is calculated by combining the keyword information with the sentence reasonability (namely the reasonability degree of the logical sequence and the relation between words in the sentence), so that the misjudgment of the behavior of obtaining high score by stacking the words is avoided, and the scoring result is more objective and accurate.
In addition, when the keywords are expanded, only the subject keyword set is expanded without expanding the common keyword set, words with low relevance are effectively filtered (namely the common keywords are not expanded), and the subsequent keyword identification speed is favorably improved; and moreover, the subject keyword set is expanded by the words with the same, similar and high degree of association as the subject keyword, and the expanded subject keyword set covers a wider range of keywords, so that answering information of the words with higher related word occupation ratio can be reasonably scored although the subject keyword occupation ratio is lower, and the flexibility and the accuracy of scoring results are improved to a certain extent.
In addition, when the scoring is calculated, the scoring weight of the subject keyword is highest, the scoring weight of the associated word is lower, and the scoring weight of the common keyword is lowest, so that answering information of the common keyword with a low proportion but a high proportion can also obtain a reasonable score, and the accuracy of the scoring result is improved to a certain extent.
Furthermore, the sentence reasonability of the answering information is calculated according to the answering information and the word sequence of each sentence in the standard answer based on the Markov hypothesis and the Bayesian conditional probability model, so that the reasonability calculation result is consistent with the approximation degree of the natural language expression habit, the sentence reasonability of the standard answer is used as a judgment reference, the judgment standard is unified, the grading difference is avoided, and the objectivity and the accuracy of the grading result are further improved.
In addition, the expected scores of the examinees are used as an adjustment standard, and the scores of the examinees are adjusted according to an up-regulation principle of ' less score for high-score examinees and more score for low-score examinees ' and a down-regulation principle of less score for high-score examinees and more score for low-score examinees ', so that the purpose of examination is achieved, and the psychological health problem caused by too large difference between the actual scores and the expected scores of the examinees is avoided.
A large number of technical features are described in the specification of the present application, and are distributed in various technical solutions, so that the specification is too long if all possible combinations of the technical features (i.e., the technical solutions) in the present application are listed. In order to avoid this problem, the respective technical features disclosed in the above summary of the invention of the present application, the respective technical features disclosed in the following embodiments and examples, and the respective technical features disclosed in the drawings may be freely combined with each other to constitute various new technical solutions (which are considered to have been described in the present specification) unless such a combination of the technical features is technically infeasible. For example, in one example, the feature a + B + C is disclosed, in another example, the feature a + B + D + E is disclosed, and the features C and D are equivalent technical means for the same purpose, and technically only one feature is used, but not simultaneously employed, and the feature E can be technically combined with the feature C, then the solution of a + B + C + D should not be considered as being described because the technology is not feasible, and the solution of a + B + C + E should be considered as being described.
Drawings
Fig. 1 is a flow chart of a method for machine-intelligent review of short-response questions according to a first embodiment of the present application.
FIG. 2 is a schematic structural diagram of a machine intelligent review system for short-response questions according to a second embodiment of the present application.
Detailed Description
In the following description, numerous technical details are set forth in order to provide a better understanding of the present application. However, it will be understood by those skilled in the art that the technical solutions claimed in the present application may be implemented without these technical details and with various changes and modifications based on the following embodiments.
Description of partial concepts:
subject keywords: the subject nouns or phrases, and all the words which reflect the subject characteristics and have certain information meanings in the subject field appear in the subject corpus.
Common keywords: in addition to the subject keywords, other words with certain information meanings are often found in the subject corpus.
Associated words: the words with the same or similar meaning as the corresponding subject keywords and the words with high subject relevance with the corresponding subject keywords are referred. The related words corresponding to each subject keyword may be 0, 1 or more, and the related words of one subject keyword are not limited to other subject keywords and/or common keywords.
Semantic related words: and words with the same or similar meaning as the corresponding subject keywords.
Subject associated word: words with a high degree of subject association with the corresponding subject keyword.
Some innovation points of the application are described below for a history subject brief answer:
simple answering: to illustrate the role of the people in independent war? .
Standard answers: the united states people, who have been tightly integrated with the formation of the american nationality, have ultimately developed a tremendous revolutionary force, their role in the united states' independent war is not negligible, and we can see the following aspects: (1) after the seven-year war is finished, the English starts to tighten the compression and the peeling of the English colonial land.
According to the embodiment of the application, the keywords are divided and expanded:
1) and identifying a subject keyword set and a common keyword set in the standard answer based on a keyword library corresponding to the historical subject:
subject keyword set: { [ 1-the American ethnic group ] [ 1-the revolutionary force ] [ 1-the independent war ] … … };
common keyword set: { [ 3-polymerization ] [ 3-bulk ] [ 3-formation ] [ 3-giant ] … … }.
2) The method comprises the following steps of (1) expanding a subject keyword set of a keyword library based on historical subjects:
determining [ 1-the American public ] corresponding associated word set: { [ 2-Community ] [ 2-people ] [ 2-America ] [ 2-Union ] [ 2-Med. Atlantic ] [ 2-Med. No. ] ] [ 2-Med. … … };
determining [ 1-independent war ] corresponding associated word set: { [ 2-USA ] [ 2-oppress ] [ 2-peel ] [ 2-FREE ] [ 2-REJECT ] [ 2-revolute ] [ 2-GROUND ] [ 2-ARRAY ] [ 2-EY force ] [ 2-NYY ] … … }; … … are provided.
The expanded subject keyword set comprises the following steps: { [ 1-the American people ] [ 1-the national beauty ] [ 1-revolutionary force ] [ 1-independent war ] [ 2-Community ] [ 2-the civil public ] [ 2-the American national beauty ] [ 2-the conglomerate ] [ 2-the Large ] [ 2-the United states ] [ 2-the oppression ] [ 2-the decortication ] [ 2-the liberty ] [ 2-the resistance ] [ 2-the revolutionary ] [ 2-the army ] [ 2-the armed force ] [ 2-the New York ] … … };
common keyword set: { [ 3-polymerization ] [ 3-bulk ] [ 3-formation ] … … }.
It can be seen that:
1) when the keywords are expanded, only the subject keywords are expanded, words with low relevance are effectively filtered (namely common keywords are not expanded), and the identification speed of subsequent keyword identification is improved.
2) The expanded subject keywords cover wider keywords, so that answer information containing more relevant words can be reasonably scored during subsequent scoring calculation.
3) Due to the fact that the keywords are reasonably divided and selectively expanded, and in the subsequent scoring calculation process, the scoring weight of the common keywords is reduced by improving the scoring weights of the subject keywords and the keywords, and the scoring result is more reasonable and accurate. For example, the formation of "american nationality makes people group together to form a revolutionary power" and the answer information is less in the scoring weight of the common keyword than that of the subject keyword and the related word, and although the answer information lacks the common keyword of "aggregate and whole" and the related word of "huge", the answer information can also obtain a relatively high score because the subject keyword including "american nationality and revolutionary power" and the related word of "people group and group", and the embodiment of the present application can achieve this effect.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The first embodiment of the present application relates to a machine intelligent review method for short answer questions, the flow of which is shown in fig. 1, and the method comprises the following steps:
and constructing a corresponding keyword library for each subject in advance. Specifically, the construction method of the keyword library of each subject comprises the following steps: the method comprises the steps of obtaining subject keywords and common keywords of a subject based on large-scale subject linguistic data (including subject textbooks, test questions, documents and the like, for example), generating a word vector table of each keyword, clustering the keywords based on the word vector table, obtaining a relevant word set of each subject keyword, and constructing a keyword library of the subject, wherein the keyword library comprises the subject keywords and the common keywords of the subject and the relevant word set of each subject keyword.
Optionally, the "acquiring subject keywords and common keywords of the subject based on the subject corpus, generating a word vector table of each keyword, clustering each keyword based on the word vector table, and acquiring a related word set of each subject keyword" may further include the following steps:
firstly, acquiring subject keywords and common keywords of the subject based on the subject corpus;
generating word vectors of the keywords by using a text depth language model to obtain a word vector table;
calculating the distance between each keyword in the word vector table;
acquiring words with the distance from each subject keyword to each subject keyword smaller than a preset threshold value to form a related word set of each subject keyword.
The text deep language model in the step two may be, for example, a deep learning model based on word2vec, an NNLM neural network language model, an ELMo model, a BERT model, and the like, and is not limited thereto. In the step c, the distance between the keywords in the word vector table may be calculated by, for example, a cosine similarity calculation method, a pearson correlation coefficient, or a similarity calculation method based on euclidean distance, but is not limited thereto.
In step 101, response information and standard answers of the target test questions are obtained.
Then, step 102 is entered, based on the keyword library, a subject keyword set and a common keyword set in the standard answer are extracted, and a related word set of each subject keyword is determined to expand the subject keyword set.
Optionally, step 102 is preceded by the step "clause and word segmentation on the standard answer", and step 102 is further implemented as: and extracting a subject keyword set and a common keyword set in the standard answer based on the keyword library and the word segmentation result of the standard answer, and determining a related word set of each subject keyword to expand the subject keyword set.
Optionally, after step 102, the following steps are further included: and removing the repeated associated words in the expanded subject keyword set from the subject keyword set, and removing the repeated common keywords in the common keyword set from the associated words in the expanded subject keyword set.
Optionally, after step 102, the following steps are further included: and labeling the category identification of each keyword in the expanded subject keyword set, and providing a basis for the subsequent step 103 of identifying different types of keywords and step 105 of determining the number of the different types of keywords. For example and without limitation, the following notations apply: the subject keyword may be labeled 1 and the associated word may be labeled 2. Optionally, the related words include semantic related words and disciplinary related words. For example and without limitation, the following notations apply: the semantic related word may be denoted by 21 and the disciplinary related word may be denoted by 22.
Then, step 103 is performed, subject keywords and associated words in the response information are identified based on the expanded subject keyword set, and common keywords in the response information are identified based on the common keyword set.
Optionally, step 103 further includes a step of "segmenting words and sentences from the answering information", and step 103 is further implemented as: and identifying subject keywords and associated words in the response information based on the expanded subject keyword set and the word segmentation result of the response information, and identifying common keywords in the response information based on the common keyword set.
Then, step 104 is entered to calculate the sentence reasonableness of the answering information, which is the reasonable degree of the logical sequence and relationship between words in the sentence. Specifically, the sentence reasonableness is a calculation of the reasonable degree of the word order (logical sequence and relation between words) of the sentence, and is calculated based on a language model of a large amount of language training.
Optionally, step 104 may further comprise the following sub-steps 104 a-104 d:
step 104 a: respectively extracting the word sequence of each sentence in the answering information and the standard answer;
step 104 b: calculating the probability value of the position of each word in each word sequence in the sentence by adopting an N-gram language model according to the Markov assumption;
step 104 c: calculating a word reasonable probability value of each sentence according to the probability value of the position of each word in the sentence based on a Bayesian conditional probability model;
step 104 d: and calculating the sentence reasonability of the response information according to the word reasonable probability value of each sentence in the response information and the standard answers.
In one embodiment, the N-gram language model is a trigram language model. In other embodiments, the N-gram language model may also be a binary, quaternary or quinary language model, or the like. Taking a trigram model as an example, according to the Markov assumption, the probability of any word appearing at a certain position is only related to one or a limited number of words appearing before it, and the trigram model is adopted, i.e., the word appearing at the current position is only related to two words before it, i.e., P (w) n |w 1 ,w 2 ,...,w n )≈P(w n |w n-2 ,w n-1 ) And training to obtain a 'three-dimensional language model' based on the large-scale subject linguistic data. Based on the sentence word sequence set of the standard answer information and the examinee answering information, obtaining the probability value of the position of each word appearing in the current sentence; and calculating the reasonable word probability of each sentence in the standard answer information and the examinee answering information according to the Bayesian conditional probability model. The specific calculation method is as follows: set of word sequences representing a sentence by SThen the sentence passing probability P(s) ═ P (w) 1 ,w 2 ,...,w n ) Namely: p(s) ═ P (w) 1 ,w 2 ,...,w n )=P(w 1 )×P(w 2 |w 1 )×...×P(w n |w 1 ,w 2 ,...,w n-1 ) Wherein w is 1 Indicates the probability of occurrence of the first word, P (w) 2 |w 1 ) Denotes w 1 On the premise of appearance w 2 Probability of occurrence next to that of P (w) 2 |w 1 )、P(w 3 |w 1 ,w 2 )、......、P(w n |w 1 ,w 2 ,...,w n ) And obtaining through the constructed trigram language model.
Optionally, step 104d may be further implemented as: respectively calculating the word reasonable probability mean values of the sentences of the answering information and the standard answers; if the mean value of the reasonable probabilities of the words of the sentence of the answering information is smaller than the mean value of the reasonable probabilities of the words of the sentence of the standard answer, the sentence reasonability of the answering information is the quotient of the mean value of the reasonable probabilities of the words of the sentence of the answering information and the mean value of the reasonable probabilities of the words of the sentence of the standard answer; if the mean value of the reasonable probabilities of the words of the sentence of the response information is greater than or equal to the mean value of the reasonable probabilities of the words of the sentence of the standard answer, the sentence reasonableness of the response information is 1.
Then, step 105 is entered, according to the formula Calculating a score F of the response information, wherein s 1 、s 2 、s 3 、s 4 Weight coefficients s representing subject keyword information, related word information, general keyword information, and sentence reasonableness in the answer information 1 >s 2 >s 3 ,F 0 The total score is the target test question.
Note that, in step 105, the number of related words in the response information means the number of excluded related words that are the same as the subject keyword, and the number of common keywords in the response information means the number of excluded common keywords that are the same as the related words.
Optionally, the related words comprise semantic related words and disciplinary related words, and the step 105 further comprises the steps of: according to Calculating the score F of the response information: wherein s is 1 、s 2 、s 3 、s 4 Weight coefficients respectively representing subject keyword information, related word information, general keyword information, sentence reasonability in the response information, a being a weight coefficient of semantic keyword information in related word information, b being a weight coefficient of subject related word information in related word information, and s being 1 >s 2 >s 3 ,a>b,F 0 And summarizing the target test question.
Wherein s is 4 May be between s 1 And s 3 Preferably s 1 >s 4 >s 2 >s 3 。
Optionally, after step 105, the following steps a to C may be further included:
a: calculating the average value of the scores of the answering information of all examinees in the preset examination group;
b: obtaining expected scores of the answering information of all examinees in the preset examination group, and calculating an average value of the expected scores;
c: and adjusting the scores of the answering information of each examinee according to the average value of the scores and the average value of the expected scores.
Optionally, the step C may be further implemented as: and determining an expected average score range according to the average value of the expected scores, determining to adjust the score upwards when the average value of the score is smaller than the lower limit value of the expected average score range, and determining to adjust the score downwards when the average value of the score is larger than or equal to the upper limit value of the expected average score range.
Optionally, after the determining to adjust the score up or the determining to adjust the score down, the following steps a to f may be further included:
a: calculating a down-regulation or up-regulation base score as an absolute value of a difference between the average of the scores and the average of the expected scores;
b: according to the ranking from high to low of the scores of all the examinees in the preset examination group according to the answering information, dividing the examinees and the scores thereof into an excellent examinee set, a common examinee set and a poor examinee set according to the ranking result;
c: calculating the up-regulation or down-regulation score of each test in the common test set to be equal to the up-regulation or down-regulation benchmark score;
d: according to the formulaCalculating an up or down score for each test in the set of excellent tests, wherein TR represents the up or down benchmark score, F 0 Indicates the total score of the target test question, F i Represents the score, S, of the ith test taker in the set of excellent test takers i Represents an up or down score for the ith test taker in the set of excellent test takers;
e: according to the formulaCalculating the up-regulation or down-regulation score of each examinee in the poor examinee set, wherein f represents the number of examinees in the excellent examinee set, h represents the number of examinees in the poor examinee set, S represents the up-regulation or down-regulation score of each examinee in the poor examinee set, TR represents the up-regulation or down-regulation reference score, and S represents the up-regulation or down-regulation reference score i Indicates the up-regulation or down-regulation of the ith test taker in the set of excellent test takersAdjusting the value;
f: and adjusting the score of the answering information of each examinee up or down according to the calculated up-adjustment or down-adjustment value of the score of each examinee.
Alternatively, the above-mentioned "determining the expected average score range according to the average of the expected scores" may be set by manual input of the reviewer or by default by the system to the average of the expected scores ± a preset score, which may be, for example, but not limited to, 2, 3, 4, 5, etc.
The second embodiment of the present application relates to a machine intelligent review system for a short-response question, which has a structure shown in fig. 2 and includes a keyword library construction module, a keyword library, an acquisition module, a keyword recognition module, a reasonableness calculation module, and a scoring module.
Specifically, the keyword library construction module is configured to obtain subject keywords and common keywords of the subject based on the subject corpus, generate a word vector table of each keyword, perform clustering on each keyword based on the word vector table, and obtain a related word set of each subject keyword, so as to construct the keyword library of the subject. The keyword library construction module is used for constructing a keyword library for each subject, and the keyword library of each subject comprises subject keywords of the subject, common keywords and associated word sets corresponding to the subject keywords.
Optionally, the keyword library building module is further configured to obtain subject keywords and common keywords of the subject based on the subject corpus; generating word vectors of the keywords by using a text depth language model to obtain a word vector table; calculating the distance between each keyword in the word vector table; and acquiring words with the distance from each subject keyword to each subject keyword less than a preset threshold value to form a related word set of each subject keyword. The text deep language model may be, for example, a word2 vec-based deep learning model, an NNLM neural network language model, an ELMo model, a BERT model, and the like, and is not limited thereto. For example, a cosine similarity calculation method, a pearson correlation coefficient, or a similarity calculation method based on euclidean distance may be used to calculate the distance between the keywords in the word vector table, but the method is not limited thereto.
The acquisition module is used for acquiring the answering information and the standard answers of the target test questions.
The keyword recognition module is used for extracting a subject keyword set and a common keyword set in the standard answer based on the keyword library and determining a related word set of each subject keyword to expand the subject keyword set; and identifying subject keywords and associated words in the response information based on the expanded subject keyword set, and identifying common keywords in the response information based on the common keyword set.
Optionally, the keyword recognition module is further configured to perform sentence segmentation and word segmentation on the standard answer, extract a subject keyword set and a common keyword set in the standard answer based on the keyword library and the word segmentation result of the standard answer, and determine a related word set of each subject keyword to expand the subject keyword set. Optionally, the keyword recognition module is further configured to perform word segmentation and sentence segmentation on the response information, recognize subject keywords and associated words in the response information based on the expanded subject keyword set and the word segmentation result of the response information, and recognize common keywords in the response information based on the common keyword set.
Optionally, the keyword recognition module is further configured to perform deduplication on associated words in the expanded subject keyword set, the associated words being repeated with the subject keywords, and eliminate common keywords in the common keyword set, the common keywords being repeated with the associated words in the expanded subject keyword set.
Optionally, the keyword recognition module is further configured to label each keyword in the expanded subject keyword set with a category identifier thereof, so as to provide a basis for subsequent recognition of different types of keywords and determination of the number of different types of keywords, for example, but not limited to, the following labeling manner is adopted: the subject keyword may be labeled 1 and the associated word may be labeled 2. Optionally, the relevant words include semantic relevant words and disciplinary relevant words, for example and without limitation, the following notation is adopted: the semantic related word may be denoted by 21 and the disciplinary related word may be denoted by 22.
The reasonability calculation module is used for calculating the sentence reasonability of the answering information, and the sentence reasonability refers to the reasonability of the logical sequence and relationship between words in the sentence. Specifically, the sentence reasonableness is a calculation of the reasonable degree of the word order (logical sequence and relation between words) of the sentence, and is calculated based on a language model of a large amount of language training.
Optionally, the reasonableness calculation module is further configured to extract word sequences of each sentence in the answer information and the standard answer respectively; calculating the probability value of the position of each word in each word sequence in the sentence by adopting an N-gram language model according to the Markov hypothesis; calculating word reasonable probability value of each sentence according to the probability value of the position of each word in the sentence based on a Bayesian conditional probability model; and calculating the sentence reasonability of the response information according to the word reasonable probability value of each sentence in the response information and the standard answers.
In one embodiment, the N-gram language model is a trigram language model. In other embodiments, the N-gram language model may also be a binary, quaternary or quinary language model, or the like. Taking a trigram model as an example, according to the Markov assumption, the probability of any word appearing at a certain position is only related to one or a limited number of words appearing before it, and the trigram model is adopted, i.e., the word appearing at the current position is only related to two words before it, i.e., P (w) n |w 1 ,w 2 ,...,w n )≈P(w n |w n-2 ,w n-1 ) And training to obtain a 'three-dimensional language model' based on the large-scale subject linguistic data. Based on the sentence word sequence set of the standard answer information and the examinee answering information, obtaining the probability value of the position of each word appearing in the current sentence; and calculating the reasonable word probability of each sentence in the standard answer information and the examinee answering information according to the Bayesian conditional probability model. The specific calculation method is as follows: s denotes a word sequence set of a sentence, and the sentence passing probability P (S ═ P (w) 1 ,w 2 ,...,w n ) Namely: p (S ═ P (w) 1 ,w 2 ,...,w n )=P(w 1 )×P(w 2 |w 1 )×...×P(w n |w 1 ,w 2 ,...,w n-1 ) Wherein w is 1 Denotes the probability of occurrence of the first word, P (w) 2 |w 1 ) Denotes w 1 On the premise of occurrence of w 2 Probability of occurrence next to each other, and so on, P (w) 2 |w 1 )、P(w 3 |w 1 ,w 2 )、......、P(w n |w 1 ,w 2 ,...,w n ) And obtaining the language data through the constructed trigram language model.
Optionally, the reasonableness calculation module is further configured to calculate word reasonable probability averages of sentences of the answer information and the standard answers, respectively; if the mean value of the reasonable probabilities of the words of the sentence of the answering information is smaller than the mean value of the reasonable probabilities of the words of the sentence of the standard answer, the sentence reasonability of the answering information is the quotient of the mean value of the reasonable probabilities of the words of the sentence of the answering information and the mean value of the reasonable probabilities of the words of the sentence of the standard answer; if the mean value of the reasonable probabilities of the words of the sentence of the response information is greater than or equal to the mean value of the reasonable probabilities of the words of the sentence of the standard answer, the sentence reasonableness of the response information is 1.
The scoring module is used for scoring according to a formula Calculating a score F of the response information, wherein s 1 、s 2 、s 3 、s 4 Weight coefficients s representing subject keyword information, related word information, general keyword information, and sentence reasonableness in the answer information 1 >s 2 >s 3 ,F 0 The total score is the target test question.
Optionally, the related words comprise semantic related words and disciplinary related words, and the scoring module is further configured to score according to Calculating a score F of the response information, wherein s 1 、s 2 、s 3 、s 4 Weight coefficients respectively representing subject keyword information, related word information, general keyword information, sentence reasonability in the response information, a being a weight coefficient of semantic keyword information in related word information, b being a weight coefficient of subject related word information in related word information, and s being 1 >s 2 >s 3 ,a>b,F 0 The total score is the target test question.
Wherein s is 4 May be between s 1 And s 3 Preferably s 1 >s 4 >s 2 >s 3 。
Optionally, the scoring module is further configured to calculate an average value of the scores of the answering information of all the examinees in the preset examination group; obtaining expected scores of the answering information of all examinees in the preset examination group, and calculating the average value of the expected scores; and adjusting the scores of the answering information of each examinee according to the average value of the scores and the average value of the expected scores.
Optionally, the scoring module is further configured to determine an expected average score range according to the average value of the expected scores, determine to adjust the score upward when the average value of the score is smaller than a lower limit value of the expected average score range, and determine to adjust the score downward when the average value of the score is greater than or equal to an upper limit value of the expected average score range.
Optionally, the scoring module is further configured to compute a down-or up-regulation reference score as an absolute value of a difference between the average of the scores and the average of the expected scores; all examinees in the preset examination group are ranked from high to low according to the scores of the answering information and are ranked according to the rankingDividing each examinee and the score thereof into a superior examinee set, a common examinee set and a poor examinee set; calculating the up-regulation or down-regulation score of each test in the common test set to be equal to the up-regulation or down-regulation benchmark score; according to the formulaCalculating an up or down score for each test in the set of excellent tests, wherein TR represents the up or down benchmark score, F 0 Indicates the total score of the target test question, F i Represents the score, S, of the ith test taker in the set of excellent test takers i Representing the upregulation or downregulation value of the ith test taker in the set of excellent test takers; according to the formulaCalculating the up-regulation or down-regulation score of the score of each examinee in the poor examinee set, wherein f represents the number of examinees in the excellent examinee set, h represents the number of examinees in the poor examinee set, S represents the up-regulation or down-regulation score of each examinee in the poor examinee set, TR represents the up-regulation or down-regulation reference score, and S i Representing the upregulation or downregulation value of the ith test taker in the set of excellent test takers; and adjusting the score of the answering information of each test taker up or down according to the calculated up or down adjustment value of the score of each test taker.
Alternatively, the above-mentioned "determining the expected average score range according to the average of the expected scores" may be set by manual input of the reviewer or by default by the system to the average of the expected scores ± a preset score, which may be, for example, but not limited to, 2, 3, 4, 5, etc.
The first embodiment is a method embodiment corresponding to the present embodiment, and the technical details in the first embodiment may be applied to the present embodiment, and the technical details in the present embodiment may also be applied to the first embodiment.
It should be noted that, those skilled in the art should understand that the implementation functions of the modules shown in the embodiment of the machine intelligent review system for the short-answer question can be understood by referring to the related description of the machine intelligent review method for the short-answer question. The functions of the modules shown in the embodiment of the machine intelligent review system for short answer questions can be implemented by a program (executable instructions) running on a processor, and can also be implemented by specific logic circuits. The machine intelligent review system for the short answer questions in the embodiment of the application can be stored in a computer readable storage medium if the system is realized in the form of a software functional module and is sold or used as an independent product. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
Accordingly, the present application also provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the computer-executable instructions implement the method embodiments of the present application. Computer-readable storage media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable storage medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
In addition, the embodiment of the application also provides a machine intelligent review system of the short answer question, which comprises a memory for storing computer executable instructions and a processor; the processor is configured to implement the steps of the method embodiments described above when executing the computer-executable instructions in the memory. The Processor may be a Central Processing Unit (CPU), another general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or the like. The memory may be a read-only memory (ROM), a Random Access Memory (RAM), a Flash memory (Flash), a hard disk, or a solid state disk. The steps of the method disclosed in the embodiments of the present invention may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
It is noted that, in the present patent application, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the use of the verb "comprise a" to define an element does not exclude the presence of another, same element in a process, method, article, or apparatus that comprises the element. In the present patent application, if it is mentioned that a certain action is executed according to a certain element, it means that the action is executed according to at least the element, and two cases are included: performing the action based only on the element, and performing the action based on the element and other elements. The expression of multiple, etc. includes 2, and more than 2, more than 2.
All documents mentioned in this application are to be considered as being incorporated in their entirety into the disclosure of this application so as to be subject to modification as necessary. It should be understood that the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of the present disclosure should be included in the scope of protection of one or more embodiments of the present disclosure.
Claims (9)
1. A machine intelligent review method of short answer questions is characterized by comprising the following steps:
acquiring subject keywords and common keywords of the subject based on subject linguistic data in advance, generating a word vector table of each keyword, clustering the keywords based on the word vector table, acquiring a related word set of the subject keywords, and constructing a keyword library of the subject;
acquiring answering information and standard answers of the target test questions;
on the basis of the keyword library, extracting a subject keyword set and a common keyword set in the standard answers, and determining a relevant word set of each subject keyword to expand the subject keyword set;
identifying subject keywords and associated words in the answering information based on the expanded subject keyword set, and identifying common keywords in the answering information based on the common keyword set;
calculating the sentence reasonability of the answering information, wherein the sentence reasonability refers to the reasonability of the logical sequence and relationship between words in the sentence, and the step further comprises the following substeps: respectively extracting word sequences of each sentence in the response information and the standard answers, calculating a probability value of the position of each word in each word sequence in the sentence according to Markov hypothesis by adopting an N-gram language model, calculating a word reasonable probability value of each sentence according to the probability value of the position of each word in the sentence on the basis of a Bayesian conditional probability model, and calculating the sentence reasonability of the response information according to the word reasonable probability value of each sentence in the response information and the standard answers;
according to the formula Calculating a score F of the response information, wherein s 1 、s 2 、s 3 、s 4 Weight coefficients s representing subject keyword information, associated word information, general keyword information, and sentence reasonableness in the answer information 1 > 2 > 3 ,F 0 And summarizing the target test questions.
2. The method of machine-intelligent review of the abbreviated question as set forth in claim 1, wherein said calculating a sentence reasonableness of said response information based on said response information and a word reasonableness probability value of each sentence of said standard answers further comprises:
respectively calculating word reasonable probability mean values of sentences of the answering information and the standard answers;
if the mean value of the reasonable probabilities of the words of the sentence of the answering information is smaller than the mean value of the reasonable probabilities of the words of the sentence of the standard answer, the sentence reasonability of the answering information is the quotient of the mean value of the reasonable probabilities of the words of the sentence of the answering information and the mean value of the reasonable probabilities of the words of the sentence of the standard answer;
and if the mean value of the reasonable probability of the words of the sentence of the answering information is greater than or equal to the mean value of the reasonable probability of the words of the sentence of the standard answer, the sentence reasonableness of the answering information is 1.
3. The machine-intelligent method of reviewing a short answer question of claim 1, wherein said obtaining subject keywords and common keywords of the subject based on the subject corpus and generating a word vector table for each keyword, clustering each keyword based on the word vector table to obtain a set of associated words for each subject keyword, further comprises:
acquiring subject keywords and common keywords of the subject based on the subject corpus;
generating a word vector of each keyword by using a text depth language model to obtain a word vector table;
calculating the distance between each keyword in the word vector table;
and acquiring words with the distance from each subject keyword being less than a preset threshold value to form a related word set of each subject keyword.
4. The machine-intelligent method for machine-readable review of abbreviated questions of claim 3 wherein said text-deep language model is a word2 vec-based deep learning model;
and calculating the distance between the keywords in the word vector table by adopting a cosine similarity calculation method.
5. The machine-intelligent review method for short-response questions of any of claims 1-4, further comprising, after said calculating the score for said response information:
calculating the average value of the scores of the answering information of all examinees in a preset examination group;
obtaining expected scores of the answering information of all examinees in the preset examination group, and calculating an average value of the expected scores;
and adjusting the scores of the answering information of each examinee according to the average value of the scores and the average value of the expected scores.
6. The machine-intelligent method for machine-review of short-response questions of claim 5, wherein said adjusting the score of said answering information for each test taker based on the average of said scores and the average of said expected scores, further comprises:
determining an expected average score range according to the average value of the expected scores, determining to adjust the scores upwards when the average value of the scores is smaller than the lower limit value of the expected average score range, and determining to adjust the scores downwards when the average value of the scores is larger than or equal to the upper limit value of the expected average score range;
after the determining to adjust the score upward or the determining to adjust the score downward, further comprising:
calculating a down-regulation or up-regulation base score as an absolute value of a difference between the average of the scores and the average of the expected scores;
according to the ranking of all the examinees in the preset examination group from high to low according to the scores of the answering information, dividing the examinees and the scores thereof into an excellent examinee set, a common examinee set and a poor examinee set according to the ranking result;
calculating an up-or down-regulation score for each test in the set of common test takers equal to the up-or down-regulation base score;
according to the formulaCalculating an up or down score for each test taker in the set of excellent test takers, wherein TR represents the up or down baseline score, F 0 Representing the total score of the target test questions, F i Representing the score, S, of the ith test taker in the set of excellent test takers i Representing an up or down score for the ith test taker in the set of excellent test takers;
according to the formulaComputing for each test in the set of bad tests(ii) an up or down score of scores, wherein f represents the number of test takers in the set of excellent test takers, h represents the number of test takers in the set of poor test takers, S represents the up or down score of each test taker in the set of poor test takers, TR represents the up or down baseline score, S i Representing an up or down score for the ith test taker in the set of excellent test takers;
and adjusting the scores of the answering information of each examinee up or down according to the calculated up-adjustment or down-adjustment values of the scores of each examinee.
7. The utility model provides a machine intelligence system of reviewing of brief answer, its characterized in that includes:
the keyword library construction module is used for acquiring subject keywords and common keywords of the subject based on subject linguistic data in advance, generating a word vector table of each keyword, clustering each keyword based on the word vector table, and acquiring a related word set of each subject keyword so as to construct a keyword library of the subject;
the acquisition module is used for acquiring the answering information and the standard answers of the target test questions;
a keyword identification module, configured to extract a subject keyword set and a common keyword set in the standard answer based on the keyword library, determine a related word set of each subject keyword to expand the subject keyword set, identify subject keywords and related words in the response information based on the expanded subject keyword set, and identify common keywords in the response information based on the common keyword set;
the reasonableness calculation module is used for calculating the sentence reasonableness of the answering information, the sentence reasonableness refers to the reasonable degree of the logical sequence and the relation between words in the sentence, the reasonableness calculation module is also used for respectively extracting the answering information and the word sequence of each sentence in the standard answer, calculating the probability value of the position of each word in each word sequence in the sentence according to Markov assumption by adopting an N-gram language model, calculating the word reasonable probability value of each sentence according to the probability value of the position of each word in the sentence on the basis of a Bayesian conditional probability model, and calculating the sentence reasonableness of the answering information according to the answering information and the word reasonable probability value of each sentence in the standard answer;
a scoring module for scoring according to a formula Calculating a score F of the response information, wherein s 1 、s 2 、s 3 、s 4 Weight coefficients s representing subject keyword information, associated word information, general keyword information, and sentence reasonableness in the answer information 1 >s 2 >s s ,F 0 And summarizing the target test questions.
8. The utility model provides a machine intelligence system of reviewing of brief answer, its characterized in that includes:
a memory for storing computer executable instructions; and the number of the first and second groups,
a processor for implementing the steps in the method of any one of claims 1 to 6 when executing the computer-executable instructions.
9. A computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor implement the steps in the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011078190.0A CN112214579B (en) | 2020-10-10 | 2020-10-10 | Machine intelligent review method and system for short answer questions |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011078190.0A CN112214579B (en) | 2020-10-10 | 2020-10-10 | Machine intelligent review method and system for short answer questions |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112214579A CN112214579A (en) | 2021-01-12 |
CN112214579B true CN112214579B (en) | 2022-08-23 |
Family
ID=74053028
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011078190.0A Active CN112214579B (en) | 2020-10-10 | 2020-10-10 | Machine intelligent review method and system for short answer questions |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112214579B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113254618B (en) * | 2021-06-15 | 2021-11-19 | 明品云(北京)数据科技有限公司 | Data acquisition processing method, system, electronic equipment and medium |
CN114743421B (en) * | 2022-04-27 | 2023-05-16 | 广东亚外国际文化产业有限公司 | Comprehensive assessment system and method for foreign language learning intelligent teaching |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108959261A (en) * | 2018-07-06 | 2018-12-07 | 京工博创(北京)科技有限公司 | Paper subjective item based on natural language sentences topic device and method |
CN110175585A (en) * | 2019-05-30 | 2019-08-27 | 北京林业大学 | It is a kind of letter answer correct system and method automatically |
CN110196893A (en) * | 2019-05-05 | 2019-09-03 | 平安科技(深圳)有限公司 | Non- subjective item method to go over files, device and storage medium based on text similarity |
CN110705278A (en) * | 2018-07-09 | 2020-01-17 | 北大方正集团有限公司 | Subjective question marking method and subjective question marking device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8788260B2 (en) * | 2010-05-11 | 2014-07-22 | Microsoft Corporation | Generating snippets based on content features |
-
2020
- 2020-10-10 CN CN202011078190.0A patent/CN112214579B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108959261A (en) * | 2018-07-06 | 2018-12-07 | 京工博创(北京)科技有限公司 | Paper subjective item based on natural language sentences topic device and method |
CN110705278A (en) * | 2018-07-09 | 2020-01-17 | 北大方正集团有限公司 | Subjective question marking method and subjective question marking device |
CN110196893A (en) * | 2019-05-05 | 2019-09-03 | 平安科技(深圳)有限公司 | Non- subjective item method to go over files, device and storage medium based on text similarity |
CN110175585A (en) * | 2019-05-30 | 2019-08-27 | 北京林业大学 | It is a kind of letter answer correct system and method automatically |
Non-Patent Citations (1)
Title |
---|
基于语句相似度的主观试题自动阅卷模型研究;陈贤武等;《武汉大学学报(工学版)》;20180731;第51卷(第7期);第654-658页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112214579A (en) | 2021-01-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106570708B (en) | Management method and system of intelligent customer service knowledge base | |
US20210056571A1 (en) | Determining of summary of user-generated content and recommendation of user-generated content | |
CN101599071B (en) | Automatic extraction method of dialog text theme | |
CN110795543A (en) | Unstructured data extraction method and device based on deep learning and storage medium | |
WO2019165678A1 (en) | Keyword extraction method for mooc | |
CN110096567A (en) | Selection method, system are replied in more wheels dialogue based on QA Analysis of Knowledge Bases Reasoning | |
CN105975454A (en) | Chinese word segmentation method and device of webpage text | |
CN112214579B (en) | Machine intelligent review method and system for short answer questions | |
CN106960001A (en) | A kind of entity link method and system of term | |
CN111078943A (en) | Video text abstract generation method and device | |
CN109949799B (en) | Semantic parsing method and system | |
CN109508460B (en) | Unsupervised composition running question detection method and unsupervised composition running question detection system based on topic clustering | |
CN108090099B (en) | Text processing method and device | |
CN113342958B (en) | Question-answer matching method, text matching model training method and related equipment | |
CN112580351B (en) | Machine-generated text detection method based on self-information loss compensation | |
CN110781681A (en) | Translation model-based elementary mathematic application problem automatic solving method and system | |
CN110852071B (en) | Knowledge point detection method, device, equipment and readable storage medium | |
CN111767743B (en) | Machine intelligent evaluation method and system for translation test questions | |
CN107562907B (en) | Intelligent lawyer expert case response device | |
CN110969005A (en) | Method and device for determining similarity between entity corpora | |
CN116993549A (en) | Review resource recommendation method for online learning system | |
CN111813919B (en) | MOOC course evaluation method based on syntactic analysis and keyword detection | |
CN110223206A (en) | Text major field determines method and system and parsing courseware matching process and system | |
Murugan et al. | Affix-based Distractor Generation for Tamil Multiple Choice Questions using Neural Word Embedding | |
CN114239587B (en) | Digest generation method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |