Nothing Special   »   [go: up one dir, main page]

CN117852550A - Method, medium and system for electronic examination paper - Google Patents

Method, medium and system for electronic examination paper Download PDF

Info

Publication number
CN117852550A
CN117852550A CN202410035078.0A CN202410035078A CN117852550A CN 117852550 A CN117852550 A CN 117852550A CN 202410035078 A CN202410035078 A CN 202410035078A CN 117852550 A CN117852550 A CN 117852550A
Authority
CN
China
Prior art keywords
topic
question
knowledge point
vector
questions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410035078.0A
Other languages
Chinese (zh)
Inventor
高阳
何淑媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Jinke Education Technology Co ltd
Original Assignee
Yunnan Jinke Education Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Jinke Education Technology Co ltd filed Critical Yunnan Jinke Education Technology Co ltd
Priority to CN202410035078.0A priority Critical patent/CN117852550A/en
Publication of CN117852550A publication Critical patent/CN117852550A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a medium and a system for an electronic examination paper, belonging to the technical field of electronic examination paper, comprising the following steps: acquiring knowledge points to be examined and key indexes of the knowledge points, and establishing a group scroll knowledge point vector; screening all test questions matched with the knowledge point vector from a question bank according to the knowledge point vector to serve as a first question set; extracting the semantic features of the topics from each topic in the first topic set, and clustering to obtain a plurality of clustering centers and topics corresponding to each clustering center, so that the topics are divided into a plurality of groups and marked as topic semantic groups; generating an intra-group topic difficulty distribution map and a topic similarity matrix for each topic semantic group; selecting a topic corresponding to each knowledge point by using an optimization algorithm according to a topic difficulty distribution map and a topic similarity matrix in a topic library by using a group-scroll knowledge point vector; the selected questions are combined into a complete test paper, and final evaluation and optimization are carried out through a self-feedback neural network.

Description

Method, medium and system for electronic examination paper
Technical Field
The invention belongs to the technical field of electronic examination paper, and particularly relates to an electronic examination paper making method, medium and system.
Background
Currently, computer technology is increasingly used in educational examination, and electronic examination gradually replaces traditional paper examination. In the electronic examination, the construction of a test question bank and the automatic assembly of examination papers are one of key technologies.
The traditional manual scroll-making mode mainly depends on the experience of teachers, has low efficiency and cannot adapt to the needs of large-scale examination. Some existing automatic paper assembling systems based on artificial intelligence can realize test question matching of knowledge point dimensions, but are limited to relying on knowledge points marked manually in advance, can not realize deep understanding of semantic content, and lead to too narrow knowledge points covered by the test questions. In addition, the existing system mainly performs the question selection through manually preset logic rules, so that intelligent evaluation of question difficulty and distinguishing degree cannot be realized, and the quality of the assembled scroll test cannot be ensured.
In summary, the existing automatic paper-making system has obvious defects in the aspects of knowledge point labeling, semantic understanding, paper-making optimization and the like, and cannot effectively realize the balance between the coverage breadth of the knowledge points of the test paper and the difficulty of examination questions, so that a new automatic paper-making technical scheme is needed to realize the intelligent and personalized paper-making of examination questions.
Disclosure of Invention
In view of the above, the invention provides an electronic examination paper assembling method, medium and system, which can solve the technical problems that the existing automatic paper assembling system has obvious defects in the aspects of knowledge point marking, semantic understanding, paper assembling optimization and the like, and cannot effectively realize the balance between the coverage breadth of the knowledge points of the examination paper and the examination paper difficulty.
The invention is realized in the following way:
the first aspect of the invention provides an electronic examination paper making method, which comprises the following steps:
s10, acquiring knowledge points to be examined and key indexes of the knowledge points, and establishing a group-scroll knowledge point vector comprising the knowledge points and the key indexes thereof, wherein the key indexes are expressed by positive integers;
s20, screening all test questions matched with the knowledge point vector from a question bank according to the knowledge point vector to serve as a first question set;
s30, carrying out semantic analysis on each topic in the first topic set, and extracting topic semantic features;
s40, clustering all topics in the first topic set according to topic semantic features to obtain a plurality of clustering centers and topics corresponding to each clustering center, and dividing the topics into a plurality of groups to be marked as topic semantic groups;
s50, generating an intra-group topic difficulty distribution map and a topic similarity matrix for each topic semantic group by using a self-supervision learning method, and ensuring the diversity and the balance of topic difficulty and knowledge point coverage;
s60, selecting the questions corresponding to each knowledge point by using a group paper knowledge point vector according to the question difficulty distribution map and the question similarity matrix in the question library and adopting an optimization algorithm so as to ensure that the test paper covers all knowledge points and keep the difficulty balance and the content diversity;
s70, combining the selected questions into a complete test paper, and finally evaluating and optimizing the test paper through a self-feedback neural network to complete the process of the electronic test paper.
Based on the technical scheme, the method for the electronic examination paper can be further improved as follows:
the step of acquiring the knowledge points to be examined and the key indexes of the knowledge points and establishing the grouped scroll knowledge point vector specifically comprises the following steps:
firstly, determining each knowledge point to be examined in the current examination according to a teaching outline and a course target;
then, according to the importance degree of each knowledge point in the teaching outline and course target and the examination frequency of the knowledge points, determining the key index of each knowledge point, wherein the key index is expressed by a positive integer;
then, the acquired knowledge points and the key indexes thereof are established into a knowledge point vector, and each element in the vector contains the name of the knowledge point and the key indexes thereof.
Further, the step of screening all questions matched with the knowledge point vector from the question bank as the first question set according to the knowledge point vector specifically includes:
firstly, a test question vector space needs to be constructed;
then, matching is carried out according to the group paper knowledge point vector and each test question vector in the test question vector space, and the similarity is calculated;
and finally, selecting all test questions meeting a certain similarity threshold to form a first question set.
Further, the step of performing semantic analysis on each topic in the first topic set and extracting semantic features of the topics specifically includes: performing topic semantic analysis by using a word vector representation method, performing word segmentation on topic texts of each topic, searching word vectors of each word by using a pre-training word vector model, and performing weighted average on the word vectors to obtain topic word vectors representing topic semantic features.
Further, the step of clustering all the topics in the first topic set according to topic semantic features specifically includes: and clustering the first question set by adopting a K-means clustering algorithm, calculating the semantic similarity among the questions, distributing the questions into the class most similar to the questions, and iteratively updating the class center to finally obtain a plurality of clusters containing different questions.
Further, the step of generating the intra-group topic difficulty distribution map and the topic similarity matrix for each topic semantic group by using the self-supervision learning method specifically includes:
firstly, training a question difficulty estimator by using a self-supervision learning method;
then, outputting the topic difficulty scores of the topics in each topic semantic group by using a topic difficulty evaluator, and counting the topic numbers of different difficulty intervals to obtain topic difficulty distribution;
finally, a similarity matrix for the topics within the set is calculated.
Further, the step of selecting the topic corresponding to each knowledge point specifically includes: defining a knowledge point set, a question set and a weight coefficient, establishing an objective function containing knowledge point coverage and question difficulty balance constraint, and solving by using an optimization algorithm to obtain an optimal question combination with the maximum knowledge point examination coverage.
Further, the step of combining the selected questions into a complete test paper and performing final evaluation and optimization through a self-feedback neural network specifically comprises the following steps: and pre-training the group paper by using an automatic encoder, constructing an evaluation model and a generation model, forming a self-adaptive feedback structure by combined training, and adjusting the generation model by feedback of an evaluation result to optimize the selected question combination.
A second aspect of the present invention provides a computer readable storage medium, where the computer readable storage medium stores program instructions, where the program instructions are executed to perform an electronic test paper assembling method as described above.
A third aspect of the present invention provides an electronic test paper system, comprising the computer-readable storage medium described above.
Compared with the prior art, the method, the medium and the system for the electronic examination paper have the beneficial effects that: deep semantic analysis of test question text is realized through natural language processing technology, knowledge points contained in the test questions are accurately obtained, and manual knowledge point labeling is not relied on; the word vector is used for representing the test question semantics, the accurate matching of the knowledge points and the test questions is carried out, and the coverage breadth of the knowledge points is enlarged; the non-supervision clustering algorithm of semantic features is adopted to group the test questions, and the non-supervision clustering algorithm is not simple and random, so that the balance of knowledge point examination is facilitated; the technical problems that the existing automatic paper assembling system has obvious defects in the aspects of knowledge point marking, semantic understanding, paper assembling optimization and the like, and the balance between the coverage breadth of knowledge points and the difficulty of examination papers cannot be effectively realized are solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method provided by the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Referring to fig. 1, a flowchart of a method for electronically testing a paper according to a first aspect of the present invention is provided, the method comprising the steps of:
s10, acquiring knowledge points to be examined and key indexes of the knowledge points, and establishing a group-scroll knowledge point vector, wherein the group-scroll knowledge point vector comprises the knowledge points and key indexes thereof, and the key indexes are expressed by positive integers;
s20, screening all test questions matched with the knowledge point vector from a question bank according to the knowledge point vector to serve as a first question set;
s30, carrying out semantic analysis on each topic in the first topic set, and extracting topic semantic features;
s40, clustering all the topics in the first topic set according to topic semantic features to obtain a plurality of clustering centers and topics corresponding to each clustering center, dividing the topics into a plurality of groups, and marking the groups as topic semantic groups;
s50, generating an intra-group topic difficulty distribution map and a topic similarity matrix for each topic semantic group by using a self-supervision learning method, and ensuring the diversity and the balance of topic difficulty and knowledge point coverage;
s60, selecting the questions corresponding to each knowledge point by using a group paper knowledge point vector according to the question difficulty distribution map and the question similarity matrix in the question library and adopting an optimization algorithm so as to ensure that the test paper covers all knowledge points and keep the difficulty balance and the content diversity;
s70, combining the selected questions into a complete test paper, and finally evaluating and optimizing the test paper through a self-feedback neural network to complete the process of the electronic test paper.
The following is a description of specific embodiments of the steps described above:
specific embodiment of step S10:
the method aims at acquiring knowledge points to be examined and key indexes of the knowledge points, establishing a group scroll knowledge point vector and providing knowledge point basis for subsequent automatic group scroll.
Firstly, a scroll teacher determines each knowledge point to be examined in the examination according to a teaching outline and a course target, wherein the knowledge points can be teaching chapters, lessons or knowledge units and the like. And then, the group roll teachers determine key indexes of the knowledge points according to factors such as importance degree of the knowledge points in the teaching outline and course targets, examination frequency of the knowledge points and the like, wherein the key indexes are expressed by positive integers, and the larger the numerical value is, the more important the knowledge points are expressed.
Then, the system establishes a knowledge point vector from the acquired knowledge points and the key indexes thereof, wherein each element in the vector contains the name of the knowledge point and the key indexes thereof. The knowledge point vector is a scroll knowledge point vector, and reflects each knowledge point to be examined in the examination and the importance degree weight of each knowledge point.
The knowledge point vector can be established by adopting a vector space model algorithm which abstracts words in the text into coordinates in a vector space, and the knowledge point vector is widely applied to the field of natural language processing. The specific implementation steps are as follows:
1) Defining a knowledge point set as k= { K1, K2,., kn }, where ki represents the i-th knowledge point;
2) Defining a key index set as w= { W1, W2, & gt, wn }, wherein wi represents a key index of an ith knowledge point ki;
3) Establishing a knowledge point vector set according to the knowledge point set K and the key index set W: v= { V1, V2,., vn }, where vi= (ki, wi) represents a vector containing knowledge points ki and their emphasis indices wi;
4) All vectors in the vector set V are connected in a certain sequence, and a final group volume knowledge point vector is obtained: v' = (V1, V2,) vn.
By the vector space model algorithm, the scroll knowledge point vector can be effectively established from the extracted knowledge points and key indexes thereof, and knowledge point basis is provided for subsequent automatic scroll. The vector fully considers the name information of the knowledge points and the importance degree of the knowledge points in the examination, not only covers all the knowledge points to be examined, but also reflects the weight of each knowledge point.
Specific embodiment of step S20:
the method comprises the steps of selecting test questions matched with a vector from a question bank according to a group paper knowledge point vector to form a preliminary test question set. The step realizes the matching of the knowledge points and the test questions through the application of the vector space model.
First, a test question vector space needs to be constructed. The method comprises the following specific steps:
1) Defining all test question sets in a question bank as q= { Q1, Q2,., qm }, wherein qj represents the jth test question;
2) Carrying out semantic analysis on each test question by using a natural language processing technology, extracting all knowledge points related to the test questions, and obtaining a knowledge point set Kqj = { k1, k2, & gt, kn };
3) For each test question qj, calculating importance degree weight wij of a knowledge point ki in the test question by using a word frequency-inverse document frequency (TF-IDF) algorithm;
4) According to the knowledge point set Kqj and the weight set Wqj, a test question vector of each test question is established: vqj = (v 1, v2,., vn), where vi= (ki, wij) represents knowledge points ki and their weights wij in the test question qj;
5) All test question vectors form a test question vector space: q' = (Vq 1, vq2,) Vqm.
After the test question vector space is built, matching can be performed according to the group paper knowledge point vector V 'and each test question vector Vqj in the test question vector space Q', and the similarity is calculated, wherein the specific steps are as follows:
1) For each test question qj in the question bank, the cosine similarity of the test question vector Vqj and the group paper knowledge point vector V' is calculated:
sim(Vqj,V')=Vqj·V'/|Vqj||V'|
2) And setting a similarity threshold value theta, and selecting all test questions meeting sim (Vqj, V') not less than theta to form a first question set. The larger the value of θ, the more stringent the screen.
Through a vector space model algorithm, accurate matching of knowledge points and test questions can be achieved, and the test questions highly relevant to the set of paper required knowledge points can be automatically screened from a question bank. Compared with manual screening, the method is more efficient and accurate, and the winding efficiency is greatly improved.
Specific embodiment of step S30:
the purpose of this step is to analyze the first topic set semantically, so as to extract the semantic features contained in each topic, and provide basis for subsequent semantic clustering and difficulty assessment.
The topic semantic analysis can be realized by natural language processing technology, and common methods comprise: word vector representation, semantic role labeling, dependency syntax analysis, and the like.
The term vector representation is used here for topic semantic analysis. The method obtains the distributed continuous vector representation of the words through the neural network model learning, and the word vectors can reflect word semantic information and the relation with other words and can be used for representing word semantic features. The specific implementation steps are as follows:
1) Performing word segmentation processing on the question text by using a word segmentation technology, and extracting all words in the question text;
2) Searching and replacing proper nouns in the question text, and searching Word vectors corresponding to each Word by using a pre-trained Word vector model (such as Word2Vec, gloVe and the like);
3) Carrying out weighted average on word vectors of each word (according to TF-IDF weights of the words in the text) to obtain topic word vectors representing topic semantic features;
4) Repeating the steps 1-3 for each topic to finally obtain a topic word vector set of all topics, namely a topic semantic feature set.
Through the word vector representation, each topic is mapped into a word vector with a fixed dimension, and semantic information in the word vector reflects semantic features of topic text. The method provides semantic representation for subsequent semantic clustering and difficulty analysis, and is one of the keys for realizing automatic volume grouping.
Specific embodiment of step S40:
the method aims at carrying out semantic clustering on the first question set, dividing the questions into different classes according to semantic features, and providing support for realizing the equalization of knowledge point coverage subsequently.
The clustering algorithm is one of the unsupervised machine learning methods, grouping sample data by their similarity. The first topic set is clustered by adopting a K-means clustering algorithm according to the semantic features of the topics. The method comprises the following specific steps:
1) Randomly selecting k topics as initial clustering centers;
2) Calculating the semantic similarity between each topic and the clustering center, and distributing each topic into the class most similar to the topic;
3) Recalculating the class center for each class, and repeating the step 2 until the class center is not changed;
4) Finally, k classes and a question set in each class are obtained.
The method for calculating the semantic similarity among topics adopts cosine similarity of word vectors, and comprises the following steps:
setting topic word vectors of the topics A and the topics B as VA and VB;
the semantic similarity of a and B is:
sim(A,B)=VA·VB/(|VA|·|VB|)
through the K-means clustering algorithm, a topic subset with high semantic level distinction degree can be obtained, and the method is more favorable for the equalization of knowledge point coverage compared with random division. The choice of the number k of clusters affects the result, typically taking 2-3 times the number of knowledge points.
Specific embodiment of step S50:
the method aims at generating a difficulty distribution and similarity matrix in each topic semantic group, providing references of topic difficulty levels for subsequent group volumes, and ensuring balanced selection of different types of topics.
Firstly, a question difficulty model needs to be built, a difficulty evaluator is trained, and common methods are as follows:
1) Evaluating a regression model by using semantic features such as word vectors and the like and monitoring learning training difficulty;
2) By adopting transfer learning, the Domain-Net is trained to distinguish simple questions and difficult questions, and then the difficulty estimator is obtained by fine tuning.
Secondly, inputting the semantic features of the topics in each topic semantic group, and outputting the topic difficulty scores by using a trained difficulty evaluator; and counting the number of the topics in each subdivision difficulty interval to obtain the topic difficulty distribution.
And thirdly, calculating a similarity matrix of the topic word vectors aiming at the topics in the same topic semantic group to represent semantic similarity relations among the topics.
The similarity matrix is calculated as follows:
n topics are arranged in the topic semantic group, and topic word vectors are V1, V2, & Vn;
the similarity matrix is:
A=[sim(V1,V1)...sim(V1,Vn)
.........
sim(Vn,V1)...sim(Vn,Vn)]
through the question difficulty distribution and the similarity matrix, under the condition of ensuring the balance of the difficulty, the questions with different semantic expression modes can be selected, and the diversity examination of knowledge points is realized.
Specific embodiment of step S60:
the method comprises the steps of selecting an optimal question corresponding to a knowledge point by using an optimization algorithm under the constraint of satisfying difficulty distribution, and forming a test paper with balanced knowledge points.
1) The following variables are defined:
t: question set, ti represents each question
K: knowledge point set, kj represents each knowledge point
W: knowledge point weight coefficient set
D: question difficulty set, di represents the difficulty of each question
Q: topic semantic similarity matrix
2) Defining an objective function:
max∑j(Wj*likj)
sigma i e T (di-dj) 2< = epsilon (difficulty variance less than threshold epsilon)
Σi, j e T (Qi, j- μ) 2< = δ (similarity variance less than threshold δ)
likj represents the logical value of the knowledge point kj corresponding to the topic i
3) And solving an objective function by using an iterative optimization algorithm (such as a genetic algorithm, simulated annealing and the like) to obtain an optimal question combination.
The objective function consists of knowledge point coverage, question difficulty balance constraint and question similarity constraint. The optimization algorithm selects the optimal subject with the largest coverage for knowledge point examination on the premise of meeting the difficulty and similarity limits.
Specific embodiment of step S70:
the method comprises the steps of evaluating and optimizing the paper assembly result by using a self-adaptive feedback network, and outputting a final examination paper.
1) And (5) pre-training the group paper by using an automatic encoder and the like to obtain a characteristic extractor for the paper representation.
2) Constructing an evaluation model: and (3) inputting test question features in a time step mode by using a recurrent neural network such as LSTM and the like, and outputting a difficulty degree and a discrimination degree score.
3) And (3) constructing a generation model: and using the coding-decoding structures such as the Seq2Seq and the like, and taking the knowledge point vector as input to generate a corresponding test paper.
4) And jointly training the evaluation model and the generation model to form the self-adaptive feedback structure. And (5) feeding back the evaluation result, adjusting and generating a model, and outputting a better test paper.
A specific embodiment of the present invention is provided below, in which the formulas or variables used are explained as follows:
k-knowledge point set to be examined
k i -ith knowledge point
W-key index set for each knowledge point
w i -the emphasis index of the ith knowledge point
V K -a knowledge point vector comprising knowledge points and their key indexes
All test question sets in Q-question bank
q j -jth test question
w ij Test question q j Middle knowledge point k i Weights of (2)
Test question q j Question vector of (a)
Knowledge point vector V K And test question vector->Similarity of (2)
Theta-similarity threshold
Test question q j Word vector representation of (a)
v i Word vector of i-th word in question text
Test question q j Semantic feature vectors of (a)
w i TF-IDF weight of ith word in test question
k-number of classes whose topics are divided
In iteration t, topic q j Class to which they belong
Class center of class i in iteration t
f (·) -question difficulty assessment function
y q Difficulty of topic q
Question q i And q j Similarity of (2)
x j -a binary variable representing whether the topic j is selected
w j Weights of knowledge points j
d i Difficulty value of topic i
d avg Average difficulty
Threshold of variance in an e, delta constraint
x * Optimal topic selection scheme
E-test paper representation extractor
R-assessment model
G-generation model
L-loss function
Lambda-super parameter
θ E ,θ R ,θ G Parameters of the corresponding model
The following are the steps of a specific embodiment:
step S10
Let k= { K 1 ,k 2 ,...,k n The n knowledge points to be examined are represented by w= { W 1 ,w 2 ,...,w n And the emphasis index of each knowledge point. A knowledge point vector V can be established K ={(k 1 ,w 1 ),(k 2 ,w 2 ),...,(k n ,w n ) An i-th element containing knowledge point k i And its emphasis index w i
Step S20
The question library comprises m questions Q= { Q 1 ,q 2 ,...,q m }. For each test question q j Calculating each knowledge point k contained by the word frequency-inverse text frequency (TF-IDF) algorithm i Weight w of (2) ij And building test question vectors
Knowledge point vector V K Test question vectorCan be calculated using cosine similarity:
and selecting test questions with similarity larger than a threshold value theta to form a first question set.
Step S30
For each topic q j Obtaining its word vector representation by using word vector technique Wherein v is i Is the word vector of the i-th word in the question text. Then title q j Is a weighted average of all word vectors:
wherein w is i Is calculated by TF-IDF and the i-th word is in the title q j Is a weight of (a). Finally, the semantic feature vectors of all topics form a semantic feature set
Step S40
And adopting a K-means clustering algorithm. Assuming that the topics are divided into k classes, then in the t-th iteration, the topic q j The class to which it belongs is calculated as follows:
wherein,is the class center of the i-th class in the t-th iteration. And updating class center mu of each class through a formula i . The iteration is repeated until the class center stabilizes.
Step S50
First, a question difficulty evaluation function is constructedWherein->Is the semantic feature vector of the title q, y q Is the difficulty of prediction. Then calculate +.> Obtaining the question difficulty distribution.
At the same time, for every two topics q in the same topic semantic group i ,q j Calculating cosine similarity of two topic semantic vectorsAnd constructing a similarity matrix.
Step S60
Objective function:
s.t.∑ i∈T (d i -d avg ) 2 ≤∈
obtaining an optimal question selection scheme x by solving a mixed integer programming problem *
Step S70
Obtaining a test paper representation extractor E by using an automatic encoder; constructing and evaluating LSTMR, inputting the topic features in time steps, and outputting scores; and constructing and generating LSTMG, and taking knowledge points as input to generate test paper. Joint training:
where L is the loss function and λ is the hyper-parameter. Adaptively adjusting parameter θ of G G And generating a better test paper.
A second aspect of the present invention provides a computer readable storage medium, where the computer readable storage medium stores program instructions, where the program instructions are executed to perform an electronic test paper assembling method as described above.
A third aspect of the present invention provides an electronic test paper system, comprising the computer-readable storage medium described above.
The following is a description of the practical use of the invention in a specific group roll scenario:
scenario case one
The physical teacher of a certain college plans to automatically generate the physical end examination paper of the college students by using the electronic examination paper system.
Firstly, a teacher determines knowledge points which need to be carefully examined in the examination according to a high-level physics teaching outline, and the method comprises the following steps: the Newton's law of motion, the function and the function, the circular motion and the like have 12 knowledge points.
Then, the teacher assigns a key index of 1-10 points to each knowledge point according to experience, for example, the index of ' Newton's three laws of motion ' is 10 points, the index of ' circular motion ' is 7 points, and the importance degree of the examination of the knowledge points is indicated.
The system formats the 12 knowledge points and their key exponents into a knowledge point vector: [ ('newton's law of motion, 10), ('work and function', 8), ('circular motion', 7. ], stored as a group roll basis.
In the aspect of the question bank, the end-of-term questions of the school for the past 5 years are stored in the question bank, and the total number of the questions is 1000. The system analyzes the knowledge points examined by each question through natural language processing technology.
Taking a ' trolley for uniform acceleration linear motion, solving the speed v and the displacement s of the trolley after acceleration for 4s from the static acceleration of 2m/s 2 ' as an example, and judging the system to mainly examine ' Newton's three law of motion '.
And determining knowledge point weights for each question by adopting a TF-IDF algorithm, and constructing a question vector for storage.
Then, the system calculates cosine similarity between 1000 topics in the topic library and the group roll knowledge point vector, and screens 240 topics with similarity greater than 0.75 to form a first topic set.
Word2Vec is used for extracting Word vectors for topics in the first topic set, and the Word vectors represent topic semantic information.
The 240 topic targets were divided into 15 classes using a K-means clustering algorithm, each class containing semantically similar topics.
And judging the question difficulty by using a transfer learning model according to each class of questions, and counting the number of the questions with different difficulty levels to obtain the question difficulty distribution. And simultaneously calculating a semantic similarity matrix among the topics.
Taking Newton's law of motion as an example, the system selects 3 topics with lower similarity under the limitation of difficulty distribution, and repeats the process until all knowledge points are covered.
Finally, the system selects 50 topics to generate a complete test paper. And (3) carrying out quality scoring on the test paper based on the Seq2Seq structure by the evaluation model, generating a model modification knowledge point distribution, regenerating, and outputting the optimal test paper through multiple rounds of iteration.
The embodiment shows that the electronic examination paper system can automatically analyze the theme semantics, realize wide coverage of knowledge points and balance difficulty, so that the paper is more intelligent and adaptive, and can be widely applied to electronic examination papers of various subjects.
Scene case two
The history teacher in a certain high school prepares to use the electronic examination paper system of the invention to automatically generate the paper of the final examination of the history period in a high school.
Firstly, a teacher determines knowledge points needing to be subjected to important examination according to a high-history teaching outline, and the method comprises the following steps: the Confucius and theory thereof, the politics of Qin dynasty, the Yingjingji sweat Mongolia and the like have 15 knowledge points.
And then, the teacher gives a key index of 1-10 points to each knowledge point according to experience to represent the examination importance degree of the knowledge point. For example, the index of "Confucius and its theory" is 10 points, and the index of "Qin dynasty politics" is 8 points.
The system formats the 15 knowledge points and their key indexes into a knowledge point vector: the storage was performed in [ (hole, theory, 10), (' politics of Qin dynasty ', 8), (' monogon unified mongolia, 7) ].
In the aspect of the question bank, the last question of the past 3 years of the school is recorded into the question bank, and the total number of the questions is 1000, namely a history question. The system analyzes knowledge points for each question using natural language processing techniques.
Taking the example of what the 'crafted' idea shows, the system judges that it is mainly examining the 'Confucius and its theory' knowledge points.
And determining the weight of the knowledge points by adopting a TF-IDF method for each question, and constructing a question vector.
Then, the system calculates the similarity between the 1000 topics and the group roll knowledge point vector, and screens out 200 topics with similarity exceeding 0.8 to form a first topic set.
Word2Vec is used for extracting Word vectors for topics of the first topic set to represent topic semantics.
The 200 topics were divided into 10 classes using a K-means clustering algorithm, each class containing semantically similar topics.
Aiming at each type of questions, predicting the question difficulty by using a transfer learning model, and counting the number of the questions with different difficulties to obtain the question difficulty distribution. And simultaneously calculating a semantic similarity matrix among the topics.
Taking the 'Confucius and theory' knowledge points as an example, the system selects 3 topics with lower similarity under the limitation of difficulty distribution until the coverage requirement of each knowledge point is met.
Finally, the system selects 50 test questions to generate a complete test paper. The evaluation model scores the quality of the test paper based on the Seq2Seq, generates a model modification knowledge point distribution, regenerates the knowledge point distribution and obtains the optimal test paper through iteration.
The embodiment shows that the invention can automatically analyze the semantics of the history test questions, realize wide knowledge point coverage and difficulty control, make the paper of the history subjects more intelligent and personalized, and can be widely applied to the electronic examination paper of each subject.
Scene case three
The electronic examination paper system is used for editing the paper of the final examination of the first mathematics period by a certain middle school plan. Firstly, a teacher determines knowledge points which need to be carefully examined in the examination according to a teaching outline, and 10 knowledge points such as a unitary quadratic equation, a function image, a linear equation and the like are selected.
Then, the teacher gives an integer between 1 and 10 as an important index to each knowledge point according to own experience, and the important degree of the investigation of the knowledge points is represented. The key index of the knowledge point of the 'unitary quadratic equation' is 10, the key index of the knowledge point of the 'functional image' is 8, and the key index of the knowledge point of the 'linear equation' is 7.
The system formats the 10 knowledge points and corresponding key indexes into a knowledge point vector: [ ('unitary quadratic equation', 10), ('functional image', 8), ('linear equation', 7),. The. ], stored as knowledge point basis for the subsequent set of volumes.
In the aspect of the question bank, the terminal examination questions of the school in the past year are stored in the question bank, and the total number of the questions is 500. The system uses Word vector model Word2Vec to carry out semantic analysis on each topic, extracts keywords and judges knowledge points related to the topic. For example, where a topic is described as "two roots of the known unitary quadratic equation $x ζ2+5x+6=0$differby 5, solving the equation," the system determines that the topic is primarily related to the "unitary quadratic equation" knowledge point.
The knowledge point weight is calculated for each topic by adopting a TF-IDF algorithm, and a topic vector is constructed, for example, the topic vector is: [ ('one-dimensional quadratic equation', 0.8), ('functional image', 0.1),. The term ] is stored in the test question vector space.
Then, the system calculates cosine similarity between the stored 500-item-topic vector and the group-scroll knowledge-point vector, and selects topics with similarity greater than 0.7 to form a first topic set, 180 topics in total.
And extracting Word vectors from all topics of the first topic set by using a Word2Vec Word vector model, and obtaining vectors representing topic semantics after averaging and pooling, wherein the dimension is 100.
Clustering semantic vectors of 180 topics by adopting a K-Means algorithm, setting the class number k=20, and dividing the topics into 20 classes after iteration, wherein each class contains topics with similar semantics.
Aiming at the topics in each category, the system classifies the topics by using a difficulty evaluator obtained by transfer learning, for example, 1-5 represent difficulty grades, and the number of the topics at each grade is counted to obtain the topic difficulty distribution. And simultaneously, calculating the semantic similarity matrix of the topics in the class.
Taking knowledge points of a unitary quadratic equation as an example, the system selects 3 related topics with lower similarity in a range allowed by the difficulty distribution through an optimization algorithm, and repeats the process until the coverage requirement of all knowledge points is met.
Finally, the system combines the selected 50 topics to form a complete set of test paper. The evaluation model scores the quality of the test paper based on the Seq2Seq structure, generates model modification knowledge point distribution, regenerates the test paper, and iterates repeatedly until the score reaches a threshold value.
Therefore, the system realizes automatic generation of test paper with strong representativeness to knowledge points and moderate difficulty. The method avoids blindness of manual paper assembly, realizes the accuracy of test question screening and the optimization of paper assembly quality, and achieves the effect of intelligent paper assembly.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention.

Claims (10)

1. The method for assembling the electronic examination paper is characterized by comprising the following steps of:
s10, acquiring knowledge points to be examined and key indexes of the knowledge points, and establishing a group-scroll knowledge point vector comprising the knowledge points and the key indexes thereof, wherein the key indexes are expressed by positive integers;
s20, screening all test questions matched with the knowledge point vector from a question bank according to the knowledge point vector to serve as a first question set;
s30, carrying out semantic analysis on each topic in the first topic set, and extracting topic semantic features;
s40, clustering all topics in the first topic set according to topic semantic features to obtain a plurality of clustering centers and topics corresponding to each clustering center, and dividing the topics into a plurality of groups to be marked as topic semantic groups;
s50, generating an intra-group topic difficulty distribution map and a topic similarity matrix for each topic semantic group by using a self-supervision learning method, and ensuring the diversity and the balance of topic difficulty and knowledge point coverage;
s60, selecting the questions corresponding to each knowledge point by using a group paper knowledge point vector according to the question difficulty distribution map and the question similarity matrix in the question library and adopting an optimization algorithm so as to ensure that the test paper covers all knowledge points and keep the difficulty balance and the content diversity;
s70, combining the selected questions into a complete test paper, and finally evaluating and optimizing the test paper through a self-feedback neural network to complete the process of the electronic test paper.
2. The method for assembling paper for electronic examination of claim 1, wherein the steps of obtaining knowledge points to be examined and key indexes of the knowledge points, and establishing a paper assembling knowledge point vector comprise:
firstly, determining each knowledge point to be examined in the current examination according to a teaching outline and a course target;
then, according to the importance degree of each knowledge point in the teaching outline and course target and the examination frequency of the knowledge points, determining the key index of each knowledge point, wherein the key index is expressed by a positive integer;
then, the acquired knowledge points and the key indexes thereof are established into a knowledge point vector, and each element in the vector contains the name of the knowledge point and the key indexes thereof.
3. The method of claim 2, wherein the step of screening all questions matched with the knowledge point vector from a question bank as a first question set according to the knowledge point vector comprises the following steps:
firstly, a test question vector space needs to be constructed;
then, matching is carried out according to the group paper knowledge point vector and each test question vector in the test question vector space, and the similarity is calculated;
and finally, selecting all test questions meeting a certain similarity threshold to form a first question set.
4. The method for assembling an electronic examination paper according to claim 3, wherein the step of performing semantic analysis on each question in the first question set and extracting semantic features of the questions comprises the following steps: performing topic semantic analysis by using a word vector representation method, performing word segmentation on topic texts of each topic, searching word vectors of each word by using a pre-training word vector model, and performing weighted average on the word vectors to obtain topic word vectors representing topic semantic features.
5. The method for assembling electronic examination papers according to claim 4, wherein the step of clustering all questions in the first question set according to the semantic features of the questions comprises the following steps: and clustering the first question set by adopting a K-means clustering algorithm, calculating the semantic similarity among the questions, distributing the questions into the class most similar to the questions, and iteratively updating the class center to finally obtain a plurality of clusters containing different questions.
6. The method of claim 5, wherein the step of generating the intra-group question difficulty distribution map and the question similarity matrix for each question semantic group by using a self-supervision learning method comprises:
firstly, training a question difficulty estimator by using a self-supervision learning method;
then, outputting the topic difficulty scores of the topics in each topic semantic group by using a topic difficulty evaluator, and counting the topic numbers of different difficulty intervals to obtain topic difficulty distribution;
finally, a similarity matrix for the topics within the set is calculated.
7. The method for assembling an electronic examination paper according to claim 6, wherein the step of selecting the questions corresponding to each knowledge point comprises the following steps: defining a knowledge point set, a question set and a weight coefficient, establishing an objective function containing knowledge point coverage and question difficulty balance constraint, and solving by using an optimization algorithm to obtain an optimal question combination with the maximum knowledge point examination coverage.
8. The method of claim 7, wherein the step of combining the selected questions into a complete test paper and performing final evaluation and optimization by a self-feedback neural network comprises the steps of: and pre-training the group paper by using an automatic encoder, constructing an evaluation model and a generation model, forming a self-adaptive feedback structure by combined training, and adjusting the generation model by feedback of an evaluation result to optimize the selected question combination.
9. A computer readable storage medium having stored therein program instructions which, when executed, are adapted to carry out an electronic test pack method as claimed in any one of claims 1 to 8.
10. An electronic test paper system comprising the computer-readable storage medium of claim 9.
CN202410035078.0A 2024-01-09 2024-01-09 Method, medium and system for electronic examination paper Pending CN117852550A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410035078.0A CN117852550A (en) 2024-01-09 2024-01-09 Method, medium and system for electronic examination paper

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410035078.0A CN117852550A (en) 2024-01-09 2024-01-09 Method, medium and system for electronic examination paper

Publications (1)

Publication Number Publication Date
CN117852550A true CN117852550A (en) 2024-04-09

Family

ID=90529992

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410035078.0A Pending CN117852550A (en) 2024-01-09 2024-01-09 Method, medium and system for electronic examination paper

Country Status (1)

Country Link
CN (1) CN117852550A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118690731A (en) * 2024-08-28 2024-09-24 福建师范大学协和学院 Language teaching automatic group volume evaluation method, medium and equipment based on Internet

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118690731A (en) * 2024-08-28 2024-09-24 福建师范大学协和学院 Language teaching automatic group volume evaluation method, medium and equipment based on Internet

Similar Documents

Publication Publication Date Title
Baradwaj et al. Mining educational data to analyze students' performance
CN104866578B (en) A kind of imperfect Internet of Things data mixing fill method
Balakrishnan Significance of classification techniques in prediction of learning disabilities
CN114913729B (en) Question selecting method, device, computer equipment and storage medium
CN112559749B (en) Intelligent matching method, device and storage medium for online education teachers and students
CN117852550A (en) Method, medium and system for electronic examination paper
CN110993102A (en) Campus big data-based student behavior and psychological detection result accurate analysis method and system
Hamim et al. Student profile modeling using boosting algorithms
Lee et al. Machine learning approaches for learning analytics: Collaborative filtering or regression with experts
CN113033180B (en) Automatic generation service system for Tibetan reading problem of primary school
CN114416929A (en) Sample generation method, device, equipment and storage medium of entity recall model
KR20180066705A (en) Method and apparatus for analyzing vulnerability of learner
CN116910185B (en) Model training method, device, electronic equipment and readable storage medium
CN117556965A (en) Teaching course optimization method, system and storage medium based on intelligent operation platform
CN116049376B (en) Method, device and system for retrieving and replying information and creating knowledge
CN116361541A (en) Test question recommendation method based on knowledge tracking and similarity analysis
Mounika et al. A comparative study of machine learning algorithms for student academic performance
CN113535945B (en) Text category recognition method, device, equipment and computer readable storage medium
Leteno et al. An investigation of structures responsible for gender bias in BERT and DistilBERT
Gao et al. Classification decision tree algorithm in predicting students’ course preference
Wang et al. Research on the Youth Group's Expectations for the Future Development of self-Media while in the Digital Economy
Triayudi et al. New Framework of Educational Data Mining to Predict Student Learning Performance
CN117932073B (en) Weak supervision text classification method and system based on prompt engineering
Calma Active Learning with Uncertain Annotators: Towards Dedicated Collaborative Interactive Learning
CN118070813B (en) Investment decision consultation question-answering method and system based on NLP and large language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination