CN111081103A

CN111081103A - A method for obtaining a dictation answer, a family teaching device and a storage medium

Info

Publication number: CN111081103A
Application number: CN201910409633.0A
Authority: CN
Inventors: 崔颖
Original assignee: Guangdong Genius Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2019-05-17
Filing date: 2019-05-17
Publication date: 2020-04-28
Anticipated expiration: 2039-05-17
Also published as: CN111081103B

Abstract

The embodiment of the invention relates to the technical field of education, and discloses a dictation answer obtaining method, home education equipment and a storage medium. The method comprises the following steps: shooting to obtain a first page before the user writes, then broadcasting the pronunciation of the dictation content, shooting to obtain a second page after the user writes, then respectively identifying the first page and the second page to obtain first page information and second page information, and comparing the second page information with the first page information to obtain the dictation answer written by the user. By implementing the embodiment of the invention, more accurate dictation answers can be identified, so that the detection accuracy of the dictation answers is improved.

Description

Dictation answer obtaining method, family education equipment and storage medium

Technical Field

The invention relates to the technical field of education, in particular to a dictation answer obtaining method, family education equipment and a storage medium.

Background

At present, student users often use home education equipment such as a point-reading machine or a learning tablet to practice dictation of a text after class, and the home education equipment collects writing pages of the users after the users finish the dictation practice, identifies dictation answers written by the users according to the dictation contents in the writing pages, and detects whether the dictation answers are wrong or not. However, in practice, it has been found that before the user performs dictation practice, there are many times when a previously written writing is already on the written page, or the user writes directly using paper with other printed text (e.g., newspaper or an exercise book). Therefore, when the family education equipment identifies the dictation answers, all the text contents on the writing page can be identified, so that a lot of redundant non-answer information is identified as the dictation answers by mistake, and the detection accuracy of the family education equipment on the dictation answers is too low.

Disclosure of Invention

In view of the above-mentioned drawbacks, embodiments of the present invention disclose a dictation answer obtaining method, a family education device, and a storage medium, which can identify more accurate dictation answers, thereby improving the accuracy of detecting the dictation answers.

The first aspect of the embodiments of the present invention discloses a dictation answer obtaining method, including:

shooting to obtain a first page before writing of a user; broadcasting the pronunciation of the dictation content;

shooting to obtain a second page written by the user, wherein the second page is formed after the user writes on the first page according to the pronunciation of the dictation content;

identifying the first page to obtain first page information;

identifying the second page to obtain second page information;

and comparing the second page information with the first page information to obtain a dictation answer written by the user.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, after the capturing obtains the first page before the user writes, the method further includes:

judging whether the first page is a blank page or not;

if the page is not the blank page, extracting any characteristic region in the first page;

and after the reading of the dictation content is broadcasted, the method further comprises:

shooting in a preset time period to obtain a plurality of frames of user images;

judging whether the plurality of frames of user images all contain the characteristic region;

if the plurality of frames of user images contain the characteristic area, judging whether the plurality of frames of user images are used for describing that the user finishes writing according to the pronunciation of the dictation content;

and if the writing is finished, executing the step of shooting to obtain a second page written by the user.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, before the identifying the first page to obtain the first page information, the method further includes:

identifying a first area in the first page where a user hand is located; matting the first region from the first page to obtain a target first page;

the identifying the first page to obtain first page information includes:

identifying the target first page to obtain first page information;

before the identifying the second page to obtain second page information, the method further includes:

identifying a second area in the second page where the user's hand is located; matting the second region from the second page to obtain a target second page;

the identifying the second page to obtain second page information includes:

and identifying the target second page to obtain second page information.

As an optional implementation manner, in the first aspect of the embodiment of the present invention, the dictation content includes several dictation words; the method further comprises the following steps:

recording the writing position of a user writing on the first page according to the pronunciation of each dictation word in the dictation contents in the process of broadcasting the pronunciation of the dictation contents;

and after comparing the second page information with the first page information to obtain a dictation answer written by the user, the method further comprises:

performing word segmentation processing on the dictation answers to obtain a plurality of answer words;

acquiring the image position of each answer word in the second page;

determining dictation words corresponding to each answer word according to the image positions and the writing positions;

and for each answer word, correcting the answer word through a standard answer of the dictation word corresponding to the answer word.

As an optional implementation manner, in the first aspect of this embodiment of the present invention, the method further includes:

recording the writing starting time of a user writing according to the pronunciation of each dictation word in the dictation contents on the first page in the process of broadcasting the pronunciation of the dictation contents;

and after the answer words are corrected through the standard answers of the dictation words corresponding to the answer words, aiming at each answer word, the method further comprises the following steps:

sequencing the plurality of corrected answer words after the correction according to the sequence of the writing starting time from first to last so as to obtain a correct answer word sequence, and sequencing the plurality of error answer words after the correction according to the sequence so as to obtain an error answer word sequence; the wrong answer words are not matched with the corresponding standard answers, and the correct answer words are matched with the corresponding standard answers;

outputting a list containing the correct answer word sequence and/or the incorrect answer word sequence.

A second aspect of an embodiment of the present invention discloses a family education device, including:

the shooting unit is used for shooting and obtaining a first page before writing of a user;

the broadcasting unit is used for broadcasting the pronunciation of the dictation content after the shooting unit shoots and obtains a first page before the user writes;

the shooting unit is further used for shooting a second page written by the user, wherein the second page is formed after the user writes on the first page according to the pronunciation of the dictation content;

an identifying unit configured to identify the first page to obtain first page information; and identifying the second page to obtain second page information;

and the comparison unit is used for comparing the second page information with the first page information to obtain the dictation answer written by the user.

As an optional implementation manner, in the second aspect of the embodiment of the present invention, the method further includes:

the judging unit is used for judging whether the first page is a blank page or not after the first page before the writing of the user is shot and obtained by the shooting unit;

an extracting unit, configured to extract any feature region in the first page when the determining unit determines that the first page is not the blank page;

the shooting unit is also used for shooting in a preset time period to obtain a plurality of frames of user images after the broadcast unit broadcasts the pronunciation of the dictation content;

the judging unit is further configured to judge whether the plurality of frames of user images all include the feature region; when the plurality of frames of user images are judged to contain the characteristic area, judging whether the plurality of frames of user images are used for describing that the user finishes writing according to the pronunciation of the dictation content; and when the fact that the user writes is judged, triggering the shooting unit to execute the operation of shooting to obtain a second page written by the user.

a cutting unit, configured to, before the identification unit identifies the first page to obtain first page information, identify a first region in the first page where a user's hand is located, and cut the first region from the first page to obtain a target first page; before the identification unit identifies the second page to obtain second page information, identifying a second area of the second page where the user hand is located, and removing the second area from the second page to obtain a target second page;

the identification unit is specifically used for identifying the target first page to obtain first page information; and identifying the target second page to obtain second page information.

As an optional implementation manner, in the second aspect of the embodiment of the present invention, the dictation content includes several dictation words; the family education device further includes:

the recording unit is used for recording the writing position of a user writing according to the pronunciation of each dictation word in the dictation contents on the first page in the process of broadcasting the pronunciation of the dictation contents by the broadcasting unit;

the word segmentation unit is used for performing word segmentation processing on the dictation answer to obtain a plurality of answer words after the comparison unit compares the second page information with the first page information to obtain the dictation answer written by the user;

the obtaining unit is used for obtaining the image position of each answer word in the second page;

the determining unit is used for determining dictation words corresponding to the answer words according to the image positions and the writing positions;

and the correcting unit is used for correcting each answer word through the standard answer of the dictation word corresponding to the answer word.

As an optional implementation manner, in the second aspect of the embodiment of the present invention, the recording unit is further configured to record, during a process that the broadcasting unit broadcasts the reading of the dictation content, a writing start time when a user writes on the first page according to the reading of each dictation word in the dictation content;

and, the family education device further comprises:

the ranking unit is used for ranking the corrected correct answer words according to the sequence of the writing starting time from first to last to obtain a correct answer word sequence after the correction unit corrects the answer words through the standard answers of the dictation words corresponding to the answer words aiming at each answer word; sequencing the plurality of error answer words after the correction according to the sequence to obtain an error answer word sequence; the wrong answer words are not matched with the corresponding standard answers, and the correct answer words are matched with the corresponding standard answers;

and the output unit is used for outputting a list containing the correct answer word sequence and/or the wrong answer word sequence.

A third aspect of an embodiment of the present invention discloses a family education apparatus, including:

a memory storing executable program code;

a processor coupled with the memory;

the processor calls the executable program code stored in the memory to execute the dictation answer obtaining method disclosed by the first aspect of the embodiment of the invention.

A fourth aspect of the present invention discloses a computer-readable storage medium storing a computer program, where the computer program causes a computer to execute a dictation answer acquisition method disclosed in the first aspect of the present invention.

A fifth aspect of embodiments of the present invention discloses a computer program product, which, when run on a computer, causes the computer to perform some or all of the steps of any one of the methods of the first aspect.

A sixth aspect of the present embodiment discloses an application publishing platform, where the application publishing platform is configured to publish a computer program product, where the computer program product is configured to, when running on a computer, cause the computer to perform part or all of the steps of any one of the methods in the first aspect.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, the first page before the user writes is obtained by shooting, the pronunciation of the dictation content is reported, the second page after the user writes is obtained by shooting, then the first page and the second page are respectively identified, the first page information and the second page information are obtained, the second page information and the first page information are compared to obtain the dictation answer written by the user, the more accurate dictation answer can be identified, and the detection accuracy of the dictation answer is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a schematic flow chart of a dictation answer obtaining method disclosed in an embodiment of the present invention;

fig. 2 is a schematic flow chart of another dictation answer obtaining method disclosed in the embodiment of the present invention;

fig. 3 is a schematic flow chart illustrating another dictation answer obtaining method disclosed in the embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a family education device according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of another family education device disclosed in the embodiment of the present invention;

FIG. 6 is a schematic structural diagram of another family education device disclosed in the embodiment of the present invention;

fig. 7 is a diagram illustrating an example of a process of shooting by a family education device to obtain a first page or a second page according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It is to be noted that the terms "first", "second", and the like in the description and claims of the present invention are used for distinguishing different objects, and are not used for describing a specific order. The terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The embodiment of the invention discloses a dictation answer obtaining method, a family education device and a storage medium, which can identify more accurate dictation answers so as to improve the detection accuracy of the dictation answers. The method is suitable for family education equipment such as family education machines, learning machines, point reading machines, learning flat plates or learning computers. The operating systems of various types of family education devices include, but are not limited to, an Android operating system, an IOS operating system, a Symbian operating system, a blackberry operating system, a Windows Phone8 operating system, and the like. The embodiment of the invention is described by taking the family education equipment as an execution subject, and the invention is not limited in any way. The following detailed description is made with reference to the accompanying drawings.

Example one

Referring to fig. 1, fig. 1 is a schematic flow chart of a dictation answer obtaining method according to an embodiment of the present invention. As shown in fig. 1, the dictation answer obtaining method may include the following steps:

101. the family education equipment shoots and obtains a first page before the user writes.

It should be noted that the first page includes, but is not limited to, blank pages, pages with existing content, newspaper pages, textbooks, or exercise books, and other pages that can be used for writing.

In the embodiment of the invention, if the first page is a blank operation page, after the family education equipment shoots and obtains the second page written by the user, the family education equipment can directly identify the second page to obtain the second page information, and the second page information is used as a dictation answer written by the user.

In the embodiment of the invention, the shooting module for shooting the image can be arranged on one surface of the family education equipment, which is provided with the display screen, and the surface is provided with the reflecting device, and the mirror surface of the reflecting device and the lens surface of the shooting module form a preset angle. Referring to fig. 7, fig. 7 is a diagram illustrating an example of a process of capturing a first page or a second page by a family education device according to an embodiment of the present invention. As shown in fig. 7, the mode that the family education equipment controls the shooting module to shoot the mirror image in the reflector as the first page before the user writes or the second page after the user writes can be: the family education device 10 can be provided with a shooting module 20, and the shooting module 20 is used for shooting to obtain a first page or a second page; a light reflecting device 30 (e.g., a mirror, a prism, a convex lens, or the like) may be further disposed right in front of the shooting module 20, where the light reflecting device 30 is configured to change a light path of the shooting module, so that the shooting module 20 shoots a first page before the user writes or a second page after the user writes on the carrier 40. The imaging of the obtained carrier 40 in the reflector 30 is shot through the shooting module 20 using the family education device 10, and the placing mode of the family education device 10 is not changed manually, so that the shooting process can be simplified, and the shooting efficiency is improved. The carrier 40 may be a book, an exercise book, a picture book, a test paper, etc. placed on a desktop, and the embodiment of the present invention is not limited in particular.

102. The family education equipment broadcasts the pronunciation of the dictation content.

In the embodiment of the present invention, the dictation content includes, but is not limited to, dictation single words, dictation word groups, dictation text segments or dictation articles, and the like. Meanwhile, the subject to which the dictation content belongs includes, but is not limited to, Chinese, English, music, chemistry, and the like, which is specific, and the invention is not limited.

103. The family education equipment shoots and obtains a second page written by the user, and the second page is formed after the user writes on the first page according to the pronunciation of the dictation content.

In the embodiment of the present invention, a specific implementation manner of shooting by the family education device to obtain the second page written by the user may be to periodically shoot. For example, when the family education device broadcasts the pronunciation of one dictation content and detects that the user writes one dictation answer according to the dictation content, the family education device shoots once, so that a plurality of written second pages of the user are obtained in the whole dictation process. Based on the method, one dictation answer written by the user each time can be identified in real time, so that the acquisition efficiency of the dictation answer is improved.

For example, when the family education device detects that the user writes the specified number of dictation answers, the family education device shoots once, so that a plurality of written second pages of the user are obtained in the whole dictation process; the specified number may be preset, and the specific numerical value may be an integer such as 2, 3, or 4, and the present invention is not particularly limited. Based on the method, the power consumption of the equipment can be reduced while the dictation answers are identified in real time, and therefore the cruising ability of the battery is improved.

In some other possible embodiments, the specific implementation of the family education device capturing the second page written by the user may be that the capturing is performed only when the dictation quitting instruction input by the user is received or the family education device is detected to quit the dictation mode, so as to obtain the second page written by the user. In this case, the family education device only captures a first page before writing by the user and a second page after writing by the user in the whole dictation process, and the first page and the second page are respectively obtained when the family education device enters and exits the dictation mode.

As an optional implementation manner, if the family education device periodically shoots and obtains the second page written by the user, the method may specifically include: after broadcasting the pronunciation of the dictation content and when the waiting time reaches the preset time, the family education equipment shoots and obtains a second page written by the user, wherein the waiting time is obtained by starting timing at the broadcasting finishing moment; and taking the second page written by the user as the first page before the user writes in the next period, and entering the next period.

By implementing the implementation mode, the pages before and after the user writes are obtained through periodic shooting, one dictation answer written by the user each time can be identified in real time, and therefore the acquisition efficiency of the dictation answer is improved.

As another optional implementation, step 103 may specifically include the following steps: when the family education equipment is in a dictation mode, the family education equipment controls a camera module which is arranged on the family education equipment to shoot a mirror image in the light reflecting device at a preset frequency to serve as a page which is being written by a user; the family education equipment detects whether each writing area on the page written by the user is written with writing contents in real time; and if the writing contents are written in each writing area, the family education equipment shoots to obtain a second page written by the user.

It can be understood that if the writing contents are written in each writing area, it can be determined that the page written by the user is about to be or completely full, at this time, the family education device can perform the operation of shooting the second page written by the user, otherwise, the operation of controlling the camera module installed in the family education device to shoot the mirror image in the light reflecting device at the preset frequency as the page written by the user is continuously performed. Where the writing area may be part or all of the area in the page that the user is writing.

By implementing the implementation mode, the second page written by the user is shot and obtained when the page written by the user is about to be fully written or is fully written, the dictation answer of the user is obtained by combining the first page before the user writes, and the detection is carried out, so that the problem of large power consumption caused by periodically shooting and obtaining the pages before and after the user writes can be solved, the power consumption of equipment is further reduced, and the cruising ability of the battery is improved.

104. The family education device identifies the first page to obtain first page information.

It should be noted that, in the embodiment of the present invention, character Recognition may be specifically performed through Optical Character Recognition (OCR). The OCR generally comprises operations such as image preprocessing, character recognition, recognition result optimization and the like; among them, image preprocessing generally includes the following steps: graying, binarization, noise reduction, tilt correction, character segmentation, and the like.

105. The family education device identifies the second page to obtain second page information.

106. The family education device compares the second page information with the first page information to obtain a dictation answer written by the user.

It can be understood that the information existing before and after the user writes, that is, the common information appearing in the second page information and the first page information does not belong to the dictation answer written by the user according to the dictation content broadcasted in the dictation process, and belongs to redundant invalid information. Therefore, common information appearing in both the second page information and the first page information is filtered from the second page information, and new information existing in the second page information but not existing in the first page information is obtained, so that the new information is used as a dictation answer written by a user.

It can be seen that, by implementing the method described in fig. 1, a first page before the user writes is obtained by shooting, the pronunciation of the dictation content is reported, a second page after the user writes is obtained by shooting, then the first page and the second page are respectively identified, the first page information and the second page information are obtained, the second page information and the first page information are compared to obtain the dictation answer written by the user, a more accurate dictation answer can be identified, and thus the accuracy rate of detecting the dictation answer is improved.

Example two

Referring to fig. 2, fig. 2 is a schematic flow chart of another dictation answer obtaining method disclosed in the embodiment of the present invention. As shown in fig. 2, the dictation answer obtaining method may include the following steps:

201. the family education equipment shoots and obtains a first page before the user writes.

202. The family education device judges whether the first page is a blank page. If not, go to step 203; otherwise, the flow is ended.

203. And the family education equipment extracts any characteristic region in the first page.

The characteristic region may be a shape region or a character region.

For example, it is assumed that the feature region is a text region, that is, the first phrase "central idea" located at the upper left corner in the first page is extracted as the feature region. Then in step 206, the family education device determines whether the user performs the page turning action by continuously detecting whether the phrase "central thought" still exists in the top left corner of the captured user image. It can be understood that if the page turning action exists, the user is judged not to perform the page turning action; if not, judging that the user performs the page turning action.

204. The family education equipment broadcasts the pronunciation of the dictation content.

205. The family education equipment shoots and obtains a plurality of frames of user images in a preset time period.

The preset time period can be set manually according to actual conditions. In practical applications, the preset time period may be the waiting time period mentioned in the first preset embodiment, where the waiting time period is obtained by starting timing at the end of broadcasting the dictation content.

206. The family education device judges whether the plurality of frames of user images all contain the characteristic region. If both feature areas are included, go to step 207; otherwise, the flow is ended.

207. And the family education equipment judges whether the plurality of frames of user images are used for describing that the user finishes writing according to the pronunciation of the dictation content. If the writing is finished, go to step 208; otherwise, the flow is ended.

As an optional implementation manner, the family education device may specifically identify a user action region (i.e., a region including a gesture for indicating a user action) in a plurality of frames of user images through deep learning and the like, and determine whether a current gesture of the user in the user action region is adapted to a preset gesture for indicating that writing is completed; and if the two types of the sound are matched, the user is judged to finish writing according to the pronunciation of the dictation content.

Further optionally, the family education device may specifically obtain an action sequence corresponding to the user action target by tracking the user action target in each frame of the user image, pre-process the action sequence, input the pre-processed action sequence into a pre-trained action classification model, extract a deep action feature of the action sequence in the action classification model, and identify, according to the deep action feature, whether the action sequence is used for describing that the user has finished writing according to the pronunciation of the dictation content in the action classification model.

Wherein the preprocessing comprises one or more of screening of action sequences, equalization of images, normalization of images, action correction and image scaling. The pre-trained motion classification model can be constructed and trained by taking a deep convolutional neural network as a baseline network.

Through the implementation mode, the accuracy rate of judging whether the user image is used for describing the completion of writing of the user according to the pronunciation of the dictation content can be improved, so that the false triggering of the home teaching equipment for shooting the second page written by the user is reduced, and the accuracy rate of acquiring the dictation answer is improved.

208. And shooting by the family education equipment to obtain a second page written by the user.

209. The family education device identifies the first page to obtain first page information.

As an alternative embodiment, before performing step 209, the family education device may further identify a first region of the first page where the user's hand is located, and remove the first region from the first page to obtain the target first page. Therefore, the specific implementation of step 209 is: the family education device identifies the target first page to obtain first page information. By implementing the embodiment, the recognition interference caused by the area where the hand of the user is located can be eliminated, so that the recognition accuracy of the page information is improved.

210. The family education device identifies the second page to obtain second page information.

As an alternative embodiment, before performing step 210, the family education device may further identify a second region of the second page where the user's hand is located, and remove the second region from the second page to obtain the target second page. Thus, the specific implementation of step 210 is: the family education device identifies the target second page to obtain second page information. By implementing the embodiment, the recognition interference caused by the area where the hand of the user is located can be eliminated, so that the recognition accuracy of the page information is improved.

211. The family education device compares the second page information with the first page information to obtain a dictation answer written by the user.

It can be seen that, compared with the method described in fig. 1, with the method described in fig. 2, when the first page obtained by shooting before writing by the user is not a blank page, any feature region in the first page can be extracted, and then the feature region is detected, so as to determine whether the user performs a page turning action, if the user does not perform the page turning action, and when it is detected that the user has finished writing, the second page obtained after writing by the user is shot again, so that the problem that the second page obtained by shooting and the first page are not the same page due to the page turning action by the user can be overcome, and the accuracy of obtaining the dictation answer can be improved.

EXAMPLE III

Referring to fig. 3, fig. 3 is a schematic flow chart of another dictation answer obtaining method disclosed in the embodiment of the present invention. As shown in fig. 3, the dictation answer obtaining method may include the following steps:

301 to 302. For the description of steps 301 to 302, please refer to the detailed description of steps 101 to 102 in the first embodiment, which is not repeated herein. The dictation content comprises a plurality of dictation words, and the dictation answer comprises a plurality of dictation words.

303. In the process of broadcasting the pronunciation of the dictation content, the family education equipment records the writing position of the user writing on the first page according to the pronunciation of each dictation word in the dictation content.

It should be noted that, in some possible scenarios, the order in which the user writes according to the dictation content is not necessarily a specific order from left to right or from top to bottom, but the user writes at any position on the first page at random.

For example, assuming that the first page is a newspaper, obviously, the newspaper has no specific text line for the user to write, so the user may write in any blank space on the first page, and the answer words included in the dictation answers recognized by the image recognition technology are not arranged in the order of the pronunciation broadcast of the dictation words. Under the scenario with the great possibility, if the answer words are corrected by acquiring the corresponding standard answers according to the pronunciation broadcasting sequence of the dictation words, the correction is disordered, and the correction accuracy is too low.

In summary, by executing step 303 and steps 308 to 311, the writing position where the user writes according to the pronunciation of the dictation words on the first page can be recorded in real time, then after the dictation answers are recognized, the dictation words corresponding to each answer word are determined according to the image position of each answer word in the dictation answers on the second page, so as to obtain the standard answers corresponding to each answer word, and finally, the answer words are corrected according to the standard answers, so that the problem that the corresponding standard answers correct the answers according to the pronunciation broadcasting sequence of the dictation words, the correction is disordered due to the fact that the corresponding standard answers correct the answers modify the answer words can be overcome, and the correction accuracy is improved, so that the home education equipment is more intelligent.

As another optional implementation manner, in the process of broadcasting the sound of the dictation content, the family education device records the writing position where the user writes according to the sound of each dictation word in the dictation content on the first page, and simultaneously may record the writing start time when the user writes according to the sound of each dictation word in the dictation content on the first page.

Based on this, after step 311 is performed, the following steps may also be performed: the family education equipment sorts the corrected correct answer words according to the sequence of the writing starting time from first to last to obtain a correct answer word sequence, and sorts the corrected error answer words according to the sequence to obtain an error answer word sequence; the wrong answer words are not matched with the corresponding standard answers, and the correct answer words are matched with the corresponding standard answers; the family education device outputs a list containing a correct answer word sequence and/or a wrong answer word sequence.

By implementing the embodiment, the list containing the correct answer word sequence and/or the wrong answer word sequence is output according to the broadcasting sequence, so that the impression of the user on the answer words can be enhanced, the memory of the answer words is enhanced, and the dictation effect is improved.

Further optionally, the way for the family education device to output the list containing the wrong answer word sequence may specifically be: the family education equipment outputs a list containing a wrong answer word sequence and a corresponding standard answer sequence, wherein the wrong answer word sequence comprises a plurality of wrong answer words, and the standard answer sequence comprises a plurality of standard answers corresponding to the wrong answer words one by one.

For example, assuming that the wrong answer word sequence is "sand-sea-like-star solution", the corresponding standard answer sequence is "shark-sea-star", wherein the wrong answer words "sand" and standard answer "shark" correspond to each other one by one, the wrong answer words "sea-like" and standard answer "sea" correspond to each other one by one, and the wrong answer words "star solution" and standard answer "star" correspond to each other one by one. The standard answers "shark", "ocean", and "planet" are in the same order as the broadcast of the corresponding pronunciation of the dictation word, specifically, the broadcast order is "sha (1 sound) yu (2 sounds)", "hai (3 sounds) yang (2 sounds)" and "xing (1 sound) qiu (2 sounds)".

304 to 307. For the description of steps 304-307, please refer to the detailed description of steps 103-106 in the first embodiment, which is not repeated herein.

308. The family education equipment carries out word segmentation processing on the dictation answers to obtain a plurality of answer words.

In the embodiment of the invention, the family education equipment can specifically recombine the word sequences in the dictation answers into the word sequences according to a certain rule. Assuming that the subject to which the dictation content belongs is a Chinese language, the dictation answer is also a Chinese text, and the family education device can perform word segmentation processing specifically by methods such as word segmentation based on character string matching, word segmentation based on understanding, or word segmentation based on statistics.

309. And the family education equipment acquires the image position of each answer word in the second page.

310. And the family education equipment determines the dictation words corresponding to each answer word according to the image position and the writing position.

311. And the family education equipment corrects the answer words through the standard answers of the dictation words corresponding to the answer words aiming at each answer word.

It can be seen that, compared with the method described in the embodiment of fig. 1, the method described in fig. 3 can be implemented, and the writing position where the user writes according to the pronunciation of the dictation words on the first page can be recorded in real time, and then after the dictation answers are recognized, the dictation words corresponding to each answer word can be determined according to the image position of each answer word in the dictation answers in the second page, so as to obtain the standard answers corresponding to each answer word, and finally, the answer words can be modified according to the standard answers, so that the problem that modification errors and confusion occur due to the fact that the corresponding standard answers are obtained to modify the answers according to the pronunciation broadcasting sequence of the dictation words can be overcome, and the modification accuracy is improved, so that the home equipment is more intelligent.

In addition, a list containing a correct answer word sequence and/or an incorrect answer word sequence can be output according to the broadcasting sequence, so that the impression of the user on the answer words can be deepened, the memory of the answer words is deepened, and the dictation effect is improved.

Example four

Referring to fig. 4, fig. 4 is a schematic structural diagram of a family education device according to an embodiment of the present invention. As shown in fig. 4, the family education device may include:

the shooting unit 401 is configured to shoot and obtain a first page before writing by the user.

And the broadcasting unit 402 is used for broadcasting the pronunciation of the dictation content after the shooting unit 401 shoots and obtains the first page before the user writes.

The shooting unit 401 is further configured to shoot a second page written by the user, where the second page is formed after the user writes on the first page according to the pronunciation of the dictation content.

An identifying unit 403 for identifying the first page to obtain first page information; and identifying the second page to obtain second page information.

And a comparing unit 404, configured to compare the second page information with the first page information to obtain a dictation answer written by the user.

As an alternative embodiment, the comparison unit 404 is further configured to, when the first page before the user writes, captured by the capturing unit 401, is a blank job page, identify, by the identification unit 403, second page information obtained by identifying a second page as a dictation answer written by the user.

As an alternative embodiment, the manner of capturing by the capturing unit 401 to obtain the second page written by the user may specifically be to capture periodically.

Further optionally, the manner of the above-mentioned shooting unit 401 for periodically shooting to obtain the second page written by the user may specifically be:

the shooting unit 401 is configured to, after the broadcasting unit 402 broadcasts the sound of the dictation content and the waiting time reaches the preset time, shoot the second page written by the user, where the waiting time is obtained by starting timing at the broadcast completion time; and taking the second page written by the user as the first page before the user writes in the next period, and entering the next period. By implementing the implementation mode, the pages before and after the user writes are obtained through periodic shooting, one dictation answer written by the user each time can be identified in real time, and therefore the acquisition efficiency of the dictation answer is improved.

As another optional implementation, the manner in which the shooting unit 401 is used to shoot and obtain the second page written by the user may specifically be:

the shooting unit 401 is configured to control the camera module installed in the family education device to shoot the mirror image in the light reflecting device at a preset frequency as the page being written by the user when the family education device is in the dictation mode; detecting whether each writing area on the page written by the user is written with writing content in real time; and if so, shooting to obtain a second page written by the user.

It can be understood that if the writing contents are written in each writing area, it can be determined that the page being written by the user is about to be or completely full, the shooting unit 401 can perform the operation of shooting to obtain the second page written by the user, otherwise, the operation of controlling the camera module installed in the family education device to shoot the mirror image in the light reflecting device at the preset frequency as the page being written by the user is continuously performed. Where the writing area may be part or all of the area in the page that the user is writing.

It can be seen that, with the home education device described in fig. 4, the first page before the user writes can be obtained by shooting, the pronunciation of the dictation content is reported, the second page after the user writes can be obtained by shooting, the first page and the second page are respectively identified, the first page information and the second page information are obtained, the second page information and the first page information are compared to obtain the dictation answer written by the user, a more accurate dictation answer can be identified, and therefore the detection accuracy of the dictation answer is improved.

EXAMPLE five

Referring to fig. 5, fig. 5 is a schematic structural diagram of another family education device disclosed in the embodiment of the present invention. Wherein, the family education device shown in fig. 5 is optimized by the family education device shown in fig. 4, and compared with fig. 4, the family education device shown in fig. 5 may further include:

a determination unit 405 configured to determine whether the first page is a blank page after the photographing unit 401 photographs the first page before the user writes.

An extracting unit 406, configured to extract any feature region in the first page when the determining unit 405 determines that the first page is not a blank page.

The shooting unit 401 is further configured to shoot a plurality of frames of user images within a preset time period after the broadcasting unit 402 broadcasts the reading of the dictation content.

The determining unit 405 is further configured to determine whether each of the plurality of frames of user images includes a feature area; when the plurality of frames of user images are judged to contain the characteristic area, judging whether the plurality of frames of user images are used for describing that the user finishes writing according to the pronunciation of the dictation content; and when the user writing is judged to be finished, triggering the shooting unit 401 to execute the operation of shooting to obtain the second page written by the user.

By implementing the implementation mode, when the first page before the user writes obtained by shooting is not a blank page, any characteristic region in the first page is extracted, and then the characteristic region is detected, so that whether the user performs a page turning action is judged, if the user does not perform the page turning action and the user finishes writing is detected, the second page after the user writes is obtained by shooting again, the problem that the second page and the first page which are shot are not the same page due to the fact that the user performs the page turning action can be overcome, and the acquisition accuracy of the dictation answer is improved.

As an alternative implementation, the manner that the above-mentioned determining unit 405 is used to determine whether several frames of user images are used for describing that the user has finished writing according to the pronunciation of the dictation content may specifically be:

the above-mentioned determining unit 405 is configured to identify a user action region (that is, a region including a user action instruction) in a plurality of frames of user images through deep learning and the like, and determine whether a current posture of the user in the user action region is adapted to a preset posture for instructing that writing is completed; and if the two types of the sound are matched, the user is judged to finish writing according to the pronunciation of the dictation content.

Further optionally, the manner that the aforementioned determining unit 405 is configured to determine whether the frames of user images are used for describing that the user has finished writing according to the pronunciation of the dictation content may specifically be:

the above-mentioned determining unit 405 is configured to track the user action target in each frame of user image, obtain an action sequence corresponding to the user action target, pre-process the action sequence, input the pre-processed action sequence into a pre-trained action classification model, extract a deep action feature of the action sequence in the action classification model, and identify, according to the deep action feature, whether the action sequence is used for describing that the user has written according to the pronunciation of the dictation content in the action classification model.

As an alternative embodiment, the family education device shown in fig. 5 may further include:

a matting unit 407 configured to, before the identifying unit 403 identifies the first page to obtain the first page information, identify a first region in the first page where a user's hand is located, and scrub the first region from the first page to obtain a target first page; and before the identifying unit 404 identifies the second page to obtain the second page information, identifying a second region of the second page where the user's hand is located, and matting the second region from the second page to obtain a target second page.

The identifying unit 403 is specifically configured to identify a target first page to obtain first page information; and identifying the target second page to obtain second page information.

By implementing the embodiment, the recognition interference caused by the area where the hand of the user is located can be eliminated, so that the recognition accuracy of the page information is improved.

As an alternative embodiment, the dictation content comprises a plurality of dictation words; the family education device shown in fig. 5 may further include:

the recording unit 408 is configured to record, in a process of broadcasting the reading of the dictation content by the broadcasting unit 402, a writing position where the user writes according to the reading of each dictation word in the dictation content on the first page.

A word segmentation unit 409, configured to perform word segmentation processing on the dictation answer after the comparison unit 404 compares the second page information with the first page information to obtain a dictation answer written by the user, so as to obtain a plurality of answer words.

An obtaining unit 410, configured to obtain an image position of each answer word in the second page.

The determining unit 411 is configured to determine a dictation word corresponding to each answer word according to the image position and the writing position.

And a correcting unit 412, configured to correct, for each answer word, the answer word with a standard answer of the dictation word corresponding to the answer word.

According to the implementation mode, the writing position of a user writing according to the pronunciation of the dictation words on the first page is recorded in real time, the dictation words corresponding to each answer word are determined by combining the image position of each answer word in the dictation answers in the second page after the dictation answers are recognized, the standard answers corresponding to each answer word are obtained, and finally, the answer words are corrected according to the standard answers, so that the problem that the answer words are corrected according to the pronunciation broadcasting sequence of the dictation words, the correction is disordered due to the fact that the corresponding standard answers are obtained, the correction accuracy is improved, and the family education equipment is more intelligent.

As another alternative, in the family education device shown in fig. 5, the recording unit 408 is further configured to record, during the process that the broadcasting unit 402 broadcasts the reading of the dictation contents, the writing start time when the user writes on the first page according to the reading of each dictation word in the dictation contents.

And, the family education device may further include:

a sorting unit 413, configured to, after the modifying unit 412 modifies each answer word by using the standard answer of the dictation word corresponding to the answer word, sort the modified multiple correct answer words according to the sequence from the beginning to the end of the writing time, so as to obtain a correct answer word sequence; sequencing the plurality of error answer words after the correction according to the sequence to obtain an error answer word sequence; the wrong answer words are not matched with the corresponding standard answers, and the correct answer words are matched with the corresponding standard answers.

The output unit 414 is configured to output a list including the correct answer word sequence and/or the incorrect answer word sequence.

As an alternative implementation, the manner of outputting the list containing the wrong answer word sequence by the output unit 414 may specifically be:

an output unit 414, configured to output a list including a word sequence of wrong answers and a corresponding sequence of standard answers; the standard answer sequence comprises a plurality of standard answers corresponding to the wrong answer words one by one.

By implementing the above embodiment, the list including the correct answer word sequence and/or the wrong answer word sequence is output according to the broadcasting sequence, so that the user can be helped to deepen the impression of the answer words, the memory of the answer words is deepened, and the dictation effect is improved.

Therefore, compared with the family education device described in the embodiment of fig. 4, the family education device described in the embodiment of fig. 5 can overcome the problem that the second page and the first page which are shot are not the same page due to the fact that the user performs the page turning action, and therefore the accuracy rate of obtaining the dictation answer is improved.

In addition, the recognition interference caused by the area where the user hand is located can be eliminated, and therefore the recognition accuracy of the page information is improved.

In addition, the problem that the answer words are corrected by acquiring the corresponding standard answers according to the pronunciation broadcasting sequence of the dictation words and the correction is disordered can be solved, so that the correction accuracy is improved, and the family education equipment is more intelligent.

And the method can help the user deepen the impression of the answer words, thereby deepening the memory of the answer words and improving the dictation effect.

EXAMPLE six

Referring to fig. 6, fig. 6 is a schematic structural diagram of another family education device according to an embodiment of the present invention. As shown in fig. 6, the family education device may include:

a memory 601 in which executable program code is stored;

a processor 602 coupled to a memory 601;

the processor 602 calls the executable program code stored in the memory 601 to execute any one of the dictation answer obtaining methods shown in fig. 1 to 3.

It should be noted that the family education device shown in fig. 6 may further include components, not shown, such as a power supply, an input key, a speaker, a microphone, a screen, an RF circuit, a Wi-Fi module, a bluetooth module, and a sensor, which are not described in detail in this embodiment.

The embodiment of the invention discloses a computer-readable storage medium which stores a computer program, wherein the computer program enables a computer to execute any one dictation answer acquisition method shown in figures 1-3.

Embodiments of the present invention also disclose a computer program product, wherein, when the computer program product is run on a computer, the computer is caused to execute part or all of the steps of the method as in the above method embodiments.

The embodiment of the present invention also discloses an application publishing platform, wherein the application publishing platform is used for publishing a computer program product, and when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of the method in the above method embodiments.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Those skilled in the art should also appreciate that the embodiments described in this specification are exemplary and alternative embodiments, and that the acts and modules illustrated are not required in order to practice the invention.

In various embodiments of the present invention, it should be understood that the sequence numbers of the above-mentioned processes do not imply an inevitable order of execution, and the execution order of the processes should be determined by their functions and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated units, if implemented as software functional units and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present invention, which is a part of or contributes to the prior art in essence, or all or part of the technical solution, can be embodied in the form of a software product, which is stored in a memory and includes several requests for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute part or all of the steps of the above-described method of each embodiment of the present invention.

In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood, however, that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.

Those skilled in the art will appreciate that some or all of the steps in the methods of the above embodiments may be implemented by a program instructing associated hardware, and the program may be stored in a computer-readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), compact disc-Read Only Memory (CD-ROM), or other Memory, magnetic disk, magnetic tape, or magnetic tape, Or any other medium which can be used to carry or store data and which can be read by a computer.

The dictation answer obtaining method, the family education device and the storage medium disclosed by the embodiment of the invention are introduced in detail, a specific embodiment is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for obtaining a dictation answer, comprising:

Capture the first page before the user writes; broadcast the pronunciation of the dictation content;

photographing to obtain a second page written by the user, where the second page is a page formed by the user writing on the first page according to the pronunciation of the dictation content;

identifying the first page to obtain first page information;

identifying the second page to obtain second page information;

The second page information is compared with the first page information to obtain a dictation answer written by the user.

2 . The method according to claim 1 , wherein after obtaining the first page before the user writes by the shooting, the method further comprises: 2 .

Determine whether the first page is a blank page;

If it is not the blank page, extract any feature area in the first page;

And, after the pronunciation of the broadcast dictation content, the method further includes:

Capture several frames of user images within a preset time period;

judging whether the several frames of user images all include the characteristic region;

If all include the feature area, determine whether the several frames of user images are used to describe that the user has finished writing according to the pronunciation of the dictation content;

If the writing is completed, the step of photographing to obtain the second page written by the user is performed.

3. The method according to claim 1 or 2, wherein before the identifying the first page to obtain the first page information, the method further comprises:

Identifying the first area where the user's hand is located in the first page; cutting out the first area from the first page to obtain a target first page;

The identifying the first page to obtain the first page information includes:

Identifying the target first page to obtain first page information;

Before the identifying the second page to obtain the second page information, the method further includes:

Identifying the second area where the user's hand is located in the second page; cutting out the second area from the second page to obtain a target second page;

The identifying the second page to obtain the second page information includes:

Identify the target second page to obtain second page information.

4. The method according to any one of claims 1 to 3, wherein the dictation content comprises several dictation words; the method further comprises:

In the process of broadcasting the pronunciation of the dictation content, record the writing position where the user writes on the first page according to the pronunciation of each of the dictation words in the dictation content;

And, after the comparing the second page information with the first page information to obtain the dictation answer written by the user, the method further includes:

Perform word segmentation processing on the dictation answer to obtain several answer words;

obtaining the image position of each of the answer words in the second page;

According to the image position and the writing position, determine the dictation word corresponding to each of the answer words;

For each answer word, the answer word is corrected through the standard answer of the dictation word corresponding to the answer word.

5. The method according to claim 4, wherein the method further comprises:

In the process of broadcasting the pronunciation of the dictation content, record the time when the user starts writing on the first page according to the pronunciation of each of the dictation words in the dictation content;

And, for each of the answer words, after correcting the answer word through the standard answer of the dictation word corresponding to the answer word, the method further includes:

Sorting the corrected correct answer words in the order from first to last according to the starting time of writing to obtain the correct answer word sequence, and sorting the corrected incorrect answer words in the order according to the order to obtain a sequence of wrong answer words; wherein, the wrong answer words do not match their corresponding standard answers, and the correct answer words match their corresponding standard answers;

A list containing the correct answer word sequence and/or the wrong answer word sequence is output.

6. a kind of tutoring equipment, is characterized in that, comprises:

a photographing unit, used for photographing and obtaining the first page before the user writes;

a broadcasting unit, used for broadcasting the pronunciation of the dictation content after the photographing unit captures and obtains the first page before writing by the user;

The photographing unit is further configured to photograph and obtain a second page written by the user, where the second page is a page formed by the user writing on the first page according to the pronunciation of the dictation content;

an identification unit, configured to identify the first page to obtain first page information; and identify the second page to obtain second page information;

a comparison unit, configured to compare the second page information with the first page information to obtain a dictation answer written by the user.

7. The tutoring device according to claim 6, further comprising:

a judging unit, configured to judge whether the first page is a blank page after the photographing unit obtains the first page before the user's writing;

an extraction unit, configured to extract any feature area in the first page when the determination unit determines that the first page is not the blank page;

The photographing unit is further configured to photograph and obtain several frames of user images within a preset time period after the broadcast unit broadcasts the pronunciation of the dictation content;

The judging unit is further configured to judge whether the several frames of user images all include the characteristic area; and when it is judged that the several frames of the user images all contain the characteristic area, determine whether the several frames of the user images all contain the characteristic area It is used to describe that the user has finished writing according to the pronunciation of the dictation content; and when it is determined that the user has finished writing, the shooting unit is triggered to perform the shooting operation to obtain the second page written by the user.

8. The tutoring device according to claim 6 or 7, further comprising:

The knockout unit is configured to identify the first area where the user's hand is located in the first page before the identifying unit identifies the first page to obtain the first page information, and remove the first area from the Cut out the first page to obtain the target first page; and, before the identifying unit identifies the second page to obtain the second page information, identifying the second area where the user's hand is located in the second page, knocking out the second region from the second page to obtain a target second page;

The identifying unit is specifically configured to identify the target first page to obtain first page information; and identify the target second page to obtain second page information.

9. The tutoring device according to any one of claims 6 to 8, wherein the dictation content comprises several dictation words; the tutoring device further comprises:

a recording unit, configured to record the writing that the user writes on the first page according to the pronunciation of each of the dictation words in the dictation content during the process of broadcasting the pronunciation of the dictation content by the broadcasting unit Location;

A word segmentation unit, configured to perform word segmentation processing on the dictation answer after the comparison unit compares the second page information with the first page information to obtain the dictation answer written by the user to obtain several answers word;

an obtaining unit for obtaining the image position of each of the answer words in the second page;

a determining unit for determining a dictation word corresponding to each of the answer words according to the image position and the writing position;

The marking unit is configured to, for each of the answer words, correct the answer words through the standard answer of the dictation words corresponding to the answer words.

10. The tutoring device according to claim 9, wherein:

The recording unit is further configured to record the user on the first page according to the pronunciation of each of the dictation words in the dictation content during the process of broadcasting the pronunciation of the dictation content by the broadcasting unit. The moment when writing begins;

And, the tutoring device further includes:

The sorting unit is used for, after the correcting unit corrects the answer word for each answer word through the standard answer of the dictation word corresponding to the answer word, according to the start writing time Sorting the corrected correct answer words in a first-to-last order to obtain a correct answer word sequence; and sorting the corrected incorrect answer words in the order to obtain a wrong answer word word sequence; wherein, the wrong answer word does not match its corresponding standard answer, and the correct answer word matches its corresponding standard answer;

An output unit for outputting a list containing the correct answer word sequence and/or the wrong answer word sequence.

11. A tutoring device, characterized in that, comprising:

memory in which executable program code is stored;

a processor coupled to the memory;

The processor invokes the executable program code stored in the memory to execute the method for obtaining a dictation answer according to any one of claims 1 to 5.

12. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, wherein the computer program causes a computer to execute a dictation answer acquisition according to any one of claims 1 to 5 method.