CN115620402A - Human-cargo interaction behavior identification method, system and related device - Google Patents
Human-cargo interaction behavior identification method, system and related device Download PDFInfo
- Publication number
- CN115620402A CN115620402A CN202211498078.1A CN202211498078A CN115620402A CN 115620402 A CN115620402 A CN 115620402A CN 202211498078 A CN202211498078 A CN 202211498078A CN 115620402 A CN115620402 A CN 115620402A
- Authority
- CN
- China
- Prior art keywords
- target
- target customer
- hand
- shelf
- customer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 84
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000001514 detection method Methods 0.000 claims abstract description 81
- 238000005303 weighing Methods 0.000 claims abstract description 75
- 230000009471 action Effects 0.000 claims abstract description 18
- 230000006399 behavior Effects 0.000 claims description 149
- 238000012549 training Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 11
- 206010019114 Hand fracture Diseases 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 206010017076 Fracture Diseases 0.000 claims description 3
- 230000007423 decrease Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 230000002452 interceptive effect Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000003014 reinforcing effect Effects 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/62—Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Psychiatry (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
Abstract
One or more embodiments of the present specification disclose a person-goods interaction behavior recognition method, system and related device, the method comprising: judging whether a touch action occurs on a target shelf or not according to the image information, and determining different identification schemes according to the number of target customers with the touch action: if only one target customer touches, the weighing information of the target shelf can be used for identifying the interaction behavior of the target customer and the target shelf; if a plurality of target customers touch, the goods holding detection model can be used for predicting images before and after each target customer touches, and the interaction behavior of each target customer and the target shelf is identified according to the prediction result. Therefore, the interaction behavior of the target customer and the target shelf can be accurately identified by combining weighing information or a goods holding detection model according to the touch behavior determined by the image information, and the identification accuracy and the identification efficiency are improved.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method, a system, and a related device for identifying human-cargo interaction behavior.
Background
The digital stores with the integration of Internet application, internet of things technology of physical stores, artificial intelligence and automation technology have come into operation. In shopping places such as current shopping malls and supermarkets, stealing behaviors generally exist. The existing solution is to supervise and prevent through a monitoring means, but because there are more customers and commodities and the vision scheme has inherent limitations, such as human body shielding, commodity complexity and background complexity, interaction between the customers and the goods shelf cannot be accurately identified, for example, whether the customers take the commodities and put them back is increased, and then the task of suspect investigation is increased.
Disclosure of Invention
One or more embodiments of the present disclosure are directed to a method, a system and a related device for identifying human-cargo interaction behavior, so as to accurately identify interaction behavior between a target customer and a target shelf.
To solve the above technical problem, one or more embodiments of the present specification are implemented as follows:
in a first aspect, a human-cargo interaction behavior identification method is provided, including:
receiving image information collected based on a shooting place where a target shelf is located;
detecting whether a target customer touches the touch behavior of the target shelf or not based on the image information;
if the touch behavior of a target customer is detected, inquiring weighing information of the target shelf within the starting time to the ending time of the touch behavior, and identifying the interaction behavior of the target customer and the target shelf based on the weighing information;
if the touch behaviors of a plurality of target customers are detected, predicting the handheld states of the target customers at the starting time of the touch behaviors and the handheld states of the target customers at the ending time of the touch behaviors according to a goods holding detection model, and identifying the interaction behaviors of each target customer and the target shelf based on the prediction results;
the goods holding detection model is obtained by training based on historical hand images before and after historical customers respectively interact with the plurality of shelves.
In a second aspect, a human-cargo interaction behavior recognition apparatus is provided, including:
the receiving module is used for receiving image information collected based on a shooting place where the target shelf is located;
the detection module is used for detecting whether a target customer touches the touch behavior of the target shelf or not based on the image information;
the identification module is used for inquiring weighing information of the target shelf within the starting time to the ending time of the touch action if the detection module detects the touch action of a target customer, and identifying the interaction action of the target customer and the target shelf based on the weighing information;
the identification module is used for predicting the handheld states of the target customers at the starting time of the touch behaviors and the handheld states of the target customers at the ending time of the touch behaviors according to the goods-holding detection model if the detection module detects the touch behaviors of the target customers, and identifying the interaction behaviors of each target customer and the target shelf based on the prediction results;
the goods holding detection model is obtained by training historical hand images before and after historical customers interact with the plurality of shelves respectively.
In a third aspect, a human-cargo interaction behavior recognition system is provided, including: at least one shelf, each shelf is provided with a weighing device for weighing the shelf; the upper computer is used for receiving weighing information sent by the weighing devices; at least one camera for collecting image information of the shelf from a top view angle or a side view angle; and the central control server is respectively connected with the at least one upper computer and the at least one camera, and is used for receiving the weighing information uploaded by the at least one upper computer and the image information acquired by the at least one camera and executing the human-cargo interaction behavior identification method of the first aspect.
In a fourth aspect, an electronic device is provided, including:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to perform the method of human-cargo interaction behavior recognition as described in the first aspect.
In a fifth aspect, a computer-readable storage medium is provided, which stores one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform the human-cargo interaction behavior recognition method of the first aspect.
The technical scheme provided by one or more embodiments of the above description is that, a person-goods interaction behavior recognition system is formed by using a low-cost goods shelf, a weighing device, an upper computer, a camera and a central control server, and based on weighing information of a target goods shelf acquired by the weighing device and image information including a target customer acquired by the camera, the weighing information and the image information are transmitted to the central control server through the upper computer to be processed, whether a touch behavior occurs on the target goods shelf is determined in advance according to the image information, and different recognition schemes are determined according to the number of the target customers who have the touch behavior: if only one target customer has touch behavior, the weighing information of the target shelf can be used for identifying the interaction behavior of the target customer and the target shelf; if a plurality of target customers touch, the goods holding detection model can be used for predicting images before and after each target customer touches, and the interaction behavior of each target customer and the target shelf is identified according to the prediction result. Therefore, the interaction behavior of the target customer and the target shelf can be accurately identified by combining weighing information or a goods holding detection model according to the touch behavior determined by the image information, and the identification accuracy and the identification efficiency are improved.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, reference will now be made briefly to the drawings that are used in the description of one or more embodiments or prior art, it being apparent that the drawings in the description that follows are only some of the embodiments described in the specification, and that other drawings may be derived from those drawings without inventive effort by a person skilled in the art.
Fig. 1 is a schematic structural diagram of a human-cargo interaction behavior recognition system provided in an embodiment of the present specification.
Fig. 2 a-2 b are schematic views of installation positions of a camera and a shelf provided in the embodiment of the present disclosure.
Fig. 3 is a schematic step diagram of a human-cargo interaction behavior identification method provided in an embodiment of the present specification.
Fig. 4a is a schematic diagram of correcting a keypoint trajectory based on a hand trajectory according to an embodiment of the present specification.
Fig. 4b is a schematic diagram of correcting a hand trajectory based on a keypoint trajectory according to an embodiment of the present disclosure.
Fig. 5 a-5 f are schematic diagrams of weight change curves of a target rack from a start time to an end time of a touch action according to an embodiment of the present disclosure.
Fig. 6 is a schematic flow chart of a person-goods interaction behavior identification method provided in an embodiment of the present specification.
Fig. 7 is a schematic structural diagram of a person-goods interaction behavior recognition apparatus provided in an embodiment of the present specification.
Fig. 8 is a schematic structural diagram of an electronic device provided in an embodiment of the present specification.
Detailed Description
In order to make the technical solutions in the present specification better understood, the technical solutions in one or more embodiments of the present specification will be clearly and completely described below with reference to the accompanying drawings in one or more embodiments of the present specification, and it is obvious that the one or more embodiments described are only a part of the embodiments of the present specification, and not all embodiments. All other embodiments obtained by a person skilled in the art on the basis of one or more embodiments in the present specification without making any inventive step shall fall within the scope of protection of this document.
In view of the fact that the identification of whether a customer takes goods through a visual algorithm is not accurate, and pressure sensors for weighing are required to be arranged on each goods of an intelligent goods shelf arranged in an unmanned supermarket, the cost and the operation cost of the intelligent goods shelf are high, and the intelligent goods shelf cannot be used in more interactive scenes, especially large-scale business overtime places and other places.
Therefore, in the embodiment of the present specification, a low-cost shelf, a weighing device, an upper computer, a camera, and a central control server are used to form a human-cargo interaction behavior recognition system, and based on weighing information of a target shelf collected by the weighing device and image information containing target customers collected by the camera, the weighing information and the image information are transmitted to the central control server through the upper computer to be processed, specifically, whether touch behaviors occur on the target shelf is determined according to the image information, and different recognition schemes are determined according to the number of the target customers who have touch behaviors: if only one target customer has touch behavior, the weighing information of the target shelf can be used for identifying the interaction behavior of the target customer and the target shelf; if a plurality of target customers touch, the goods holding detection model can be used for predicting images before and after each target customer touches, and the interaction behavior of each target customer and the target shelf is identified according to the prediction result. Therefore, the interaction behavior of the target customer and the target shelf can be accurately identified by combining weighing information or a goods holding detection model according to the touch behavior determined by the image information, and the identification accuracy and the identification efficiency are improved.
It should be understood that the human-cargo interaction behavior recognition scheme referred to in the specification can be applied to various common supermarkets, unmanned supermarkets, shopping malls, supermarkets and other shopping places provided with goods shelves. Alternatively, the present invention can be applied to public places such as libraries, bookstores, and service stations (umbrella rental, raincoat rental, free borrowing, and the like) that provide rental services or free use services. Therefore, by means of the visual image technology, digitization of retail industries such as offline shopping super, store and department goods is enabled, accurate identification of interaction behaviors of customers and goods shelves can be achieved in the places, and then suspected customers are accurately checked, and the purpose of preventing theft and damage is achieved.
Referring to fig. 1, a schematic structural diagram of a human-cargo interaction behavior recognition system provided in an embodiment of the present specification is shown. The human-cargo interaction behavior recognition system can comprise: at least one shelf 102, each shelf 102 being mounted with a weighing device (constituted by a pressure sensor 1042 and a signal processing circuit 1044) for weighing the shelf; the upper computer 106 is used for receiving weighing information sent by the weighing devices; at least one camera 108 for acquiring image information of the shelves 102 from a top or side view; and a central control server 110 connected to the at least one upper computer 106 and the at least one camera 108, respectively, where the central control server 110 is configured to receive weighing information uploaded by the at least one upper computer 106 and image information acquired by the at least one camera 108 (2 are shown in the figure), and execute a human-cargo interaction behavior identification method in this specification, which is described in detail below.
The shelves 102 may be placed on a horizontal ground surface, and the shelves 102 may be reinforced with reinforcing members to prevent the shelves 102 from rocking when a customer touches or takes goods from above, and to ensure that the shelves 102 cannot rock due to pedestrian movement. The weighing device mounted on the rack 102 may be a pressure sensor 1042 and a signal processing circuit 1044. That is, each weighing device may be constituted by a plurality of pressure sensors 1042 and a signal processing circuit 1044 connecting the plurality of pressure sensors 1042. Specifically, one pressure sensor 1042 is installed at each of the four corners of the bottom of each shelf 102, that is, the four pressure sensors 1042 are installed between the shelf 102 and the ground, and the pressure generated by the entire weight of the shelf 102 is applied to the four pressure sensors 1042. The four pressure sensors 1042 are connected to a signal processing circuit 1044, and the signal processing circuit 1044 converts analog signals read from the pressure sensors 1042 into digital signals, thereby determining weighing information of the rack 102.
The shelf 102 is different from an intelligent shelf in an unmanned supermarket, the shelf 102 is a common shelf, and only the bottom of the shelf is provided with a pressure sensor 1042; the pressure sensor 1042 may be a half-bridge pressure sensor or other types of pressure sensors, which is not limited in this specification as long as it can acquire a pressure value of the rack 102 and transmit the pressure value to the signal processing circuit 1044 to obtain weighing information of the rack 102.
The upper computer 106 may be a low-cost computer (such as a single chip microcomputer, etc.), which has the capability of being connected to the signal processing circuit 1044 (such as through a general purpose input/output GPIO interface, a transistor-transistor logic TTL interface, a serial port, etc.), and has the capability of being connected to the central control server 110 (such as through Wifi, bluetooth, a wired network, etc.). One host computer 106 may process the weighing information from the plurality of shelves 102 and communicate the weighing information to the central control server 110 as needed.
In fact, the upper computer 106 may provide an http service, when the central control server 110 queries the weighing information of the shelf 102, the central control server 110 may give the serial number of the queried shelf 102, and the upper computer 106 transmits the weighing information of the shelf 102 corresponding to the serial number to the central control server 110 through the http service. In fact, the method is not limited to outputting the weighing information in the http mode, and other modes can also include various communication modes such as telnet/ssh and the like. The http mode is an embodiment, and the work flow of the upper computer 106 is as follows: the upper computer 106 waits for the query request from the central control server 110, and after receiving the query request from the central control server 110, the query request includes which shelf the weight should be queried, and the upper computer 106 reads the weighing information from the signal processing circuit 1044 connected to the pressure sensor 1042, and returns the weighing information to the central control server 110.
The camera 108 is connected to the central control server 110 through a network, the camera may be installed right above the shelf 102, as shown in fig. 2a, the camera 108 is installed vertically downward in a top view installation mode, and the lens center of the camera 108 is perpendicular to the plane of the shelf 102; or on the side of the shelf 102. Referring to fig. 2b, the camera is mounted diagonally downward, side-on the side of the middle of the aisle of the shelf 102. The number of the cameras 108 installed in this specification may be two, and in order to acquire image information at an omnidirectional angle, the cameras 108 may be additionally installed at other positions. The camera 108 may be a general camera, or may be an infrared camera or a camera with other capturing and processing functions. It should be understood that the cameras 108 provided for each shelf 102 may be provided with corresponding numbers to facilitate identification of the captured image data or image information, distinguishing between image information belonging to different shelves 102.
The central server 110 obtains the weighing information of the shelves 102 from the upper computer 106, and the central server 110 also needs to obtain image information from the cameras 108, comprehensively evaluate whether the target customers touch the target shelves, and whether to take the goods from the target shelves and identify which customer takes the goods when the target customers touch the target shelves.
Referring to fig. 3, a schematic step diagram of a method for identifying human-cargo interaction behavior provided in an embodiment of the present specification, where the method may include the following steps:
step 302: and receiving image information collected based on the shooting place where the target shelf is located.
Specifically, image information acquired by a camera provided at a shooting site where the target shelf is located may be periodically acquired, and the image information may be composed of a plurality of image data, and the image data may be a human body image including one target customer or a plurality of target customers.
Step 304: detecting whether a target customer touches the touch behavior of the target shelf or not based on the image information; if no touch activity is detected, no processing is done, otherwise, the following step 306 or step 308 is performed.
Optionally, in step 304, when detecting whether there is a touch behavior of the target customer touching the target shelf based on the image information, the key part of the target customer included in the image information may be tracked and located based on a hand detection algorithm, a hand tracking algorithm, a key point detection algorithm, and a key point tracking algorithm; and if the distance between the key part of the target customer and the target shelf is smaller than a first threshold value, determining that the target customer touches the target shelf.
The key point detection algorithm and the key point tracking algorithm take a human body image as input, output the positions of key parts of the human body such as hands, shoulders, feet, head and the like (the positions of key points determined by the central points of the key parts), and connect the key parts belonging to the same person with each other. However, the keypoint detection algorithm and the keypoint tracking algorithm are easily affected by the surrounding environment (such as the arm is blocked, the appearance of the arm is similar to the background, and the like), so that the touch time is difficult to judge. The hand detection algorithm takes a human body image (video) as input and detects the position of a hand in the human body image. However, there may be hands of other people in the human image, resulting in associating the wrong hand with the human body. Therefore, in the embodiments of the present disclosure, the key part of the target customer included in the image information is tracked and located by combining the hand detection algorithm and the hand tracking algorithm with the key point detection algorithm and the key point tracking algorithm. And if the distance between the key part of the target customer and the target shelf in a certain image frame of the image information is smaller than a first threshold value, determining that the target customer touches the target shelf. Wherein the first threshold may be a range of values, e.g., [0, 2) cm, determined based on repeated touch tests. Conversely, if the distance between the target customer's key location and the target shelf is greater than or equal to the first threshold, it may be determined that the target customer does not touch the target shelf. It should be understood that the value ranges are merely exemplary, and the specific values should be flexibly adjusted according to the touch conditions set in different applicable places.
Further, when the key parts of the target customer included in the image information are tracked and positioned based on the hand detection algorithm and the hand tracking algorithm and the key point detection algorithm and the key point tracking algorithm, the hand track and the key point track can be respectively determined, and then the hand of the target customer is tracked and positioned based on the hand track and the key point track. Specifically, the method comprises the following steps: inputting each image frame of the image information into a key point detection model to obtain a key part set of each target customer, wherein the key part set is associated with identification information and a hand of the target customer; gathering key parts which belong to the same target customer and are obtained from each image frame in the image information into tracks to obtain a key point track of each target customer; inputting each image frame of the image information into a hand detection model to obtain a positioning frame of each hand; tracking and positioning based on a positioning frame obtained from each image frame in the image information to obtain a hand track of each hand; and tracking and positioning the hand of the target customer based on the key point track and the hand track. The identification information of the target customer concerned may be face information.
The key point detection model may be obtained by taking historical human body images as training samples, labeling positions of key parts (for example, 18 points such as a left hand, a right hand, a left foot, a right foot, a left shoulder and a right shoulder) in each historical human body image, and inputting a preset model for repeated training. The hand detection model can be obtained by taking historical human body images as training samples, marking the position of a hand in each historical human body image, inputting a preset model for repeated training. In this way, each image frame can be respectively input into the key point detection model and the hand detection model to obtain a key part set and a hand positioning frame; accordingly, the video images in the image information can obtain the key point track and the hand track of the corresponding target customer.
Further, when the hands of the target customer are tracked and positioned based on the key point tracks and the hand tracks, if the distance between the key point corresponding to the hand in the key point track of the target customer and the center of the positioning frame in the hand track is not larger than a second threshold value in the detected continuous N image frames, determining that the key point track of the target customer is bound with the hand track of the target customer, and splicing the key point track before being lost and the hand track after being lost to track and position the hands of the target customer after the key points except the hand in the key point track of the target customer are lost; and N is a positive integer greater than or equal to 2, and the second threshold is half of the average value of the long edges of the positioning frames in the hand track after averaging. Referring to fig. 4a, a diagram of correcting a keypoint trajectory based on a hand trajectory is shown. The upper rectangular frame of FIG. 4a is the positioning frame of the hand, and these positioning frames are connected by the arrow to form the hand track; the lower broken line of fig. 4a is regarded as an arm diagram in which a plurality of key points of an arm are connected, and the arm diagrams form a key point track. The default key point track is associated with the corresponding target customer, but when a certain key point drifts or an arm is blocked to cause the key point to be lost, the key point track can be replaced by the hand track in the lost time period. In fig. 4a, assuming that a keypoint trajectory and a hand trajectory composed of 5 frames are shown, at frames 2 to 4, keypoints partially disappear, and then the hand trajectory may be replaced in this period. Therefore, the hand track is used for correcting the key point track so as to accurately track and position the hand of the target customer.
After splicing the key point track before the loss and the hand track after the loss to track and position the hand of the target customer, if the distance between the key point corresponding to the hand in the key point track of the target customer and the center of the positioning frame in the hand track is greater than a second threshold value in M continuous image frames, the key point track of the target customer is recovered to be tracked, or the key point track of the target customer is determined to be bound with other hand tracks, wherein M is a positive integer greater than or equal to 2, and M is greater than N. In fact, when it is detected that the distance between the key point corresponding to the hand in the key point track of the target customer and the center of the positioning frame in the hand track is greater than the second threshold, it indicates that the bound key point track and the hand track may not belong to the same target customer, and the key point track of the target customer may be unbound to be tracked, or the key point track of the target customer may be bound with other hand tracks that satisfy the condition that the distance is not greater than the second threshold.
When the hand of the target customer is tracked and positioned based on the key point track and the hand track, if the hand track fracture is detected, estimating the position of the key point corresponding to the hand based on the key point track of the target customer associated with the hand corresponding to the hand track; and splicing the broken hand tracks based on the estimation result to track and locate the hand of the target customer. Referring to fig. 4b, due to occlusion and the like, especially when the hand is not visible after it is extended into the shelf, the hand trajectory of the same hand may be broken, i.e. the break occurs at frame 3 in fig. 4 b. The key point detection algorithm can estimate the position of the hand through the arm trend so as to replace the result of hand detection, and thus, the position of the hand can be estimated by using key points in the key point track so as to connect two broken hand tracks. And correcting the hand trajectory through the key point trajectory so as to accurately track and position the hand of the target customer based on the correct and complete hand trajectory.
Step 306: and if the touch behavior of a target customer is detected, inquiring weighing information of the target shelf from the starting time to the ending time of the touch behavior, and identifying the interaction behavior of the target customer and the target shelf based on the weighing information.
Referring to fig. 5a to 5f, curves of the weight value of the target rack from the start time to the end time of the touch action are shown respectively; in conjunction with these curves, the interaction behavior of the target customer with the target shelf may be identified, and if the first weight value at the start time is greater than the second weight value at the end time, and the weight value decreases within the time period, it is determined that the target customer picked from the target shelf, as in FIG. 5a; if the first weight value of the start time is equal to the second weight value of the end time and the weight values are not changed within the time period, determining that the target customer only touches the target shelf, as in FIG. 5b; if the first weight value at the start time is less than the second weight value at the end time and the weight value rises within the time period, determining that the target customer is on the target shelf, as in FIG. 5c; if the first weight value at the start time is greater than the second weight value at the end time, and the weight value first rises and then falls within the time period, determining that the target customer has replaced a higher quality item from the target shelf, as in FIG. 5d; if the first weight value of the starting time is equal to the second weight value of the ending time, and the weight value rises first and then falls within the time period, determining that the target customer replaced goods of equal quality from the target shelf, as shown in fig. 5e; if the first weight value at the start time is less than the second weight value at the end time and the weight values rise and fall within the time period, then the target customer is determined to have replaced a lower quality item from the target shelf, as shown in FIG. 5f. Therefore, after the touch action between the target customer and the target shelf is determined, the target customer is specifically identified to be the interactive actions such as picking up goods from the target shelf, changing goods or putting goods or only touching according to the weight change curve in the weighing information of the target shelf; therefore, after the specific interactive behaviors are accurately identified, whether the target customer holds the goods or not is determined conveniently based on the identified interactive behaviors, and the checking efficiency and speed are improved.
Step 308: if the touch behaviors of a plurality of target customers are detected, predicting the handheld states of the target customers at the starting time of the touch behaviors and the handheld states of the target customers at the ending time of the touch behaviors according to a goods-holding detection model, and identifying the interaction behaviors of each target customer and the target shelf based on the prediction results; the goods holding detection model is obtained by training historical hand images before and after historical customers interact with the plurality of shelves respectively.
Considering that the target shelf is touched by a plurality of target customers, the situations that the hands are crossed or the goods are taken simultaneously or one goods is taken and put are possible, and then the interaction behaviors cannot be accurately identified according to the weighing information. For this purpose, the holding states of the target customers at the starting time of the occurrence of the touch behavior and at the ending time of the occurrence of the touch behavior can be predicted according to the goods-holding detection model, and specifically, a first image and a second image of each target customer in the target customers can be acquired; inputting the first image and the second image into a goods holding detection model respectively to obtain a prediction result of each target customer; wherein the first image is a hand image captured for each target customer at a start time of occurrence of the touching behavior, and the second image is a hand image captured for each target customer at an end time of occurrence of the touching behavior.
Furthermore, after determining the hand-held state of each target customer before and after touch through the holding detection model, the method can be used for any target customer: determining a first prediction result predicted from the first image and a second detection predicted from the second image from the prediction results of the target customer; if the first prediction result is that the target customer does not hold the goods and the second prediction result is that the target customer holds the goods, determining that the target customer takes the goods from the target shelf; if the first prediction result is that the target customer holds the goods, and the second prediction result is that the target customer does not hold the goods, determining that the target customer puts goods on the target shelf; determining that the target customer only touches the target shelf if the first prediction result is that the target customer does not hold the item and the second prediction result is that the target customer does not hold the item; and if the first prediction result is that the target customer holds the goods and the second prediction result is that the target customer holds the goods, determining that the target customer changes the goods on the target shelf.
Referring to fig. 6, a schematic flow chart of human-cargo interaction behavior recognition provided in the embodiments of the present specification is shown.
Step 602: and receiving image information collected based on the shooting place where the target shelf is located.
Step 604: and tracking and positioning the key parts of the target customers contained in the image information based on a hand detection algorithm, a hand tracking algorithm, a key point detection algorithm and a key point tracking algorithm.
Step 606: and if the distance between the key part of the target customer and the target shelf is smaller than a first threshold value, determining that the target customer touches the target shelf.
Step 608: and if the touch behavior of a target customer is detected, identifying the interaction behavior of the target customer and the target shelf based on the weighing information.
Step 610: and if the touch behaviors of a plurality of target customers are detected, predicting the handheld states of the target customers at the starting time of the touch behaviors and the ending time of the touch behaviors according to the goods-holding detection model.
Step 612: identifying interaction behavior of each target customer with the target shelf based on the prediction results.
The specific implementation of steps 602 to 612 and the achieved technical effect can refer to steps 302 to 308.
Through above-mentioned technical scheme, utilize low-cost goods shelves, and weighing device, host computer and camera, well accuse server constitute people goods interaction behavior identification system, and based on the weighing information of the target goods shelves that weighing device gathered, and the image information who contains the target customer that the camera was gathered, transmit through the host computer and handle for well accuse server, it takes place to have the touch action to the target goods shelves specifically to judge earlier according to image information, and according to the target customer's that takes place the touch action figure, confirm different identification scheme: if only one target customer has touch behavior, the weighing information of the target shelf can be used for identifying the interaction behavior of the target customer and the target shelf; if a plurality of target customers touch, the goods holding detection model can be used for predicting images before and after each target customer touches, and the interaction behavior of each target customer and the target shelf is identified according to the prediction result. Therefore, the interaction behavior of the target customer and the target shelf can be accurately identified by combining weighing information or a goods holding detection model according to the touch behavior determined by the image information, and the identification accuracy and the identification efficiency are improved.
Referring to fig. 7, a human-cargo interaction behavior recognition apparatus provided for an embodiment of the present disclosure may include:
a receiving module 702, configured to receive image information acquired based on a shooting location where a target shelf is located, where the image information includes image data of at least one target customer;
a detection module 704, configured to detect whether there is a touch behavior of a target customer touching the target shelf based on the image information;
the identification module 706 is configured to query weighing information of the target shelf from a starting time to an ending time of occurrence of the touching behavior if the touch behavior of a target customer is detected by the detection module 704, and identify an interaction behavior of the target customer with the target shelf based on the weighing information;
the identifying module 706, if the detecting module 704 detects the touch behaviors of multiple target customers, is configured to predict, according to a holding detection model, the hand-held states of the multiple target customers at the start time of the occurrence of the touch behaviors and the hand-held states at the end time of the occurrence of the touch behaviors, and identify, based on a prediction result, an interaction behavior of each target customer with the target shelf;
the goods holding detection model is obtained by training historical hand images before and after historical customers interact with the plurality of shelves respectively.
Optionally, as an embodiment, when detecting whether there is a touch behavior of a target customer touching the target shelf based on the image information, the detecting module 704 is specifically configured to:
tracking and positioning the key part of the target customer contained in the image information based on a hand detection algorithm, a hand tracking algorithm, a key point detection algorithm and a key point tracking algorithm; and if the distance between the key part of the target customer and the target shelf is smaller than a first threshold value, determining that the target customer touches the target shelf.
In a specific implementation manner of the embodiment of the present specification, when the detection module 704 tracks and locates the key parts of the target customer included in the image information based on the hand detection algorithm, the hand tracking algorithm, the key point detection algorithm, and the key point tracking algorithm, the detection module is specifically configured to:
inputting each image frame of the image information into a key point detection model to obtain a key part set of each target customer, wherein the key part set is associated with identification information and a hand of the target customer; gathering key parts which belong to the same target customer and are obtained from each image frame in the image information into a track to obtain a key point track of each target customer; inputting each image frame of the image information into a hand detection model to obtain a positioning frame of each hand; tracking and positioning based on a positioning frame obtained from each image frame in the image information to obtain a hand track of each hand; and tracking and positioning the hand of the target customer based on the key point track and the hand track.
In a further specific implementation manner of the embodiment of the present specification, the detecting module 704, when performing tracking and positioning on the hand of the target customer based on the key point trajectory and the hand trajectory, is specifically configured to:
in the detected N continuous image frames, determining that the distance between a key point corresponding to a hand in the key point track of the target customer and the center of a positioning frame in the hand track is not greater than a second threshold value, and then determining that the key point track of the target customer is bound with the hand track of the target customer, so that after key points except the hand in the key point track of the target customer are lost, splicing the key point track before the loss and the lost hand track to track and position the hand of the target customer; and N is a positive integer greater than or equal to 2, and the second threshold is half of the average value of the long edges of the positioning frames in the hand track after averaging.
In yet another specific implementation manner of the embodiment of the present specification, after splicing the keypoint trajectory before the loss and the hand trajectory after the loss to track and locate the hand of the target customer, the detecting module 704 is further configured to:
when detecting that the distance between a key point corresponding to a hand in a key point track of a target customer and the center of a positioning frame in the hand track is larger than a second threshold value in M continuous image frames, resuming tracking the key point track of the target customer, or determining that the key point track of the target customer is bound with other hand tracks, wherein M is a positive integer greater than or equal to 2, and M is greater than N.
In yet another specific implementation manner of the embodiment of the present specification, the detecting module 704, when performing tracking and positioning on the hand of the target customer based on the keypoint trajectory and the hand trajectory, is specifically configured to:
when the hand trajectory fracture is detected, estimating the positions of key points corresponding to the hands based on the key point trajectories of target customers associated with the hands corresponding to the hand trajectory; and splicing the broken hand tracks based on the estimation result to track and locate the hand of the target customer.
In another specific implementation manner of the embodiment of the present specification, the identifying module 706, when identifying the interaction behavior of the target customer with the target shelf based on the weighing information, is specifically configured to:
determining the weight value of the target goods shelf within the starting time to the ending time of the touch action in the weighing information; determining that the target customer picked the item from the target shelf if the first weight value at the start time is greater than the second weight value at the end time and the weight value drops within the time period; determining that the target customer only touches the target shelf if the first weight value of the start time is equal to the second weight value of the end time and the weight value is unchanged within the time period; determining that the target customer is in stock on the target shelf if the first weight value at the start time is less than the second weight value at the end time and the weight value increases within the time period; determining that the target customer has replaced a higher quality item from the target shelf if the first weight value at the start time is greater than the second weight value at the end time and the weight values increase and decrease within the time period; determining that the target customer has replaced goods of equal quality from the target shelf if the first weight value at the start time is equal to the second weight value at the end time and the weight values rise and fall within the time period; if the first weight value at the start time is less than the second weight value at the end time, and the weight values rise and fall within the time period, then it is determined that the target customer has replaced a lower quality item from the target shelf.
In another specific implementation manner of the embodiment of the present specification, the identifying module 706, when predicting the handheld states of the target customers at the starting time of the occurrence of the touching behavior and at the ending time of the occurrence of the touching behavior according to the holding detection model, is specifically configured to:
obtaining a first image and a second image of each of the plurality of targeted customers; inputting the first image and the second image into a goods-holding detection model respectively to obtain a prediction result of each target customer; wherein the first image is a hand image acquired at a start time of occurrence of the touching act for each target customer, and the second image is a hand image acquired at an end time of occurrence of the touching act for each target customer.
In yet another specific implementation manner of the embodiment of the present specification, the identifying module 706, when identifying the interaction behavior of each target customer with the target shelf based on the prediction result, is specifically configured to:
for any target customer: determining a first prediction result predicted from the first image and a second detection predicted from the second image from the prediction results of the target customer; determining that the target customer picks items from the target shelf if the first forecast indicates that the target customer does not hold items and the second forecast indicates that the target customer holds items; if the first prediction result is that the target customer holds the goods, and the second prediction result is that the target customer does not hold the goods, determining that the target customer puts goods on the target shelf; determining that the target customer only touches the target shelf if a first prediction is that the target customer does not hold items and a second prediction is that the target customer does not hold items; and if the first prediction result is that the target customer holds the goods and the second prediction result is that the target customer holds the goods, determining that the target customer changes the goods on the target shelf.
Through above-mentioned technical scheme, utilize low-cost goods shelves, and weighing device, host computer and camera, well accuse server constitute people goods interaction behavior identification system, and based on the weighing information of the target goods shelves that weighing device gathered, and the image information who contains the target customer that the camera was gathered, transmit through the host computer and handle for well accuse server, it takes place to have the touch action to the target goods shelves specifically to judge earlier according to image information, and according to the target customer's that takes place the touch action figure, confirm different identification scheme: if only one target customer touches, the weighing information of the target shelf can be used for identifying the interaction behavior of the target customer and the target shelf; if a plurality of target customers touch, the goods holding detection model can be used for predicting images before and after each target customer touches, and the interaction behavior of each target customer and the target shelf is identified according to the prediction result. Therefore, the interaction behavior of the target customer and the target shelf can be accurately identified by combining weighing information or a goods holding detection model according to the touch behavior determined by the image information, and the identification accuracy and the identification efficiency are improved.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present specification. Referring to fig. 8, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other by an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 8, but that does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form the human-cargo interaction behavior recognition device on the logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
receiving image information collected based on a shooting place where a target shelf is located, wherein the image information comprises image data of at least one target customer; detecting whether a target customer touches the touch behavior of the target shelf or not based on the image information; if the touch behavior of a target customer is detected, inquiring weighing information of the target shelf from the starting time to the ending time of the touch behavior, and identifying the interaction behavior of the target customer and the target shelf based on the weighing information; if the touch behaviors of a plurality of target customers are detected, predicting the handheld states of the target customers at the starting time of the touch behaviors and the handheld states of the target customers at the ending time of the touch behaviors according to a goods holding detection model, and identifying the interaction behaviors of each target customer and the target shelf based on the prediction results; the goods holding detection model is obtained by training based on historical hand images before and after historical customers respectively interact with the plurality of shelves.
The method performed by the apparatus disclosed in the embodiment shown in fig. 3 or fig. 6 in this specification may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The methods, steps, and logic blocks disclosed in one or more embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with one or more embodiments of the present disclosure may be embodied directly in hardware, in a software module executed by a hardware decoding processor, or in a combination of the hardware and software modules executed by a hardware decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and combines hardware thereof to complete the steps of the method.
The electronic device may further execute the method in fig. 3 or fig. 6, and implement the functions of the corresponding apparatus in the embodiment shown in fig. 3 or fig. 6, which are not described herein again in this specification.
Of course, besides the software implementation, the electronic device of the embodiment of the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
Embodiments of the present specification also propose a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, are capable of causing the portable electronic device to perform the method of the embodiment shown in fig. 3 or fig. 6, and in particular to perform the method of:
receiving image information collected based on a shooting place where a target shelf is located, wherein the image information comprises image data of at least one target customer; detecting whether a target customer touches the touch behavior of the target shelf or not based on the image information; if the touch behavior of a target customer is detected, inquiring weighing information of the target shelf from the starting time to the ending time of the touch behavior, and identifying the interaction behavior of the target customer and the target shelf based on the weighing information; if the touch behaviors of a plurality of target customers are detected, predicting the handheld states of the target customers at the starting time of the touch behaviors and the handheld states of the target customers at the ending time of the touch behaviors according to a goods-holding detection model, and identifying the interaction behaviors of each target customer and the target shelf based on the prediction results; the goods holding detection model is obtained by training based on historical hand images before and after historical customers respectively interact with the plurality of shelves.
In short, the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present specification shall be included in the protection scope of the present specification.
The system, apparatus, module or unit illustrated in one or more of the above embodiments may be implemented by a computer chip or an entity, or by an article of manufacture with a certain functionality. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or apparatus comprising the element.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Claims (14)
1. A human-cargo interaction behavior identification method comprises the following steps:
receiving image information collected based on a shooting place where a target shelf is located;
detecting whether a target customer touches the touch behavior of the target shelf or not based on the image information;
if the touch behavior of a target customer is detected, inquiring weighing information of the target shelf within the starting time to the ending time of the touch behavior, and identifying the interaction behavior of the target customer and the target shelf based on the weighing information;
if the touch behaviors of a plurality of target customers are detected, predicting the handheld states of the target customers at the starting time of the touch behaviors and the handheld states of the target customers at the ending time of the touch behaviors according to a goods holding detection model, and identifying the interaction behaviors of each target customer and the target shelf based on the prediction results;
the goods holding detection model is obtained by training historical hand images before and after historical customers interact with the plurality of shelves respectively.
2. The human-cargo interaction behavior recognition method according to claim 1, wherein the step of detecting whether a target customer touches the target shelf or not based on the image information comprises the following steps:
tracking and positioning the key parts of the target customers contained in the image information based on a hand detection algorithm, a hand tracking algorithm, a key point detection algorithm and a key point tracking algorithm;
and if the distance between the key part of the target customer and the target shelf is smaller than a first threshold value, determining that the target customer touches the target shelf.
3. The human-cargo interaction behavior recognition method according to claim 2, wherein the tracking and positioning of the key parts of the target customers included in the image information based on a hand detection algorithm and a hand tracking algorithm and a key point detection algorithm and a key point tracking algorithm comprises:
inputting each image frame of the image information into a key point detection model to obtain a key part set of each target customer, wherein the key part set is associated with identification information and a hand of the target customer; gathering key parts which belong to the same target customer and are obtained from each image frame in the image information into a track to obtain a key point track of each target customer;
inputting each image frame of the image information into a hand detection model to obtain a positioning frame of each hand; tracking and positioning based on a positioning frame obtained from each image frame in the image information to obtain a hand track of each hand;
and tracking and positioning the hand of the target customer based on the key point track and the hand track.
4. The human-cargo interaction behavior recognition method as claimed in claim 3, wherein the tracking and positioning of the hand of the target customer based on the key point track and the hand track comprises:
in the detected N continuous image frames, determining that the distance between a key point corresponding to a hand in the key point track of the target customer and the center of a positioning frame in the hand track is not greater than a second threshold value, and then determining that the key point track of the target customer is bound with the hand track of the target customer, so that after key points except the hand in the key point track of the target customer are lost, splicing the key point track before the loss and the lost hand track to track and position the hand of the target customer; and N is a positive integer greater than or equal to 2, and the second threshold is half of the average value of the long edges of the positioning frames in the hand track after averaging.
5. The human-cargo interaction behavior recognition method according to claim 4, after splicing the pre-loss key point trajectory and the post-loss hand trajectory to track and locate the hand of the target customer, the method further comprising:
in M continuous image frames, if the distance between a key point corresponding to a hand in a key point track of a target customer and the center of a positioning frame in a hand track is larger than a second threshold value, the key point track of the target customer is recovered to be tracked, or the key point track of the target customer is determined to be bound with other hand tracks, wherein M is a positive integer larger than or equal to 2, and M is larger than N.
6. The human-cargo interaction behavior recognition method as claimed in claim 3, wherein the tracking and positioning of the hand of the target customer based on the key point track and the hand track comprises:
when the hand trajectory fracture is detected, estimating the positions of key points corresponding to the hands based on the key point trajectories of target customers associated with the hands corresponding to the hand trajectory;
and splicing the broken hand tracks based on the estimation result to track and locate the hand of the target customer.
7. The human-cargo interaction behavior recognition method as claimed in any one of claims 1 to 6, wherein recognizing the interaction behavior of the target customer with the target shelf based on the weighing information comprises:
determining the weight value of the target goods shelf within the starting time to the ending time of the touch action in the weighing information;
determining that the target customer picked the item from the target shelf if the first weight value at the start time is greater than the second weight value at the end time and the weight value drops within the time period;
determining that the target customer only touches the target shelf if the first weight value of the start time is equal to the second weight value of the end time and the weight values do not change within the time period;
determining that the target customer is in stock on the target shelf if the first weight value at the start time is less than the second weight value at the end time and the weight value increases within the time period;
determining that the target customer has replaced a higher quality item from the target shelf if the first weight value at the start time is greater than the second weight value at the end time and the weight values increase and decrease within the time period;
determining that the target customer has replaced goods of equal quality from the target shelf if the first weight value at the start time is equal to the second weight value at the end time and the weight values rise and fall within the time period;
if the first weight value at the start time is less than the second weight value at the end time, and the weight values rise and fall within the time period, then it is determined that the target customer has replaced a lower quality item from the target shelf.
8. The human-cargo interaction behavior recognition method as claimed in any one of claims 1 to 6, wherein the predicting of the handheld states of the plurality of target customers at the starting time of the occurrence of the touching behavior and at the ending time of the occurrence of the touching behavior according to the goods-holding detection model comprises:
obtaining a first image and a second image of each of the plurality of targeted customers;
inputting the first image and the second image into a goods holding detection model respectively to obtain a prediction result of each target customer;
wherein the first image is a hand image acquired at a start time of occurrence of the touching act for each target customer, and the second image is a hand image acquired at an end time of occurrence of the touching act for each target customer.
9. The human-cargo interaction behavior recognition method as claimed in claim 8, wherein recognizing the interaction behavior of each target customer with the target shelf based on the prediction result comprises:
for any target customer:
determining a first prediction result predicted from the first image and a second detection predicted from the second image from the prediction results of the target customer;
determining that the target customer picks items from the target shelf if the first forecast indicates that the target customer does not hold items and the second forecast indicates that the target customer holds items;
if the first prediction result is that the target customer holds the goods and the second prediction result is that the target customer does not hold the goods, determining that the target customer puts the goods on the target shelf;
determining that the target customer only touches the target shelf if a first prediction is that the target customer does not hold items and a second prediction is that the target customer does not hold items;
and if the first prediction result is that the target customer holds the goods and the second prediction result is that the target customer holds the goods, determining that the target customer changes the goods on the target shelf.
10. A human-cargo interaction behavior recognition apparatus, comprising:
the receiving module is used for receiving image information collected based on a shooting place where the target goods shelf is located;
the detection module is used for detecting whether a target customer touches the touch behavior of the target shelf or not based on the image information;
the identification module is used for inquiring weighing information of the target shelf within the time from the starting time to the ending time of the touch action if the detection module detects the touch action of a target customer, and identifying the interaction action of the target customer and the target shelf based on the weighing information;
the identification module is used for predicting the handheld states of the target customers at the starting time of the occurrence of the touch behaviors and the handheld states of the target customers at the ending time of the occurrence of the touch behaviors according to the goods-holding detection model if the detection module detects the touch behaviors of the target customers, and identifying the interaction behaviors of each target customer and the target shelf based on the prediction results;
the goods holding detection model is obtained by training based on historical hand images before and after historical customers respectively interact with the plurality of shelves.
11. The human-cargo interaction behavior recognition apparatus according to claim 10, wherein the detection module, when detecting whether there is a touch behavior of a target customer touching the target shelf based on the image information, is specifically configured to:
tracking and positioning the key part of the target customer contained in the image information based on a hand detection algorithm, a hand tracking algorithm, a key point detection algorithm and a key point tracking algorithm;
and if the distance between the key part of the target customer and the target shelf is smaller than a first threshold value, determining that the target customer touches the target shelf.
12. A human-cargo interaction behavior recognition system, comprising: at least one shelf, each shelf is provided with a weighing device for weighing the shelf; the upper computer is used for receiving weighing information sent by the weighing devices; at least one camera for collecting image information of the shelf from a top view or a side view; and the central control server is respectively connected with the at least one upper computer and the at least one camera, and is used for receiving the weighing information uploaded by the at least one upper computer and the image information acquired by the at least one camera and executing the human-cargo interaction behavior identification method according to any one of claims 1 to 9.
13. An electronic device, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to perform the human-cargo interaction behavior recognition method of any of claims 1-9.
14. A computer-readable storage medium storing one or more programs which, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform the human-cargo interaction behavior recognition method of any one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211498078.1A CN115620402B (en) | 2022-11-28 | 2022-11-28 | Human-cargo interaction behavior identification method, system and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211498078.1A CN115620402B (en) | 2022-11-28 | 2022-11-28 | Human-cargo interaction behavior identification method, system and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115620402A true CN115620402A (en) | 2023-01-17 |
CN115620402B CN115620402B (en) | 2023-03-31 |
Family
ID=84878189
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211498078.1A Active CN115620402B (en) | 2022-11-28 | 2022-11-28 | Human-cargo interaction behavior identification method, system and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115620402B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108492451A (en) * | 2018-03-12 | 2018-09-04 | 远瞳(上海)智能技术有限公司 | Automatic vending method |
CN109064630A (en) * | 2018-07-02 | 2018-12-21 | 高堆 | Nobody weigh automatically valuation container system |
JP2019211891A (en) * | 2018-06-01 | 2019-12-12 | コニカミノルタ株式会社 | Behavior analysis device, behavior analysis system, behavior analysis method, program and recording medium |
WO2020171635A1 (en) * | 2019-02-21 | 2020-08-27 | 이한승 | Smart vending machine allowing product to be selected by opening door |
CN113706227A (en) * | 2021-11-01 | 2021-11-26 | 微晟(武汉)技术有限公司 | Goods shelf commodity recommendation method and device |
CN113807915A (en) * | 2021-08-31 | 2021-12-17 | 恩梯梯数据(中国)信息技术有限公司 | Unmanned supermarket person and goods matching method and system based on deep learning construction |
CN114360057A (en) * | 2021-12-27 | 2022-04-15 | 广州图普网络科技有限公司 | Data processing method and related device |
CN114529847A (en) * | 2021-12-29 | 2022-05-24 | 西安理工大学 | Goods shelf dynamic commodity identification and customer shopping matching method based on deep learning |
US20220230216A1 (en) * | 2018-07-16 | 2022-07-21 | Accel Robotics Corporation | Smart shelf that combines weight sensors and cameras to identify events |
CN217338023U (en) * | 2021-08-31 | 2022-09-02 | 恩梯梯数据(中国)信息技术有限公司 | Unmanned supermarket people and goods matching device |
CN115083016A (en) * | 2022-06-09 | 2022-09-20 | 广州紫为云科技有限公司 | Monocular camera-based small-target-oriented hand space interaction method and device |
-
2022
- 2022-11-28 CN CN202211498078.1A patent/CN115620402B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108492451A (en) * | 2018-03-12 | 2018-09-04 | 远瞳(上海)智能技术有限公司 | Automatic vending method |
JP2019211891A (en) * | 2018-06-01 | 2019-12-12 | コニカミノルタ株式会社 | Behavior analysis device, behavior analysis system, behavior analysis method, program and recording medium |
CN109064630A (en) * | 2018-07-02 | 2018-12-21 | 高堆 | Nobody weigh automatically valuation container system |
US20220230216A1 (en) * | 2018-07-16 | 2022-07-21 | Accel Robotics Corporation | Smart shelf that combines weight sensors and cameras to identify events |
WO2020171635A1 (en) * | 2019-02-21 | 2020-08-27 | 이한승 | Smart vending machine allowing product to be selected by opening door |
CN113807915A (en) * | 2021-08-31 | 2021-12-17 | 恩梯梯数据(中国)信息技术有限公司 | Unmanned supermarket person and goods matching method and system based on deep learning construction |
CN217338023U (en) * | 2021-08-31 | 2022-09-02 | 恩梯梯数据(中国)信息技术有限公司 | Unmanned supermarket people and goods matching device |
CN113706227A (en) * | 2021-11-01 | 2021-11-26 | 微晟(武汉)技术有限公司 | Goods shelf commodity recommendation method and device |
CN114360057A (en) * | 2021-12-27 | 2022-04-15 | 广州图普网络科技有限公司 | Data processing method and related device |
CN114529847A (en) * | 2021-12-29 | 2022-05-24 | 西安理工大学 | Goods shelf dynamic commodity identification and customer shopping matching method based on deep learning |
CN115083016A (en) * | 2022-06-09 | 2022-09-20 | 广州紫为云科技有限公司 | Monocular camera-based small-target-oriented hand space interaction method and device |
Non-Patent Citations (2)
Title |
---|
李春华;: "基于RFID技术的智能超市构架方案" * |
赵政: "基于嵌入式系统自动售货机的控制及应用" * |
Also Published As
Publication number | Publication date |
---|---|
CN115620402B (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12056932B2 (en) | Multifactor checkout application | |
CN108985199B (en) | Detection method and device for commodity taking and placing operation and storage medium | |
US20210056498A1 (en) | Method and device for identifying product purchased by user and intelligent shelf system | |
US9165279B2 (en) | System and method for calibration and mapping of real-time location data | |
US20180247361A1 (en) | Information processing apparatus, information processing method, wearable terminal, and program | |
US20190371134A1 (en) | Self-checkout system, method thereof and device therefor | |
CN109353397B (en) | Commodity management method, device and system, storage medium and shopping cart | |
CN110033293B (en) | Method, device and system for acquiring user information | |
US20170068945A1 (en) | Pos terminal apparatus, pos system, commodity recognition method, and non-transitory computer readable medium storing program | |
EP3510571A1 (en) | Order information determination method and apparatus | |
US20180268224A1 (en) | Information processing device, determination device, notification system, information transmission method, and program | |
JP6707940B2 (en) | Information processing device and program | |
JP7379677B2 (en) | Electronic device for automatic user identification | |
KR20140114832A (en) | Method and apparatus for user recognition | |
JPWO2015147333A1 (en) | Sales registration device, program and sales registration method | |
CN112307864A (en) | Method and device for determining target object and man-machine interaction system | |
US10037510B2 (en) | System and method for calibration and mapping of real-time location data | |
US20230101001A1 (en) | Computer-readable recording medium for information processing program, information processing method, and information processing device | |
CN111428743A (en) | Commodity identification method, commodity processing device and electronic equipment | |
JP7318753B2 (en) | Information processing program, information processing method, and information processing apparatus | |
JP7108553B2 (en) | Planogram information generation device and planogram information generation program | |
CN115620402B (en) | Human-cargo interaction behavior identification method, system and related device | |
CN110677448A (en) | Associated information pushing method, device and system | |
CN113378601A (en) | Method for preventing goods loss, self-service equipment and storage medium | |
CN112950329A (en) | Commodity dynamic information generation method, device, equipment and computer readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231219 Address after: Room 801-6, No. 528 Yan'an Road, Gongshu District, Hangzhou City, Zhejiang Province, 310000 Patentee after: Zhejiang Shenxiang Intelligent Technology Co.,Ltd. Address before: Room 5034, building 3, 820 wenerxi Road, Xihu District, Hangzhou, Zhejiang 310000 Patentee before: ZHEJIANG LIANHE TECHNOLOGY Co.,Ltd. |