Nothing Special   »   [go: up one dir, main page]

WO2003001450A1 - Processing of digital images - Google Patents

Processing of digital images Download PDF

Info

Publication number
WO2003001450A1
WO2003001450A1 PCT/SE2002/001244 SE0201244W WO03001450A1 WO 2003001450 A1 WO2003001450 A1 WO 2003001450A1 SE 0201244 W SE0201244 W SE 0201244W WO 03001450 A1 WO03001450 A1 WO 03001450A1
Authority
WO
WIPO (PCT)
Prior art keywords
subareas
threshold
image
subarea
values
Prior art date
Application number
PCT/SE2002/001244
Other languages
French (fr)
Inventor
Andreas Olsson
Original Assignee
Anoto Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anoto Ab filed Critical Anoto Ab
Priority to EP02741593A priority Critical patent/EP1421555A1/en
Publication of WO2003001450A1 publication Critical patent/WO2003001450A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments
    • G06V30/1423Image acquisition using hand-held instruments; Constructional details of the instruments the instrument generating sequences of position coordinates corresponding to handwriting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image

Definitions

  • the present invention relates in general to processing of digital images, in particular thresholding or binarization thereof.
  • the invention is particularly, but not exclusively, aimed at preliminary image processing prior to calculation of position information on the basis of the shape ' and/or location of an object in a digital image .
  • thresholding In digital image processing, it is sometimes desirable to separate some form of structure from a background in a digital image. This can be achieved by so-called thresholding or binarization, in which the luminance values of the pixels of the digital image are compared to a threshold value. More particularly, luminance values above the threshold value are set to 1, while luminance values below the threshold value are set to 0, or vice versa. With a well-selected threshold value, the thresholding results in a binary image with defined, real structures .
  • a sequence of images is processed in a number of steps.
  • One of the introductory steps can be the above-mentioned thresholding, which aims on the one hand to locate relevant structures and, on the other, to reduce the amount of data that is processed in subsequent steps.
  • the thresholding it is desirable for the thresholding to be carried out with high precision, as errors will otherwise propagate in the subsequent processing steps.
  • the thresholding means that the luminance value of each pixel in a current image is compared with a global threshold value.
  • global thresholding requires, however, extensive control of the image recording so as to avoid luminance variations and noise.
  • there are often variations within each image for example with regard to the background luminance, signal-to-noise ratio and sharpness.
  • global thresholding such variations can lead to structures being missed or to fictitious structures being identified, particularly at the periphery of the images .
  • An example where the above considerations arise is in calculating a position based on images of a pattern on a base.
  • the pattern contains individual symbols, the shape and/or relative location of which code said position.
  • the images can, for example, be recorded optically by a sensor in a hand-held apparatus, for example in the form of a pen.
  • a pen for position determination is described, for example, in US-A-5 051 736, US-A-5 477 012 and WO 00/73983 and can be used to record handwritten information digitally.
  • the above-mentioned images can be processed in a data processing unit, such as a suitably programmed microprocessor, an ASIC, an FPGA, etc, which receives a sequence of digital greyscale images, converts these to binary for identification of the above-mentioned symbols, and calculates a position on the basis of each binarised image.
  • a threshold matrix is used that contains a threshold value for each pixel in the greyscale image.
  • a derivative matrix is calculated by convolution of the greyscale image with a suitable derivative mask, whereupon the pixel values of the derivative matrix are multiplied by the pixel values of the greyscale image to create a product matrix. Thereafter the derivative matrix and the product matrix are divided into subareas, within which a respective sum of the pixel values is calculated.
  • US-A-5 764 611 describes a thresholding method to be applied to greyscale images containing a pattern of dark dots against a bright background.
  • the greyscale image is divided into subareas, within which the pixel values are summed to create a sum matrix.
  • a low-pass filter is then applied to this sum matrix to create a background matrix, which after multiplication by a suitable fraction value is considered to form a matrix of local threshold values.
  • this thresholding method is hampered by being sensitive to lack of sharpness in the greyscale image. Such lack of sharpness must be elimi- nated by extensive pre-processing of the greyscale image.
  • the contrast matrix is then used to produce coefficients for a subarea-specific contrast function, which is finally allowed to operate on the greyscale image in order to eliminate the lack of sharpness in the same.
  • This pre-processing comprises several time-consuming operations and is, in addition, memory- intensive in that it requires the intermediate storage of both the greyscale image and the result of the high-pass filtering.
  • Prior-art technique also includes US-A-4 593 325, which describes a method for adaptive thresholding of a greyscale image prior to binary duplication of the same.
  • An object of the present invention is thus to demonstrate a technique which permits the identification of individual objects in a digital image in a quick and memory-efficient way.
  • a further object is to demonstrate a technique that is relatively unaffected by variations in luminance and/ or sharpness within an image.
  • a reference image which is representative of the digital image which is to be processed, for the calculation of the threshold matrix.
  • This reference image is given two predetermined, overlapping divisions, into first and second subareas. For each first subarea, a background luminance value is estimated, and for each second subarea, an object luminance value is estimated. Based on these estimated values, a threshold value is calculated for each overlap- ping first and second subarea.
  • the threshold values form a threshold matrix which is used for binarization of the digital image.
  • the invention enables rapid extraction of a low resolution background image and a low resolution object image from the reference image, after which the threshold matrix is created by threshold values being determined, according to some criterion, to be between associated values in the background and object images.
  • the method according to the invention is a direct method which can be carried out quickly and with relatively low demands as to available memory capacity, since the threshold values can be calculated directly, subarea by subarea, based on the estimated background and object luminance values.
  • the method can be carried out without any calculation-intensitive operations, such as convolutions and divisions, even if the method, when necessary, can be supplemented with such an operation, for instance for filtering.
  • individual objects can be identified in a digital image also with variations in luminance and/or sharpness within the same.
  • the estimated background luminance values represent the variation in background luminance over the reference image, or at least a relevant portion thereof, which means that the threshold values can be determined in adequate relation to the background.
  • the estimated object luminance values represent, in combination with the estimated background luminance values, the variation in contrast over the reference image or said portion, which means the threshold values can also be determined in adequate relation to the sharpness.
  • the division of the reference image, or a portion thereof, into subareas does not have to be a physical division.
  • the subareas are as a rule intended and used for extraction of data from the reference image.
  • the first and second subareas conveniently extend over one and the same continuous portion of the reference image, and in particular so that each second subarea at least partly comprises a first subarea. This guarantees that a threshold value can be calculated for each sub- area.
  • the first subareas are mutually exclusive and the second subareas are mutually exclusive.
  • Such first and second subareas do not overlap, and are suitably arranged side by side within the rele- vant portion of the reference image.
  • the first and second subareas may be identical, as regards size as well as relative position, i.e. the first and second subareas coincide.
  • the first subareas partly overlap each other, and/or the second subareas partly overlap each other.
  • Such an embodiment makes it possible to calculate the threshold matrix with greater accuracy and higher resolution. Moreover, the appearance of artefacts in the joint between subareas is minimised.
  • the size of the first and second subareas may be adjusted to optimal estimation of the background luminance values and object luminance values respectively, as will be discussed in more detail below.
  • threshold values are generated in a threshold matrix.
  • the term threshold matrix is to be interpreted figuratively to relate to an assembly of threshold values which are related to one part each of the reference image.
  • Such a threshold matrix can be stored in a memory unit as a matrix, a plurality of vectors, or an assembly of indi- vidual threshold values.
  • the calculated threshold values may be written one upon the other in one and same memory space in the memory unit, or be given as a sequence of values which is directly used for binarisa- tion of the digital image.
  • the above-mentioned reference image may be any image which is representative of the digital image which is to be binarised.
  • the reference image may consist of an image in this sequence of digital images.
  • the reference image consists of the digital image which is to be binarised.
  • the digital image thus is received, after which the background and object luminance values are estimated for the subareas and the threshold matrix is calculated. Then the threshold matrix is used for binarisation of the digital image.
  • the thresh- old matrix is calculated intermittently on the basis of the luminance values of a current image in the sequence of digital images, which threshold matrix can then be applied for the binarisation of one or more subsequent images in the sequence of digital images.
  • the calculation of the threshold matrix can be carried out in parallel with the actual thresholding, whereby more images can be processed per unit of time.
  • the need for intermediate storage of the digital images is avoided, as these can be processed by direct comparison with an already-calculated threshold matrix. This is due to the fact that the algorithms according to the invention are sufficiently robust to permit calculation of the threshold values from a reference image which is similar but not identical to the image to which the thresholding is to be applied.
  • the method according to the invention can be based on a simple and calculation-efficient estimate of the background and object luminance values for each subarea.
  • the background luminance value is estimated on the basis of first order statistics of the luminance values of the pixels within the first subarea.
  • First order statistics for example comprising the least value, the greatest value, the median value, the mean value and the sum of the pixels' luminance values within a subarea, can be extracted from a greyscale image in a calculation-efficient way.
  • the object luminance value can be estimated on the basis of first order statistics of the luminance values of the pixels within the second subarea.
  • the background luminance value is estimated on the basis of the greatest luminance value of the pixels within the first subarea.
  • a luminance value can be extracted from a greyscale image quickly and in a calculation-efficient way.
  • the background luminance value is estimated on the basis of the mean value of the luminance values of the pixels within the first subarea.
  • the background luminance value is estimated on the basis of a percentile value, for example in the range In a corresponding way, the object luminance value can be estimated on the basis of the least luminance value of the pixels within the second subarea.
  • the subareas can be designed so that each of the second subareas comprises a whole number of the first subareas, whereby the threshold matrix is minimised since this only needs to contain one threshold value for each of the first subareas.
  • the second subareas are designed in such a way that they each contain at least a part of at least one of the objects that are to be identified. Accordingly, each second subarea contains with certainty at least one value of the luminance within an object. All threshold values in the resultant threshold matrix will thereby be related both to the objects and to the background against which the objects appear, and this is achieved without a separate selection for the elimination of threshold values belonging to subareas that do not contain any part of an object. It is preferable that the second subareas are designed in such a way that they each contain at least one object in its entirety, which guarantees that each second subarea contains the object's extreme value in luminance. This simplifies the calculation of an adequate threshold value.
  • the image is preferably divided into first subareas that are larger than the objects that are to be identified, whereby each first subarea contains with certainty at least one value of the background luminance between the objects, in relation to which the threshold value can be set .
  • the objects are positioned relative to an invisible raster or grid of known dimensions.
  • This application can serve for the calculation of posi- tions described by way of introduction, for digital recording of handwritten information.
  • a position is coded by the shape and/or location of one or more individual objects in an image.
  • the position-coding pattern can be mentioned that is described in Applicant's patent publications WO 01/16691, WO 01/26032 and WO 01/26033, which are herewith incorporated by reference.
  • This position-coding pattern is constructed of marks, for example dots, which are located at a fixed distance from raster dots belonging to an invisible raster. The value of each mark is provided by its position relative to the associated raster dot.
  • a plurality of such marks together code a position that is given by two or more coordinates.
  • All of the position-coding patterns stated above contain objects that are a known distance apart, which is given by the distance between the raster dots and the distance between the objects and the raster dots.
  • the subareas can thereby be designed in relation to the known dimensions of the raster, for example in such a way that each sub- area contains at least a part of at least one object or at least one object in its entirety. If the current image is recorded with a certain inclination of the sensor relative to the position-coding pattern, the distance between the objects will vary across the image, for which reason this type of perspective distortion must be taken into account when dimensioning the subareas.
  • WO 00/73983 should also be mentioned, which describes a position-coding pattern containing marks of two different sizes, where each mark is centred at a respective raster dot of an invisible raster.
  • US-A-5 221 833 describes a coding pattern containing marks in the form of lines with different angles of rotation around their dot of symmetry, where each such mark is centred at a respective raster dot of an invisible raster. Also in these cases, it is possible to size the subareas taking into account the dimensions of the raster, so that each subarea contains with certainty at least a part of an object or an object in its entirety.
  • the classification of the subareas into at least a first category with a high signal-to-noise ratio and a second category with a low signal-to-noise ratio.
  • the classi- fication is used for calculating the threshold value for each subarea, by setting the threshold value at a larger relative distance or contrast depth from the background luminance value in subareas belonging to the second category than in subareas belonging to the first cate- gory.
  • the danger is reduced of the threshold value being at the level of noise in the proximity of the object.
  • Such a threshold value could generate a number of fictitious structures around the actual object.
  • the threshold value for subareas belonging to the first and second categories is set at a relative distance to the background luminance value of approximately 40-60% and approximately 60-80% of the contrast, respectively.
  • the subareas can be further classified into a third category with a high signal-to-noise ratio and a low contrast.
  • the subareas belonging to the third category are overexposed, for which reason the luminance depth of the object is given by only one or a few pixels.
  • the threshold value for a subarea belonging to the third category is set at a relative distance to the background luminance value of approximately 30-50% of the contrast .
  • the classification of the subareas with regard to their signal-to-noise ratio can be carried out based on a statistical variation measure, for example normalised standard deviation. For images comprising dark objects against a bright background, however, the signal-to-noise ratio is related to the background luminance value. Therefore in a calculation-efficient embodiment, the classification can be carried out by comparison of the greatest luminance value within the subarea and a limit level.
  • This limit level can, for example, correspond to the mean value of the greatest luminance values of all the subareas.
  • the limit level can, for example, consist of a predetermined fixed value.
  • a characteristic level is estimated for the luminance values within each sub- area, where the relative distance between the threshold value and the background luminance value is set depending upon this characteristic level, which suitably is indicative of the signal-to-noise ratio in the subarea.
  • the characteristic level can also indicate any overexposure of a subarea, with the ensuing danger of reduced contrast .
  • the threshold value can be adapted to the conditions in the subarea, as indicated by the characteristic level.
  • the characteristic level can be produced on the basis of the mean value, the median value or the sum of the luminance values of the pixels within a subarea. In certain cases, the least or the greatest luminance value within a subarea can represent the characteristic level of the subarea.
  • the relative distance between the threshold value and the background luminance value is a monoto- nically decreasing function of the characteristic level.
  • the calculated threshold value can be more or less representative of the pixels within the current subarea.
  • the threshold value can be less representative at the edges of the subarea.
  • a subsequent smoothing step is preferably implemented, in which each calculated threshold value is updated on the basis of adjacent calculated threshold values in the threshold matrix.
  • the threshold matrix is given further threshold values in the smoothing step, by interpolation of adjacent calculated threshold values in the threshold matrix. In the interpolation, the threshold matrix is thus given further threshold values which are used for the thresholding of an associated part of the greyscale image.
  • the interpolation can be of any kind, for example linear, and can be implemented in one or more steps .
  • the smoothing step com- prises, alternatively or additionally, a low-pass filtering of the threshold matrix.
  • the invention also relates to a device, a computer program product and a hardware circuit for the identification of individual objects in a digital image, and a hand-held apparatus for position determination.
  • Fig. IA shows schematically an example of 4 x 4 marks that are used to code a position
  • Fig. IB shows schematically a digital pen that is used to detect the marks in Fig. IA and to calculate a position on the basis of these
  • Fig. 2 shows greyscale images, recorded by the pen in Fig. IB, of a position-coding pattern of the type shown in Fig. IA,
  • FIG. 3 shows schematically the construction of a data processor in the pen in Fig. IB
  • Figs 4A - 4B illustrate schematically the division of an image into subareas for the calculation of a threshold matrix according to a first and a second embodiment, respectively
  • Fig. 5A illustrates schematically the classification of objects in the image
  • Fig. 5B illustrates schematically a generalisation of the embodiment in Fig. 5A
  • Fig. 6 shows luminance values along a line in a greyscale image, in which associated threshold values in the threshold matrix are indicated by broken lines, and
  • Fig. 7 shows the greyscale images in Fig. 2 after the thresholding thereof according to the invention.
  • the description below concerns position determination based on images of a position-coding pattern.
  • the position-coding pattern can be of any type, for example any one of the patterns mentioned by way of introduction. In the following, however, the invention is exemplified in connection with the pattern that is described in Applicant's Patent Publications WO 01/16691, WO 01/26032 and WO 01/26033. This pattern will be described briefly below with reference to Fig. IA.
  • the position-coding pattern comprises a virtual raster or grid 1, which is thus neither visible to the human eye nor can be detected directly by a device which is to determine positions on the surface, and a plurality of marks 2, each of which, depending upon its position, represents one of four values "1" to "4" .
  • the value of the mark 2 depends upon where it is placed in relation to its nominal position 3.
  • the nominal position 3, which can also be called a raster dot, is represented by the intersection of the raster lines. In one embodiment, the distance between the raster lines is 300 ⁇ m and the angle between the raster lines is 90 degrees.
  • raster intervals are possible, for example 254 ⁇ m to suit printers and scanners which often have a resolution which is a multiple of 100 dpi, which corresponds to a distance between dots of 25.4 mm/100, that is 254 ⁇ m.
  • each mark 2 is, at its centre of gravity, displaced relative to its nominal position 3, that is no mark is located at the nominal position. In addition, there is only one mark 2 per nominal position 3.
  • the marks 2 are displaced relative to the nominal positions 3 by 50 ⁇ m along the raster lines.
  • the displacement is preferably 1/6 of the raster interval, as it is then relatively easy to determine to which nominal position a particular mark belongs.
  • the displacement should be at least approximately 1/8 of the raster interval, otherwise it becomes difficult to determine a displacement, that is the requirements for resolution become great .
  • the displacement should be less than approximately 1/4 of the raster interval, in order for it to be possible to determine to which nominal position a mark belongs.
  • Each mark 2 consists of a more or less circular dot with a radius which is approximately the same size as the displacement or somewhat less.
  • the radius can be 25% to 120% of the displacement. If the radius is much larger than the displacement, it can be difficult to determine the raster lines. If the radius is too small, a greater resolution is required to record the marks.
  • the marks do not, however, need to be circular or round, but any suitable shape can be used, such as square, triangular, elliptical, open or closed, etc.
  • the pattern described above can be designed to code a very large number of absolute positions. For example, the pattern can be such that 6 x 6 adjacent marks together code a position, in the form of an x-coordinate and a y-coordinate .
  • Fig. IB shows a hand-held apparatus 10, below called a pen, that is used for optical detection of the position-coding pattern in Fig. IA.
  • a pen that is used for optical detection of the position-coding pattern in Fig. IA.
  • the pen 10 has a casing 11 in the shape of a pen, which has an opening 12 at one end. This end is intended to abut against or to be held a short distance from the surface on which the position determination is to be carried out .
  • One or more infrared light-emitting diodes 13 are arranged in the opening 12 for illuminating the surface area which is to be imaged, and an area sensor 14, sen- sitive to infrared light, for example a CCD or CMOS sensor, is arranged for recording a two-dimensional image of the surface area.
  • an area sensor 14 sen- sitive to infrared light, for example a CCD or CMOS sensor, is arranged for recording a two-dimensional image of the surface area.
  • the area sensor 14 is connected to a data processor 15 which is arranged to determine a position on the basis of the image recorded by the sensor 14.
  • the data processor 15 can contain one or more processors (not shown) , programmed to record images from the sensor 15 or from a buffer memory (not shown) associated with the sensor, and to carry out position determination on the basis of these images.
  • the pen 10 has also a pen point 16 which deposits pigment ink on the product. Using this, the user can write physically on the product, while at the same time what is being written is recorded digitally via optical detection of the position-coding pattern.
  • the pigment ink is suitably transparent to infrared light, while the marks 2 of the position-coding pattern (Fig. IA) absorb infrared light. This means that the pigment ink does not interfere with the detection of the pattern.
  • the area sensor 14 When the pen 10 is passed over the position-coding pattern, the area sensor 14 thus records a sequence of digital greyscale images which are transmitted to the data processor 15 for position determination.
  • Fig. 2 shows examples of such greyscale images I . These contain 96 x 96 pixels, the luminance values of which are given with 8-bit resolution. To achieve an adequate temporal resolution for the digitally recorded information, the images are read off from the area sensor 14 at a frequency of approximately 100 Hz, that is approximately 10 ms is available per image I for calculating a position. In the images in Fig . 2 , the marks 2 appear as dark dots against a bright background. Normally each mark or object covers several pixels in the image.
  • the sharpness varies within the image as a result of the pen and thus the area sensor being angled against the base when writing down information.
  • the contrast can also vary within the image as a result of uneven scattering properties of the base.
  • the illumination of the base is uneven.
  • the images are well illuminated at their central parts, with, however, varying image sharpness, while the peripheral parts have low signal-to-noise ratios, due to insufficient illumination.
  • Fig. 3 shows the data processor in the pen in greater detail.
  • the data processor 15 comprises a preprocessing unit 20, a threshold calculation unit 21 and a position determination unit 22.
  • the pre-processing unit 20 comprises in this case a hardware circuit (ASIC) which records a current greyscale image I from the area sensor 14, obtains a threshold matrix T from the threshold calculation unit 21, and generates a binary image B.
  • ASIC hardware circuit
  • the luminance value of each pixel in the current image is compared with an associated threshold value in the threshold matrix T. If the luminance value is greater than the threshold value, the corresponding luminance value in the binary image is set to one (1) , otherwise to zero (0) .
  • the output binary image B thus contains dark objects (value 0) which ideally constitute the marks, against a bright background (value 1) .
  • the pre-processing unit 20 also contains a statistics module which generates image statistical data S for given subareas or partitions in the current greyscale image I.
  • This image statistical data S is stored in a memory 23, from which the threshold calculation unit 21 can obtain the relevant image statistical data S when " it is to commence the calculation of a new threshold matrix T.
  • the threshold calculation unit 21 thus generates the threshold matrix T based on this image statistical data S, as will be described in greater detail below.
  • the position determination unit 22 receives the binary image B from the pre-processing unit 20, identi- fies the marks in the binary image and calculates position coordinates (x,y) on the basis of the positions of the marks in relation to the virtual raster.
  • the threshold calculation unit 21 and the position determination unit 22 consist of software which is executed in a micro- processor (not shown) .
  • the decoding in the position determination unit will not be described here in greater detail, as the present invention relates to the prelimi- nary processing step, more particularly the binarisation of the greyscale images I Embodiment 1
  • a first embodiment is based on a partitioning of greyscale images, each of which contains 96 x 126 pixels, into 63 (7 x 9) square subareas I s , each of which contains 14 x 14 pixels, as indicated by thin lines in Fig. 4A.
  • This division is used on the one hand by the statistics module for the generation of statistical data S, and, on the other, by the threshold calculation unit 21 for the calculation of the threshold matrix T.
  • the size of the subareas is set with knowledge of the images that are to be binarised. In the greyscale images the distance between the raster lines is known, in the present case approximately 7.5 pixels. On the basis of this information, the size of the subareas I s can be selected in such a way that each subarea I s with great certainty contains at least a part of a mark 2, as also shown in Fig. 4A. Since the images are taken using the hand-held apparatus 10, which is used like a pen, it must also be taken into account that the angle between the pen and the base, that is the perspective in the images, can vary depending upon the writing posture of the user.
  • a raster is used with a distance between the raster lines of approximately 300 ⁇ m, together with circular dots with a displacement and a radius of approximately 50 ⁇ m.
  • Square subareas I s with a side length of approximately 120% of the raster interval should then be sufficiently large to guarantee that each subarea I s in each image I contains at least a part of a dot. Also taking into account the varying perspective, side lengths in the range 150% - 300% of the raster interval have been found to give satisfactory results.
  • the subareas I ⁇ are thereby so large that they essentially always contain at least one dot in its entirety, which simplifies the production of an adequate threshold value for each subarea.
  • the upper limit for the size of the subareas I ⁇ is given by the lowest acceptable resolution of the threshold matrix, which depends among other things upon the spatial size of the luminance variations in the images . It is also possible to make the subareas I s smaller, in order to increase the resolution of the threshold matrix. In this case, certain subareas will only contain background and thus will not contain part of any mark, for which reason a correct threshold value cannot be calculated. These subareas should therefore be identified and allocated a correction value which, for example, is calculated on the basis of the threshold values for the surrounding subareas .
  • the statistics module derives statistical data S in the form of the greatest luminance value (max) and the least luminance value (min) within each subarea I ⁇ .
  • a threshold value Ti is calculated for each subarea. This threshold value is stored in the threshold matrix T, as illustrated in Fig. 4A. The threshold value Ti is calculated as a function of the contrast (max - min) within the subarea :
  • each subarea should advantageously contain at least one mark in its entirety, so that the actual luminance depth of the mark can be used in the calculation of the threshold value.
  • the factor ki determines at which contrast depth the threshold value is to be set. In order to reduce the effect of noise and lack of sharpness, the contrast depth factor ki is set on the basis of a classification of each subarea into the classes “sharp” , "lacking in contrast” and “noisy” .
  • noise limit value is defined as the mean value of the greatest luminance values in all the subareas, which can be calculated in a simple way from said image statistical data S.
  • the contrast limit value is defined as the mean value of the contrast in all the sub- areas in the current image, which can also be calculated in a simple way from said image statistical data S.
  • Fig. 5a shows examples of the classification of the subareas in a diagram of the luminance as a function of pixels.
  • the subarea I is “sharp”
  • the subarea II "lacking in contrast”
  • the subarea III is “noisy” .
  • the threshold values i for the respective subareas are also indicated by broken lines. It has been found advantageous to set the factor ki to a value in the range 0.6-0.8 in noisy subareas, 0.3-0.5 in subareas lacking in contrast and 0.4-0.6 in sharp subareas. According to one embodiment, the central value in each of the above ranges is used.
  • the threshold value is set relatively far from the background luminance (max) in noisy subareas, which are typically to be found at the periphery of the image. This reduces the risk of the threshold value being set at the level of the background noise, which could result in the thresholding generating a binary image with many small fictitious structures.
  • the noisy subareas can undergo a supplementary contrast control, for example by the threshold value being set to zero in the noisy subareas that have a contrast similar to typical noise levels.
  • the threshold value is set relatively close to the background luminance (max) .
  • This increases the probability of associated marks being identified as structures with a plurality of connected pixels, which in turn provides a better estimation of the position of the mark in the subsequent decoding in the position determination unit 22.
  • this type of subarea is in fact overexposed, for which reason the luminance depth of the marks is only given by one or a few pixels, as indicated in Fig. 5A.
  • the threshold matrix T contains a threshold value Ti per subarea I s (cf . Fig. 4A) .
  • the classification of the subareas can be made more sophisticated. For example, histograms or stan- dard deviations can be used to identify noise, mean values can be used to identify background luminance, etc.
  • An advantage of the use of the minimum and the maximum for each subarea is, however, that these values can be extracted from a greyscale image in a calculation-effec- tive way. In addition, the number of calculation steps is minimised in the production of the threshold matrix.
  • the mean value of the luminance values of the pixels is calculated within the respective subarea, after which the threshold value is calculated on the basis of this mean value and of the contrast in the subarea in question, in accordance with:
  • f (m) is a function of the luminance mean value mi in the subarea in question and has a value in the range 0 - 1.
  • Fig. 5B shows an example of the appearance of the function f (m) .
  • f (m) is a continuous and mono- tonically decreasing function of the luminance mean value of the subarea.
  • Low luminance mean values give both low contrast and low signal-to-noise ratio, as indicated by the luminance histogram (i) , for which reason the value of the function is set close to 1 (corresponding to the above class "noisy").
  • the signal-to-noise ratio and the contrast tend to increase gradually, as indicated by the luminance histograms (ii) and (iii) , for which reason the value of the function is allowed to decrease to a corresponding extent.
  • f (m) becomes approximately 0.5 (corresponding to the above class "sharp").
  • the contrast is again reduced (see the luminance histogram (iv) ) as a result of over- exposure of the subarea (corresponding to the above class "lacking in contrast") .
  • the function f (m) is here set to a value close to 0, that is the threshold value becomes close to the estimated background luminance .
  • Fig. 5B can be modified.
  • it can consist of more segments (classes) , with other breakpoints and inclinations.
  • the function can be a curve, as given by a second degree equation or the like.
  • the same subareas are used for estimating both background luminance and object luminance in the greyscale image.
  • subareas of different sizes are instead used for estimating the background luminance and object luminance of the greyscale image.
  • the threshold matrix is thus calcu- lated based on image statistical data for two different sets of subareas, object subareas and background sub- areas .
  • the object subareas and the background subareas overlap each other and cover all that part of the image that is to be binarised.
  • the object subareas correspond in size to the subareas that are used in the first embodiment, that is they are so large that they contain with certainty at least a part of a mark.
  • the background sub- areas can, however, be made smaller, as they only need to be large enough to contain with certainty pixels that are representative of the image's local background luminance, that is they should be larger than each mark in the image. Any enlargement as a result of the effects of perspective should be taken into account.
  • Fig. 4B shows an example of the partitioning into object subareas and background subareas.
  • This partitioning is suited to the same coding pattern as the first embodiment, however for greyscale images consisting of 96 x 96 pixels.
  • Each greyscale image is divided into 64 (8 x 8) object subareas I s , 0 , each of which contains 12 x 12 pixels, and into 256 (16 x 16) background sub- areas I s ,b/ each of which contains 6 x 6 pixels.
  • the object subareas I s , 0 a e dimensioned to comprise a whole number of background subareas I s , b (in this case four) , whereby the calculation of the threshold matrix is simplified.
  • a threshold value can now be calculated for each background subarea, in accordance with:
  • the background luminance is estimated as the greatest luminance value within the background subarea and the object luminance as the least luminance value within the object subarea.
  • the threshold value can be calculated in alternative ways, as described in connection with the first embodiment above .
  • the threshold matrix is calculated based on a background matrix, which contains the background luminances estimated for the background subareas I S( b, and an object matrix which contains the object luminances estimated for the object subareas I s ,o-
  • a background matrix which contains the background luminances estimated for the background subareas I S( b)
  • an object matrix which contains the object luminances estimated for the object subareas I s ,o-
  • the object subareas I s ,o overlap a whole number of background subareas I s ,b > as the data that requires intermediate storage in the background matrix and the object matrix is thereby minimised.
  • the statistics module in the pre-processing unit 20 can be designed to generate separate image statistical data for the background subareas I S/b and the object subareas I s ,o-
  • the image statistical data for the object areas I s , 0 can be calculated from the image statistical data for the background subareas I s ,b- For instance, this is the case in the example above, with estimation based on minimum and maximum values and with adjustment of the relative sizes of the background subareas and the object subareas.
  • Both embodiments described above result in a threshold matrix T containing a threshold value Ti per subarea I s and I s ,:t>, respectively. It has, however, been found that the precision of the thresholding is improved if the threshold matrix is given additional threshold values by interpolation, between the threshold values calculated as defined above. Such additional threshold values can be created by linear interpolation of adjacent values in the threshold matrix. The linear interpolation is carried out in two steps, interpolation by rows and interpolation by columns. The threshold matrix interpolated in this way can then, if required, undergo a further interpolation. It should be noted that the rela- tionship between the threshold values and the subareas is changed when the threshold matrix is given additional threshold values by means of interpolation.
  • each threshold value is now applicable to pixels within smaller thresholding areas of each image.
  • Each first calculated threshold value is suitably allocated to a thresholding area in the centre of its subarea, whereupon the new threshold values can be allocated to thresholding areas in between.
  • each such thresholding area has a size that is 1/4 or 1/16, respectively, of the size of the subarea.
  • An alternative method for improving the precision of the thresholding is to have the threshold matrix calculated in accordance with the embodiment above under- go a low-pass filtering, for example by convolution of the threshold matrix with a suitable 3 x 3 matrix.
  • Fig. 6 shows the luminance distribution along a line in a greyscale image, together with calculated threshold values Ti along this line.
  • the threshold values in Fig. 6 are produced according to the second embodiment with a subsequent enlargement of the threshold matrix by linear interpolation. In spite of large variations in background luminance, signal-to-noise ratio and contrast, the calculated threshold values Ti accord well with the luminance values along the line.
  • Fig. 7 shows a number of binary images B that have been generated by the thresholding of the greyscale images shown in Fig. 2.
  • the thresholding is carried out according to the second embodiment with a subsequent enlargement of the threshold matrix by linear interpolation.
  • a comparison of Figs 2 and 7 shows that the thresholding results in a satisfactory identification of the marks 2, even with variations in illumination, base properties and imaging conditions within the greyscale images .
  • a threshold matrix can obviously be calculated for one particular greyscale image and then used for the thresholding of the same with high precision.
  • the calculation of the threshold matrix can be carried out quickly, based on given image statistical data. It is estimated that it takes approximately 8000 clock cycles for the calculation of the threshold matrix according to the second embodiment, that is with a background matrix estimated for 16 x 16 subareas, an object matrix estimated for 8 x 8 subareas and a mean value matrix estimated for 8 x 8 subareas. For an 80 MHz processor, this corresponds to a calculation time of 100 ⁇ s .
  • the algorithms described above can permit a further increase of the throughput of images I through the data processor 15, by carrying out the calculation of the threshold matrix in parallel with the actual thresholding.
  • the threshold matrix is thus calculated on the basis of a preceding image, which is similar to the subsequent image or images to be thresholded by use of this calculated threshold matrix. This can be regarded as if the thresh- old matrix is periodically updated and then used in thresholding one or more subsequent images. It has been found fully possible to use one and the same threshold matrix for the thresholding of a plurality of, for example 5 - 10, consecutive greyscale images. According to this embodiment, a given greyscale image can be thresholded at the same time as it is being read in from the sensor by the data processor 15.
  • This thresholding can thus be implemented in hardware and in this way relieve the processor (not shown) which carries out the calculations in the threshold calculation unit 21 and the position determination unit 22.
  • This hardware can at the same time also generate the above-mentioned image statistical data S in order to relieve the processor still further.
  • the need for intermediate storage of the greyscale image is avoided as this can be processed by direct comparison with an already-calculated threshold matrix.
  • the algorithms according to the invention have a sufficient tolerance to variations in luminance and/or sharpness from image to image.
  • the threshold matrix T is calculated on the basis of image statistical data for given subareas in the greyscale images and thereby contains threshold values that are related to the overall luminance distribution in the images, both with regard to the background and to the object.
  • the threshold matrix contains both global information which is relevant for several consecutive images, and local information, which allows for the thresholding of each object in relation to its local surroundings.
  • each subarea, which contains a plurality of pixels is allocated a threshold value in the threshold matrix, the effect of local variations is limited.
  • the size of the subareas is selected in such a way that the calculated threshold value is sufficiently insensitive to local variations in order to achieve a desired tolerance to variations in luminance and/or sharpness from image to image.
  • the preprocessing unit 20 is designed to buffer, before, after or during the gene- ration of said image statistical data S, one or more associated greyscale images, for instance in the memory 23.
  • the threshold matrix T which is calculated by the threshold calculation unit 21 for a current greyscale image can thus be used by the preprocessing unit 21 for binarisation of the same current image, and optionally a subsequent image in the incoming sequence of images .
  • the binary images before they are analysed further for position determination, they can undergo an area check, with the aim of eliminating fictitious marks on the basis of the number of connected pixels within each mark. Accordingly, marks consisting of one or a few pixels can be assumed to originate from noise and can therefore be removed. As the maximal size of the marks is known, an upper area threshold can also be set.
  • the above-mentioned contrast depth factor can, ' instead of being set at a predetermined value or be calculated on the basis of the classification of associated subareas, be given by an external process, such as a control loop.
  • an external process such as a control loop.
  • the calculation of image statistical data can be carried out in the threshold calculation unit instead of in the pre-processing unit .
  • the data processor can be realised completely in hardware or completely in software .
  • sub- areas can be of any shape, such as square, rectangular, triangular, rhombic, hexagonal, etc.
  • the invention is in no way restricted to the described position-coding pattern, but can also be used for the identification and decoding of other position- coding patterns.
  • the raster described above can have other shapes than orthogonal, such as a rhombic grid, for example with 60 degree angles, a triangular or hexagonal grid, etc.
  • the marks can be displaced in other directions than along the raster lines.
  • the pattern is optically readable and the sensor is thus optical. It is recognised, however, that the images that are processed according to the invention can be generated in another way, for example by detection of chemical, acoustic, electromagnetic, capacitive or inductive parameters. Similarly, it is recognised that the invention can also be used for identification of bright marks against a dark background.
  • the invention can be used in general for identification of individual objects in a digital image in a quick and memory-efficient way, particularly when there are variations in luminance and/or sharpness within an image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

In a method for identifying individual objects in a digital image (I), a device is used that comprises a theresholding unit (20) and a threshold calculation unit (21). the thresholding unit (20) is designed to receive the digital image (I) and compare the luminance values of the digital image with threshold values (Ti) of a threshold matrix (T) in order to create a binary image on the basis of the comparison. The threshold calculation unit (21) is designed to estimate, on the basis of a first and second division of a reference image which corresponds to the digital image (I) into first and second subareas, a background luminance value for each first subarea and an object luminance value for each second subarea. For each overlapping first and second subarea, the threshold calculation unit (21) calculates a threshold value (Ti) on the basis on the associated background and object luminancevalues, and creates the threshold matrix (T) from said threshold values (Ti). The device and the method permit quick and memory-efficient identification of the individual objects, even if the digital image (I) contains variations in luminance and/or sharpness. A computer program product, a hardware circuit and a hand-held apparatus for position determination are also described.

Description

PROCESSING OF DIGITAL IMAGES
Field of the Invention
The present invention relates in general to processing of digital images, in particular thresholding or binarization thereof. The invention is particularly, but not exclusively, aimed at preliminary image processing prior to calculation of position information on the basis of the shape ' and/or location of an object in a digital image . Background Art
In digital image processing, it is sometimes desirable to separate some form of structure from a background in a digital image. This can be achieved by so-called thresholding or binarization, in which the luminance values of the pixels of the digital image are compared to a threshold value. More particularly, luminance values above the threshold value are set to 1, while luminance values below the threshold value are set to 0, or vice versa. With a well-selected threshold value, the thresholding results in a binary image with defined, real structures .
In many cases, a sequence of images is processed in a number of steps. One of the introductory steps can be the above-mentioned thresholding, which aims on the one hand to locate relevant structures and, on the other, to reduce the amount of data that is processed in subsequent steps. Of course, it is desirable for the thresholding to be carried out with high precision, as errors will otherwise propagate in the subsequent processing steps.
It its simplest form, the thresholding means that the luminance value of each pixel in a current image is compared with a global threshold value. Such so-called global thresholding requires, however, extensive control of the image recording so as to avoid luminance variations and noise. In practical applications there are often variations within each image, for example with regard to the background luminance, signal-to-noise ratio and sharpness. With global thresholding such variations can lead to structures being missed or to fictitious structures being identified, particularly at the periphery of the images .
In order to solve these problems, so-called local thresholding is used, in which a threshold value is calculated for each pixel in an image on the basis of the luminance values of the surrounding pixels. This is, however, a time-consuming operation that, in addition, requires intermediate storage of the digital image.
An example where the above considerations arise is in calculating a position based on images of a pattern on a base. The pattern contains individual symbols, the shape and/or relative location of which code said position. The images can, for example, be recorded optically by a sensor in a hand-held apparatus, for example in the form of a pen. Such a pen for position determination is described, for example, in US-A-5 051 736, US-A-5 477 012 and WO 00/73983 and can be used to record handwritten information digitally.
The above-mentioned images can be processed in a data processing unit, such as a suitably programmed microprocessor, an ASIC, an FPGA, etc, which receives a sequence of digital greyscale images, converts these to binary for identification of the above-mentioned symbols, and calculates a position on the basis of each binarised image. During the binarization, a threshold matrix is used that contains a threshold value for each pixel in the greyscale image. The recording of handwritten information should be carried out at high temporal resolution, typically approximately 50-100 images per second, for which reason it is difficult to combine the requirements for high precision in the thresholding with the requirements for rapid processing and small memory requirement, even in a specially-adapted data processing unit. In the article "Threshold Selection Based on a Simple Image Statistic", published in 1985 in the periodical Computer Vision, Graphics and Image Process ing, No. 30, pp 125-147, examples are given of various local thresholding methods . One such method is based on the fact that an adequate threshold value can be calculated based on a derivative value for each pixel in the original greyscale image. More particularly, a derivative matrix is calculated by convolution of the greyscale image with a suitable derivative mask, whereupon the pixel values of the derivative matrix are multiplied by the pixel values of the greyscale image to create a product matrix. Thereafter the derivative matrix and the product matrix are divided into subareas, within which a respective sum of the pixel values is calculated. Finally, local threshold values are obtained for the various subareas from the quotient between the subarea sums for the product matrix and the derivative matrix. The method is, however, time-consuming, among other things as a result of the calculation of the derivative matrix by convolution and the calculation of the threshold matrix by division. The method is also undesirably memory-intensive, as it requires the intermediate storage of the greyscale image, the derivative matrix and the product matrix.
US-A-5 764 611 describes a thresholding method to be applied to greyscale images containing a pattern of dark dots against a bright background. In this method, the greyscale image is divided into subareas, within which the pixel values are summed to create a sum matrix. A low-pass filter is then applied to this sum matrix to create a background matrix, which after multiplication by a suitable fraction value is considered to form a matrix of local threshold values. In addition to a time-consum- ing low-pass filtering, this thresholding method is hampered by being sensitive to lack of sharpness in the greyscale image. Such lack of sharpness must be elimi- nated by extensive pre-processing of the greyscale image. This is achieved by high-pass filtering of the incoming greyscale image, whereupon the resulting luminance values are summed within the above-mentioned subareas to create a contrast matrix. The contrast matrix is then used to produce coefficients for a subarea-specific contrast function, which is finally allowed to operate on the greyscale image in order to eliminate the lack of sharpness in the same. This pre-processing comprises several time-consuming operations and is, in addition, memory- intensive in that it requires the intermediate storage of both the greyscale image and the result of the high-pass filtering.
Prior-art technique also includes US-A-4 593 325, which describes a method for adaptive thresholding of a greyscale image prior to binary duplication of the same. Summary of the Invention
An object of the present invention is thus to demonstrate a technique which permits the identification of individual objects in a digital image in a quick and memory-efficient way.
A further object is to demonstrate a technique that is relatively unaffected by variations in luminance and/ or sharpness within an image. These and other objects that will be apparent from the following description, are achieved completely or partially by means of a method according to claim 1, a device according to claim 22, a computer program product according to claim 39, a hardware circuit according to claim 40 and a hand-held apparatus for position determination according to claim 41. Preferred embodiments are defined in the dependent claims.
According to the invention, a reference image is used which is representative of the digital image which is to be processed, for the calculation of the threshold matrix. This reference image is given two predetermined, overlapping divisions, into first and second subareas. For each first subarea, a background luminance value is estimated, and for each second subarea, an object luminance value is estimated. Based on these estimated values, a threshold value is calculated for each overlap- ping first and second subarea. The threshold values form a threshold matrix which is used for binarization of the digital image.
In one way of locking at the matter, the invention enables rapid extraction of a low resolution background image and a low resolution object image from the reference image, after which the threshold matrix is created by threshold values being determined, according to some criterion, to be between associated values in the background and object images. The method according to the invention is a direct method which can be carried out quickly and with relatively low demands as to available memory capacity, since the threshold values can be calculated directly, subarea by subarea, based on the estimated background and object luminance values. The method can be carried out without any calculation-intensitive operations, such as convolutions and divisions, even if the method, when necessary, can be supplemented with such an operation, for instance for filtering.
By means of a method according to the invention, individual objects can be identified in a digital image also with variations in luminance and/or sharpness within the same. The estimated background luminance values represent the variation in background luminance over the reference image, or at least a relevant portion thereof, which means that the threshold values can be determined in adequate relation to the background. The estimated object luminance values represent, in combination with the estimated background luminance values, the variation in contrast over the reference image or said portion, which means the threshold values can also be determined in adequate relation to the sharpness. The division into first and second subareas makes it possible to easily adjust the method to achieve the desired calculation speed, robustness or precision, by selecting a suitable size of the first and second sub- areas. However, it should be pointed out that the division of the reference image, or a portion thereof, into subareas does not have to be a physical division. The subareas are as a rule intended and used for extraction of data from the reference image. The first and second subareas conveniently extend over one and the same continuous portion of the reference image, and in particular so that each second subarea at least partly comprises a first subarea. This guarantees that a threshold value can be calculated for each sub- area.
According to one embodiment, the first subareas are mutually exclusive and the second subareas are mutually exclusive. Such first and second subareas do not overlap, and are suitably arranged side by side within the rele- vant portion of the reference image. This minimises the number of subareas of a given size which is necessary within a given portion of the reference image, which may be favourable for both calculation speed and memory capacity. In its simplest form, the first and second subareas may be identical, as regards size as well as relative position, i.e. the first and second subareas coincide. According to an alternative embodiment, the first subareas partly overlap each other, and/or the second subareas partly overlap each other. Such an embodiment makes it possible to calculate the threshold matrix with greater accuracy and higher resolution. Moreover, the appearance of artefacts in the joint between subareas is minimised.
In the above embodiments, the size of the first and second subareas may be adjusted to optimal estimation of the background luminance values and object luminance values respectively, as will be discussed in more detail below.
During the method according to the invention, threshold values are generated in a threshold matrix. The term threshold matrix is to be interpreted figuratively to relate to an assembly of threshold values which are related to one part each of the reference image. Such a threshold matrix can be stored in a memory unit as a matrix, a plurality of vectors, or an assembly of indi- vidual threshold values. Alternatively, the calculated threshold values may be written one upon the other in one and same memory space in the memory unit, or be given as a sequence of values which is directly used for binarisa- tion of the digital image. The above-mentioned reference image may be any image which is representative of the digital image which is to be binarised. When the digital image is included in a sequence of digital images, the reference image may consist of an image in this sequence of digital images. According to one embodiment, the reference image consists of the digital image which is to be binarised. In this case the digital image thus is received, after which the background and object luminance values are estimated for the subareas and the threshold matrix is calculated. Then the threshold matrix is used for binarisation of the digital image. Such an embodiment, with intermediate storage of the digital image, is robust and allows accurate binarisation.
According to an alternative embodiment, the thresh- old matrix is calculated intermittently on the basis of the luminance values of a current image in the sequence of digital images, which threshold matrix can then be applied for the binarisation of one or more subsequent images in the sequence of digital images. In this way, the calculation of the threshold matrix can be carried out in parallel with the actual thresholding, whereby more images can be processed per unit of time. In addi- tion, the need for intermediate storage of the digital images is avoided, as these can be processed by direct comparison with an already-calculated threshold matrix. This is due to the fact that the algorithms according to the invention are sufficiently robust to permit calculation of the threshold values from a reference image which is similar but not identical to the image to which the thresholding is to be applied.
The method according to the invention can be based on a simple and calculation-efficient estimate of the background and object luminance values for each subarea.
According to an alternative, the background luminance value is estimated on the basis of first order statistics of the luminance values of the pixels within the first subarea. First order statistics, for example comprising the least value, the greatest value, the median value, the mean value and the sum of the pixels' luminance values within a subarea, can be extracted from a greyscale image in a calculation-efficient way. Similarly, the object luminance value can be estimated on the basis of first order statistics of the luminance values of the pixels within the second subarea.
According to one embodiment, which is based on greyscale images with dark objects against a bright back- ground, the background luminance value is estimated on the basis of the greatest luminance value of the pixels within the first subarea. Such a luminance value can be extracted from a greyscale image quickly and in a calculation-efficient way. According to one alternative, the background luminance value is estimated on the basis of the mean value of the luminance values of the pixels within the first subarea. According to another alternative, the background luminance value is estimated on the basis of a percentile value, for example in the range In a corresponding way, the object luminance value can be estimated on the basis of the least luminance value of the pixels within the second subarea.
According to the invention, the subareas can be designed so that each of the second subareas comprises a whole number of the first subareas, whereby the threshold matrix is minimised since this only needs to contain one threshold value for each of the first subareas.
According to one embodiment, the second subareas are designed in such a way that they each contain at least a part of at least one of the objects that are to be identified. Accordingly, each second subarea contains with certainty at least one value of the luminance within an object. All threshold values in the resultant threshold matrix will thereby be related both to the objects and to the background against which the objects appear, and this is achieved without a separate selection for the elimination of threshold values belonging to subareas that do not contain any part of an object. It is preferable that the second subareas are designed in such a way that they each contain at least one object in its entirety, which guarantees that each second subarea contains the object's extreme value in luminance. This simplifies the calculation of an adequate threshold value.
The image is preferably divided into first subareas that are larger than the objects that are to be identified, whereby each first subarea contains with certainty at least one value of the background luminance between the objects, in relation to which the threshold value can be set .
In one application, the objects are positioned relative to an invisible raster or grid of known dimensions. This application can serve for the calculation of posi- tions described by way of introduction, for digital recording of handwritten information. As mentioned, a position is coded by the shape and/or location of one or more individual objects in an image. In addition to the references given above, the position-coding pattern can be mentioned that is described in Applicant's patent publications WO 01/16691, WO 01/26032 and WO 01/26033, which are herewith incorporated by reference. This position-coding pattern is constructed of marks, for example dots, which are located at a fixed distance from raster dots belonging to an invisible raster. The value of each mark is provided by its position relative to the associated raster dot. A plurality of such marks together code a position that is given by two or more coordinates. All of the position-coding patterns stated above contain objects that are a known distance apart, which is given by the distance between the raster dots and the distance between the objects and the raster dots. The subareas can thereby be designed in relation to the known dimensions of the raster, for example in such a way that each sub- area contains at least a part of at least one object or at least one object in its entirety. If the current image is recorded with a certain inclination of the sensor relative to the position-coding pattern, the distance between the objects will vary across the image, for which reason this type of perspective distortion must be taken into account when dimensioning the subareas. In this connection, WO 00/73983 should also be mentioned, which describes a position-coding pattern containing marks of two different sizes, where each mark is centred at a respective raster dot of an invisible raster. In addition, US-A-5 221 833 describes a coding pattern containing marks in the form of lines with different angles of rotation around their dot of symmetry, where each such mark is centred at a respective raster dot of an invisible raster. Also in these cases, it is possible to size the subareas taking into account the dimensions of the raster, so that each subarea contains with certainty at least a part of an object or an object in its entirety. According to the invention, there can also be a classification of the subareas into at least a first category with a high signal-to-noise ratio and a second category with a low signal-to-noise ratio. The classi- fication is used for calculating the threshold value for each subarea, by setting the threshold value at a larger relative distance or contrast depth from the background luminance value in subareas belonging to the second category than in subareas belonging to the first cate- gory. As a result, the danger is reduced of the threshold value being at the level of noise in the proximity of the object. Such a threshold value could generate a number of fictitious structures around the actual object. In one exemplary embodiment, the threshold value for subareas belonging to the first and second categories is set at a relative distance to the background luminance value of approximately 40-60% and approximately 60-80% of the contrast, respectively.
In addition, the subareas can be further classified into a third category with a high signal-to-noise ratio and a low contrast. In many cases, the subareas belonging to the third category are overexposed, for which reason the luminance depth of the object is given by only one or a few pixels. In order to increase the probability of objects in these subareas being identified as a plurality of connected pixels, it has been found suitable to set the threshold value at a smaller relative distance or contrast depth from the background luminance value in subareas belonging to the third category in subareas belonging to the first category. In an exemplary embodi- • ment, the threshold value for a subarea belonging to the third category is set at a relative distance to the background luminance value of approximately 30-50% of the contrast . The classification of the subareas with regard to their signal-to-noise ratio can be carried out based on a statistical variation measure, for example normalised standard deviation. For images comprising dark objects against a bright background, however, the signal-to-noise ratio is related to the background luminance value. Therefore in a calculation-efficient embodiment, the classification can be carried out by comparison of the greatest luminance value within the subarea and a limit level. This limit level can, for example, correspond to the mean value of the greatest luminance values of all the subareas. Alternatively, the limit level can, for example, consist of a predetermined fixed value.
According to one embodiment, a characteristic level is estimated for the luminance values within each sub- area, where the relative distance between the threshold value and the background luminance value is set depending upon this characteristic level, which suitably is indicative of the signal-to-noise ratio in the subarea. The characteristic level can also indicate any overexposure of a subarea, with the ensuing danger of reduced contrast . Thus the threshold value can be adapted to the conditions in the subarea, as indicated by the characteristic level. For example, the characteristic level can be produced on the basis of the mean value, the median value or the sum of the luminance values of the pixels within a subarea. In certain cases, the least or the greatest luminance value within a subarea can represent the characteristic level of the subarea. For greyscale images of dark objects against a bright background, it is preferable for the relative distance between the threshold value and the background luminance value to be a monoto- nically decreasing function of the characteristic level. The calculated threshold value can be more or less representative of the pixels within the current subarea. In particular, the threshold value can be less representative at the edges of the subarea. As the threshold value changes in steps at these edges, unwanted artefacts can arise in the thresholding of the digital image. In order to minimise this potential problem, a subsequent smoothing step is preferably implemented, in which each calculated threshold value is updated on the basis of adjacent calculated threshold values in the threshold matrix. In one embodiment, the threshold matrix is given further threshold values in the smoothing step, by interpolation of adjacent calculated threshold values in the threshold matrix. In the interpolation, the threshold matrix is thus given further threshold values which are used for the thresholding of an associated part of the greyscale image. The interpolation can be of any kind, for example linear, and can be implemented in one or more steps .
According to one embodiment, the smoothing step com- prises, alternatively or additionally, a low-pass filtering of the threshold matrix.
The invention also relates to a device, a computer program product and a hardware circuit for the identification of individual objects in a digital image, and a hand-held apparatus for position determination.
The advantages of the device, the computer program product, the hardware circuit and the hand-held apparatus will be apparent from the above description. Features that are described in connection with the method for identifying individual objects are of course also applicable to the device, the computer program product, the hardware circuit and the hand-held apparatus . Brief Description of the Drawings
For the purpose of exemplification, the invention will be described below with reference to the accompany- ing drawings, which illustrate a currently preferred embodiment and in which
Fig. IA shows schematically an example of 4 x 4 marks that are used to code a position, and Fig. IB shows schematically a digital pen that is used to detect the marks in Fig. IA and to calculate a position on the basis of these, Fig. 2 shows greyscale images, recorded by the pen in Fig. IB, of a position-coding pattern of the type shown in Fig. IA,
Fig. 3 shows schematically the construction of a data processor in the pen in Fig. IB,
Figs 4A - 4B illustrate schematically the division of an image into subareas for the calculation of a threshold matrix according to a first and a second embodiment, respectively, Fig. 5A illustrates schematically the classification of objects in the image, and Fig. 5B illustrates schematically a generalisation of the embodiment in Fig. 5A,
Fig. 6 shows luminance values along a line in a greyscale image, in which associated threshold values in the threshold matrix are indicated by broken lines, and
Fig. 7 shows the greyscale images in Fig. 2 after the thresholding thereof according to the invention. Description of Preferred Embodiments
The description below concerns position determination based on images of a position-coding pattern. The position-coding pattern can be of any type, for example any one of the patterns mentioned by way of introduction. In the following, however, the invention is exemplified in connection with the pattern that is described in Applicant's Patent Publications WO 01/16691, WO 01/26032 and WO 01/26033. This pattern will be described briefly below with reference to Fig. IA.
The position-coding pattern comprises a virtual raster or grid 1, which is thus neither visible to the human eye nor can be detected directly by a device which is to determine positions on the surface, and a plurality of marks 2, each of which, depending upon its position, represents one of four values "1" to "4" . The value of the mark 2 depends upon where it is placed in relation to its nominal position 3. The nominal position 3, which can also be called a raster dot, is represented by the intersection of the raster lines. In one embodiment, the distance between the raster lines is 300 μm and the angle between the raster lines is 90 degrees. Other raster intervals are possible, for example 254 μm to suit printers and scanners which often have a resolution which is a multiple of 100 dpi, which corresponds to a distance between dots of 25.4 mm/100, that is 254 μm.
In the example in Fig. IA, there are four possible locations, one on each of the raster lines extending from the nominal position 3. The displacement from the nominal position 3 is the same size for all values. Each mark 2 is, at its centre of gravity, displaced relative to its nominal position 3, that is no mark is located at the nominal position. In addition, there is only one mark 2 per nominal position 3.
In one embodiment, the marks 2 are displaced relative to the nominal positions 3 by 50 μm along the raster lines. The displacement is preferably 1/6 of the raster interval, as it is then relatively easy to determine to which nominal position a particular mark belongs. The displacement should be at least approximately 1/8 of the raster interval, otherwise it becomes difficult to determine a displacement, that is the requirements for resolution become great . On the other hand, the displacement should be less than approximately 1/4 of the raster interval, in order for it to be possible to determine to which nominal position a mark belongs.
Each mark 2 consists of a more or less circular dot with a radius which is approximately the same size as the displacement or somewhat less. The radius can be 25% to 120% of the displacement. If the radius is much larger than the displacement, it can be difficult to determine the raster lines. If the radius is too small, a greater resolution is required to record the marks. The marks do not, however, need to be circular or round, but any suitable shape can be used, such as square, triangular, elliptical, open or closed, etc. The pattern described above can be designed to code a very large number of absolute positions. For example, the pattern can be such that 6 x 6 adjacent marks together code a position, in the form of an x-coordinate and a y-coordinate . If a subset of the pattern is applied to a product, an electronic representation can be obtained of what is written or drawn on the product using a pen, by continually determining the position of the pen on the product by reading off the local combination of marks . This reading off can be carried out by optical detection. Fig. IB shows a hand-held apparatus 10, below called a pen, that is used for optical detection of the position-coding pattern in Fig. IA. In the following the pen's main components will be described briefly. For a more detailed description, reference is made to the above-mentioned WO 01/16691, WO 01/26032 and WO 01/26033.
The pen 10 has a casing 11 in the shape of a pen, which has an opening 12 at one end. This end is intended to abut against or to be held a short distance from the surface on which the position determination is to be carried out .
One or more infrared light-emitting diodes 13 are arranged in the opening 12 for illuminating the surface area which is to be imaged, and an area sensor 14, sen- sitive to infrared light, for example a CCD or CMOS sensor, is arranged for recording a two-dimensional image of the surface area.
The area sensor 14 is connected to a data processor 15 which is arranged to determine a position on the basis of the image recorded by the sensor 14. The data processor 15 can contain one or more processors (not shown) , programmed to record images from the sensor 15 or from a buffer memory (not shown) associated with the sensor, and to carry out position determination on the basis of these images.
The pen 10 has also a pen point 16 which deposits pigment ink on the product. Using this, the user can write physically on the product, while at the same time what is being written is recorded digitally via optical detection of the position-coding pattern. The pigment ink is suitably transparent to infrared light, while the marks 2 of the position-coding pattern (Fig. IA) absorb infrared light. This means that the pigment ink does not interfere with the detection of the pattern.
When the pen 10 is passed over the position-coding pattern, the area sensor 14 thus records a sequence of digital greyscale images which are transmitted to the data processor 15 for position determination. Fig. 2 shows examples of such greyscale images I . These contain 96 x 96 pixels, the luminance values of which are given with 8-bit resolution. To achieve an adequate temporal resolution for the digitally recorded information, the images are read off from the area sensor 14 at a frequency of approximately 100 Hz, that is approximately 10 ms is available per image I for calculating a position. In the images in Fig . 2 , the marks 2 appear as dark dots against a bright background. Normally each mark or object covers several pixels in the image. The sharpness varies within the image as a result of the pen and thus the area sensor being angled against the base when writing down information. The contrast can also vary within the image as a result of uneven scattering properties of the base. In addition, the illumination of the base is uneven. In general, the images are well illuminated at their central parts, with, however, varying image sharpness, while the peripheral parts have low signal-to-noise ratios, due to insufficient illumination. In addition to said variations in sharpness, contrast, signal-to-noise ratio and illumination within each image, there are corresponding variations between different images, as the angle of inclination of the pen varies with time while information is being written down, and also between different users and different bases. Fig. 3 shows the data processor in the pen in greater detail. The data processor 15 comprises a preprocessing unit 20, a threshold calculation unit 21 and a position determination unit 22. The pre-processing unit 20 comprises in this case a hardware circuit (ASIC) which records a current greyscale image I from the area sensor 14, obtains a threshold matrix T from the threshold calculation unit 21, and generates a binary image B. In this thresholding or binarisation, the luminance value of each pixel in the current image is compared with an associated threshold value in the threshold matrix T. If the luminance value is greater than the threshold value, the corresponding luminance value in the binary image is set to one (1) , otherwise to zero (0) . The output binary image B thus contains dark objects (value 0) which ideally constitute the marks, against a bright background (value 1) .
In the embodiment described, the pre-processing unit 20 also contains a statistics module which generates image statistical data S for given subareas or partitions in the current greyscale image I. This image statistical data S is stored in a memory 23, from which the threshold calculation unit 21 can obtain the relevant image statistical data S when" it is to commence the calculation of a new threshold matrix T. The threshold calculation unit 21 thus generates the threshold matrix T based on this image statistical data S, as will be described in greater detail below.
The position determination unit 22 receives the binary image B from the pre-processing unit 20, identi- fies the marks in the binary image and calculates position coordinates (x,y) on the basis of the positions of the marks in relation to the virtual raster. The threshold calculation unit 21 and the position determination unit 22 consist of software which is executed in a micro- processor (not shown) . The decoding in the position determination unit will not be described here in greater detail, as the present invention relates to the prelimi- nary processing step, more particularly the binarisation of the greyscale images I Embodiment 1
A first embodiment is based on a partitioning of greyscale images, each of which contains 96 x 126 pixels, into 63 (7 x 9) square subareas Is, each of which contains 14 x 14 pixels, as indicated by thin lines in Fig. 4A. This division is used on the one hand by the statistics module for the generation of statistical data S, and, on the other, by the threshold calculation unit 21 for the calculation of the threshold matrix T.
The size of the subareas is set with knowledge of the images that are to be binarised. In the greyscale images the distance between the raster lines is known, in the present case approximately 7.5 pixels. On the basis of this information, the size of the subareas Is can be selected in such a way that each subarea Is with great certainty contains at least a part of a mark 2, as also shown in Fig. 4A. Since the images are taken using the hand-held apparatus 10, which is used like a pen, it must also be taken into account that the angle between the pen and the base, that is the perspective in the images, can vary depending upon the writing posture of the user.
In the present case, a raster is used with a distance between the raster lines of approximately 300 μm, together with circular dots with a displacement and a radius of approximately 50 μm. Square subareas Is with a side length of approximately 120% of the raster interval should then be sufficiently large to guarantee that each subarea Is in each image I contains at least a part of a dot. Also taking into account the varying perspective, side lengths in the range 150% - 300% of the raster interval have been found to give satisfactory results. The subareas IΞ are thereby so large that they essentially always contain at least one dot in its entirety, which simplifies the production of an adequate threshold value for each subarea. The upper limit for the size of the subareas IΞ is given by the lowest acceptable resolution of the threshold matrix, which depends among other things upon the spatial size of the luminance variations in the images . It is also possible to make the subareas Is smaller, in order to increase the resolution of the threshold matrix. In this case, certain subareas will only contain background and thus will not contain part of any mark, for which reason a correct threshold value cannot be calculated. These subareas should therefore be identified and allocated a correction value which, for example, is calculated on the basis of the threshold values for the surrounding subareas .
In the present exemplary embodiment, the statistics module derives statistical data S in the form of the greatest luminance value (max) and the least luminance value (min) within each subarea IΞ.
In the threshold calculation unit 21 (Fig. 3) , a threshold value Ti is calculated for each subarea. This threshold value is stored in the threshold matrix T, as illustrated in Fig. 4A. The threshold value Ti is calculated as a function of the contrast (max - min) within the subarea :
Ti = max - ki * (max - min) . The greatest value (max) within the subarea is assumed to give an adequate indication of the background luminance within the subarea, and the least value within the subarea is assumed to give an adequate indication of the object luminance of the mark or marks within the sub- area. It has been found that each subarea should advantageously contain at least one mark in its entirety, so that the actual luminance depth of the mark can be used in the calculation of the threshold value.
The factor ki determines at which contrast depth the threshold value is to be set. In order to reduce the effect of noise and lack of sharpness, the contrast depth factor ki is set on the basis of a classification of each subarea into the classes "sharp" , "lacking in contrast" and "noisy" .
Each subarea that has a greatest luminance value (max) below a noise limit value is classified as "noisy" . According to one embodiment, the noise limit value is defined as the mean value of the greatest luminance values in all the subareas, which can be calculated in a simple way from said image statistical data S.
Other subareas are classified as either sharp or lacking in contrast, depending upon whether the contrast in the image is above or below a contrast limit value. According to one embodiment, the contrast limit value is defined as the mean value of the contrast in all the sub- areas in the current image, which can also be calculated in a simple way from said image statistical data S.
Fig. 5a shows examples of the classification of the subareas in a diagram of the luminance as a function of pixels. The subarea I is "sharp", the subarea II "lacking in contrast" and the subarea III is "noisy" . In Fig. 5A the threshold values i for the respective subareas are also indicated by broken lines. It has been found advantageous to set the factor ki to a value in the range 0.6-0.8 in noisy subareas, 0.3-0.5 in subareas lacking in contrast and 0.4-0.6 in sharp subareas. According to one embodiment, the central value in each of the above ranges is used.
Thus the threshold value is set relatively far from the background luminance (max) in noisy subareas, which are typically to be found at the periphery of the image. This reduces the risk of the threshold value being set at the level of the background noise, which could result in the thresholding generating a binary image with many small fictitious structures. In order to reduce still further the number of fictitious structures, the noisy subareas can undergo a supplementary contrast control, for example by the threshold value being set to zero in the noisy subareas that have a contrast similar to typical noise levels.
In subareas lacking in contrast, that is subareas with high signal-to-noise ratio and low contrast, how- ever, the threshold value is set relatively close to the background luminance (max) . This increases the probability of associated marks being identified as structures with a plurality of connected pixels, which in turn provides a better estimation of the position of the mark in the subsequent decoding in the position determination unit 22. In many cases, this type of subarea is in fact overexposed, for which reason the luminance depth of the marks is only given by one or a few pixels, as indicated in Fig. 5A. After the above calculation, the threshold matrix T contains a threshold value Ti per subarea Is (cf . Fig. 4A) .
Of course, the classification of the subareas can be made more sophisticated. For example, histograms or stan- dard deviations can be used to identify noise, mean values can be used to identify background luminance, etc. An advantage of the use of the minimum and the maximum for each subarea is, however, that these values can be extracted from a greyscale image in a calculation-effec- tive way. In addition, the number of calculation steps is minimised in the production of the threshold matrix.
According to a further variant, the mean value of the luminance values of the pixels is calculated within the respective subarea, after which the threshold value is calculated on the basis of this mean value and of the contrast in the subarea in question, in accordance with:
Ti = max - f ( i) * (max - min) , where f (m) is a function of the luminance mean value mi in the subarea in question and has a value in the range 0 - 1. Fig. 5B shows an example of the appearance of the function f (m) . Unlike the factor ki described above, which varies in steps depending upon the classi- fication of the subarea, f (m) is a continuous and mono- tonically decreasing function of the luminance mean value of the subarea. Low luminance mean values give both low contrast and low signal-to-noise ratio, as indicated by the luminance histogram (i) , for which reason the value of the function is set close to 1 (corresponding to the above class "noisy"). With increasing luminance mean values, the signal-to-noise ratio and the contrast tend to increase gradually, as indicated by the luminance histograms (ii) and (iii) , for which reason the value of the function is allowed to decrease to a corresponding extent. Finally, f (m) becomes approximately 0.5 (corresponding to the above class "sharp"). With sufficiently high luminance mean values, the contrast is again reduced (see the luminance histogram (iv) ) as a result of over- exposure of the subarea (corresponding to the above class "lacking in contrast") . In order to maximise the probability that the mark or marks that are included in the sub- area are actually identified by the thresholding, the function f (m) is here set to a value close to 0, that is the threshold value becomes close to the estimated background luminance .
Those skilled in the art will recognise that the function in Fig. 5B can be modified. For example, it can consist of more segments (classes) , with other breakpoints and inclinations. Alternatively, the function can be a curve, as given by a second degree equation or the like.
It should also be pointed out that the above func- tion can depend on other variables than the luminance mean value, for example the median value, the sum, the greatest or the smallest of all the luminance values within the subarea. Embodiment 2
In the first embodiment, the same subareas are used for estimating both background luminance and object luminance in the greyscale image. In the following exemplify- ing embodiment, subareas of different sizes are instead used for estimating the background luminance and object luminance of the greyscale image.
In this example, the threshold matrix is thus calcu- lated based on image statistical data for two different sets of subareas, object subareas and background sub- areas . The object subareas and the background subareas overlap each other and cover all that part of the image that is to be binarised. The object subareas correspond in size to the subareas that are used in the first embodiment, that is they are so large that they contain with certainty at least a part of a mark. The background sub- areas can, however, be made smaller, as they only need to be large enough to contain with certainty pixels that are representative of the image's local background luminance, that is they should be larger than each mark in the image. Any enlargement as a result of the effects of perspective should be taken into account.
Fig. 4B shows an example of the partitioning into object subareas and background subareas. This partitioning is suited to the same coding pattern as the first embodiment, however for greyscale images consisting of 96 x 96 pixels. Each greyscale image is divided into 64 (8 x 8) object subareas Is,0, each of which contains 12 x 12 pixels, and into 256 (16 x 16) background sub- areas Is,b/ each of which contains 6 x 6 pixels. In this example, the object subareas Is,0 a e dimensioned to comprise a whole number of background subareas Is,b (in this case four) , whereby the calculation of the threshold matrix is simplified.
As in the first embodiment, a threshold value can now be calculated for each background subarea, in accordance with:
Ti = bi - ki * (bi - oi) , where bi is the estimation of the background luminance within the background subarea IΞ, / and Oi is the estimation of the object luminance within the larger object subarea IS/0 which overlaps the current background subarea Is,b- As in the first example, the background luminance is estimated as the greatest luminance value within the background subarea and the object luminance as the least luminance value within the object subarea. Of course, the threshold value can be calculated in alternative ways, as described in connection with the first embodiment above .
In practice, the threshold matrix is calculated based on a background matrix, which contains the background luminances estimated for the background subareas IS(b, and an object matrix which contains the object luminances estimated for the object subareas Is,o- In general, it is preferable that the object subareas Is,o overlap a whole number of background subareas Is,b> as the data that requires intermediate storage in the background matrix and the object matrix is thereby minimised.
It should be pointed out that the statistics module in the pre-processing unit 20 (Fig. 3) can be designed to generate separate image statistical data for the background subareas IS/b and the object subareas Is,o- In certain cases, however, the image statistical data for the object areas Is,0 can be calculated from the image statistical data for the background subareas Is,b- For instance, this is the case in the example above, with estimation based on minimum and maximum values and with adjustment of the relative sizes of the background subareas and the object subareas.
Both embodiments described above result in a threshold matrix T containing a threshold value Ti per subarea Is and Is,:t>, respectively. It has, however, been found that the precision of the thresholding is improved if the threshold matrix is given additional threshold values by interpolation, between the threshold values calculated as defined above. Such additional threshold values can be created by linear interpolation of adjacent values in the threshold matrix. The linear interpolation is carried out in two steps, interpolation by rows and interpolation by columns. The threshold matrix interpolated in this way can then, if required, undergo a further interpolation. It should be noted that the rela- tionship between the threshold values and the subareas is changed when the threshold matrix is given additional threshold values by means of interpolation. From having been applicable to all pixels within a subarea, each threshold value is now applicable to pixels within smaller thresholding areas of each image. Each first calculated threshold value is suitably allocated to a thresholding area in the centre of its subarea, whereupon the new threshold values can be allocated to thresholding areas in between. By one or two interpolations, each such thresholding area has a size that is 1/4 or 1/16, respectively, of the size of the subarea.
An alternative method for improving the precision of the thresholding, is to have the threshold matrix calculated in accordance with the embodiment above under- go a low-pass filtering, for example by convolution of the threshold matrix with a suitable 3 x 3 matrix.
Fig. 6 shows the luminance distribution along a line in a greyscale image, together with calculated threshold values Ti along this line. The threshold values in Fig. 6 are produced according to the second embodiment with a subsequent enlargement of the threshold matrix by linear interpolation. In spite of large variations in background luminance, signal-to-noise ratio and contrast, the calculated threshold values Ti accord well with the luminance values along the line.
Fig. 7 shows a number of binary images B that have been generated by the thresholding of the greyscale images shown in Fig. 2. The thresholding is carried out according to the second embodiment with a subsequent enlargement of the threshold matrix by linear interpolation. A comparison of Figs 2 and 7 shows that the thresholding results in a satisfactory identification of the marks 2, even with variations in illumination, base properties and imaging conditions within the greyscale images .
A threshold matrix can obviously be calculated for one particular greyscale image and then used for the thresholding of the same with high precision. The calculation of the threshold matrix can be carried out quickly, based on given image statistical data. It is estimated that it takes approximately 8000 clock cycles for the calculation of the threshold matrix according to the second embodiment, that is with a background matrix estimated for 16 x 16 subareas, an object matrix estimated for 8 x 8 subareas and a mean value matrix estimated for 8 x 8 subareas. For an 80 MHz processor, this corresponds to a calculation time of 100 μs .
Now returning to Fig. 3, it has been found that the algorithms described above can permit a further increase of the throughput of images I through the data processor 15, by carrying out the calculation of the threshold matrix in parallel with the actual thresholding. The threshold matrix is thus calculated on the basis of a preceding image, which is similar to the subsequent image or images to be thresholded by use of this calculated threshold matrix. This can be regarded as if the thresh- old matrix is periodically updated and then used in thresholding one or more subsequent images. It has been found fully possible to use one and the same threshold matrix for the thresholding of a plurality of, for example 5 - 10, consecutive greyscale images. According to this embodiment, a given greyscale image can be thresholded at the same time as it is being read in from the sensor by the data processor 15. This thresholding can thus be implemented in hardware and in this way relieve the processor (not shown) which carries out the calculations in the threshold calculation unit 21 and the position determination unit 22. This hardware can at the same time also generate the above-mentioned image statistical data S in order to relieve the processor still further. In addition, the need for intermediate storage of the greyscale image is avoided as this can be processed by direct comparison with an already-calculated threshold matrix.
This embodiment is made possible by the fact that the algorithms according to the invention have a sufficient tolerance to variations in luminance and/or sharpness from image to image. Among other things, this is because the threshold matrix T is calculated on the basis of image statistical data for given subareas in the greyscale images and thereby contains threshold values that are related to the overall luminance distribution in the images, both with regard to the background and to the object. This can be regarded as if the threshold matrix contains both global information which is relevant for several consecutive images, and local information, which allows for the thresholding of each object in relation to its local surroundings. As each subarea, which contains a plurality of pixels, is allocated a threshold value in the threshold matrix, the effect of local variations is limited. In other words, the size of the subareas is selected in such a way that the calculated threshold value is sufficiently insensitive to local variations in order to achieve a desired tolerance to variations in luminance and/or sharpness from image to image.
According to an alternative embodiment of the data processor 15 in Fig. 3, the preprocessing unit 20 is designed to buffer, before, after or during the gene- ration of said image statistical data S, one or more associated greyscale images, for instance in the memory 23. The threshold matrix T which is calculated by the threshold calculation unit 21 for a current greyscale image can thus be used by the preprocessing unit 21 for binarisation of the same current image, and optionally a subsequent image in the incoming sequence of images . It should be pointed out that the above description is only intended to provide an example of how the invention can be realised within the scope of the protection that is defined by the appended claims. For example, before the binary images are analysed further for position determination, they can undergo an area check, with the aim of eliminating fictitious marks on the basis of the number of connected pixels within each mark. Accordingly, marks consisting of one or a few pixels can be assumed to originate from noise and can therefore be removed. As the maximal size of the marks is known, an upper area threshold can also be set.
The above-mentioned contrast depth factor can,' instead of being set at a predetermined value or be calculated on the basis of the classification of associated subareas, be given by an external process, such as a control loop. Such an embodiment is described in Applicant's Swedish Patent Application SE 0103845-4 filed on 20 November 2001. According to a further alternative, the calculation of image statistical data can be carried out in the threshold calculation unit instead of in the pre-processing unit .
It should also be pointed out that, as an alterna- tive to the described combination of hardware circuits and software-controlled processor, the data processor can be realised completely in hardware or completely in software .
In addition, it should be emphasised that the sub- areas can be of any shape, such as square, rectangular, triangular, rhombic, hexagonal, etc.
The invention is in no way restricted to the described position-coding pattern, but can also be used for the identification and decoding of other position- coding patterns. It should also be pointed out that the raster described above can have other shapes than orthogonal, such as a rhombic grid, for example with 60 degree angles, a triangular or hexagonal grid, etc. In addition, the marks can be displaced in other directions than along the raster lines.
In the exemplary embodiment above, the pattern is optically readable and the sensor is thus optical. It is recognised, however, that the images that are processed according to the invention can be generated in another way, for example by detection of chemical, acoustic, electromagnetic, capacitive or inductive parameters. Similarly, it is recognised that the invention can also be used for identification of bright marks against a dark background.
Finally, it should be noted that the invention can be used in general for identification of individual objects in a digital image in a quick and memory-efficient way, particularly when there are variations in luminance and/or sharpness within an image.

Claims

1. A method for identifying individual objects (2) in a digital image (I) that is constructed of pixels with a respective luminance value, comprising the steps of estimating, based on a first and a second division of a reference image which corresponds to the digital image (I) into a plurality of first and second subareas, a background luminance value for each first subarea and an object luminance value for each second subarea, calculating, for each overlapping first and second subarea, a threshold value (Ti) based on the associated background and object luminance values, creating a threshold matrix (T) from said threshold values (Ti) , and comparing the luminance values of the digital image (I) with the threshold values (Ti) of the threshold matrix (T) in order to create a binary image (B) on the basis of the comparison.
2. A method according to claim 1, in which the reference image is divided so that the first subareas are mutually exclusive and the second subareas are mutually exclusive .
3. A method according to claim 1 or 2 , comprising the step of estimating the background luminance value on the basis of first order statistics of the luminance values of the pixels within the first subarea.
4. A method according to claim 1, 2 or 3 , comprising the step of estimating the object luminance value on the basis of first order statistics of the luminance values of the pixels within the second subarea.
5. A method according to any one of the preceding claims, comprising the step of estimating the background luminance value on the basis of the greatest luminance value of the pixels within the first subarea.
6. A method according to any one of the preceding claims, comprising the step of estimating the object luminance value on the basis of the least luminance value of the pixels within the second subarea.
7. A method according to any one of the preceding claims, comprising the step of designing the subareas so hat each second subarea at least partly comprises a first subarea.
8. A method according to any one of the preceding claims, comprising the step of designing the subareas so that the first and second subareas coincide.
9. A method according to any one of the preceding claims, comprising the step of designing the subareas so that each second subarea comprises a whole number of the first subareas.
10. A method according to any one of the preceding claims, comprising the step of designing the second sub- areas in such a way that each one contains at least a part of at least one of said objects (2) .
11. A method according to any one of the preceding claims, comprising the step of designing the second sub- areas in such a way that each one contains at least one object (2) in its entirety.
12. A method according to any one of the preceding claims, comprising the step of designing the first subareas in such a way that they are larger than said objects (2) .
13. A method according to any one of the preceding claims, in which said objects (2) are positioned relative to an invisible raster (1) of known dimensions, the second subareas being dimensioned with regard to the known dimensions of the raster (1) .
14. A method according to any one of the preceding claims, comprising the steps of classifying the subareas into at least a first category with a high signal-to- noise ratio and a second category with a low signal -to- noise ratio, and setting the threshold value (Ti) at a greater relative distance from the background luminance value in subareas belonging to the second category than in subareas belonging to the first category.
15. A method according to claim 14, comprising the steps of further classifying the subareas into a third category with a high signal-to-noise ratio and a low contrast, and setting the threshold value (Ti) at a smaller relative distance from the background luminance value in subareas belonging to the third category than in subareas belonging to the first category.
16. A method according to any one of the preceding claims, comprising a subsequent smoothing step in which the respective calculated threshold values (Ti) of the threshold matrix (T) are updated on the basis of adjacent calculated threshold values (Ti) in said threshold matrix (T) .
17. A method according to claim 16, in which during the smoothing step the threshold matrix (T) is provided with additional threshold values (Ti) , by interpolation of adjacent calculated threshold values (Ti) in the threshold matrix (T) .
18. A method according to claim 16 or 17, in which the smoothing step comprises a low-pass filtering of the threshold matrix (T) .
19. A method according to any one of the preceding claims, in which the reference image consists of the digital image.
20. A method according to any one of the preceding claims, in which the digital image (I) is part of a sequence of digital images, and in which the reference image consists of an image in the sequence of digital images .
21. A method according to claim 20, in which the threshold matrix (T) is calculated intermittently on the basis of the luminance values of a current image in the sequence of digital images and is used for the binary conversion of a subsequent image in the sequence of digital images.
22. A device for identification of individual objects (2) in a digital image (I) consisting of pixels with a respective luminance value, comprising a thresholding unit (20) which is designed to receive the digital image (I) and compare the luminance values of the digital image with the threshold values (Ti) of a threshold matrix (T) in order to create a binary image on the basis of the comparison, and a threshold calculation unit (21) which is designed to estimate, on the basis of a first and a second division of a reference image which corresponds to the digital image (I) into first and second subareas respectively, a background luminance value for each first subarea and an object luminance value for each second subarea, to calculate a threshold value (Ti) for each overlapping first and second subarea on the basis of the associated background and object luminance values and to create the threshold matrix (T) from said threshold values (Ti) .
23. A device according to claim 22, in which the threshold calculation unit (21) is designed to divide the reference image so that the first subareas are mutually exclusive and the second subareas are mutually exclusive.
24. A device according to claim 22 or 23, in which the threshold calculation unit (21) is designed to esti- mate the background luminance value on the basis of first order statistics of the luminance values of the pixels within the first subarea.
25. A device according to claim 22, 23 or 24, in which the threshold calculation unit (21) is designed to estimate the object luminance value on the basis of first order statistics of the luminance values of the pixels within the second subarea.
26. A device according to any one of claims 22-25, in which each second subarea at least partly comprises a first subarea.
27. A device according to any one of claims 22-26, in which the first and the second subareas coincide.
28. A device according to any one of claims 22-27, in which each second subarea comprises a whole number of the first subareas.
29. A device according to any one of claim 22-28, in which each second subarea contains at least a part of at least one of said objects (2) .
30. A device according to any one of claims 22-29, in which each second subarea contains at least one object in its entirety.
31. A device according to any one of claims 22-30, in which each first subarea is larger than said objects (2) .
32. A device according to any one of claims 22-31, in which said objects (2) are positioned relative to an invisible raster (1) of known dimensions, and in which the second subareas are dimensioned with regard to the known dimensions of the raster (1) .
33. A device according to any one of claims 22-32, in which the threshold calculation unit (21) is arranged to use the digital image (I) as reference image.
34. A device according to any one of claims 22-33, in which the digital image (I) is included in a sequence of digital images, the threshold calculation unit (21) being arranged to use an image in the sequence of digital images as reference image.
35. A device according to claim 34, in which the thresholding unit (20) is designed to calculate image statistical data (S) related to said subareas in the digital image (I) in connection with the receipt of the digital image (I) , and compare, to create the binary image (B) , in connection with the receipt of the digital image (I) , the luminance values of the digital image (I) with the threshold values (Ti) of a previously calculated threshold matrix (T) .
36. A device according to claim 34, in which the thresholding unit (20) is designed to calculate image statistical data (S) related to said subareas in the digital image (I) in connection with the receipt of the digital image (I) , receive from the threshold calculation unit (21) a threshold matrix (T) calculated on the basis of said image statistical data (S) , and compare the lumi- nance values of the digital image (I) with the threshold values (Ti) of the received threshold matrix (T) to create the binary image (B) .
37. A device according to claim 35 or 36, in which the threshold calculation unit (21) is designed to inter- mittently read said image statistical data (S) and on the basis thereof calculate said threshold matrix (T) .
38. A device according to any one of claims 22-37, comprising a hardware circuit for realisation of the thresholding unit (20) and a processor unit with exe- cutable software for realisation of the threshold calculation unit (21) .
39. A computer-readable computer program product that comprises a computer program with instructions for causing the computer to implement a method according to any one of claims 1-21.
40. A hardware circuit, such as an ASIC, that is designed to implement a method according to any one of claims 1-21.
41. A hand-held apparatus for position determina- tion, comprising a sensor (14) for production of a sequence of images of a surface with a position-coding pattern, and a processing unit (15) which is arranged to receive an image in said sequence and, calculate a position based on the position-coding pattern in the image, the processing unit (15) comprising a device according to any one of claims 22-38.
PCT/SE2002/001244 2001-06-26 2002-06-25 Processing of digital images WO2003001450A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP02741593A EP1421555A1 (en) 2001-06-26 2002-06-25 Processing of digital images

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE0102254-0 2001-06-26
SE0102254A SE0102254L (en) 2001-06-26 2001-06-26 Digital image processing

Publications (1)

Publication Number Publication Date
WO2003001450A1 true WO2003001450A1 (en) 2003-01-03

Family

ID=20284605

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2002/001244 WO2003001450A1 (en) 2001-06-26 2002-06-25 Processing of digital images

Country Status (3)

Country Link
EP (1) EP1421555A1 (en)
SE (1) SE0102254L (en)
WO (1) WO2003001450A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008030104A1 (en) * 2006-09-07 2008-03-13 Lumex As Relative threshold and use of edges in optical character recognition process
US7457476B2 (en) 2001-10-03 2008-11-25 Anoto Ab Optical sensor device and a method of controlling its exposure time
DE212007000046U1 (en) 2006-06-28 2009-03-05 Anoto Ab Operation control and data processing in an electronic pen
CN116629289A (en) * 2023-05-23 2023-08-22 深圳市牛加技术有限公司 Optical lattice two-dimensional coordinate recognition method and device based on convolutional neural network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4593325A (en) 1984-08-20 1986-06-03 The Mead Corporation Adaptive threshold document duplication
US5051736A (en) 1989-06-28 1991-09-24 International Business Machines Corporation Optical stylus and passive digitizing tablet data input system
US5477012A (en) 1992-04-03 1995-12-19 Sekendur; Oral F. Optical position determination
US5629780A (en) * 1994-12-19 1997-05-13 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Image data compression having minimum perceptual error
US5661506A (en) 1994-11-10 1997-08-26 Sia Technology Corporation Pen and paper information recording system using an imaging pen
US5963676A (en) * 1997-02-07 1999-10-05 Siemens Corporate Research, Inc. Multiscale adaptive system for enhancement of an image in X-ray angiography
FR2786011A1 (en) * 1998-11-13 2000-05-19 Centre Nat Etd Spatiales METHOD OF COMPARISON OF RECORDED IMAGES SHAPED OF PIXELS REPRESENTING EQUIPOTENTIALS OF AT LEAST ONE INTEGRATED CIRCUIT CHIP
WO2000073983A1 (en) 1999-05-28 2000-12-07 Anoto Ab Position determination

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5852434A (en) 1992-04-03 1998-12-22 Sekendur; Oral F. Absolute optical position determination

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4593325A (en) 1984-08-20 1986-06-03 The Mead Corporation Adaptive threshold document duplication
US5051736A (en) 1989-06-28 1991-09-24 International Business Machines Corporation Optical stylus and passive digitizing tablet data input system
US5477012A (en) 1992-04-03 1995-12-19 Sekendur; Oral F. Optical position determination
US5661506A (en) 1994-11-10 1997-08-26 Sia Technology Corporation Pen and paper information recording system using an imaging pen
US5629780A (en) * 1994-12-19 1997-05-13 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Image data compression having minimum perceptual error
US5963676A (en) * 1997-02-07 1999-10-05 Siemens Corporate Research, Inc. Multiscale adaptive system for enhancement of an image in X-ray angiography
FR2786011A1 (en) * 1998-11-13 2000-05-19 Centre Nat Etd Spatiales METHOD OF COMPARISON OF RECORDED IMAGES SHAPED OF PIXELS REPRESENTING EQUIPOTENTIALS OF AT LEAST ONE INTEGRATED CIRCUIT CHIP
WO2000073983A1 (en) 1999-05-28 2000-12-07 Anoto Ab Position determination

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
METZLER V. ET AL.: "Morphological multiscale shape analysis of light micrographs", IS&T/SPIE'S 12TH ANNUAL SYMPOSIUM ON ELECTRONIC IMAGING 2000, 23 January 2000 (2000-01-23) - 28 January 2000 (2000-01-28), SAN JOSE, CA, XP002956207, Retrieved from the Internet <URL:http://www.isip.mu-luebeck.de/~metzler/papers/spieei2000b.html.> *
SAVAKIS A. ET AL.: "Adaptive document image thresholding using foreground and background clustering", PROCEEDINGS OF INTERNATIONAL CONF. ON IMAGE PROCESSING ICIP'98, 1998, XP001044499, Retrieved from the Internet <URL:http://www.rit.edu/~axseec/papers/icip98adt.pdf> *
See also references of EP1421555A1

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7457476B2 (en) 2001-10-03 2008-11-25 Anoto Ab Optical sensor device and a method of controlling its exposure time
DE212007000046U1 (en) 2006-06-28 2009-03-05 Anoto Ab Operation control and data processing in an electronic pen
WO2008030104A1 (en) * 2006-09-07 2008-03-13 Lumex As Relative threshold and use of edges in optical character recognition process
US8311329B2 (en) 2006-09-07 2012-11-13 Lumex As Relative threshold and use of edges in optical character recognition process
CN116629289A (en) * 2023-05-23 2023-08-22 深圳市牛加技术有限公司 Optical lattice two-dimensional coordinate recognition method and device based on convolutional neural network

Also Published As

Publication number Publication date
EP1421555A1 (en) 2004-05-26
SE0102254L (en) 2002-12-27
SE0102254D0 (en) 2001-06-26

Similar Documents

Publication Publication Date Title
US7110604B2 (en) Processing of digital images
US7283676B2 (en) Method and device for identifying objects in digital images
US11962875B2 (en) Recycling methods and systems, and related plastic containers
KR101399709B1 (en) Model-based dewarping method and apparatus
CN104781833B (en) Quick Response Code
US6044165A (en) Apparatus and method for tracking handwriting from visual input
US20090067743A1 (en) Preprocessing for information pattern analysis
JP2003526841A (en) Face extraction system and method based on biometrics
US11962876B2 (en) Recycling methods and systems, and related plastic containers
KR20030010530A (en) Image processing method, apparatus and system
KR101272448B1 (en) Apparatus and method for detecting region of interest, and the recording media storing the program performing the said method
KR20070046946A (en) Photographic document imaging system
EP2014082A1 (en) Generating a bitonal image from a scanned colour image
WO2006127608A2 (en) Efficient finder patterns and methods for application to 2d machine vision problems
WO2013044983A1 (en) Feedback to user for indicating augmentability of an image
US7729534B2 (en) Image-processing device and image-processing method for extracting a recognition-target area including a character from a target image
EP1725975A2 (en) Method, apparatus and program for detecting an object
CN112699704A (en) Method, device, equipment and storage device for detecting bar code
Bukhari et al. Adaptive binarization of unconstrained hand-held camera-captured document images.
JP2002199206A (en) Method and device for imbedding and extracting data for document, and medium
WO2003001450A1 (en) Processing of digital images
US20060279800A1 (en) Image processing apparatus, image processing method, and image processing program
US20080247649A1 (en) Methods For Silhouette Extraction
CN113076952B (en) Text automatic recognition and enhancement method and device
CN108255298A (en) Infrared gesture identification method and equipment in a kind of projection interactive system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ CZ DE DE DK DK DM DZ EC EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW GH GM KE LS MW

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Country of ref document: RU

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 2002741593

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 2002741593

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP