1226189 玖、發明醜明 ' (發明說日I應敘明:發明所屬之技術領域、先前技術、內容、實施方式及圖式簡單說明) 【發明所屬之技術領域】 本發明係有關於一種自動偵測影像中有興趣區域(Region 〇f Interest; - ROI)之方法,其尤指一種自動產生有興趣區域罩遮之方法,可對於影像內 - 容與壓縮位元率要求更有調適性,提供使用者可在解壓縮後得到ROI區域 內較佳之視覺效果與影像品質。 【先前技術】 _ 隨著資訊科技日新月異的進步與發展,我們進入了多采多姿的資訊時 φ 代,大量的影像、文字、及音/視訊資料經由有線或無線的通道傳輸,包 括透過個人行動通訊系統、網際網路等媒介,而達到即時傳送及接收這 些多媒體資訊;另一方面儲存這樣大量之多媒體資料亦是個難題。以影 像、音/視訊資料傳輸或儲存而言,未經壓縮處理之資料,將因資料容量 過大而無法達到即時傳輸及有效儲存之目的,因此多媒體資料之壓縮即 成爲必須之步驟。 在 ISO 與 ITU-T 組織下之 Joint Photographic Experts Group(JPEG)方令 1986年成立,主要致力於靜態影像之壓縮標準制訂,現行普遍之JPEG影 像壓縮標準即爲此組織首先制訂,而相繼之JPEG 2000爲最近制訂之影像® 壓縮標準。而現行之影像與視訊壓縮標準中,均使用離散餘弦轉換(Discrete Cosine Transform; DCT)或離散小波轉換(Discrete Wavelet Transform; DWT) , 作爲減少影像或視訊資料之空間多餘(spatial redundancy),包括影像壓縮標 、 準 JPEG 與 JPEG 2000,及視訊壓縮標準 MPEG-1、MPEG-2、MPEG-4、1226189 发明, invention is ugly '(Invention Day I should state: the technical field, prior art, content, implementation, and drawings of the invention briefly explain) [Technical field to which the invention belongs] The present invention relates to an automatic detection The method of measuring the region of interest (Region 〇 Interest;-ROI), especially a method of automatically generating the mask of the region of interest, can be more adaptable to the image content and compression bit rate requirements. Users can obtain better visual effects and image quality in the ROI region after decompression. [Previous technology] _ With the rapid progress and development of information technology, we have entered a variety of information ages, and a large number of images, text, and audio / video data are transmitted through wired or wireless channels, including through individuals Mobile communication systems, the Internet and other media, to achieve real-time transmission and reception of these multimedia information; on the other hand, storing such a large amount of multimedia data is also a difficult problem. In terms of video or audio / video data transmission or storage, uncompressed data will not be able to achieve the purpose of real-time transmission and effective storage due to the large data capacity, so compression of multimedia data becomes a necessary step. The Joint Photographic Experts Group (JPEG) Order, established under the ISO and ITU-T organizations, was established in 1986. It is mainly dedicated to the development of compression standards for still images. The current universal JPEG image compression standard was first developed for this organization, followed by JPEG. 2000 is the recently developed Image® compression standard. In the current image and video compression standards, Discrete Cosine Transform (DCT) or Discrete Wavelet Transform (DWT) are used to reduce spatial redundancy of images or video data, including images. Compression standards, quasi-JPEG and JPEG 2000, and video compression standards MPEG-1, MPEG-2, MPEG-4,
H.261、H.263與H.263+等,其中對於影像壓縮標準JPEG與JPEG 2000而 言,所規定使用之轉換分別爲離散餘弦轉換與離散小波轉換,前者是以8x 8像素方塊作爲離散餘弦轉換之處理單位,並且在轉換後加以編碼成位元 流;後者則以整張影像作爲離散小波轉換之輸入,轉換後係數則被區分爲N 11 1226189 內容進行 EBCOT(Embedded Block Coding with Optimized Truncation)編碼, 包括位元平面與幾何編碼(arithmetic coding)等步驟,而編碼成更高效能之隱 藏資料流(embedded bit-stream)。整個JPEG 2000壓縮流程如第二圖所示, 其處理流程大致可區分爲下列三個步驟: 1. 輸入影像之預處理,此步驟包括影像切割(tile dividing)與色彩轉 換(color transform)。依不同需要切割輸入影像大小,並以切割後的 影像爲單位進行色彩轉換。 2. 色彩轉換後之影像資料緊接著進行離散小波轉換(DiscreteH.261, H.263 and H.263 +, etc. For the image compression standards JPEG and JPEG 2000, the required conversions are discrete cosine transform and discrete wavelet transform. The former uses 8x8 pixel blocks as discrete Cosine transformation processing unit, and encode it into a bit stream after conversion; the latter uses the entire image as input for discrete wavelet transformation, and the converted coefficients are divided into N 11 1226189 content for EBCOT (Embedded Block Coding with Optimized Truncation) ) Encoding, including steps such as bit plane and geometric coding, and encoding into a more efficient embedded bit-stream. The entire JPEG 2000 compression process is shown in the second figure, and its processing flow can be roughly divided into the following three steps: 1. Pre-processing of the input image, this step includes image dividing (tile dividing) and color transform (color transform). Cut the input image size according to different needs, and perform color conversion on the cut image as a unit. 2. The image data after color conversion is then subjected to discrete wavelet conversion (Discrete
Wavelet Transform; DWT),除去頻域中多餘訊號,並進行量化處理 (quantization) ° · 3. 最後將量化後之離散小波參數以位元平面爲單位進行處理,除去 位元域之多餘訊號(bit redundancy),並以封包(packet)爲單位輸出資 料流,稱爲EBCOT編碼。 在JPEG 2000 part 1中,亦提供了針對影像中特定區域加強編碼效能之 ROI編碼選項,其使用之原理爲犧牲非有興趣區域的品質而提高有興趣區 域內之影像品質,因此,能在影像傳輸過程中優先顯示有興趣區域內的影 像內容。由於影像經由JPEG 2000中之ROI選項編碼後,能讓ROI優先被 編碼成資料流,並且ROI中之影像資料流經過頻寬不足之通道仍能夠具備鲁 一定的品質,故這樣的技術應用在網際網路或無線通訊上之多媒體互動顯 示極爲重要。在一般的ROI應用上,當一個影像被區分爲有興趣區域(ROI) 與非有興趣區域(background),在編碼時必須加上ROI位置之資訊,以方便 解碼端輸出正確之ROI位置,然而於JPEG 2000 part 1中,有一項稱之爲 _Wavelet Transform (DWT) to remove redundant signals in the frequency domain and perform quantization ° · 3. Finally, the quantized discrete wavelet parameters are processed in bit plane units to remove the redundant signals in the bit domain (bit redundancy), and the data stream is output in packets, which is called EBCOT coding. In JPEG 2000 part 1, it also provides ROI coding options to enhance the coding performance for specific areas in the image. The principle is to sacrifice the quality of the non-interest area and improve the image quality in the area of interest. During the transmission, the image content in the area of interest is displayed first. Since the image is encoded by the ROI option in JPEG 2000, the ROI can be preferentially encoded into a data stream, and the image data stream in the ROI can still have a certain quality through channels with insufficient bandwidth. Therefore, this technology is applied in the Internet Multimedia interactive displays on the Internet or wireless communications are extremely important. In general ROI applications, when an image is divided into a region of interest (ROI) and a region of non-interest (background), the ROI position information must be added during encoding to facilitate the decoder to output the correct ROI position. However, In JPEG 2000 part 1, one is called _
MaxShift之ROI編碼技巧,可不必傳輸相關ROI位置資訊於位元流之間, 解碼端仍可解出ROI位置。 在應用ROI功能之壓縮影像資料時,希望可以對於影像中之特定物件 或區域做不同之影像處理或壓縮技巧,以便提高此特定物件或區域之影像 13 1226189 品質,此影像中之特定物件或區域則稱爲有興趣區域(Region of Interest; R〇I),例如以更多位元來表示有興趣區域內之像素値、以高解析度來加強 有興趣區域內之影像清晰度、或以較小之量化値來量化係數等。以上這些 方法均可以加強影像解壓縮後之視覺效果,然而對於影像中有興趣區域的 圏選,一般而言,是利用影像處理中分類與辨識影像中特定物件或區域, 以及利用手動方式圈選一固定之有興趣區域,因而,此兩種作法依有興趣 區域產生之途徑不同將有兩個待解決之問題: (1) ·以手動方式圈選出影像中固定之有興趣區域,由於此種作法圏選出 之有興趣區域均爲相同形狀大小,因此對於影像壓縮應用而言,其 有興趣區域大小與位元率要求將不具有調節功能。 (2) .以預先影像辨識分離處理而得到有興趣區域,影像在處理前經由影 像分離或辨識處理而得到一有興趣區域,通常影像辨識必須耗費相 當多之計算量,另外此種方法亦對於壓縮位元率要求與有興趣區域 大小無調適性。 故本案發明者基於解決上述問題,提出一種分析影像資料轉換後係數內 容,並具有自動偵測影像中有興趣區域之方法與裝置,本發明係根據影像 資料壓縮時,所得到相關頻率變化之離散轉換後係數內容,並且配合壓縮 位元率要求,決定相關影像內容之ROI區域大小與位置,使用者可在解壓 縮後得到ROI區域內較佳之視覺效果與影像品質。 【內容】 本發明之主要目的,在於提供一種自動偵測影像中有興趣區域之方 法,本發明係根據影像資料壓縮時,所得到相關頻率變化之轉換後係數內 容,並且配合壓縮位元率要求,決定相關影像內容之ROI區域大小與位置, 使用者可在解壓縮後得到ROI區域內較佳之視覺效果與影像品質。 本發明之次要目的,在於提供一種自動偵測影像中有興趣區域之方 14 1226189 法,所提出之自動產生ROI罩遮方法可對於影像內容與壓縮位元率要求更 有調適性,並且由於整個ROI罩遮產生機制內嵌於壓縮過程中,本發明亦 具備低計算成本之特性。 本發明之又一目的,在於提供一種自動偵測影像中有興趣區域之方 法,於ROI產生機制內嵌於影像壓縮過程中,不需額外手動或影像辨識技 巧。並且可利用於各種影像輸入或壓縮過程中,例如醫學影像、紅外線影 像等。 本發明提出一種針對影像轉換後係數之分析方法,探討係數編碼區塊 之位置與頻率變化內容,配合壓縮位元率要求,於壓縮過程中產生一 R〇i 罩遮,此罩遮涵蓋影像中之有興趣區域,並且可於壓縮編碼時,視應用領 域之不同,而對ROI罩遮內之轉換後係數作相關應用之處理,最後編碼成 內嵌ROI資訊之位元資料。 【實施方式】 茲爲使貴審查委員對本發明之結構特徵及所達成之功效有更進一步 之瞭解與認識,謹佐以較佳之實施例及配合詳細之說明,說明如後: 爲避免藉由影像辨識而產生ROI區域之過度設計(Over- design),及手 動產生ROI區域之人力耗費,與對於影像內容無調適性,在本發明中,考 慮於影像壓縮過程中,分析影像轉換後係數之頻率特性,自動地得到影像 編碼壓縮時之ROI區域,請參閱第三圖,其係爲本發明之較佳實施例之流 程圖;所提出之處理步驟如下所述: 步驟10,利用對於轉換後係數之頻率與位置的分析,分類小單位區塊。根 據編碼區塊之轉換後係數之頻率與位置呈現狀況,將影像以ρχρ 像素小單位區塊爲分析之基本單位,判斷此小單位是否屬於R01 區域,其中p爲影像判斷有興趣區塊時之基本單位,可視轉換後 係數所代表意義或壓縮處理過程而定; 步驟20,找出影像內已判定爲有興趣小單位集合之中心點。爲了避免已判 定爲有興趣區域之小單位區塊分散於影像內,必須考慮此小單位 15 1226189 區塊於影像內之分佈位置,找出此集合分佈之中心點,盡量使得· 小區塊集合呈現群聚狀況,如此一來才能顯現R〇i區域編碼之效 果;以及 步驟30,產生一 ROI罩遮(ROI mask)。在得到有興趣小單位區塊之中心點 後,以影像處理中之型態處理技巧(morphology),以擴張運算 _ (dilation)及侵蝕運算(erosion)處理之,而得到多個呈現群聚但不連 續之小區塊集合,爲了使得ROI罩遮能夠集中於上述步驟所求出 之小區塊分佈中心點,而呈現封閉且連續之狀況,接下來配合壓 縮位元率要求,由中心點往外搜尋並結合型態處理後之小區塊集 合,並決定那些區塊集合成爲與位元率要求相關之R〇i罩遮。 * 上述之三個方法步驟,爲針對前面所提之手動或影像辨識而得ROI區 域之缺點而考量設計。 步驟10中,爲根據影像之內容即時地分析與辨識屬於有興趣特性之小 單位區塊,步驟20及步驟30中,則爲配合傳輸或壓縮條件的要求,自動 地整合該步驟10之結果得到一 ROI罩遮,因此具有對於位元率條件之調適 功能,並且盡量使得ROI罩遮呈現封閉且群聚之效果,力P強視覺品質提升。 惟此提出發明並不強調運用於任一特定之離散轉換後係數、或任一特 定之影像壓縮標準中,具類似分析影像轉換後係數頻率變化與位置資訊而鲁 產生有興趣區域之方法或裝置,本發明所提出之分析方法均可利用於往後 之影像壓縮標準或離散轉換係數中。 對於ROI罩遮之自動偵測,首先必須針對所使用轉換之方法,依不同 轉換後係數之高低頻變化作統計,並判斷是否爲合適之有興趣區域,以便 : 得到影像內容分佈狀況;再者,爲根據小單位區塊分佈狀況,考慮小單位 區塊群聚狀況,在位元率要求條件限制下,結合小單位區塊成爲一封閉且 盡量連續之ROI罩遮。惟本發明提出之ROI罩遮產生方法並不限用於任何 一種壓縮程序中,凡壓縮過程中,利用探討轉換後係數之內容,配合考慮 16 1226189 面的分析並分類小單位區塊、找出影像內已判定爲有興趣小單位集合之中· 心點、與產生一 ROI罩遮(R〇I mask)。 於步驟10中,由於EBCOT的編碼是以4-像素的『stripe』作爲編碼單 位,爲了快速以及相容性,在第一個步驟中我們運用4x4小區塊作爲度量 - 單位。以σ〆/,力代表在第kth位元平面其座標爲(/,力位置的位元重要性, 而bk(i,j)則爲判斷座標位置(i,j)是否屬於最重要編碼掃瞄過程之狀態表示’ 當bk(i,j)爲”1”時,紀錄屬於最重要編碼掃瞄過程的點,否則bk(i,j) 將被紀錄爲”〇”。爲了探討每一位元平面之最重要編碼掃瞄過程所表示之狀 態可否表示影像變化內容,此最重要編碼掃瞄過程狀態將依位元平面之重 要性,由最重要位元平面(MSB)累積計算至最不重要位元平面(LSB),稱爲 ® SumN,其中爲累積至第N平面之重要位元數目總和,由此結果,可決定對 於一任意輸入影像而言,應採取第幾個平面爲資料探討依據。 n-\ n-lWith MaxShift's ROI coding technique, it is not necessary to transmit the relevant ROI position information between the bit streams, and the decoder can still determine the ROI position. When applying the compressed image data of ROI function, it is hoped that different image processing or compression techniques can be applied to specific objects or areas in the image in order to improve the quality of this specific object or area. 13 1226189 Quality, specific objects or areas in this image It is called Region of Interest (ROI). For example, more pixels are used to represent pixels in the area of interest. 高 High-resolution is used to enhance the sharpness of the image in the area of interest. Small quantization 値 to quantize coefficients and so on. The above methods can enhance the visual effect of the image after decompression. However, for the selection of areas of interest in the image, in general, the classification and identification of specific objects or areas in the image are performed using image processing, and the manual selection is used. There is a fixed area of interest. Therefore, there are two problems to be solved depending on the way in which these two methods are generated according to the area of interest. (1) · Manually select the fixed area of interest in the image. The method 圏 selects the areas of interest that have the same shape and size. Therefore, for image compression applications, the size and bit rate requirements of the areas of interest will not have the adjustment function. (2). The area of interest is obtained by separating and identifying the image in advance. Before the image is processed, an area of interest is obtained through the process of image separation or identification. Usually, the image recognition must take a considerable amount of calculation. In addition, this method is also suitable for The compression bit rate requirement is not adaptable to the size of the area of interest. Therefore, based on solving the above problems, the inventor of the present case proposes a method and device for analyzing the content of the coefficients of the image data after conversion, and having a method and device for automatically detecting the region of interest in the image. The content of the converted coefficients and the compression bit rate requirements determine the size and position of the ROI region of the relevant image content. Users can obtain better visual effects and image quality in the ROI region after decompression. [Content] The main purpose of the present invention is to provide a method for automatically detecting an area of interest in an image. The present invention is based on the content of the converted coefficients of the relevant frequency changes obtained when the image data is compressed, and it meets the compression bit rate requirements. , Determine the size and position of the ROI region of the relevant image content, and users can get better visual effects and image quality in the ROI region after decompression. A secondary object of the present invention is to provide a method for automatically detecting an area of interest in an image. The method 14 1226189 can automatically adapt the image content and compression bit rate requirements. The entire ROI mask generation mechanism is embedded in the compression process, and the invention also has the characteristics of low calculation cost. Yet another object of the present invention is to provide a method for automatically detecting an area of interest in an image. The ROI generation mechanism is embedded in the image compression process without the need for additional manual or image recognition techniques. And it can be used in various image input or compression processes, such as medical images, infrared images, and so on. The present invention proposes a method for analyzing coefficients after image conversion. It explores the position and frequency changes of the coefficient encoding blocks. In accordance with the compression bit rate requirements, a Roi mask is generated during the compression process. This mask covers the image. The area of interest can be compressed and encoded, depending on the application area, the relevant coefficients are processed for the converted coefficients in the ROI mask, and finally encoded into the bit data of the embedded ROI information. [Implementation] In order to make your reviewing members have a better understanding and understanding of the structural features and achieved effects of the present invention, I would like to refer to the preferred embodiments and detailed descriptions as follows: In order to avoid the use of images The over-design of the ROI area caused by identification, the labor cost of manually generating the ROI area, and the lack of adaptability to the image content. In the present invention, the frequency of the image conversion coefficients is analyzed during the image compression process. Characteristics, automatically obtain the ROI region when image encoding and compression, please refer to the third figure, which is a flowchart of a preferred embodiment of the present invention; the proposed processing steps are as follows: Step 10, using the coefficients after conversion Analysis of frequency and location, classify small unit blocks. According to the frequency and position of the transformed coefficients of the coded block, the image is determined by using the ρχρ pixel small unit block as the basic unit for analysis, to determine whether this small unit belongs to the R01 region, where p is the time when the image judges the block of interest. The basic unit depends on the significance of the coefficients after the conversion or the compression process. Step 20: Find the center point of the set of small units of interest that has been determined to be of interest in the image. In order to avoid that the small unit blocks that have been determined as areas of interest are scattered in the image, it is necessary to consider the distribution position of the small unit 15 1226189 block in the image, find the center point of this set distribution, and try to make the small block set appear Grouping status, so that the effect of Roi region coding can be revealed; and step 30, a ROI mask is generated. After getting the center point of the small unit block of interest, it is processed with the morphology in image processing, the expansion operation and the erosion operation to obtain multiple presentation clusters but The set of discontinuous small blocks, in order to enable the ROI mask to focus on the small block distribution center points obtained in the above steps, and present a closed and continuous situation. Next, in accordance with the compression bit rate requirements, search from the center point outward and Combine the small block sets after type processing, and decide which block sets become Roi masks related to the bit rate requirements. * The above three method steps are designed to consider the shortcomings of the ROI area obtained by manual or image recognition mentioned above. In step 10, in order to analyze and identify small unit blocks belonging to the characteristics of interest in real time based on the content of the image, in steps 20 and 30, in accordance with the requirements of transmission or compression conditions, the results of step 10 are automatically integrated to obtain A ROI mask, so it has the function of adjusting the bit rate conditions, and try to make the ROI mask appear closed and clustered as much as possible, which improves the visual quality. However, the proposed invention does not emphasize the method or device applied to any particular discretely transformed coefficients or any particular image compression standard. It has a similar analysis of the frequency changes of the coefficients and the position information of the transformed images to generate regions of interest. The analysis methods proposed by the present invention can be used in future image compression standards or discrete conversion coefficients. For the automatic detection of ROI masking, you must first make statistics on the conversion method used, according to the high and low frequency changes of different conversion coefficients, and determine whether it is a suitable area of interest in order to: obtain the distribution of image content; In order to consider the clustering of small unit blocks based on the distribution of small unit blocks, the combination of small unit blocks becomes a closed and as continuous ROI mask as possible under the constraints of bit rate requirements. However, the ROI mask generation method proposed by the present invention is not limited to any kind of compression program. During the compression process, the content of the converted coefficients is explored, and the analysis and classification of small unit blocks are considered in conjunction with 16 1226189. The image has been determined to be in the set of small units of interest. The center point and a ROI mask are generated. In step 10, since the encoding of EBCOT is based on a 4-pixel "stripe" as the coding unit, for the sake of speed and compatibility, we use 4x4 small blocks as the unit of measure in the first step. Let σ〆 /, force represent the bit importance of the coordinate in the kth bit plane (/, force position, and bk (i, j) to determine whether the coordinate position (i, j) belongs to the most important code scan The state of the scanning process indicates that when bk (i, j) is "1", the points belonging to the most important code scanning process are recorded, otherwise bk (i, j) will be recorded as "0". In order to explore each bit Whether the state indicated by the most important encoding scanning process of the meta plane can indicate the content of image changes. The state of the most important encoding scanning process will be calculated from the most important bit plane (MSB) to the least according to the importance of the bit plane. The significant bit plane (LSB) is called ® SumN, which is the sum of the number of significant bits accumulated to the Nth plane. From this result, it can be decided that for any input image, the number of planes should be taken as the data to explore Basis. N- \ nl
SumN = ΣΣ从,y’) ⑴ /=0 y=0 上式之N表示所掃瞄累積計算之位元平面,η是位元平面編碼時之單 位區塊大小,通常以64x64像素區塊爲主(即η=64);由實驗之經驗得知, 當所累加計算之屬於重要編碼過程數目多過於1/8的單位區塊面積,則此位 元平面可作爲正確探討DWT係數內容之依據,即停止計算SumN,此時並籲 將N設爲k,至此以後,對於位元平面之硏究,將針對此第0位元平面之 屬於重要編碼過程分佈情況加以計算並分類。 此系統間接利用EBCOT的資訊作爲輸入,而非直接取用離散小波轉換 後係數作爲分類依據,然而EBCOT事實上並非一次將整個區塊掃瞄處理, - 而是以4個像素爲單位,一次一個strip的順序完成掃瞄各個64x64像素編 碼區塊,第五圖所示爲區塊中strip的掃瞄示意圖。因此,利用這樣的編碼 處理過程,本發明機制並非單純將整個區塊之位元平面資料予以考慮,而 是以4x4像素小單位區塊對EBC〇T產生之掃瞄資訊進行分類處理,雖然這 18 1226189 樣的分類方式相對的在r〇I罩遮產生時,會使得ROI罩遮以小區塊爲單位 出現(而非任意幾何形狀),但考量此種方式可以即時且快速的擷取EBCOT 的位元平面重要性資訊,並且產生之ROI罩遮足以涵蓋高頻變化之附近區 域,使得影像高頻變化鄰近顯示更爲清晰。4x4像素小單位區塊判斷由區塊 內所包含之重要位元個數而定,即小單位區塊內包含屬於重要編碼過程數 目,其計算式如下所示: 碰"411//4」)=(2) /=0_/=0 此處之Bk最大値爲”16”,最小値爲”0”,其代表一編碼區塊內分割成4 x4像素小區塊時,小區塊內屬於重要編碼過程數量,並且式中之n値爲64, 代表EBCOT編碼時之編碼區塊(64x64像素),μ」表示取符號△之整數部 分。由於影像經DWT轉換後,在不同之頻段代表不同意義,因此,分別對 DWT係數中屬於低頻及高頻之位元平面進行統計,針對低頻訊號部分進行 統計之結果爲Bk,ix,及代表LLJBAND之統計結果,而針對高頻訊號部分 進行統計之結果爲Bk,HL、Bk,LH、與Bk,HH,分別代表HL JBAND、LH_BAND、 與HH_BAND之統計結果。以下說明這兩種不同特性頻段之判斷方法: ⑻低頻區塊統計狀況 當探§寸低頻迅號統計特性時,首先定義一纪,其爲LL_BAND小 區塊之屬於重要編碼過彳壬統計數里再乘以一^加權値ti(weighting)之結 果,其計算式如下式所示: =t^BkjLL (3) 此時欲判定珩是否代表有興趣之分類,則需有一臨界値判斷,此臨界 値之計算是考慮整個LL-BAND之Bk,LL最大値,設此低頻訊號臨界値 爲Tf,則可經由以下算式得到Tf:SumN = ΣΣfrom, y ') ⑴ / = 0 y = 0 The N in the above formula represents the bit plane of the cumulative calculation of scanning, η is the unit block size when bit plane encoding, usually 64x64 pixel blocks are Master (ie η = 64); It is known from experimental experience that when the accumulated block counts are more than 1/8 of the unit block area of the important encoding process, then this bit plane can be used as a basis for correctly discussing the content of the DWT coefficients. That is, the calculation of SumN is stopped. At this time, N is not set to k. From now on, for the investigation of the bit plane, the distribution of the important encoding process of the 0th bit plane will be calculated and classified. This system indirectly uses the information of EBCOT as input instead of directly taking the discrete wavelet transform coefficients as the classification basis. However, EBCOT is not actually scanning the entire block at a time-but using 4 pixels as a unit, one at a time The sequence of the strip completes the scanning of each 64x64 pixel coded block, and the fifth figure shows the scanning of the strip in the block. Therefore, with such an encoding process, the mechanism of the present invention does not simply consider the bit plane data of the entire block, but classifies the scanning information generated by EBCOT with 4x4 pixel small unit blocks. 18 1226189 The relative classification method will cause the ROI mask to appear in small blocks (rather than any geometric shape) when the rOI mask is generated. However, considering this method, the EBCOT Bit plane importance information, and the ROI mask generated is sufficient to cover the nearby areas with high frequency changes, so that the high frequency changes of the image are displayed more clearly near. The 4x4 pixel small unit block is determined by the number of significant bits contained in the block, that is, the small unit block contains the number of important encoding processes. The calculation formula is as follows: touch " 411 // 4 '' ) = (2) / = 0 _ / = 0 The maximum Bk here is "16" and the minimum Bk is "0", which means that when a coding block is divided into 4 x 4 pixel small blocks, the small blocks are important. The number of encoding processes, and n 値 in the formula is 64, which represents the encoding block (64x64 pixels) when EBCOT encoding, and μ ″ means taking the integer part of the symbol △. After the image is converted by DWT, it represents different meanings in different frequency bands. Therefore, the bit planes belonging to the low frequency and high frequency in the DWT coefficients are respectively counted. The results of the statistics on the low frequency signal parts are Bk, ix, and LLJBAND The statistical results for the high-frequency signal part are Bk, HL, Bk, LH, and Bk, HH, which represent the statistical results of HL JBAND, LH_BAND, and HH_BAND, respectively. The following describes the method of judging these two different frequency bands: ⑻Statistics of low-frequency blocks When exploring the statistical characteristics of low-frequency fast signals, first define a period, which is an important code in the LL_BAND small block. The result of multiplying by ^ weighting ti (weighting), the calculation formula is as follows: = t ^ BkjLL (3) At this time, to determine whether 代表 represents the classification of interest, you need a critical 値 judgment, this critical 値The calculation is to consider the maximum Bk, LL of the entire LL-BAND, and set this critical low-frequency signal to Tf, then Tf can be obtained by the following formula:
Tf = max )/ 2 ⑷ 因此當時,我們將此小區塊分類爲變化平緩且亮度値較高之 19 1226189 區域。 ⑻高頻區塊統計狀況 高頻變化較劇烈之區域通常爲影像中物體之邊緣、紋理變化大或尖 銳區域,並且由感官視覺效應上來講,通常一張影像之有興趣區域爲 物體本身包含邊緣部分,因此,在得到低頻訊號變化之平滑光亮區域 後,需要一包圍此區域之邊緣位置訊息,而這樣的訊息通常出現在高 頻段之DWT係數變化大的區域。所以接下來探討HLjAND、 LH_BAND、與HH_BAND小區塊之屬於重要編碼過程狀況統計結果, 因爲這三個頻帶各代表不同方向之紋理變化’故將這三個頻帶之屬於 重要編碼過程之數量各自乘以一加權値t2、t3及t4後,加總成爲一代 表此小區塊高頻變化之値:Tf = max) / 2 ⑷ Therefore, at that time, we classified this small block as a 19 1226189 region with smooth changes and high brightness 値.统计 Statistics of high-frequency blocks The areas with high-frequency changes are usually the edges of objects in the image, areas with large or sharp changes in texture, and from the perspective of sensory visual effects, usually the area of interest of an image is that the object itself contains edges Partly, therefore, after obtaining a smooth and bright area with low-frequency signal changes, an edge position message surrounding this area is needed, and such information usually appears in areas where the DWT coefficients of the high-frequency bands change widely. So let ’s explore the statistical results of the HLjAND, LH_BAND, and HH_BAND small blocks that are important coding processes. Because these three bands each represent texture changes in different directions, so multiply the number of the three bands that are important coding processes by After weighting 値 t2, t3, and t4, the sum becomes a high frequency change representing this small block:
Bjc = h 年 Bk HL+1:/Bk,LH+t4* Bk,HH (5) 並且在此亦需要一臨界値來判斷所加總之小區塊高頻變化是否屬於有 興趣區域之邊緣、或影像中之尖銳區域,此臨界値Te之計算亦由三個 高頻段訊號之統計結果而得,其計算式如下:Bjc = h year Bk HL + 1: / Bk, LH + t4 * Bk, HH (5) And here also a critical threshold is needed to determine whether the high-frequency change of the small block added belongs to the edge of the area of interest, or the image In the sharp region of China, the calculation of this critical 値 Te is also obtained from the statistical results of three high-frequency signals. The calculation formula is as follows:
Te = max^^ + BkLH + BkHH )/2 ⑹ 當5f>Te此時判斷爲高頻變化之有興趣小區塊,將代表影像中各 方向之紋理或亮度變化,並且當這些區塊呈現較封閉區域之排列時, 即可有效的定義出R〇i罩遮之位置。除此之外,爲了考慮影像內容與 壓縮位元率要求的調適性,上述(3)〜⑹式中之加權値與臨界値之取得, 可依位元率要求而決定。 於步驟20中,影像圖片中若包含物體或具體之高頻內含低頻變化之現 象,則影像中所呈現之高頻包圍區域往往較容易成爲觀看影像時之有興趣 區域,而觀看者亦對於此處之影像呈現品質有較高之要求。因此在統計出 高頻區域區塊之分佈後,我們藉由得出之瓦〃分佈狀況,尋找出此高頻區塊 分佈之幾何中心。假設巧分佈之最上與最下位置之垂直座標分別爲爲T_Bk 1226189 與B_Bk,分佈之最左與最右之水平位置分別爲L_Bk與R_Bk,則中心點 位置(m,η)可由這四點座標計算出: (m,n)= (T_Bk ^B_Bk L_Bk+R_Bk、 ⑺ 上式之(m,n)爲依據影像之高頻變化所求出之ROI罩遮中心點,其中h 與v分別爲垂直與水平之平均係數,可依影像內容之高頻變化區塊面積而 決定其權重大小,試圖快速找出與分佈重心點相似之中心點位置。Te = max ^^ + BkLH + BkHH) / 2 ⑹ When 5f > Te is judged to be a small block of interest with high frequency changes at this time, it will represent the texture or brightness change in all directions in the image, and when these blocks appear more closed When the areas are arranged, the position of the Roi mask can be effectively defined. In addition, in order to consider the adaptability of the image content and the compression bit rate requirements, the acquisition of the weighted threshold and critical threshold in (3) to (2) above can be determined according to the bit rate requirements. In step 20, if the image or image contains an object or a specific high-frequency and low-frequency change phenomenon, the high-frequency surrounding area presented in the image is often easier to be an area of interest when viewing the image, and the viewer is also interested in The image presentation quality here has higher requirements. Therefore, after calculating the distribution of the high-frequency area blocks, we can find the geometric center of the high-frequency block distribution based on the tile distribution. Assuming that the vertical coordinates of the top and bottom positions of the distribution are T_Bk 1226189 and B_Bk, and the left and right horizontal positions of the distribution are L_Bk and R_Bk, respectively, the center point position (m, η) can be determined by these four points. Calculated: (m, n) = (T_Bk ^ B_Bk L_Bk + R_Bk, ⑺ The above formula (m, n) is the center point of the ROI mask obtained based on the high-frequency changes of the image, where h and v are vertical respectively The average coefficient of level and level can determine its weight according to the high-frequency changing block area of the image content, trying to quickly find the position of the center point similar to the center of gravity of the distribution.
於步驟30中,在經由高頻區塊分佈之範圍求出ROI罩遮之中心點後, 必須加以整合高頻與低頻區塊包含屬於重要位元之區域,使得產生之ROI 罩遮具有封閉性與連續性。在此利用影像處理常用之型態處理法 (morphology),根據⑻式之運算而得到一相關有興趣區域之小區塊集合。 ROI—mask = ({B[,B【}〇C、nC (8)In step 30, after the center point of the ROI mask is obtained through the range of the high-frequency block distribution, the high-frequency and low-frequency blocks must be integrated to include the areas belonging to important bits, so that the generated ROI mask has a closed nature. And continuity. Here, a morphology commonly used in image processing is used to obtain a small block set of related regions of interest according to the operation of the formula. ROI_mask = ({B [, B [} 〇C, nC (8)
上式之R〇I_mask爲求出之ROI罩遮,C爲侵蝕運算(erosion)與擴張 運算(dilation)所用之罩遮矩陣,而”〇”與” Π ”表示擴張運算與侵蝕運算,其 中擴張運算爲小區塊集合與運算罩遮矩陣之聯集結果,而侵蝕運算爲擴張 運算後結果與運算罩遮矩陣之交集結果。另外,爲求得與壓縮位元率要求 相關之ROI罩遮,於此步驟之最後,考慮上一步驟所求得之小區塊中心點 位置,以中心點爲起點往外結合型態處理後所得之小區塊集合,配合位元 率要求限制小區塊集合之面積,而形成相關位元率要求之封閉且連續之ROI 罩遮。 本發明所提出之ROI罩遮產生機制與整個JPEG 2000處理方塊圖之關 係如第六圖所示,與原始之JPEG 2000壓縮過程比較,所提出之機制爲一 ROI罩遮產生器內嵌於影像壓縮過程中,輸入影像經由欲處理之離散小波 轉換與量化後,分析EBCOT編碼過程中特定位元平面呈現之重要編碼過程 位置,配合壓縮時之條件(位元率需求),即時的產生一相對於影像內容之 ROI罩遮,因而重新輸入EBCOT編碼,重新編排壓縮位元流(bit-stream), 21 1226189 而形成具R〇I區域編碼之已壓縮資料流,以便使用者於有限頻寬條件下解 壓縮影像時,對於影像之有興趣區域有較佳之影像品質。 (4)實施例說明 如第七圖所示,爲本發明針對兩張不同影像位元率要求條件爲〇.4 bpp(bits per pixel)下,使用固定矩形R〇i罩遮與所發明之自動偵測R〇I罩遮 呈現之解壓縮後影像,第七A圖爲原影像,第七B圖爲兩張影像經由原始 之JPEG 2000壓縮,第七C圖爲使用固定矩形R〇I罩遮面積爲1/3影像大 小情況下之解壓縮結果,第七D圖中白色線所圍區域爲本發明所提出之自 動偵測ROI罩遮,第七E圖所示爲以第七d圖中之r〇i罩遮進行編碼與解 碼而得之影像;由第七圖影像中不同ROI罩遮設定所呈現之結果可看出, 偵測出之ROI罩遮涵蓋影像中像素値較高部分,這是由於以位元平面編碼 中之重要編碼過程資訊爲基礎,當所參考之位元平面屬於低頻訊號時,此 方法容易將影像中像素値高之區塊分類爲有興趣區域,並且當所參考之位 元平面屬於高頻訊號時,則物體之邊緣部分亦容易被分類爲有興趣區域, 並且在0.4 bpp之低位元率要求下,以自動偵測r〇i罩遮比固定R〇i罩遮 壓縮之影像,在解壓縮後視覺品質更佳。 第八圖顯示同一張影像,在不同壓縮位元率要求下,經由固定矩形ROI 罩遮與本發明提出之自動偵測ROI罩遮方法壓縮之解壓縮結果,第八A圖 所示爲固定矩形ROI罩遮所壓縮之影像,第八B圖爲本發明所提出之自動 偵測ROI罩遮方法所壓縮之影像,並且由上而下之壓縮位元率分別爲0.3 bpp、0.5 bpp、0.8 bpp與2.0 bpp ;由圖八之比較可看出,當影像壓縮率爲 低位元率之〇·3 bpp、0.5 bpp與0.8 bpp時,以所提出之方法壓縮之影像, 其中臉部與臉部周圍邊緣部分比固定ROI罩遮明顯且清晰,這是由於自動 偵測之ROI罩遮可以將影像中之有興趣區域有效的萃取出來,而不浪費多 餘壓縮位元於其他影像部位。而對於高位元率之2.0 bPP之影像而言,因爲 兩張壓縮影像之位元資料已經幾乎被解碼出,故其所顯現之視覺效果幾乎 22 1226189 相同。 JPEG 2000中之R〇I選項編碼,其運用條件有益之處在於當傳輸頻寬 或位元率低時,解碼端依舊儘可能解壓縮出有興趣區域內之影像’等到頻 寬恢復或容許高位元率壓縮時’則解碼端可依影像壓縮資料之接收多少’ 來角军壓縮出此條件下之最佳影像品質。因此’此自動偵測罩遮之機制, 利用於JPEG 2000之壓縮中,將可隨著壓縮率條件的給定,而作最好之景多 像呈現。 【特點與功效】 本發明所提供之針對影像編碼程序中之轉換後係數內容作分析,自動 ® 偵測出影像中之有興趣區域,並加以考慮壓縮位元率要求而產生一 ROI罩 遮,與其他習用形成ROI罩遮技術做比較時,更具有下列之優點: 一、 本發明提出之機制內嵌影像壓縮過程中,不需要額外之手動產生ROI 罩遮,或經由影像預先分割與辨識過程取得R〇i罩遮資訊,因此更具有自 動偵測與產生ROI罩遮之優點。 二、 本發明提出之ROI罩遮自動偵測機制,考慮計算量之精簡,於位元平 面上簡單之統計累加計算,因此與經由影像預先分割與辨識處理取得ROI 罩遮之習知方法比較,更具有低計算量之特點。 · 三、 本發明所提出之機制是經由分析轉換後係數之編碼過程,取得與影像 內容相關係數之高低頻率資訊,並且分類其是否爲有興趣之區域,因此比 起習知技術,對於影像資料之內容更具有自動調適的功能。 綜上所述,本發明係實爲一具有新穎性、進步性及可供產業利用者, 應符合我國專利法所規定之專利申請要件無疑,爰依法提出發明專利申 請,祈鈞局早日賜准專利,至感爲禱。 惟以上所述者,僅爲本發明之一較佳實施例而已,並非用來限定本發 明實施之範圍’舉凡依本發明申請專利範圍所述之形狀、構造、特徵及精 23 1226189 神所爲之均等變化與修飾,均應包括於本發明之申請專利範圍內° 【圖式簡單說明】 第一圖:其係爲習雛藝之JPEG影麵縮標準之基本麵步驟之施圖; 第二圖:其係爲習知技藝之JPEG 2000之影像應縮處理流程之方塊圖; 第三圖··其係爲本發明之一較佳實施例之流程圖; 第四圖:其係爲本發明之一較佳實施例之自動化偵測R〇1罩遮機制之方塊 圖, 第五圖:其係爲本發明之一較佳實施例之EBC〇T編碼過程之單位區塊中’ 以4個像素爲掃瞄單位,一次一個stripe的順序完成掃瞄各個64x 64像素編碼區塊之示意圖; 第六圖:本發明提出之R01罩遮產生機制與整個JPEG 2000壓縮處理過程 關係之示意圖; 第七Λ圖:其係爲本發明之一較佳實施例之爲原影像圖; 第七Β圖:其傾賴明之—雛麵例之娜雜軸原始之】PEG 2_ 壓縮再解壓縮之示意圖; 第七c圖:其係爲本發明之一較佳實施例之使用固定矩形R〇i罩遮面積爲 1/3影像大小情況下解壓縮之示意圖; 第七D圖:其係爲本發明之一較佳實施例之白色線所圍區域爲本發明所提 出之自動偵®!iR01罩遮之示意圖; 第七E圖:其係爲本發明之一較佳實施例之第七D圖中之R01罩遮進行編 碼與解碼而得影像之示意圖; 第八A圖:其係爲本發明之一較佳實施例之固定矩形R〇I罩遮所壓縮影像 之示意圖;以及 第八B圖:其係爲本發明之一較佳實施例之自動偵測R01罩遮方法所壓縮 影像之示意圖。 □續次頁(發明說明頁不敷使用時’請註記並使用續頁)The above formula ROI_mask is the ROI mask obtained, C is the masking matrix used for erosion and expansion, and "〇" and "Π" represent expansion and erosion operations, where expansion The operation is the result of the combined set of the small block set and the operation mask matrix, and the erosion operation is the result of the intersection of the result of the expansion operation and the operation mask matrix. In addition, in order to obtain the ROI mask related to the compression bit rate requirement, at the end of this step, consider the position of the center point of the small block obtained in the previous step, and use the center point as the starting point to combine the type and process the result. The small block set meets the bit rate requirements to limit the area of the small block set, and forms a closed and continuous ROI mask with the relevant bit rate requirements. The relationship between the ROI mask generation mechanism proposed in the present invention and the entire JPEG 2000 processing block diagram is shown in Figure 6. Compared with the original JPEG 2000 compression process, the proposed mechanism is a ROI mask generator embedded in the image During the compression process, the input image is transformed and quantized by the discrete wavelet to be processed, and the important encoding process position presented by the specific bit plane in the EBCOT encoding process is analyzed. In accordance with the compression conditions (bit rate requirements), a relative The ROI of the image content is masked, so the EBCOT code is re-entered, and the compressed bit-stream is re-arranged. 21 1226189 to form a compressed data stream with RIO region coding, so that users can use limited bandwidth conditions. When the image is decompressed, the area of interest of the image has better image quality. (4) Description of the embodiment As shown in the seventh figure, under the condition that the bit rate requirements of two different images for the two images are 0.4 bpp (bits per pixel), a fixed rectangular Roi mask is used to cover the invention. Automatic detection of the decompressed image presented by the ROI mask. The seventh image A is the original image, the seventh image B is the two images compressed by the original JPEG 2000, and the seventh image C is a fixed rectangular ROI mask. The decompression result when the mask area is 1/3 of the image size. The area surrounded by the white line in the seventh D image is the automatic detection ROI mask proposed by the present invention. The seventh E image is shown in the seventh d image. The image obtained by encoding and decoding the r0i mask in the image; from the results presented by the different ROI mask settings in the image in Figure 7, it can be seen that the detected ROI mask covers the higher part of the pixels in the image This is because it is based on important coding process information in bit-plane coding. When the referenced bit-plane is a low-frequency signal, this method can easily classify blocks with high pixels in the image as regions of interest, and when When the referenced bit plane is a high-frequency signal, the edge portion of the object It can also be easily classified as a region of interest, and under the low bit rate requirement of 0.4 bpp, the compressed image is detected automatically with a roi mask than a fixed roi mask, and the visual quality is better after decompression. The eighth image shows the decompression results of the same image compressed with a fixed rectangular ROI mask and the automatic detection ROI mask method proposed by the present invention under different compression bit rate requirements. The eighth image shows a fixed rectangle The compressed image of the ROI mask, the eighth figure B is the image compressed by the automatic detection method of the ROI mask proposed by the present invention, and the compression bit rates from top to bottom are 0.3 bpp, 0.5 bpp, and 0.8 bpp, respectively. Compared with 2.0 bpp; it can be seen from the comparison of FIG. 8 that when the image compression rate is 0.3 bpp, 0.5 bpp, and 0.8 bpp of the low bit rate, the image compressed by the proposed method, wherein the face and the periphery of the face The edge part is more obvious and clearer than the fixed ROI mask. This is because the automatically detected ROI mask can effectively extract the area of interest in the image without wasting extra compression bits on other image parts. For a high bit rate 2.0 bPP image, the visual effects of the two compressed images are almost the same because the bit data of the two compressed images has been almost decoded. The ROI option encoding in JPEG 2000 is useful in that when the transmission bandwidth or bit rate is low, the decoder still decompresses the image in the area of interest as much as possible 'until the bandwidth is restored or the high bit is allowed When the element rate is compressed, the decoder can compress the best image quality under this condition. Therefore, this automatic detection masking mechanism, used in JPEG 2000 compression, will be able to make the best scene multi-image presentation with the given compression rate conditions. [Features and effects] The present invention provides analysis of the content of the converted coefficients in the image encoding program, automatically detects the areas of interest in the image, and takes into account the compression bit rate requirements to generate an ROI mask. Compared with other conventional techniques for forming ROI masks, it has the following advantages: 1. In the process of image compression of the mechanism proposed by the present invention, no additional manual ROI masks need to be generated manually, or the image segmentation and recognition process is pre-processed. Obtain Roi mask information, so it has the advantages of automatic detection and generation of ROI mask. 2. The automatic detection mechanism of ROI masking proposed by the present invention considers the reduction of the calculation amount and simple statistical accumulation calculation on the bit plane. Therefore, it is compared with the conventional method of obtaining ROI masking through image segmentation and recognition processing. It has the characteristics of low calculation volume. · The mechanism proposed by the present invention is to analyze the encoding process of the converted coefficients to obtain the high and low frequency information of the correlation coefficient with the image content, and to classify whether it is an area of interest. Therefore, compared with the conventional technology, for the image data The content is even more automatically adjusted. In summary, the present invention is a novel, progressive, and industrially usable person, which should meet the patent application requirements stipulated by the Chinese Patent Law. No doubt, an application for an invention patent was filed in accordance with the law. The patent is a prayer. However, the above is only a preferred embodiment of the present invention, and is not intended to limit the scope of implementation of the present invention. For example, the shapes, structures, features, and characteristics described in the patent application scope of the present invention 23 1226189 Equal changes and modifications should be included in the scope of the patent application of the present invention. [Simplified description of the drawing] The first picture: it is a diagram of the basic steps of the JPEG shadow reduction standard of Xi Chuyi; the second picture: It is a block diagram of the image compression processing flow of JPEG 2000, which is a conventional technique. The third diagram is a flowchart of a preferred embodiment of the present invention. The fourth diagram is one of the present invention. The block diagram of the automatic detection R01 masking mechanism of the preferred embodiment, the fifth figure: it is a unit block of the EBCOT encoding process according to a preferred embodiment of the present invention, with 4 pixels as Scanning unit, one stripe at a time to complete the scanning of each 64x64 pixel coding block diagram; Figure 6: The R01 mask generation mechanism proposed by the present invention and the entire JPEG 2000 compression process relationship diagram; the seventh Λ diagram : This is the hair One of the preferred embodiments is the original image map; the seventh B picture: its reliance on the Ming-the original example of the hybrid axis original PEG 2_ compression and decompression schematic diagram; the seventh c picture: it is based on A preferred embodiment of the invention is a schematic diagram of decompression when a fixed rectangular Roi mask area is 1/3 of the image size; FIG. 7D is a white line of a preferred embodiment of the present invention The surrounding area is a schematic diagram of the automatic detection ® iR01 mask proposed by the present invention; Figure 7E: It is obtained by encoding and decoding the R01 mask in Figure 7D, which is a preferred embodiment of the present invention. Schematic diagram of the image; Figure 8A: It is a schematic diagram of a compressed rectangular RoI mask covering a preferred embodiment of the present invention; and Figure 8B: It is a preferred implementation of the present invention Schematic illustration of the compressed image of the automatic detection R01 masking method. □ Continued pages (when the invention description page is insufficient, please note and use the continued pages)