专利汇可以提供Adaptive zonal coder专利检索,专利查询,专利分析的服务。并且The invention provides a method and apparatus for performing image data compression. Initially, a two-dimensional block of image data in a spatial domain is transformed by a transform coder, resulting in a two-dimensional array of activity coefficients in a frequency domain. The array is then serialized and quantized, yielding a one-dimensional array of coefficients with a leading coefficient and an original trailing coefficient. Next, a portion of the array is selected by choosing a new trailing coefficient based on a setable ratio of the energy of the new short array to the energy of the original array. Lastly, an end-of-block symbol is appended after the new trailing coefficient, and the selected portion of the array is encoded using an entropy coder. The invention allows image data to enjoy enhanced compression with good image fidelity, and allows image fidelity to degrade gracefully in trade for higher degrees of image compression.,下面是Adaptive zonal coder专利的具体信息内容。
This invention relates to image data compression and more particularly to adaptive techniques for transform coding of image data for transmission or storage.
Many communication systems require the transmission and/or storage of image data. Any technique that minimizes the amount of information required to encode a given image is highly desireable. Transform coding represents a major class of such methods. Meaningful images usually contain a high degree of long-range structure, i.e., they have high inter-element correlations. In transform coding, an image with high inter-element correlations is converted into a collection of uncorrelated coefficients, thereby removing most of the redundant information in the original image. Early transform techniques employed a Fourier-type transform. Current approaches use a discrete cosine transform (DCT) or a Hadamard transform, both of which provide relatively higher coding efficiency.
Commonly, an image to be encoded is partitioned into a plurality of image blocks, and each block is divided into an 8x8 or 16x16 square array. The resulting blocks are encoded one after another. A typical image compression pipeline includes a transform coder, a quantizer, and an entropy coder. A quantizer is used to map a coefficient, falling within a range of continuous values, into a member of a set of discrete quantized values, and an entropy coder uses statistical properties of the information to compress it, i.e., to express the same message using fewer binary digits. For example, a Huffman coder relates the most frequent incoming data symbols to the shortest outgoing codes, and the least frequent incoming data symbols to the longest outgoing codes. After the transform coder transforms an image block into a collection of uncorrelated coefficients, the quantizer provides the corresponding series of discrete values to the entropy coder. The resulting stream of binary digits is then either transmitted or stored.
Although it is possible to retain all the coefficients that result from transform coding an image, it is neither necessary nor desirable. Due to the nature of the human visual system, certain coefficients can be omitted without degrading perceived image quality. By retaining only visually important coefficients, an image with acceptable fidelity can be recovered using significantly less information than provided by the transform coding process.
For a typical two-dimensional source image, the transform operation results in a redistribution of image energy into a set of relatively few low-order coefficients that can be arranged in a two-dimensional array. Visually significant coefficients are usually found in the upper left-hand corner of the array.
According to a known method, referred to as 'zonal coding', only those coefficients lying within a specified zone of the array, e.g., the upper left quarter of the array, are retained. Although significant data compression can be achieved using this method, it is inadequate because picture fidelity is reduced for scenes that contain significant high frequency components.
In an alternative approach, referred to as 'adaptive transform coding', a block activity measure is used to choose an optimum quantizer. Although this method achieves gains in efficiency by providing a degree of quantization precision appropriate to the activity level of a given block, the additional information needed to specify the state of the quantizer must be transmitted along with the usual coefficient data.
The invention provides a method for image data compression that includes the following steps: transforming a two-dimensional block of image data in a spatial domain to data in a frequency domain. The transformed image data is represented as a two-dimensional array of activity coefficients. The two dimensional array is then serialized, resulting in a one-dimensional array of coefficients with a leading coefficient and an original trailing coefficient. The array is then quantized. Next, a portion of the array is selected beginning with the leading coefficient and ending with a new trailing coefficient that is closer to the leading coefficient than the original trailing coefficient. Lastly, an end-of-block symbol is appended after said new trailing coefficient, and the selected portion of the array is encoded using an entropy coder.
In a preferred embodiment, the portion is selected by first measuring the total activity of the array. A measure of activity of successive sub-portions of the array is determined on-the-fly, and added to a running total. Each new value of the running total is compared to the total activity of the array. Based on this comparison, and in a further preferred embodiment, on a setable level of image quality, a new trailing coefficient is designated, and only the portion of the array bounded by the leading coefficient and the new trailing coefficient is presented for encoding.
Another general feature of the invention is apparatus for inclusion in an image data compression pipeline, the pipeline including a transform coder, a serializer, a quantizer and an entropy coder. The apparatus includes a memory connected to said quantizer and said entropy coder. The memory has an input and an output adapted to store and delay at least a block of transformed, serialized and quantized image data, and is adapted to provide an entropy coder with successive sub-portions of said block of image data The apparatus also includes a first measurer connected to the quantizer for measuring the total activity of the block of image data, and a second measurer connected to the output of the memory for measuring the activity of successive sub-portions of the image data and computing a running total of the activity. A control law unit is connected to the first and second measurers, and to an entropy coder, and is adapted to compare the running total provided by the second measurer to a level based in part on the total activity provided by the first measurer, and sends an end-of-block symbol to the entropy encoder. The entropy coder is adapted to encode only a portion of the block of image data based on when it receives an end-of-block symbol.
Thus, it is not necessary to transmit additional information along with each block for indicating, e.g., its activity level, or run length. The end-of-block symbol provides an image data receiver with sufficient information for parsing an incoming bit stream back into blocks of image data.
The invention will be more fully understood by reading the following detailed description, in conjunction with the accompanying drawings, in which:
With reference to Fig. 1, a typical image compression pipeline includes a transform coder 10, such as a discrete cosine transform coder (DCT), as disclosed in Ahmed et al., "Discrete Cosine Transform", IEEE Trans on Computers, Jan 1974, pp 90-93; a quantizer 12, such as a linear quantizer, as described in Wintz, "Transform Picture Coding", section III, Proc. IEEE, vol. 60, pp 809-820; and an entropy coder 14, such as a Huffman coder, as discussed in Huffman, "A Method for the Construction of Minimum-Redundancy Codes", Proc. IRE 40, No. 9, 1098-1101. After the transform coder 10 transforms image data into a collection of uncorrelated coefficients, the quantizer 12 maps each coefficient, selected from a range of continuous values, into a member of a set of discrete quantized values. These quantized values are then encoded by the entropy coder 14. The resulting stream of binary digits is then either transmitted or stored.
Typically, image data is two-dimensional. Accordingly, the information provided by the transform coder 10 is presented in the form of a two-dimensional array. In a preferred embodiment, a serializer, such as a zigzag serializer, is used to covert the two-dimensional array of continuous values into a one-dimensional array of continuous values. The one-dimensional array is then quantized to yield a one-dimensional array of discrete values. For example, a zigzag serializer operating on an 4x4 array of integers as shown in Fig. 2A would produce a 1x16 array of integers as shown in Fig. 2B.
In a preferred embodiment, a zigzag serializer 16 is included in the image compression pipeline between the transform coder 10 and the quantizer 12, as shown in Fig. 3. After serializing the two-dimensional array provided by the transform coder 10, the quantizer maps the resulting one-dimensional array of continuous values into a one-dimensional array of quantized values. The array is then held for one block processing interval in a one-block delay memory unit 18, while an identical copy of the array is measured by a total block activity measure module 20. The module 20 computes the summation of the square (or the absolute value) of each element in the one-dimensional array, providing a value that represents the 'activity' or 'energy' of the image represented by the array to a control law module 22.
The zigzag serializer 16 places the low-order coefficients that result from, for example, a discrete cosine transform, at the beginning of an array, and the higher-order terms at the end. If there are sufficient low-order terms, the higher-order terms are of less importance and, due to the nature of the human visual response, may be omitted without perceived image degradation. For example, Fig. 2C shows the one-dimensional array of Fig. 2B after truncation of its six highest order coefficients. Since it is common for the activity of a block to be found mostly in its lower-order coefficients, transmitting only these coefficients results in a substantially greater compression of image data. However, unlike the known case of non-adaptive zonal coding, a block with significant activity in its higher-order coefficients will be transmitted in a manner that allows most of these coefficients to be included in the transmitted block, resulting in a received image of superior fidelity. Furthermore, it is not necessary to transmit additional information along with each block for indicating, e.g., its activity level. Instead, the end-of-block symbol 15 allows an image data receiver to know how to parse an incoming bit stream back into blocks of image data.
After dwelling in the memory unit 18 for one processing interval, the one dimensional array is again measured for activity. A running block activity measure module 24 computes a running total of the squares or the absolute values of the elements in the array as it enters the module 24. A new value of the running total is provided to the control law module 22 as each additional element of the array enters the module 24. Also, as each element of the array enters the module 24, an identical element enters the entropy coder 14. The newly arriving array elements are held in a memory register within the entropy coder 14 until a terminate-block signal, representing an end-of-block character, is provided by the control law module 22. A terminate-block signal may be generated after an entire array has entered the entropy encoder 14. Alternatively, the control law module may generate a terminate-block signal after a predetermined amount of activity has been measured by module 24. In this case, even if the entire array has not yet entered the entropy coder 14, the resulting partial array is transmitted, with an end-of-block symbol appended at the end of the partial array.
The control law module 22 compares the total block activity of the current array, provided by module 20, with the running block activity measure, such as the running sum of squares or absolute values that is progressively generated by the module 24. In a preferred embodiment, a ratio or difference circuit is included in the control law module 22 to generate a measure of the percentage of current block activity. The comparison of the total block activity measure with the running block activity measure indicates how much 'activity' has entered the entropy coder 14. A desired level of image fidelity can be set using an additional input 26.
For a given level, the number of coefficients transmitted will vary, depending on the distribution of activity among the low, middle, and high order coefficients. If the activity of the block is concentrated in its lower order terms, it will be necessary to send fewer coefficients than if a large fraction of the blocks activity is found in the higher order coefficients. If the channel used for transmission becomes overloaded, image fidelity can be decreased in trade for increased data compression.
The image data compression method of the invention can be used to encode any grayscale image or representation of the difference between two images.
Other modifications and implementations will occur to those skilled in the art without departing from the spirit and the scope of the invention as claimed. Accordingly, the above description is not intended to limit the invention except as indicated in the following claims.
标题 | 发布/更新时间 | 阅读量 |
---|---|---|
一种基于视频流的对象识别装置 | 2020-05-11 | 916 |
一种H.265编码方法和装置 | 2020-05-11 | 306 |
解码器、解码和编码视频的方法 | 2020-05-12 | 222 |
解码器及解码方法、编码器及编码方法 | 2020-05-12 | 631 |
压缩/解压缩的装置和系统、芯片、电子装置、方法 | 2020-05-12 | 463 |
基于深度学习的可变码率图像编码、解码系统及方法 | 2020-05-13 | 486 |
视频编码方法及装置 | 2020-05-08 | 60 |
一种高保真的H.264/AVC视频三系数可逆隐写方法 | 2020-05-12 | 561 |
图像处理方法和图像编码/解码方法以及使用图像处理方法和图像编码/解码方法的装置 | 2020-05-11 | 632 |
视频解码方法及视频解码器,视频编码方法及视频编码器 | 2020-05-12 | 776 |
高效检索全球专利专利汇是专利免费检索,专利查询,专利分析-国家发明专利查询检索分析平台,是提供专利分析,专利查询,专利检索等数据服务功能的知识产权数据服务商。
我们的产品包含105个国家的1.26亿组数据,免费查、免费专利分析。
专利汇分析报告产品可以对行业情报数据进行梳理分析,涉及维度包括行业专利基本状况分析、地域分析、技术分析、发明人分析、申请人分析、专利权人分析、失效分析、核心专利分析、法律分析、研发重点分析、企业专利处境分析、技术处境分析、专利寿命分析、企业定位分析、引证分析等超过60个分析角度,系统通过AI智能系统对图表进行解读,只需1分钟,一键生成行业专利分析报告。