专利汇可以提供RECURSIVE ADAPTIVE INTRA SMOOTHING FOR VIDEO CODING专利检索,专利查询,专利分析的服务。并且A recursive adaptive intra smoothing filter for intra-mode video coding is executed using one or more approaches including, but not limited to matrix multiplication, spatial filtering and frequency domain filtering. Matrix multiplication includes initially computing a prediction matrix Pm using training data. After coding a macroblock, Pm is updated for future macroblocks. In the case of applying spatial filtering, the shift invariance problem is reduced by imposing certain constraints on the matrix to be solved. In frequency domain filtering, a transform residual is minimized using DCT-domain filtering.,下面是RECURSIVE ADAPTIVE INTRA SMOOTHING FOR VIDEO CODING专利的具体信息内容。
What is claimed is:
The present invention relates to the field of image/video processing. More specifically, the present invention relates to recursive adaptive intra smoothing (RAIS) for video coding.
H.264/AVC is a relatively new international video coding standard. It considerably reduces the bit rate by approximately 30 to 70 percent when compared with previous video coding standards such as MPEG-4 Part 2 and H.263, while providing similar or better image quality.
The intra coding algorithm of H.264 exploits the spatial and spectral correlation present in an image. Intra prediction removes spatial redundancy between adjacent blocks by predicting one block from its spatially adjacent causal neighbors. A choice of coarse and fine intra prediction is allowed on a block-by-block basis. There are two types of prediction modes for the luminance samples. The 4×4 Intra mode predicts each 4×4 block independently within a macroblock, and the 16×16 Intra mode predicts a 16×16 macroblock as a whole unit. For 4×4 Intra mode, nine prediction modes are available for the encoding procedure, among which one represents a plain DC prediction, and the remaining ones operate as directional predictors distributed along eight different angles. Intra mode 16×16 is suitable for smooth image areas, where four directional prediction modes are provided as well as the separate intra prediction mode for the chrominance samples of a macroblock. In H.264 high profile, 8×8 intra prediction is introduced in addition to 4×4 and 16×16 intra prediction.
H.264 achieves excellent compression performance and complexity characteristics in the intra mode even when compared against the standard image codecs (JPEG and JPEG2000). In recent years, extended works have been developed to further improve the performance of intra prediction. Some authors introduced intramotion compensated prediction of macroblocks. Block size and accuracy adaptation are able to be brought into the intra block-matching scheme to further improve the prediction results. In such a manner, the position of reference block is coded into the bit stream. Thus, a significant amount of extra side information would affect the performance significantly. To reduce this overhead information, special processing techniques have been developed and result in a big change of intra coding structure in the H.264/AVC standard. In some references, a block-matching algorithm (BMA) is utilized to substitute for H.264 DC intra prediction mode with no need to code side information. However, prediction performance would be degraded if previously reconstructed pixels are used for the matching procedure. Also, improved lossless intra coding methods are proposed to substitute for horizontal, vertical, diagonal-down-left (mode 3) and diagonal-down-right (mode 4) of H.264/AVC. They employ a samplewise differential pulse code modulation (DPCM) method to conduct prediction of pixels in a target block. Yet these kinds of methods are only able to be used in lossless mode.
From the above-mentioned analysis, current-enhanced intra coding methods still have problems, namely, either changing the coding structures significantly, having limited usage or less gain.
A recursive adaptive intra smoothing filter for intra-mode video coding is executed using one or more approaches including, but not limited to matrix multiplication, spatial filtering and frequency domain filtering. Matrix multiplication includes initially computing a prediction matrix Pm (derived using offline training data). After coding a macroblock, Pm is updated for future macroblocks. In the case of applying spatial filtering, the shift invariance problem is reduced by imposing certain constraints on the matrix to be solved. In frequency domain filtering, a transform residual is minimized using DCT-domain filtering.
In one aspect, a method of filtering a video programmed in a memory in a device comprises calculating a prediction matrix using a training data set and recursively re-calculating the prediction matrix using a previous prediction matrix and prediction data of a current macroblock using neighboring pixels. The training data set is an offline training data set. The prediction matrix is computed using a cross-correlation matrix and an auto-correlation matrix. The filtering is applied to video coding. The coding comprises intra coding. The method further comprises implementing spatial filtering. Spatial filtering comprises restricting allowable values of the prediction matrix. A filter is restricted to have a unity DC gain, and/or a linear phase response. The filter is shift-invariant, and coefficients are chosen so that the L2-norm prediction residual is minimized based on past statistics. Filtering is not implemented if the neighboring pixels are across an edge. The method further comprises implementing Discrete Cosine Transform-domain filtering. Implementing discrete cosine transform-domain filtering comprises taking a discrete cosine transform of a block using a set of predictors resulting in transform coefficients, applying a weighting to the transform coefficients and taking an inverse discrete cosine transform to generate new predictors. The method further comprises taking the discrete cosine transform of neighboring pixels of the block for prediction. The method further comprises taking the discrete cosine transform utilizes a line of pixels from an above neighboring block and a same line of pixels from a left neighboring block. Applying the weighting includes weighting factors initially derived from offline training and updating based on previous reconstructed pixels. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPhone, an iPod®, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
In another aspect, a method of filtering a video programmed in a memory in a device comprises implementing a first filter for filtering a first row/column of a block of the video and implementing one or more additional filters for filtering additional rows/columns of the block of the video. The first row/column is nearest to predictor pixels and the additional rows/columns are further from the predictor pixels. The first filter is weaker than the one or more additional filters. The one or more additional filters are each as strong or are progressively stronger in low-pass as a distance from predictor pixels increases. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPhone, an iPod®, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
In another aspect, a system for filtering a video programmed in a memory in a device comprises a matrix multiplication module for implementing matrix multiplication on a block of the video, a spatial filtering module for applying spatial filtering to the matrix multiplication and a discrete cosine transform-domain filtering module for implementing discrete cosine transform-domain filtering to the block of the video, wherein an encoding video using the filtering results. Implementing matrix multiplication further comprises calculating a prediction matrix using a training data set and recursively re-calculating the prediction matrix using a previous prediction matrix and prediction data of a current macroblock using neighboring pixels. The training data set is an offline training data set. The prediction matrix is computed using a cross-correlation matrix and an auto-correlation matrix. The filtering is applied to video coding. The coding comprises intra coding. The system further comprises implementing spatial filtering. Spatial filtering comprises restricting allowable values of the prediction matrix. A filter is restricted to have a unity DC gain, and/or a linear phase response. The filter is shift-invariant, and coefficients are chosen so that the L2-norm prediction residual is minimized based on past statistics. Filtering is not implemented if the neighboring pixels are across an edge. The system further comprises implementing Discrete Cosine Transform-domain filtering. Implementing Discrete Cosine Transform-domain filtering comprises taking a discrete cosine transform of a block using a set of predictors resulting in transform coefficients, applying a weighting to the transform coefficients and taking an inverse discrete cosine transform to generate new predictors. The system further comprises taking the discrete cosine transform of neighboring pixels of the block for prediction. The system further comprises taking the discrete cosine transform utilizes a line of pixels from an above neighboring block and a same line of pixels from a left neighboring block. Applying the weighting includes weighting factors initially derived from offline training and updating based on previous reconstructed pixels. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPhone, an iPod®, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
In another aspect, a camera device comprises an image acquisition component for acquiring an image, a processing component for processing the image by calculating a prediction matrix using a training data set and recursively re-calculating the prediction matrix using a previous prediction matrix and prediction data of a current macroblock using neighboring pixels to filter the image generating in a processed image and a memory for storing the processed image. The training data set is an offline training data set. The prediction matrix is computed using a cross-correlation matrix and an auto-correlation matrix. The filtering is applied to video coding. The coding comprises intra coding. The camera device further comprises implementing spatial filtering. Spatial filtering comprises restricting allowable values of the prediction matrix. A filter is restricted to have a unity DC gain, and/or a linear phase response. The filter is shift-invariant, and coefficients are chosen so that the L2-norm prediction residual is minimized based on past statistics. Filtering is not implemented if the neighboring pixels are across an edge. The camera device further comprises implementing Discrete Cosine Transform-domain filtering. Implementing discrete cosine transform-domain filtering comprises taking a discrete cosine transform of a block using a set of predictors resulting in transform coefficients, applying a weighting to the transform coefficients and taking an inverse discrete cosine transform to generate new predictors. The camera device further comprises taking the discrete cosine transform of neighboring pixels of the block for prediction. The camera device further comprises taking the discrete cosine transform utilizes a line of pixels from an above neighboring block and a same line of pixels from a left neighboring block. Applying the weighting includes weighting factors initially derived from offline training and updating based on previous reconstructed pixels.
In yet another aspect, an encoder comprises an intra coding module for encoding an image for calculating a prediction matrix using a training data set and recursively re-calculating the prediction matrix using a previous prediction matrix and prediction data of a current macroblock using neighboring pixels to filter an image generating in a processed image and an intercoding module for encoding the image using motion compensation. The training data set is an offline training data set. The prediction matrix is computed using a cross-correlation matrix and an auto-correlation matrix. The filtering is applied to video coding. The coding comprises intra coding. The encoder further comprises implementing spatial filtering. Spatial filtering comprises restricting allowable values of the prediction matrix. A filter is restricted to have a unity DC gain, and/or a linear phase response. The filter is shift-invariant, and coefficients are chosen so that the L2-norm prediction residual is minimized based on past statistics. Filtering is not implemented if the neighboring pixels are across an edge. The encoder further comprises implementing Discrete Cosine Transform-domain filtering. Implementing discrete cosine transform-domain filtering comprises taking a discrete cosine transform of a block using a set of predictors resulting in transform coefficients, applying a weighting to the transform coefficients and taking an inverse discrete cosine transform to generate new predictors. The encoder further comprises taking the discrete cosine transform of neighboring pixels of the block for prediction. The encoder further comprises taking the discrete cosine transform utilizes a line of pixels from an above neighboring block and a same line of pixels from a left neighboring block. Applying the weighting includes weighting factors initially derived from offline training and updating based on previous reconstructed pixels.
A recursive adaptive intra smoothing (RAIS) filter for intra-mode video coding is described herein. The filter is able to be executed using one or more approaches including, but not limited to matrix multiplication, spatial filtering and frequency domain filtering. Matrix multiplication includes initially computing a prediction matrix Pm using offline training data. After coding a macroblock, Pm is updated for future macroblocks. In the case of applying spatial filtering, the shift invariance problem is reduced by imposing certain constraints on the matrix to be solved. In frequency domain filtering, a transform residual is minimized using DCT-domain filtering.
In inter-frame Recursive Adaptive Interpolation Filter (RAIF), for example, as described in U.S. Patent Application Ser. No. 61/301,430 , filed Feb. 4, 2011 and entitled, “RECURSIVE ADAPTIVE INTERPOLATION FILTERS (RAIF),” which is hereby incorporated by reference in its entirety for all purposes, if a current block of an image is y, then its motion compensated prediction is x. A set of filters Ak are tested, and the one that minimizes the prediction residual, ∥y−Akx∥1, is chosen. The filter index k is then transmitted. Both the encoder and decoder update Rxx (auto-correlation) and Rxy (cross-correlation) for the kth filter, and use the new filter for the future blocks.
Recursive Adaptive Intra Smoothing (RAIS) using Matrix Multiplication
The inter prediction RAIF is extended to intra prediction which is referred to as RAIS. A 4×4 block intra prediction is used as an example. In RAIS, y is the current block being predicted which is vectorized to 16×1, and x is the L-shape neighbors, a 13×1 vector for a 4×4. For each intra prediction mode m (e.g. m could be one of the 9 modes defined in AVC), a prediction matrix Pm is employed: Pred(y)=Pmx, where the size of Pm is 16×13 for prediction of 4×4. Thus, y is able to be predicted using x. The prediction matrix Pm is the optimal prediction matrix based on x and y. The Pm is determined by recursively letting the encoder and decoder learn about the statistics related to the predictor and the signal to be predicted. The previous statistics are used to improve the prediction during the encoding process. For each mode, there is a auto-correlation matrix Rxx and an cross-correlation matrix Rxy. Initially, the cross-correlation matrix, the auto-correlation matrix and Pm are computed based on training data. After each macroblock is coded, Rxx, Rxy and Pm are updated for future macroblocks by taking the previous values and combining them with new values including neighboring pixel prediction values. The update of the prediction matrix of the nth macroblock is shown as follows:
Rxxm(n+1)=(1−λ)Rxxm(n)+λE({circumflex over (x)}{circumflex over (x)}T)
Rxym(n+1)=(1−λ)Rxym(n)+λE({circumflex over (x)}{circumflex over (x)}T)
Pm(n)[Rxxm(n)]−1Rxym(n)
RAIS Using Spatial Filtering
RAIS using spatial filtering is a variation of the previous approach using matrix multiplication in the sense that certain constraints are imposed on the matrix to be solved. In spatial filtering, the constraint is shift invariance. One example is shown in
Avoid Filtering Across an Edge
The derived RAIS filters usually have a low-pass characteristics. Therefore, filtering the neighborhood should be avoided if the corresponding pixels are across an edge. A 1D Laplacian operator [−1, 2, −1] is used to detect if there is a strong gradient at each neighborhood pixel. If the gradient is greater than a threshold, RAIS is not applied to that pixel to preserve the edge, and the auto- and cross-correlation matrices are not updated based on that pixel as well.
DCT-Domain Filtering
Using Different Filters within a Block
In some embodiments, the recursive adaptive intra-smoothing application(s) 730 include several applications and/or modules. In some embodiments, the recursive adaptive intra-smoothing application(s) 730 include modules such as a matrix multiplication module for implementing RAIS using matrix multiplication, a spatial filtering module for implementing spatial filtering and a DCT-domain filtering module for implementing DCT-domain filtering. In some embodiments, fewer or additional modules and/or sub-modules are able to be included.
Examples of suitable computing devices include a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone, a video player, a DVD writer/player, a Blu-Ray® writer/player, a television, a home entertainment system or any other suitable computing device.
The difference between the original and the predicted block is referred to as the residual of the prediction. The residual is transformed, and the transform coefficients are scaled and quantized at the transform and scaling quantization module 804. Each block is transformed using an integer transform, and the transform coefficients are quantized and transmitted using entropy-coding methods. An entropy encoder 816 uses a codeword set for all elements except the quantized transform coefficients. For the quantized transform coefficients, Context Adaptive Variable Length Coding (CAVLC) or Context Adaptive Binary Arithmetic Coding (CABAC) is utilized. The deblocking filter 808 is implemented to control the strength of the filtering to reduce the blockiness of the image.
The encoder 800 also contains the local decoder 818 to generate prediction reference for the next blocks. The quantized transform coefficients are inverse scaled and inverse transformed 806 in the same way as the encoder side which gives a decoded prediction residual. The decoded prediction residual is added to the prediction, and the combination is directed to the deblocking filter 808 which provides decoded video as output. Ultimately, the entropy coder 816 produces compressed video bits 820 of the originally input video 802.
To utilize recursive adaptive intra-smoothing, a device such as a digital camera or camcorder is used to acquire an image or video of the scene. The recursive adaptive intra-smoothing is automatically performed. The recursive adaptive intra-smoothing is also able to be implemented after the image is acquired to perform post-acquisition processing.
In operation, recursive adaptive intra-smoothing is for block-based transforms. The compression method involves one or more of matrix multiplication, spatial filtering and frequency domain filtering. By implementing recursive adaptive intra-smoothing, compression efficiency is improved.
The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be readily apparent to one skilled in the art that other various modifications may be made in the embodiment chosen for illustration without departing from the spirit and scope of the invention as defined by the claims.
标题 | 发布/更新时间 | 阅读量 |
---|---|---|
一种基于FPGA的电子稳像系统 | 2021-01-08 | 1 |
以低编码器和解码器复杂度进行视频编码的高精度运动矢量 | 2023-08-31 | 2 |
视频编码装置、视频编码方法、视频再现装置、视频再现方法 | 2022-10-14 | 2 |
在native层实现无缝录像的方法、装置及终端设备 | 2020-09-16 | 2 |
Video coding method and device, related scalable bitstream and computer program product | 2021-11-20 | 2 |
实验室监管及防篡改信息系统 | 2021-10-09 | 1 |
System and Methods for Image/Video Compression | 2021-02-20 | 1 |
영상의 부호화/복호화 방법 및 이를 이용하는 장치 | 2021-05-13 | 2 |
고속 인트라 예측을 위한 영상 부호화 방법 및 장치 | 2021-09-11 | 1 |
解碼器、用以解碼之方法及相關編碼器、用以編碼之方法與數位儲存媒體 | 2021-06-16 | 2 |
高效检索全球专利专利汇是专利免费检索,专利查询,专利分析-国家发明专利查询检索分析平台,是提供专利分析,专利查询,专利检索等数据服务功能的知识产权数据服务商。
我们的产品包含105个国家的1.26亿组数据,免费查、免费专利分析。
专利汇分析报告产品可以对行业情报数据进行梳理分析,涉及维度包括行业专利基本状况分析、地域分析、技术分析、发明人分析、申请人分析、专利权人分析、失效分析、核心专利分析、法律分析、研发重点分析、企业专利处境分析、技术处境分析、专利寿命分析、企业定位分析、引证分析等超过60个分析角度,系统通过AI智能系统对图表进行解读,只需1分钟,一键生成行业专利分析报告。