专利汇可以提供Method and apparatus for bit rate control in video signal encoding using a rate-distortion model专利检索,专利查询,专利分析的服务。并且In many video encoding bit rate control models such as TMN8 and RHO domain, the distortion D has been described as a uniform weighted distortion, and therefore the distortion in each macroblock is introduced by uniformly quantising its DCT transform coefficients with a quantiser of step size Qstep. The TMN8 rate control uses a frame-layer rate control to select a target number of bits for the current frame, and a macroblock-layer rate control to select the values of the quantisation step sizes for the macroblocks. However, the distribution of the residual coding error does not conform to a uniform distribution but approximately to a Laplacian distribution. According to the invention an approximated but still easy-to-use Laplacian distribution model is calculated for the quantisation step size adaptation in the macroblock layer bit rate control.,下面是Method and apparatus for bit rate control in video signal encoding using a rate-distortion model专利的具体信息内容。
The invention relates to a method and to an apparatus for bit rate control in video signal encoding using a Rate-Distortion model, the pictures of which video signal are divided into a grid of pixel blocks, wherein the pixels in a block are transformed, quantised, inverse quantised and inverse transformed, and wherein the picture sequence encoding types include intra coding and non-intra coding.
In many video encoding bit rate control models such as TMN8 and RHO domain, the distortion D (i.e. the encoding/decoding error for a pixel block caused by quantisation) has been described as a uniform weighted distortion, the distribution of which is shown in Fig. 2 as a function of the standard deviation of the prediction residue σ, and therefore the distortion in each macroblock is introduced by uniformly quantising its DCT transform coefficients with a quantiser of step size Qstep. The following is a typically used distortion measure D for the encoded macroblock, it is described for example in
where N is the number of macroblocks in a frame, factor αi is the distortion or importance weight of the i-th macroblock and Qstepi is the step size used for quantising the i-th macroblock.
The TMN8 rate control uses a frame-layer rate control to select a target number of bits for the current frame, and a macroblock-layer rate control to select the values of the quantisation step sizes for the macroblocks.
In the frame-layer rate control a target number B of bits for the current frame is determined by:
wherein B is the target number of bits for a frame (including the bits for quantised transform coefficients, motion vectors and header information), R is the channel rate in bits per second, F is the frame rate in frames per second, W is the number of bits in the encoder buffer, Z = 0.1 is set as a default value to achieve a low delay, M is a maximum value indicating buffer fullness that is set by default to R/F, Wprev is the previous number of bits in the encoder buffer, and Bprev is the actual number of bits used for encoding the previous frame.
The frame target bit rate B varies depending on the type of video frame, e.g. I, P or B, on the buffer fullness and on the channel throughput. If W is greater than 10% of M, B is slightly decreased. Otherwise, B is slightly increased.
The macroblock-layer rate control selects the values of the quantisation step sizes Qstep for all the macroblocks in a current frame so that the sum of the bits used for these macroblocks is close to the frame target bit rate B.
However, equation (1) is not a good model for estimating or calculating the residual coding error because the distribution of the residual coding error (for P frames or for non-intra frames) after the transform does not conform to a uniform distribution but approximately to a Laplacian distribution as shown in Fig. 1 as a function of the standard deviation of the prediction residue σ, which has been proven by experiments. But because in practice it is very complex to use Laplace distribution for controlling the bit rate, most rate control methods such as TMN8 and RHO domain still prefer to use the uniform distribution, which however induces a bit rate control error and visual quality degradation.
A problem to be solved by the invention is to provide an optimised but still simple distortion model for video encoding bit rate control, which model approximates a Laplacian distribution and fits for any bit rate control that has a relation with the distortion. This problem is solved by the method disclosed in claim 1. A corresponding video signal and storage medium are disclosed in claims 8 and 9, respectively. An apparatus that utilises this method is disclosed in claim 2.
The invention uses a distortion model processing based on approximated Lapalacian distortion distribution, instead of uniform distortion distribution, for the block or macroblock layer bit rate control by quantisation step size adaptation. This kind of processing can be used in any rate control which is based on a Rate-Distortion model, such as MPEG2 Video (ISO/IEC 13818-2) rate control, MPEG4 Video (ISO/IEC 14496-2) rate control, MPEG4 AVC (ISO/IEC 14496-10, Advanced Video Coding) rate control, and MPEG4-AVC SVC (Scalable Video Coding) rate control.
The inventive bit rate control is more accurate than other Rate-Distortion bit rate controls. It is very simple and can be realised easily in practical applications.
In principle, the inventive method is suited for bit rate control in video signal encoding using a Rate-Distortion model for said bit rate control, the pictures of which video signal are divided into a grid of pixel blocks or macroblocks, wherein the pixels in a block or macroblock are transformed, quantised, inverse quantised and inverse transformed, and wherein the picture sequence encoding types include intra coding and non-intra coding, said method including the steps:
In principle the inventive apparatus is suited for bit rate control in video signal encoding using a Rate-Distortion model for said bit rate control, the pictures of which video signal are divided into a grid of pixel blocks or macroblocks, wherein the pixels in a block or macroblock are transformed, quantised, inverse quantised and inverse transformed, and wherein the picture sequence encoding types include intra coding and non-intra coding, said apparatus including:
Advantageously the coding of the current non-intra picture includes:
Advantageous additional embodiments of the invention are disclosed in the respective dependent claims.
Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in:
The Laplace distortion distribution bit rate control model processing is carried out as described below. The probability density function (PDF) of Laplace (cf. Fig. 1) can be described as
where σ is the standard deviation of the prediction residue within a block of transform coefficients and y is the transform coefficient. Practical experiments with different source sequences show that the distribution of transform (e.g. DCT transform) coefficient magnitudes (including intra and inter transform coefficients) in any codec like MPEG2-2, MPEG4-2, and MPEG4 AVC approximates the Laplace distribution. Generally, most codecs which include MPEG2, MPEG4 or MPEG4 AVC use a scalar quantiser. The basic forward quantiser operation is
where i and j are the horizontal and vertical positions in a transform block, Yij is an original transform coefficient, Q is the quantiser step size and Zij is the quantised version of that transform coefficient.
MPEG2 and MPEG4 support 31 different values of Q. In MPEG4 AVC a total of 52 Q values are supported by the standard and these are indexed by a quantisation parameter value QP. Q doubles in size for every increment of '6' in QP, and Q increases by 12.5% for each increment of '1' in QP. Anyway, the different Q steps have the same properties for the quantisation.
The mean squared error can be used as a measure for the theoretical distortion D(Q) which can be described as follows:
wherein Q(i) is the reconstructed y value derived after quantisation and inverse quantisation.
Following insertion of equation (2) into equation (4):
Because (y-Q(i))2 = y2-2yQ(i) + Q(i)2 and
The SINH function is defined as
When assuming that
Then, equation (9) can be expressed as
A corresponding comparison between Lapalacian distortion, Laplace simulation distortion and uniform distortion is depicted in Fig. 3 in which, for quantiser step size Q = 26, the distortion D is shown as a function of σ. The 'x' signs represent the uniform distortion, the 'o' signs represent the real distortion and the '+' signs represent the inventive distortion simulation. The D value for the uniform distortion results from Q = 26, Q2/12 = 56.3 .
In TMN8, the original macroblock-layer rate control model is described by
where A is the number of pixels in a macroblock, K is a distortion model parameter, B is the total number of target bits for encoding the current frame, N is the number of macroblocks in the frame, C is the overhead rate, σi is the standard deviation of the residuum of the transform coefficient coding/decoding errors in the i-th block or macroblock, and αi is the distortion weight of the i-th macroblock.
The inventive Laplace distortion model approximation can be applied in any (block or macroblock layer) bit rate control such as TMN8, RHO-domain, and so on.
When using the inventive processing for a TMN8 rate control, the TMN8 macroblock-layer rate control can be modified according to:
wherein the parameters have the same meaning as in formula (12). Formula (13) requires about the same computational complexity as formula (12). However, its bit rate control accuracy or quality is better than when applying formula (12).
Formula (13) is derived using formula (11) as follows. According to Rate-Distortion theory:
In Fig. 5 the video data input signal IE includes block or macroblock data to be encoded. In case intraframe (I) data are to be encoded, subtractor S passes the video input data via a transformer TR (computing e.g. a DCT transform) and a quantiser Q to an entropy encoder ENTCOD that delivers the encoder video data output signal OE.
In case interframe (P, B), i.e. non-intra frame, data are to be encoded, subtractor S subtracts predicted block or macroblock data PRD from the input signal IE and passes the difference data via transformer TR and quantiser Q to entropy encoder ENTCOD. The switching between the intraframe and interframe modes is carried out by a corresponding switch SW that may be implemented by an XOR function.
The output signal of Q is also fed to an inverse quantiser IQU, the output signal of which passes through an inverse transformer ITR to an adder A and represents reconstructed block or macroblock difference data. The output signal of adder A is intermediately stored in frame store and motion compensation means FSMC which also perform motion compensation on reconstructed block or macroblock data and which deliver such predicted block or macroblock data PRD to subtractor S and to the other input of adder A.
Quantiser Q and inverse quantiser IQU are controlled by the current quantiser step size Qstep (corresponding to Qi in formula (13) in P frames, as described above). It is also possible that FSMC does not perform motion compensation. Inverse quantiser IQU, inverse transformer ITR and frame store and motion compensation means FSMC represent a decoder DEC.
In the I frame and macroblock layer bitrate control processing and P frame layer bitrate control processing depicted in Fig. 4, for an I frame an initialisation step INI is carried out which can use a TMN8 model (if e.g. a bit control error under 5% is required) to compute the quantisation parameter QP and thereby the quantisation step size for each macroblock in that I frame. However, the QP for this I frame can also be derived from the adjacent prior P frame, which means that for all macroblocks in the first I frame a fixed QP is used. Because the Laplace distortion model fits only to the residuum in P frames, and because experiments have shown that in I frames the prediction residue complies with the General Gauss Distribution, the bitrate control for I frames can select either a fixed QP or an adaptive QP derived from the TMN8 model.
Advantageously, by using a fixed QP for each I frame macroblock one gets a smaller 'intra refresh' effect as compared to using an adaptive QP according to the TMN8 model. However, using a fixed QP for an I frame may introduce additional bits for that frame. Therefore an initialisation is carried out only for I frames, but not for P frames because the inventive rate control is accurate enough to adjust the QP for P frames as necessary.
After correspondingly encoding the I frame in step ICOD, it will provide the left-bits information required for applying the parameters of the Laplace distortion model for the following P frame. The depicted skipping control step SKCTRL can be omitted following this I frame encoding. As an alternative, if there was buffer overflow, steps INI, ICOD, UPDRD and SKCTRL are carried out again whereby the quantiser step size or sizes are adapted accordingly.
The target bit number for the following P frame is calculated in an estimate target bitrate step ETB and the corresponding encoder buffer fullness value is adjusted in a buffer control step BCTRL. Step BCTRL determines the buffer filling level, which is used for the macroblock layer bit rate control processing according to Fig. 5 to prevent buffer underflow and buffer overflow. For example, if the buffer fullness level is greater than 80% of the buffer size, a delta value is added to the QP value for preventing overflow. If the buffer fullness level is smaller than e.g. 20% of the buffer size, a delta value is subtracted from the QP value for preventing underflow.
Thereafter the current P frame is encoded in step PCOD, followed by step UPDRD.
In skipping control step SKCTRL it is checked whether or not the encoder buffer (total buffer) is overflow. If true, the current frame is skipped, and the target bit number for the next P frame coding is re-updated in the estimate target bitrate step ETB and the corresponding encoder buffer fullness value is re-adjusted in the buffer control step BCTRL.
The P frame encoding in step PCOD is explained in connection with Fig. 5. The processing for a current P frame starts with a Laplace distortion model parameters initialising step INILD. Thereafter, for the first macroblock in the current P frame the QP, i.e. the quantiser step size Q1 according to formula (13), is calculated in step QPCLC and applied in encoding the macroblock in step MBCOD. The macroblock encoding is followed by a Laplace distortion updating step UPLD. Thereafter it is checked in step EOF whether or not the end of the current P frame has been reached. If not true, the processing continues for the second or next, respectively, macroblock with step QPCLC. If true, the processing continues with step UPDRD in Fig. 4.
"P frame" may include any form of predicted frame, e.g. B frame.
The invention can also be applied to field-based encoding.
标题 | 发布/更新时间 | 阅读量 |
---|---|---|
一种视频帧编码方法、装置及终端设备 | 2020-05-08 | 303 |
一种视频编码方法及装置 | 2020-05-11 | 354 |
解码器、解码方法、编码器以及编码方法 | 2020-05-11 | 978 |
编码方法、系统和编码器、解码方法、系统和解码器 | 2020-05-12 | 159 |
具有低延迟的视频编码方法及装置 | 2020-05-11 | 341 |
基于Sobel算子和线性回归的高清视频编码码率控制方法 | 2020-05-08 | 694 |
图像处理方法、装置、存储介质及电子设备 | 2020-05-08 | 595 |
一种360度视频帧间快速编码方法 | 2020-05-08 | 379 |
时序动作提名的生成方法、装置、设备及存储介质 | 2020-05-11 | 284 |
一种视频数据处理方法及装置 | 2020-05-12 | 405 |
高效检索全球专利专利汇是专利免费检索,专利查询,专利分析-国家发明专利查询检索分析平台,是提供专利分析,专利查询,专利检索等数据服务功能的知识产权数据服务商。
我们的产品包含105个国家的1.26亿组数据,免费查、免费专利分析。
专利汇分析报告产品可以对行业情报数据进行梳理分析,涉及维度包括行业专利基本状况分析、地域分析、技术分析、发明人分析、申请人分析、专利权人分析、失效分析、核心专利分析、法律分析、研发重点分析、企业专利处境分析、技术处境分析、专利寿命分析、企业定位分析、引证分析等超过60个分析角度,系统通过AI智能系统对图表进行解读,只需1分钟,一键生成行业专利分析报告。