首页 / 专利库 / 视听技术与设备 / 视频编码层 / Method and apparatus for bit rate control in video signal encoding using a rate-distortion model

Method and apparatus for bit rate control in video signal encoding using a rate-distortion model

阅读:424发布:2021-09-19

专利汇可以提供Method and apparatus for bit rate control in video signal encoding using a rate-distortion model专利检索,专利查询,专利分析的服务。并且In many video encoding bit rate control models such as TMN8 and RHO domain, the distortion D has been described as a uniform weighted distortion, and therefore the distortion in each macroblock is introduced by uniformly quantising its DCT transform coefficients with a quantiser of step size Qstep. The TMN8 rate control uses a frame-layer rate control to select a target number of bits for the current frame, and a macroblock-layer rate control to select the values of the quantisation step sizes for the macroblocks. However, the distribution of the residual coding error does not conform to a uniform distribution but approximately to a Laplacian distribution. According to the invention an approximated but still easy-to-use Laplacian distribution model is calculated for the quantisation step size adaptation in the macroblock layer bit rate control.,下面是Method and apparatus for bit rate control in video signal encoding using a rate-distortion model专利的具体信息内容。

Method for bit rate control in video signal (IE) encoding using a Rate-Distortion model for said bit rate control, the pictures of which video signal are divided into a grid of pixel blocks or macroblocks, wherein the pixels in a block or macroblock are transformed (TR), quantised (QU), inverse quantised (IQU) and inverse transformed (ITR), and wherein the picture sequence encoding types include intra coding and non-intra coding, characterised by the steps:- initialising (INI) the coding of an intra picture for determining the quantiser step size to be used for each block or macroblock in said intra picture;- coding (ICOD) the blocks or macroblocks of said intra picture thereby using said pre-determined quantiser step size;- after said intra picture has been encoded, updating (UPDRD) the parameters of a Laplace distortion model for the following non-intra picture;- calculating (ETB) a target bit number (B) for said following non-intra picture;- adjusting (BCTRL) a corresponding encoder buffer fullness value;- coding (PCOD) the current non-intra picture;- updating (UPDRD) the parameters of said Laplace distortion model for the following non-intra picture;- checking (SKCTRL) whether or not the encoder buffer is overflow, and if true, skipping the current non-intra picture and re-updating correspondingly the target bit number (B) for the next non-intra picture coding and re-adjusting correspondingly said buffer fullness value;- correspondingly continuing the coding (PCOD) of the remaining non-intra pictures.Apparatus for bit rate control in video signal (IE) encoding using a Rate-Distortion model for said bit rate control, the pictures of which video signal are divided into a grid of pixel blocks or macroblocks, wherein the pixels in a block or macroblock are transformed (TR), quantised (QU), inverse quantised (IQU) and inverse transformed (ITR), and wherein the picture sequence encoding types include intra coding and non-intra coding, said apparatus including:- means (INI) being adapted for initialising the coding of an intra picture for determining the quantiser step size to be used for each block or macroblock in said intra picture;- means (ICOD) being adapted for coding the blocks or macroblocks of said intra picture thereby using said pre-determined quantiser step size;- means (UPDRD) being adapted for updating, after said intra picture has been encoded, the parameters of a Laplace distortion model for the following non-intra picture;- means (ETB) being adapted for calculating a target bit number (B) for said following non-intra picture;- means (BCTRL) being adapted for adjusting a corresponding encoder buffer fullness value;- means (PCOD) being adapted for coding the current non-intra picture;- means (UPDRD) being adapted for updating the parameters of said Laplace distortion model for the following non-intra picture;- means (SKCTRL) being adapted for checking whether or not the encoder buffer is overflow, and if true, for skipping the current non-intra picture and re-updating correspondingly the target bit number (B) for the next non-intra picture coding and re-adjusting correspondingly said buffer fullness value,whereby the coding (PCOD) of the remaining non-intra pictures is correspondingly continued.Method according to claim 1, or apparatus according to claim 2, wherein said coding (PCOD) of the current non-intra picture includes:- initialising (INILD) the Laplace distortion model parameters for said current non-intra picture;- computing (QPCLC) for the first block or macroblock a quantiser step size (Q1) according to a distortion model derived from said Laplace distortion model;- coding (MBCOD) said first block or macroblock, thereby applying said computed quantiser step size;- updating (UPLD) the Laplace distortion model parameters for the following block or macroblock;- checking (EOF) whether or not the end of the current non-intra picture has been reached, and if not true, continuing the processing for the next block or macroblock with the next quantiser step size computation for the next block or macroblock, and if true, continuing the processing with said updating (UPDRD) of the parameters of said Laplace distortion model for the following non-intra picture.Method according to claim 1 or 3, or apparatus according to claim 2 or 3, wherein the Laplacian distribution D of the distortion of transform coefficients coding/decoding errors for a pixel block is approximated by Dσ2Q212σ2+Q2,
wherein Q is the quantiser step size and σ is the variance of that distribution.
Method according to one of claims 1, 3 and 4, or apparatus according to one of claims 2 to 4, wherein the quantiser step size Qi for the i-th block or macroblock in a predicted picture is Qi=12σi2AKk=1NαkσkαiσiANK+12B-ANC-AKk=1Nαkσk,
wherein A is the number of pixels in a block or macroblock, K is a distortion model parameter, βi is the number of bits left for encoding the current picture and βi = B is set at the beginning of the picture whereby B is the target number of bits for the picture as explained above, Ni is the number of block or macroblocks that remain to be encoded in the picture, C is the overhead rate, σi is the standard deviation of the residuum of the transform coefficient coding/decoding errors in the i-th block or macroblock, and αi is the distortion weight of the i-th block or macroblock, respectively.
Method according to one of claims 1 and 3 to 5, or apparatus according to one of claims 2 to 5, wherein in said coding (ICOD) of an intra picture block or macroblock a quantiser step size is used that corresponds to the step size used in the corresponding block or macroblock in the non-intra picture arranged before said intra picture.Method according to one of claims 1 and 3 to 6, or apparatus according to one of claims 2 to 6, wherein said encoded video signal is an MPEG2, MPEG4, MPEG4 AVC or SVC video signal.A digital video signal encoded using the method of one of claims 1 and 3 to 7.Storage medium, for example an optical disc, that contains or stores, or has recorded on it, a digital video signal that was encoded according to the method of one of claims 1 and 3 to 7.
说明书全文

The invention relates to a method and to an apparatus for bit rate control in video signal encoding using a Rate-Distortion model, the pictures of which video signal are divided into a grid of pixel blocks, wherein the pixels in a block are transformed, quantised, inverse quantised and inverse transformed, and wherein the picture sequence encoding types include intra coding and non-intra coding.

Background

In many video encoding bit rate control models such as TMN8 and RHO domain, the distortion D (i.e. the encoding/decoding error for a pixel block caused by quantisation) has been described as a uniform weighted distortion, the distribution of which is shown in Fig. 2 as a function of the standard deviation of the prediction residue σ, and therefore the distortion in each macroblock is introduced by uniformly quantising its DCT transform coefficients with a quantiser of step size Qstep. The following is a typically used distortion measure D for the encoded macroblock, it is described for example in J. Ribas-Corbera, S. Lei, "Rate Control in DCT Video Coding for Low-Delay Communications", IEEE Trans. CSVT, Feb. 1999: D=1Ni=1Nαi2Qstepi212

where N is the number of macroblocks in a frame, factor αi is the distortion or importance weight of the i-th macroblock and Qstepi is the step size used for quantising the i-th macroblock.

The TMN8 rate control uses a frame-layer rate control to select a target number of bits for the current frame, and a macroblock-layer rate control to select the values of the quantisation step sizes for the macroblocks.

In the frame-layer rate control a target number B of bits for the current frame is determined by: B=R/F-Δ Δ={W/F,W>Z*MW-Z*Motherwise W=maxWprev+Bprev-R/F,0,

wherein B is the target number of bits for a frame (including the bits for quantised transform coefficients, motion vectors and header information), R is the channel rate in bits per second, F is the frame rate in frames per second, W is the number of bits in the encoder buffer, Z = 0.1 is set as a default value to achieve a low delay, M is a maximum value indicating buffer fullness that is set by default to R/F, Wprev is the previous number of bits in the encoder buffer, and Bprev is the actual number of bits used for encoding the previous frame.

The frame target bit rate B varies depending on the type of video frame, e.g. I, P or B, on the buffer fullness and on the channel throughput. If W is greater than 10% of M, B is slightly decreased. Otherwise, B is slightly increased.

The macroblock-layer rate control selects the values of the quantisation step sizes Qstep for all the macroblocks in a current frame so that the sum of the bits used for these macroblocks is close to the frame target bit rate B.

Invention

However, equation (1) is not a good model for estimating or calculating the residual coding error because the distribution of the residual coding error (for P frames or for non-intra frames) after the transform does not conform to a uniform distribution but approximately to a Laplacian distribution as shown in Fig. 1 as a function of the standard deviation of the prediction residue σ, which has been proven by experiments. But because in practice it is very complex to use Laplace distribution for controlling the bit rate, most rate control methods such as TMN8 and RHO domain still prefer to use the uniform distribution, which however induces a bit rate control error and visual quality degradation.

A problem to be solved by the invention is to provide an optimised but still simple distortion model for video encoding bit rate control, which model approximates a Laplacian distribution and fits for any bit rate control that has a relation with the distortion. This problem is solved by the method disclosed in claim 1. A corresponding video signal and storage medium are disclosed in claims 8 and 9, respectively. An apparatus that utilises this method is disclosed in claim 2.

The invention uses a distortion model processing based on approximated Lapalacian distortion distribution, instead of uniform distortion distribution, for the block or macroblock layer bit rate control by quantisation step size adaptation. This kind of processing can be used in any rate control which is based on a Rate-Distortion model, such as MPEG2 Video (ISO/IEC 13818-2) rate control, MPEG4 Video (ISO/IEC 14496-2) rate control, MPEG4 AVC (ISO/IEC 14496-10, Advanced Video Coding) rate control, and MPEG4-AVC SVC (Scalable Video Coding) rate control.

The inventive bit rate control is more accurate than other Rate-Distortion bit rate controls. It is very simple and can be realised easily in practical applications.

In principle, the inventive method is suited for bit rate control in video signal encoding using a Rate-Distortion model for said bit rate control, the pictures of which video signal are divided into a grid of pixel blocks or macroblocks, wherein the pixels in a block or macroblock are transformed, quantised, inverse quantised and inverse transformed, and wherein the picture sequence encoding types include intra coding and non-intra coding, said method including the steps:

  • initialising the coding of an intra picture for determining the quantiser step size to be used for each block or macroblock in said intra picture;
  • coding the blocks or macroblocks of said intra picture thereby using said pre-determined quantiser step size;
  • after said intra picture has been encoded, updating the parameters of a Laplace distortion model for the following non-intra picture;
  • calculating a target bit number for said following non-intra picture;
  • adjusting a corresponding encoder buffer fullness value;
  • coding the current non-intra picture;
  • updating the parameters of said Laplace distortion model for the following non-intra picture;
  • checking whether or not the encoder buffer is overflow, and if true, skipping the current non-intra picture and re-updating correspondingly the target bit number for the next non-intra picture coding and re-adjusting correspondingly said buffer fullness value;
  • correspondingly continuing the coding of the remaining non-intra pictures.

In principle the inventive apparatus is suited for bit rate control in video signal encoding using a Rate-Distortion model for said bit rate control, the pictures of which video signal are divided into a grid of pixel blocks or macroblocks, wherein the pixels in a block or macroblock are transformed, quantised, inverse quantised and inverse transformed, and wherein the picture sequence encoding types include intra coding and non-intra coding, said apparatus including:

  • means being adapted for initialising the coding of an intra picture for determining the quantiser step size to be used for each block or macroblock in said intra picture;
  • means being adapted for coding the blocks or macroblocks of said intra picture thereby using said pre-determined quantiser step size;
  • means being adapted for updating, after said intra picture has been encoded, the parameters of a Laplace distortion model for the following non-intra picture;
  • means being adapted for calculating a target bit number for said following non-intra picture;
  • means being adapted for adjusting a corresponding encoder buffer fullness value;
  • means being adapted for coding the current non-intra picture;
  • means being adapted for updating the parameters of said Laplace distortion model for the following non-intra picture;
  • means being adapted for checking whether or not the encoder buffer is overflow, and if true, for skipping the current non-intra picture and re-updating correspondingly the target bit number for the next non-intra picture coding and re-adjusting correspondingly said buffer fullness value, whereby the coding of the remaining non-intra pictures is correspondingly continued.

Advantageously the coding of the current non-intra picture includes:

  • initialising the Laplace distortion model parameters for said current non-intra picture;
  • computing for the first block or macroblock a quantiser step size according to a distortion model derived from said Laplace distortion model;
  • coding said first block or macroblock, thereby applying said computed quantiser step size;
  • updating the Laplace distortion model parameters for the following block or macroblock;
  • checking whether or not the end of the current non-intra picture has been reached, and if not true, continuing the processing for the next block or macroblock with the next quantiser step size computation for the next block or macroblock, and if true, continuing the processing with said updating of the parameters of said Laplace distortion model for the following non-intra picture.

Advantageous additional embodiments of the invention are disclosed in the respective dependent claims.

Drawings

Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in:

  • Fig. 1 Laplacian distortion distribution;
  • Fig. 2 uniform distortion distribution;
  • Fig. 3 distortion as a function of σ for uniform, real and proposed simulated distortion;
  • Fig. 4 initialisation and updating processing for frame layer bit rate control;
  • Fig. 5 initialisation and updating processing for macroblock layer bit rate control in P frames;
  • Fig. 6 encoder block diagram.

Exemplary embodiments

The Laplace distortion distribution bit rate control model processing is carried out as described below. The probability density function (PDF) of Laplace (cf. Fig. 1) can be described as fyy=12σe-2σy,

where σ is the standard deviation of the prediction residue within a block of transform coefficients and y is the transform coefficient. Practical experiments with different source sequences show that the distribution of transform (e.g. DCT transform) coefficient magnitudes (including intra and inter transform coefficients) in any codec like MPEG2-2, MPEG4-2, and MPEG4 AVC approximates the Laplace distribution. Generally, most codecs which include MPEG2, MPEG4 or MPEG4 AVC use a scalar quantiser. The basic forward quantiser operation is Zij=roundYijQ,

where i and j are the horizontal and vertical positions in a transform block, Yij is an original transform coefficient, Q is the quantiser step size and Zij is the quantised version of that transform coefficient.

MPEG2 and MPEG4 support 31 different values of Q. In MPEG4 AVC a total of 52 Q values are supported by the standard and these are indexed by a quantisation parameter value QP. Q doubles in size for every increment of '6' in QP, and Q increases by 12.5% for each increment of '1' in QP. Anyway, the different Q steps have the same properties for the quantisation.

The mean squared error can be used as a measure for the theoretical distortion D(Q) which can be described as follows: DQ=i=-Qi-0.5Qi+0.5y-Qi2fyYy,

wherein Q(i) is the reconstructed y value derived after quantisation and inverse quantisation.

Following insertion of equation (2) into equation (4): DQ=i=-Qi-0.5Qi+0.5y-Qi212σe-2σyy The equation (5) is an even function of Q. Therefore it can be re-written as: DQ=i=12Qi-0.5Qi+0.5y-Qi212σe-2σyy+200.5Qy-Qi212σe-2σyy

Because (y-Q(i))2 = y2-2yQ(i) + Q(i)2 and ye-aydy=-e-aya2ay+1+c yne-aydy=-1ae-ayyn+nayn-1e-aydy, D(Q) can be derived from equation (5) using equations (7) as: DQ=σ2-2QσeQ2σ-e-Q2σ=σ2-Qσ2sinhQ2σ=σ21-Q2σsinhQ2σ. Further, D can be described as D=σ21-βsinh β wherein β=Q2σ.

The SINH function is defined as sinhx=x1!+x33!+x55!+.

When assuming that Q2σ=β<1 sinh(β) can be approximated by sinh (β) = β + β3/6.

Then, equation (9) can be expressed as =6σ2+σ2β2-6σ2/6+β2,i.e.Dσ2β26+β2. After replacing β, the theoretical value of D is DTheoryσ2Q212σ2+Q2 for the approximated Laplacian distribution of the distortion of transform coefficients coding/decoding errors for a pixel block.

A corresponding comparison between Lapalacian distortion, Laplace simulation distortion and uniform distortion is depicted in Fig. 3 in which, for quantiser step size Q = 26, the distortion D is shown as a function of σ. The 'x' signs represent the uniform distortion, the 'o' signs represent the real distortion and the '+' signs represent the inventive distortion simulation. The D value for the uniform distortion results from Q = 26, Q2/12 = 56.3 .

In TMN8, the original macroblock-layer rate control model is described by Q=AKB-ANCσiαiΣk=1Nαkσk,

where A is the number of pixels in a macroblock, K is a distortion model parameter, B is the total number of target bits for encoding the current frame, N is the number of macroblocks in the frame, C is the overhead rate, σi is the standard deviation of the residuum of the transform coefficient coding/decoding errors in the i-th block or macroblock, and αi is the distortion weight of the i-th macroblock.

The inventive Laplace distortion model approximation can be applied in any (block or macroblock layer) bit rate control such as TMN8, RHO-domain, and so on.

When using the inventive processing for a TMN8 rate control, the TMN8 macroblock-layer rate control can be modified according to: Qi=12σi2AKΣk=1NαkσkαiσiAKN+12B-ANC-AKΣk=1Nαkσk,

wherein the parameters have the same meaning as in formula (12). Formula (13) requires about the same computational complexity as formula (12). However, its bit rate control accuracy or quality is better than when applying formula (12).

Formula (13) is derived using formula (11) as follows. According to Rate-Distortion theory: Q1*Q2*QN*=arg min1Nk=1NDk=arg min1Nk=1Nαk2σk2Qk212σk2+Qk2 at the bit constraint k=1NBk=B. According to the Lagrange algorithm: fQ=Q1*Q2*QN*λ*=arg min1Nk=1Nαk2σk2Qk212σk2+Qk2+λk=1NBk-B According to publication [1]: Bk=AKσk2+CQk2 fQ=Q1*Q2*QN*λ*=arg min1Nk=1Nαk2σk2Qk212σk2+Qk2+λk=1NAKσk2Qk2+C-B Forming the derivative of f(Q): fQQi1N2Qiαi2σi212σi212σi2+Qi22-2λAKσi2Qi3=01N12Qiαi2σi212σi2+Qi22-λAKQi3=012Qiαi2σi212σi2+Qi22=λAKNQi312Qi412σi2+Qi22=λAKNαi2σi2Qi2=12σi2λAKN12αiσi-λAKN k=1NAKσk2Qk2+C=Bk=1NAKσk2Qk2=B-ANCk=1NAKσk212σk2λAKN12αkσk-λAKN=B-ANCby using14.5k=1NAK12αkσk-λAKN12λAKN=B-ANCk=1NAK12αkσk-λAKN=12λAKNB-ANC12AKΣk=1Nαkσk-AKNλAKN=12λAKNB-ANCλAKN12(B-ANC)+AKN=12AKk=1NαkσkλAKN=12AKk=1NαkσkAKN+12(B-ANC) Formula (14.5) by application of formula (14.6): Qi2=12σi212AKk=1NαkσkANK+12B-ANC12αiσi-12AKk=1NαkσkANK+12B-ANCQi2=12σi2AKk=1NαkσkαiσiANK+12B-ANC-AKk=1NαkσkQi=12σi2AKk=1NαkσkαiσiANK+12B-ANC-AKk=1Nαkσk

In Fig. 5 the video data input signal IE includes block or macroblock data to be encoded. In case intraframe (I) data are to be encoded, subtractor S passes the video input data via a transformer TR (computing e.g. a DCT transform) and a quantiser Q to an entropy encoder ENTCOD that delivers the encoder video data output signal OE.

In case interframe (P, B), i.e. non-intra frame, data are to be encoded, subtractor S subtracts predicted block or macroblock data PRD from the input signal IE and passes the difference data via transformer TR and quantiser Q to entropy encoder ENTCOD. The switching between the intraframe and interframe modes is carried out by a corresponding switch SW that may be implemented by an XOR function.

The output signal of Q is also fed to an inverse quantiser IQU, the output signal of which passes through an inverse transformer ITR to an adder A and represents reconstructed block or macroblock difference data. The output signal of adder A is intermediately stored in frame store and motion compensation means FSMC which also perform motion compensation on reconstructed block or macroblock data and which deliver such predicted block or macroblock data PRD to subtractor S and to the other input of adder A.

Quantiser Q and inverse quantiser IQU are controlled by the current quantiser step size Qstep (corresponding to Qi in formula (13) in P frames, as described above). It is also possible that FSMC does not perform motion compensation. Inverse quantiser IQU, inverse transformer ITR and frame store and motion compensation means FSMC represent a decoder DEC.

In the I frame and macroblock layer bitrate control processing and P frame layer bitrate control processing depicted in Fig. 4, for an I frame an initialisation step INI is carried out which can use a TMN8 model (if e.g. a bit control error under 5% is required) to compute the quantisation parameter QP and thereby the quantisation step size for each macroblock in that I frame. However, the QP for this I frame can also be derived from the adjacent prior P frame, which means that for all macroblocks in the first I frame a fixed QP is used. Because the Laplace distortion model fits only to the residuum in P frames, and because experiments have shown that in I frames the prediction residue complies with the General Gauss Distribution, the bitrate control for I frames can select either a fixed QP or an adaptive QP derived from the TMN8 model.

Advantageously, by using a fixed QP for each I frame macroblock one gets a smaller 'intra refresh' effect as compared to using an adaptive QP according to the TMN8 model. However, using a fixed QP for an I frame may introduce additional bits for that frame. Therefore an initialisation is carried out only for I frames, but not for P frames because the inventive rate control is accurate enough to adjust the QP for P frames as necessary.

After correspondingly encoding the I frame in step ICOD, it will provide the left-bits information required for applying the parameters of the Laplace distortion model for the following P frame. The depicted skipping control step SKCTRL can be omitted following this I frame encoding. As an alternative, if there was buffer overflow, steps INI, ICOD, UPDRD and SKCTRL are carried out again whereby the quantiser step size or sizes are adapted accordingly.

The target bit number for the following P frame is calculated in an estimate target bitrate step ETB and the corresponding encoder buffer fullness value is adjusted in a buffer control step BCTRL. Step BCTRL determines the buffer filling level, which is used for the macroblock layer bit rate control processing according to Fig. 5 to prevent buffer underflow and buffer overflow. For example, if the buffer fullness level is greater than 80% of the buffer size, a delta value is added to the QP value for preventing overflow. If the buffer fullness level is smaller than e.g. 20% of the buffer size, a delta value is subtracted from the QP value for preventing underflow.

Thereafter the current P frame is encoded in step PCOD, followed by step UPDRD.

In skipping control step SKCTRL it is checked whether or not the encoder buffer (total buffer) is overflow. If true, the current frame is skipped, and the target bit number for the next P frame coding is re-updated in the estimate target bitrate step ETB and the corresponding encoder buffer fullness value is re-adjusted in the buffer control step BCTRL.

The P frame encoding in step PCOD is explained in connection with Fig. 5. The processing for a current P frame starts with a Laplace distortion model parameters initialising step INILD. Thereafter, for the first macroblock in the current P frame the QP, i.e. the quantiser step size Q1 according to formula (13), is calculated in step QPCLC and applied in encoding the macroblock in step MBCOD. The macroblock encoding is followed by a Laplace distortion updating step UPLD. Thereafter it is checked in step EOF whether or not the end of the current P frame has been reached. If not true, the processing continues for the second or next, respectively, macroblock with step QPCLC. If true, the processing continues with step UPDRD in Fig. 4.

"P frame" may include any form of predicted frame, e.g. B frame.

The invention can also be applied to field-based encoding.

高效检索全球专利

专利汇是专利免费检索,专利查询,专利分析-国家发明专利查询检索分析平台,是提供专利分析,专利查询,专利检索等数据服务功能的知识产权数据服务商。

我们的产品包含105个国家的1.26亿组数据,免费查、免费专利分析。

申请试用

分析报告

专利汇分析报告产品可以对行业情报数据进行梳理分析,涉及维度包括行业专利基本状况分析、地域分析、技术分析、发明人分析、申请人分析、专利权人分析、失效分析、核心专利分析、法律分析、研发重点分析、企业专利处境分析、技术处境分析、专利寿命分析、企业定位分析、引证分析等超过60个分析角度,系统通过AI智能系统对图表进行解读,只需1分钟,一键生成行业专利分析报告。

申请试用

QQ群二维码
意见反馈