CS6114 Assignment Assignment 03 Video Coding V ideo coding is the process of preparing digital video content for storage or transmission, such as in a data file or bitstream. The libx265 H.265 codec (MPEG-4 High Efficiency Video Coding) was designed to achieve from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate than the previous libx264 H.264 (MPEG-4 Advanced Video Coding) standard for a variety of different applications. In turn H.264 was designed to achieve good video quality at substantially lower bit rates than previous MPEG standards. One application area where H.264 and H.265 are currently in use is the delivery of streamed content over data networks using HTTP. In this application, the H.264/H.265 standard is used in conjunction with the MPEG-DASH standard to describe (MPEG-DASH) and encode media content for delivery using standard web infrastructure (web server/CDN/ HTTPS/TCP/IP). MPEG-DASH partitions content into a sequence of short, usually fixed duration segments. These segments are encoded at a variety of different bitrates, giving 1 https://www.vcodex.com/hevc-an-introduction-to-high-efficiency-coding/ Slice CS6114 Assignment alternative versions of the content. These alternative segments are time- aligned so that while the content is being played back by an MPEG-DASH client, the client can use a bitrate adaptation (ABR) algorithm to select the next segment to be presented that has the highest bitrate (quality) that can be downloaded in time for playback without causing stalls or buffering. Adaptive streaming requires that each segment of these alternative versions of the content do not exceed (or significantly fall short of) the nominal bitrate for that version of the content. To achieve this objective the encoding must employ constrained bitrate encoding techniques. Rate Control Rate control is the process used by the encoder in deciding how to allocate bits to encode each picture. The goal of (lossy) video coding is to reduce the bitrate while retaining as much quality as possible. Rate control is a crucial step in determining that tradeoff between size and quality. CBR and VBR encoding sets a target data rate and a bitrate control technique is applied by the encoding application to achieve the target bitrate. It can be difficult to choose an appropriate data rate for constrained connections and the quality of experience (QoE) for viewers can be impacted if the range of VBR is too high or in the case of CBR, if the nature of the content varies greatly. Often constrained VBR between 110%-150% is used, however this assumes a target bitrate to achieve an acceptable level of quality is known before the content is encoded. Not all video content is equally compressible. Low motion and smooth gradients compress well (few bits for high perceived quality) , whereas high motion and fine spatial detail are less compressible (more bits to preserve quality). Often it is easier to specify a target quality and let the encoding application vary the data rate to achieve this target. However, the data rate required to achieve the target quality is unknown in advance. Constant Rate Factor (CRF) encoding specifies a quality level and the encoding application adjusts the data rate to achieve the target quality. The result is content with a fixed quality level, but the data rate is unknown in advance. If quality is the objective this is not a concern, but if the data rate varies significantly over the duration of the content, it may have implications for the deliverability. Capped CRF applies the data rate necessary to achieve a target quality, together with a maximum data rate to ensure deliverability. 2 CS6114 Assignment In this assignment, you will encode source material using both the libx264 (H.264) and libx265 (H.265) codecs and compare the resulting bitstreams in terms of coding efficiency and quality. The expected use of the encoded material is delivery via MPEG-DASH. However, you do not have to create an MPD file or fragment the representations. Encoding In ffmpeg the the Video Buffering Verifier (VBV) enforces that the bitrate is constrained to a maximum bitrate. This is essential for content that will be streamed, as it ensures that each segment will not exceed (or substantially fall short of) the nominal bitrate for that version of the content. VBV can be used both with 2-pass VBR (use it in both passes), or with CRF encoding (capped CRF). For example using libx264 (H.264) ffmpeg -i source.mov -c:v libx264 -crf 23 -maxrate 500K -bufsize 256K source-at—0.5M.mp4 Using the above parameters will probably not result in full use of the available bitrate (i.e the resulting bitrate will not be 500kbps). It can be advantageous to specify a target average bitrate and allow the maximum rate to exceed this by a small amount. Applying VBV to CRF encoding, requires determining the CRF value that on average, results in the maximum bitrate, but does not exceed it. If the encode always exceeds the maximum bitrate, the CRF is too low. However, if the bitrate does not always hit the maximum, extra quality may be gained by lowering the CRF value. A value of +6 halves the bitrate (in general, but this is content dependent). The buffer size (bufsize) parameter determines how strict ffmpeg is at checking the variability of the bitrate. For streaming you will want to ensure that the bitrate for each segment conforms closely to the nominal bitrate. 3 CS6114 Assignment Steps The source content is a Quicktime movie containing a video stream. The video stream is encoded using libx264 (H.264) using intra-frame compression only (i.e. all I-pictures). The content is a mixture of different styles of content. • Use ffmpeg and the libx264 codec to encode two H.264 video bitstream representations, at nominal bitrates of 0.5Mbps and 2Mbps using GOP lengths of 50 and 100. • Use ffmpeg and the libx265 codec to encode two H.265 video bitstream representations, at nominal bitrates of 0.5Mbps and 2Mbps using GOP lengths of 50 and 100. You will generate eight encodings • H.264 0.5Mbps using GOP length of 50 • H.264 0.5Mbps using a GOP length of 100 • H.264 2Mbps using GOP length of 50 • H.264 2Mbps using a GOP length of 100 • H.265 0.5Mbps using GOP length of 50 • H.265 0.5Mbps using a GOP length of 100 • H.265 2Mbps using GOP length of 50 • H.265 2Mbps using a GOP length of 100 You must investigate • What CRF value and constrained bitrate parameters are needed to achieve these target nominal bitrates for each combination of codec and GOP length? • Are these values the same for H.264 and H.265? • Which codec is better able to achieve the nominal bitrate? • The difference in encoding times. 4 CS6114 Assignment Use the supplied Jupyter notebook to investigate • The bitrate of each GOP in each encoding. Does each conform to the target nominal bitrate? Which encoding is the most consistent? Explain your findings. • The quality of the encodings. Are there differences in quality between the encodings and within the encodings? Are these differences practical (i.e. noticeable)? You may choose to augment the Jupyter notebook to generate additional visualisations or analyses of the results. Questions Using these results write a report (maximum 2000 words) addressing the following questions • What differences are there between H.264 and H.265 encoding? • Is the quality of the content consistent for the different codecs and encodings? • Does changing the GOP length affect the quality of the coded content? • What GOP length would you recommend to give the best balance between quality (coding efficiency) and adaptability (shorter GOP lengths) for each of these codecs? • Which metric (PSNR or SSIM) is a better measure of quality? Do you notice any visual difference? • Is there a relationship between picture type, size and quality? Does this relationship hold across different bitrates and codecs? Be sure to include any other observations that you noted and the results and rationale for any additional analysis you performed. 5 CS6114 Assignment Deliverables An archive (zip file) containing the encoded bitstreams, any data files (e.g. CSV files), Jupyter notebooks and your report. Do not include the source material provided to you. Resources The following resources are available on Canvas or externally • A Jupyter notebook that compares videos and produces a CSV file containing a measure (the PSNR and SSIM) of the frame-by-frame difference between the reference movie (the original) and an encoded video • This Jupyter notebook also demonstrates how frame-level information from ffprobe can be combined with the quality measures • Encoding for MPEG-DASH tutorials https://blog.streamroot.io/encode- multi-bitrate-videos-mpeg-dash-mse-based-media-players/ 6
欢迎咨询51作业君