'c++, ffmpeg tanscoding: time_base differs depending on the container
I transcode video (mkv and mp4). When mkv transcoded to mkv, output is fine (output video fps and duration are same as input), but if mkv transcoded to mp4, output fps is less than input 2 times and duration of output video is more than input 2 times.
I transcode only video, audio writing as decoded packet from input file.
Video stream and context created like this:
out_stream = avformat_new_stream(ofmt_ctx, NULL);
avcodec_parameters_copy(out_stream->codecpar, in_codecpar);
out_stream->codecpar->codec_tag = 0;
codec_encode = avcodec_find_encoder(out_stream->codecpar->codec_id);
context_encode = avcodec_alloc_context3(codec_encode);
context_encode->width = width;
context_encode->height = height;
context_encode->pix_fmt = codec_encode->pix_fmts[0];
context_encode->time_base = av_inv_q(in_stream->r_frame_rate);
out_stream->time_base = context_encode->time_base;
out_stream->r_frame_rate = in_stream->r_frame_rate;
Transcoding (simplified):
int64_t i = 0;
while (true) {
av_read_frame(ifmt_ctx, pkt);
in_stream = ifmt_ctx->streams[pkt->stream_index];
pkt->stream_index = stream_mapping[pkt->stream_index];
pCodecCtx = ifmt_ctx->streams[pkt->stream_index]->codec;
pCodec = avcodec_find_decoder(pCodecCtx->codec_id);
error = avcodec_open2(pCodecCtx, pCodec, nullptr);
if (pkt->stream_index == AVMEDIA_TYPE_VIDEO) {
....
avcodec_decode_video2(pCodecCtx, frame, &frameFinished, pkt);
....
// manipulate with frame
....
frame->pts = i;
avcodec_send_frame(context_encode, frame);
while ((ret = avcodec_receive_packet(context_encode, pkt_encode)) >= 0) {
if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
break;
av_packet_rescale_ts(pkt_encode, context_encode->time_base, out_stream->time_base);
av_interleaved_write_frame(ofmt_ctx, pkt_encode);
av_packet_unref(pkt_encode);
}
i++;
}
else {
av_packet_rescale_ts(pkt, in_stream->time_base, out_stream->time_base);
av_interleaved_write_frame(ofmt_ctx, pkt);
}
av_packet_unref(pkt);
}
Mediainfo of output mkv transcoded video (mkv -> mkv):
- Frame rate : 23.976 (24000/1001) FPS
Mediainfo of output mp4 transcoded video (mkv -> mp4):
- Frame rate : 11.988 (12000/1001) FPS
- Original frame rate : 23.976 (24000/1001) FPS
When video context created, time_base values are (mkv -> mp4 and mkv -> mkv):
FPS input: (24000/1001)
FPS output: (24000/1001)
context_decode->time_base (1001 / 48000)
context_encode->time_base (1001 / 24000)
in_stream->time_base (1 / 1000)
in_stream->codec->time_base (1001 / 48000)
out_stream->time_base (1001 / 24000)
out_stream->codec->time_base (0 / 1)
When video frame is writing, time_base values are (mkv -> mp4):
context_encode->time_base (1001 / 24000)
out_stream->time_base (1 / 48000)
But if mkv->mkv:
context_encode->time_base (1001 / 24000)
out_stream->time_base (1 / 1000)
ffmpeg av_dump:
Input #0, matroska,webm, from '24fps2.mkv':
- Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
- Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp (default)
Output #0, mp4, to 'temp_read.mp4':
- Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 23.98 tbr, 23.98 tbn
- Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp
But if I manually set time_base to be equal to FPS/2 of input video:
AVRational temp;
temp.num = 500;
temp.den = 24001;
context_encode->time_base = temp;
out_stream->time_base = context_encode->time_base;
out_stream->r_frame_rate = in_stream->r_frame_rate;
When video stream and context created, time_base values are (mkv -> mp4):
context_encode->time_base (500 / 24001)
out_stream->time_base (500 / 24001)
When video frame is writing, time_base values are (mkv -> mp4):
context_encode->time_base (500 / 24001)
out_stream->time_base (1 / 48000)
And video FPS and duration is correct:
- Frame rate : 23.976 (24000/1001) FPS
What is wrong with time_base and av_packet_rescale in this case and how it could be fixed?
Solution 1:[1]
Problem was with different timebases. Audio encoding:
av_packet_rescale_ts(pkt, in_stream->time_base, out_stream->time_base);
Video encoding:
av_packet_rescale_ts(pkt_encode, context_encode->time_base, out_stream->time_base);
But out_stream variable used the same with both encodings.
I replaced out_stream->time_base
with ofmt_ctx->streams[pkt->stream_index]->time_base
and now it works fine.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | prostraction |