Skip to content

Memory growth when remuxing many segments with add_stream_from_template(..., opaque=False); opaque=True stays stable #2135

@ncheng89

Description

@ncheng89

Summary

When repeatedly remuxing an input into many short MP4 segments using PyAV, RSS grows significantly if I create output streams via:

out.add_stream_from_template(template_stream, opaque=False)

However, using opaque=True keeps RSS stable over hundreds of segments.

This is reproducible on PyAV 16.1.0 with a long H.264/AAC MP4 input and does not require decoding/encoding (packet remux only).

Environment

PyAV: 16.1.0

Python: 3.11

OS: Linux

Input: long MP4 (H.264 + AAC), segmented into 10s chunks

Reproduction
A minimal script that continuously demuxes packets from an input container and muxes them into a sequence of output MP4 files (10s each). Each segment creates a new output container and streams via add_stream_from_template.

Key difference: opaque=False vs opaque=True.

Repro code

import av, os, math, psutil

SEGMENT_DURATION = 10
OUTPUT_DIR = "segments"
INPUT_URL = "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4"
os.makedirs(OUTPUT_DIR, exist_ok=True)

in_container = av.open(INPUT_URL)
video_stream = in_container.streams.video[0]
audio_stream = in_container.streams.audio[0] if in_container.streams.audio else None

segment_index = 0
video_pts = 0
audio_pts = 0
segment_start_time = None

def start_new_segment(opaque_flag: bool):
    global out_container, out_video_stream, out_audio_stream, video_pts, audio_pts

    path = os.path.join(OUTPUT_DIR, f"segment_{segment_index:06d}.mp4")
    out_container = av.open(path, mode="w")

    out_video_stream = out_container.add_stream_from_template(
        template=video_stream,
        rate=video_stream.average_rate,
        opaque=opaque_flag,
    )
    out_video_stream.time_base = video_stream.time_base

    out_audio_stream = None
    if audio_stream:
        out_audio_stream = out_container.add_stream_from_template(
            audio_stream,
            opaque=opaque_flag,
        )
        out_audio_stream.time_base = audio_stream.time_base

    video_pts = 0
    audio_pts = 0

    rss = psutil.Process(os.getpid()).memory_info().rss / 1024 / 1024
    print(f"seg={segment_index}, rss={rss:.2f}MB, opaque={opaque_flag}, av={av.__version__}")

# CHANGE THIS FLAG:
OPAQUE_FLAG = False  # True stays stable

start_new_segment(OPAQUE_FLAG)

for packet in in_container.demux((video_stream, audio_stream) if audio_stream else (video_stream,)):
    if packet.pts is None:
        continue

    if packet.stream.type == "video":
        if segment_start_time is None:
            segment_start_time = float(packet.pts * packet.time_base)
        current_time = float(packet.pts * packet.time_base)

        if current_time - segment_start_time >= SEGMENT_DURATION:
            out_container.close()
            segment_index += 1
            segment_start_time = current_time
            start_new_segment(OPAQUE_FLAG)

        packet.pts = video_pts
        packet.dts = video_pts
        packet.stream = out_video_stream
        out_container.mux(packet)
        try:
            packet.unref()
        except Exception:
            pass

        video_pts += 1

    elif packet.stream.type == "audio" and out_audio_stream is not None:
        packet.pts = audio_pts
        packet.dts = audio_pts
        packet.stream = out_audio_stream
        out_container.mux(packet)
        try:
            packet.unref()
        except Exception:
            pass

        audio_pts += packet.duration or 0

out_container.close()
in_container.close()

Observed behavior

With opaque=False, RSS grows quickly across segments (e.g. ~100MB → ~200MB+ over dozens of segments; in longer runs it can keep climbing).

With opaque=True, RSS remains almost flat over hundreds of segments (stable within ~1MB).

Expected behavior

For pure remuxing (no encode/decode), opaque=False and opaque=True should not exhibit such a large difference in memory behavior. At minimum, RSS should not keep growing segment after segment when containers are closed.

Additional findings

I tried patching add_stream_from_template:

Current PyAV code (opaque=False branch):

codec_obj = Codec(template.codec_context.codec.name, "w")

If I change it to:

codec_obj = Codec(template.codec_context.codec.name, "r")

then the remux RSS becomes stable even with opaque=False.

However, this breaks encoding use-cases: using add_stream_from_template(..., opaque=False) for actual encoding fails with:

ValueError(22, 'Invalid argument', 'avcodec_send_frame()')

So simply switching "w" → "r" is not a correct fix, but it strongly suggests the memory growth is related to the "w" (encoder) path / codec context creation in the opaque=False branch.

Hypothesis / suggestion

For remuxing, add_stream_from_template might not need to allocate a new AVCodecContext at all. A “copy codecpar only” fast path (e.g. create stream + avcodec_parameters_copy and avoid encoder/decoder context allocation) could avoid the memory growth and also keep opaque=False meaningful for encoding-related workflows.

When opaque is false, output...

root@1dee0f1ab738:/opt# python3 t.py     
seg=0, rss=103.49MB, opaque=False, av=16.1.0
seg=1, rss=134.75MB, opaque=False, av=16.1.0
seg=2, rss=166.95MB, opaque=False, av=16.1.0
seg=3, rss=174.98MB, opaque=False, av=16.1.0
seg=4, rss=187.02MB, opaque=False, av=16.1.0
seg=5, rss=193.11MB, opaque=False, av=16.1.0
seg=6, rss=195.13MB, opaque=False, av=16.1.0
seg=7, rss=187.11MB, opaque=False, av=16.1.0
seg=8, rss=191.12MB, opaque=False, av=16.1.0
seg=9, rss=193.13MB, opaque=False, av=16.1.0
seg=10, rss=193.14MB, opaque=False, av=16.1.0
seg=11, rss=203.15MB, opaque=False, av=16.1.0
seg=12, rss=193.25MB, opaque=False, av=16.1.0
seg=13, rss=199.27MB, opaque=False, av=16.1.0
seg=14, rss=199.28MB, opaque=False, av=16.1.0
seg=15, rss=199.29MB, opaque=False, av=16.1.0
seg=16, rss=203.36MB, opaque=False, av=16.1.0
seg=17, rss=195.34MB, opaque=False, av=16.1.0
seg=18, rss=205.42MB, opaque=False, av=16.1.0
seg=19, rss=205.43MB, opaque=False, av=16.1.0
seg=20, rss=205.43MB, opaque=False, av=16.1.0
seg=21, rss=207.45MB, opaque=False, av=16.1.0
seg=22, rss=195.32MB, opaque=False, av=16.1.0
seg=23, rss=203.32MB, opaque=False, av=16.1.0
seg=24, rss=205.34MB, opaque=False, av=16.1.0
seg=25, rss=195.32MB, opaque=False, av=16.1.0
seg=26, rss=203.33MB, opaque=False, av=16.1.0
seg=27, rss=209.35MB, opaque=False, av=16.1.0
seg=28, rss=201.33MB, opaque=False, av=16.1.0
seg=29, rss=207.33MB, opaque=False, av=16.1.0
seg=30, rss=209.34MB, opaque=False, av=16.1.0

When opaque is true, output

root@1dee0f1ab738:/opt# python3 t.py 
seg=0, rss=104.32MB, opaque=True, av=16.1.0
seg=1, rss=104.55MB, opaque=True, av=16.1.0
seg=2, rss=104.55MB, opaque=True, av=16.1.0
seg=3, rss=104.55MB, opaque=True, av=16.1.0
seg=4, rss=104.55MB, opaque=True, av=16.1.0
seg=5, rss=104.55MB, opaque=True, av=16.1.0
seg=6, rss=104.55MB, opaque=True, av=16.1.0
seg=7, rss=104.55MB, opaque=True, av=16.1.0
seg=8, rss=104.56MB, opaque=True, av=16.1.0
seg=9, rss=104.56MB, opaque=True, av=16.1.0
seg=10, rss=104.56MB, opaque=True, av=16.1.0
seg=11, rss=104.56MB, opaque=True, av=16.1.0
seg=12, rss=104.56MB, opaque=True, av=16.1.0
seg=13, rss=104.56MB, opaque=True, av=16.1.0
seg=14, rss=104.56MB, opaque=True, av=16.1.0
seg=15, rss=104.56MB, opaque=True, av=16.1.0
seg=16, rss=104.56MB, opaque=True, av=16.1.0
seg=17, rss=104.56MB, opaque=True, av=16.1.0
seg=18, rss=104.56MB, opaque=True, av=16.1.0
seg=19, rss=104.56MB, opaque=True, av=16.1.0
seg=20, rss=104.56MB, opaque=True, av=16.1.0
seg=21, rss=104.56MB, opaque=True, av=16.1.0
seg=22, rss=104.56MB, opaque=True, av=16.1.0
seg=23, rss=104.56MB, opaque=True, av=16.1.0
seg=24, rss=104.56MB, opaque=True, av=16.1.0
seg=25, rss=104.56MB, opaque=True, av=16.1.0
seg=26, rss=104.56MB, opaque=True, av=16.1.0
seg=27, rss=104.57MB, opaque=True, av=16.1.0
seg=28, rss=104.57MB, opaque=True, av=16.1.0
seg=29, rss=104.57MB, opaque=True, av=16.1.0
seg=30, rss=104.57MB, opaque=True, av=16.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions