Skip to content

accurate tracking of bytes allocated #228

@shwestrick

Description

@shwestrick

MaPLe currently only tracks allocations in a very coarse-grained manner, specifically by tracking chunk allocations and deallocations. As a consequence, the functions MPL.GC.{bytesAllocated,bytesAllocatedOfProc} are fairly inaccurate. Our current scheduling of GCs might also be suffering from this lack of precision.

The issue goes a bit deep. But essentially, the lack of precision comes from the following line of code, which upper bounds the number of bytes allocated within each chunk simply by adding the whole size of the chunk to the counter:

s->cumulativeStatistics->bytesAllocated += HM_getChunkSize(chunk);

The same strategy is used to record bytes allocated for the purpose of scheduling GC:

HM_HH_addRecentBytesAllocated(thread, HM_getChunkSize(chunk));

Ideally, we would be able to accurately track the number of bytes allocated during execution, such that it corresponds exactly to the source-level allocations of the program.

In MLton, the way this is accomplished is by updating counter whenever a GC occurs. This works for MLton, because MLton performs GC only when it runs out of heap space. The same strategy doesn't work for MaPLe, because MaPLe doesn't schedule GCs at all unless a sufficient number of bytes have been allocated! In other words, MaPLe needs a (somewhat accurate) count of the number of bytes allocated at every moment. MaPLe doesn't have a natural notion of "running out of heap" -- we are always free to allocate another heap chunk if we would like.

However, just like MLton, we do have an opportunity to update the value when we enter and exit the run-time system. See enter, which calls HM_HH_updateValues to update the frontier.

Idea: we should be able to get a precise count of allocations by tracking bytes allocated whenever the frontier of a heap chunk advances.

Implementing this strategy is going to be hairy. Currently, advancing chunk frontiers in the run-time system is a bit... undisciplined. The enter function is just one place where a frontier advance might happen. Other frontier advances in the run-time system are marked by a call to HM_HH_updateValues, but not always. In a few places we directly call HM_updateChunkFrontierInList, but this function also allows for the frontier to move backwards, and more generally, it is used in both non-heap chunks and heap chunks...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions