Skip to content

Conversation

@EthanMarx
Copy link
Contributor

Adds a pipeline for running multiple training jobs using different seeds.

Copy link
Contributor

@wbenoit26 wbenoit26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you were planning on adding more to this, but I think it looks good, just had a couple questions

# which has is enough memory to write large temp
# files with luigi/law
env["TMPDIR"] = f"/local/{os.getenv('USER')}"
env["TMPDIR"] = os.getenv("AFRAME_TMPDIR")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should mention this variable in the README along with the others that you suggest putting into a .env


def requires(self):
seeds = np.random.randint(0, 1e5, size=self.num_seeds)
for seed in seeds:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the different seeds are done sequentially? Or does that scheduling happen sequentially and all the training/inference happens in parallel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So they'll still get scheduled in "parallel" , which is actually sort of problematic at inference time when they'll be fighting for gpus.

This is mostly why i'm holding off on merging this. Should find a cleaner solution.

environment += f'PATH={os.getenv("PATH")} '
environment += f"LAW_CONFIG_FILE={self.law_config} "
environment += f"USER={os.getenv('USER')} "
environment += f"TMPDIR={os.getenv('TMPDIR')} "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be AFRAME_TMPDIR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants