Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightning: prevent conflict with internal step property. #490

Merged
merged 1 commit into from
Mar 7, 2023

Conversation

daavoo
Copy link
Contributor

@daavoo daavoo commented Mar 7, 2023

If a metric didn't contain split but only freq suffix, standardize_metric_name was causing a conflict in the summary. For example: grad_step was being stored as step/grade.

If a metric didn't contain `split` but only `freq` suffix, `standardize_metric_name` was causing a conflict in the summary. For example:
`grad_step` was being stored as `step/grade`.
@daavoo daavoo requested a review from dberenbaum March 7, 2023 17:24
@daavoo daavoo added A: frameworks Area: ML Framework integration bugfix labels Mar 7, 2023
@daavoo
Copy link
Contributor Author

daavoo commented Mar 7, 2023

This surfaces the issue of treating our internal property step as a metric, but that is a whole different story

@dberenbaum
Copy link
Collaborator

Actually, let me take another look here 🙏

@dberenbaum
Copy link
Collaborator

dberenbaum commented Mar 7, 2023

Okay, you can merge, but I think we need to follow up on this.

The root of the problem is here, right?

if isinstance(v, Mapping):
d[k] = nested_update(d.get(k, {}), v)

We try to override {"step": 0} with {"step": {"grad": ...}} but there is no key inside step. It seems like we should be checking that d.get(k, {}) returns a mapping. If not, we can either override or fail more gracefully. (edit: why aren't we just merging dictionaries here? what added value does nested_update provide?)

This could happen without treating step as a metric. For example, even without step, you could get the same error for epoch here if you include epoch-level metrics.

@dberenbaum
Copy link
Collaborator

Also: should we invert {split}/{freq}/{metric} to be {metric}/{freq}/{split}? This would keep all variations on a specific metric grouped together, making it easier to compare them. This would have to be a 3.0 change, but it would also reduce these conflicts.

@daavoo daavoo merged commit 5b3ffc2 into main Mar 7, 2023
@daavoo daavoo deleted the lightning-fix-step branch March 7, 2023 20:09
@daavoo
Copy link
Contributor Author

daavoo commented Mar 7, 2023

It seems like we should be checking that d.get(k, {}) returns a mapping. If not, we can either override or fail more gracefully.

Correct. We just never encountered the situation before of trying to convert an already logged metric to a dictionary

(edit: why aren't we just merging dictionaries here? what added value does nested_update provide?)

How would you merge a nested value? See the description of #197

This could happen without treating step as a metric. For example, even without step, you could get the same error for epoch here if you include epoch-level metrics.

We should fail in log_metric if a nested metric (foo/bar) is logged and any of the parent keys already exists in the summary.

@daavoo
Copy link
Contributor Author

daavoo commented Mar 7, 2023

This would keep all variations on a specific metric grouped together, making it easier to compare them.

I think it's arguable that there are use cases where the current behavior would be preferable and others where not. No real good answer.

@dberenbaum
Copy link
Collaborator

I think it's arguable that there are use cases where the current behavior would be preferable and others where not. No real good answer.

A specific metric will always have values on the same scale, and in my experience it's more natural to have these all next to each other rather than have potentially dissimilar metrics grouped together for an entire split. Would you generally prefer to have them grouped as they are now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: frameworks Area: ML Framework integration bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants