Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send live updates outside a git repo #5108

Merged
merged 2 commits into from
Feb 15, 2024
Merged

Send live updates outside a git repo #5108

merged 2 commits into from
Feb 15, 2024

Conversation

dberenbaum
Copy link
Contributor

Docs for iterative/dvclive#646. That PR is for DVCLive, but it's really about logging to Studio from platforms where code is not run inside a Git repo.

The docs PR covers:

  • Live updates in Studio when working outside a Git repo
  • Live updates in SageMaker jobs
  • Live updates in Databricks Repos

@shcheklein shcheklein temporarily deployed to dvc-org-live-no-git-a6fn2ivmiv February 6, 2024 21:43 Inactive
@dberenbaum dberenbaum added the ⌛ status: wait-core-merge Waiting for related product PR merge/release label Feb 6, 2024
@dberenbaum dberenbaum marked this pull request as ready for review February 6, 2024 21:52
Copy link
Contributor

github-actions bot commented Feb 6, 2024

Link Check Report

There were no links to check!

@@ -125,6 +121,50 @@ The end result of running the pipeline looks like this:

![Pipeline](/img/sagemaker-pipeline.png)

### Live experiment updates in SageMaker jobs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to show / mention that is the simplest but a limited approach. Another alternative is to bundle git repo to Sagemaker, bundle results at the end of the job. I know users who are using that approach and it's not terribly complicated. We even have script examples. So we can potentially do a blog or cleanup and share them.

I'm not sure about Databricks though ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we can see an example and we can add it. I hesitate to mention other approaches now if we don't have clear instructions on how to implement them yet.

Copy link
Member

@shcheklein shcheklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good stuff. I think there are some alternatives at least in case of Sagemaker - people just bundle the repo, also it's not exactly clear what is the e2e workflow. Those experiments will be detached forever in Studio? do we expect them to just stay there? How about data - does this workflow assume no DVC-tracked data / models?

@dberenbaum
Copy link
Contributor Author

Those experiments will be detached forever in Studio? do we expect them to just stay there? How about data - does this workflow assume no DVC-tracked data / models?

Yup, that's pretty much it. We can add more in the future, but the simplicity is the point here. It shows that you can get a very similar experience to other trackers with about the same amount of effort. Showing additional value by copying files up and down from sagemaker can be a next step.

@shcheklein shcheklein temporarily deployed to dvc-org-live-no-git-a6fn2ivmiv February 15, 2024 18:45 Inactive
@dberenbaum dberenbaum merged commit 23897b5 into main Feb 15, 2024
3 checks passed
@dberenbaum dberenbaum deleted the live-no-git branch February 15, 2024 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⌛ status: wait-core-merge Waiting for related product PR merge/release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants