-
Notifications
You must be signed in to change notification settings - Fork 1k
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
I signed it! |
CLAs look good, thanks! |
}() | ||
|
||
select { | ||
case m := <-message: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's necessary to print results from the main goroutine. Doing so ends up serializing all the network calls, because the loop can't move on to the next local dep to scan until the syncSourceFor call completes.
Maybe instead, make a closure at the beginning of getProjectData()
that we can invoke in a goroutine every time we encounter a new project root.
Of course, we'll need to take some care there to deal with printing from multiple goroutines - if nothing else, it might be better to print only either before or after SyncSourceFor
completes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok doing so has increased speed drastically. Though the way Ive written it not sure it needs to be a closure. Im still analyzing the code to make sure the new goroutine to call the closure always has a chance to complete.
go func() { | ||
goStart := time.Now() | ||
defer fmt.Fprintf(os.Stderr, "Go Duration: %s\n", time.Since(goStart)) | ||
err2 := sm.SyncSourceFor(gps.ProjectIdentifier{ProjectRoot: pr}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted in slack, I probably need to add an sm method that more accurately reports the existence of the repo - does it exist upstream? what about on disk?
@@ -265,7 +275,10 @@ func getProjectData(ctx *dep.Ctx, pkgT gps.PackageTree, cpr string, sm *gps.Sour | |||
} | |||
|
|||
continue | |||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
continue
means no need for an else branch
@@ -237,6 +238,15 @@ func getProjectData(ctx *dep.Ctx, pkgT gps.PackageTree, cpr string, sm *gps.Sour | |||
packages := make(map[string]bool) | |||
notondisk := make(map[gps.ProjectRoot]bool) | |||
ondisk := make(map[gps.ProjectRoot]gps.Version) | |||
|
|||
printNewDeP := func(pr gps.ProjectRoot, sm *gps.SourceMgr) { | |||
message := "Cloning" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better as "Cloned", past tense, because this will only print after the sm is done pulling the dep down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, might be better to switch out the word entirely - "cloning" applies to git, but git isn't all we're dealing with here. Maybe "Fetched latest of"? I think that's generic enough. It also glosses over whether, in git terms, it's actually doing a clone, or just ensuring our local clone is already up to date.
My reasoning behind making it a closure was reuse - there are a couple more places from which it needs to be called. Right now, only the part of the code dealing with direct dependencies is instrumented. Look down in the
They're all guaranteed to complete and print, but not by anything directly visible in the code here. All of those This approach doesn't guarantee we'll see everything printed - for example, transitive deps of deps that aren't in GOPATH won't ever get visited, so a printing goroutine will never be run for them. But I'm not even sure that's really solvable, so this is still a substantial improvement over what we have today. EDIT: note that i'm not saying it's a good practice to just rely on the underlying implementation for completion - just that it will happen. We could certainly add a mechanism to ensure these goroutines complete in the code here - just so long as it doesn't cause us to lose concurrency. |
@@ -237,6 +238,15 @@ func getProjectData(ctx *dep.Ctx, pkgT gps.PackageTree, cpr string, sm *gps.Sour | |||
packages := make(map[string]bool) | |||
notondisk := make(map[gps.ProjectRoot]bool) | |||
ondisk := make(map[gps.ProjectRoot]gps.Version) | |||
|
|||
printNewDeP := func(pr gps.ProjectRoot, sm *gps.SourceMgr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/DeP/Dep/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
@@ -266,6 +276,8 @@ func getProjectData(ctx *dep.Ctx, pkgT gps.PackageTree, cpr string, sm *gps.Sour | |||
|
|||
continue | |||
} | |||
go printNewDep(pr, sm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why no waiting finish of this goroutine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #194 (comment)
@@ -363,6 +375,7 @@ func getProjectData(ctx *dep.Ctx, pkgT gps.PackageTree, cpr string, sm *gps.Sour | |||
} | |||
} else { | |||
dependencies[pr] = []string{pkg} | |||
go printNewDep(pr, sm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
@sdboyer Is there anything else needed here for this issue to close? |
@xmattstrongx i think some feedback from @peterbourgon would be useful, as it was his note that motivated putting this up in the first place. i'm adding him as a reviewer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that it prints something for each dep. I'm confused by the "Fetched latest of dep" verbiage, both of those words signal something specific to me, and it's not what's actually going on. Init is just cloning each identified dep to GOPATH/pkg/dep/sources, right? What about just "Fetched dep" or "Cached dep"?
Also, I don't understand why each printNewDep call is invoked in a new goroutine rather than called synchronously, and the explanation you linked to @mattn doesn't help me understand. Can you explain?
@peterbourgon We decided to make each call happen in a new goroutine because the underlying call to As far as the output verbage @sdboyer and I went back and forth a couple times on what it should display. I do like the Cached dep |
ditto what @xmattstrongx said re: separate goroutines.
Ooh yeah, I liked "Cached dep" even more. I was trying to come up with something to avoid saying "cloned," as that's git/hg-specific, and also isn't the action it's performing if the dep is already present in the cache. So, definitely, this is better. |
@xmattstrongx Thanks for the explanation. Is |
per a previous slack conversation with @sdboyer
|
I'm sorry, that doesn't seem to answer my question. Is |
@peterbourgon I guess that depends on what "the work of
The same |
OK. Calling the closure |
Was violating single responsibility. The new name makes sense to me. Gonna make the change to the function name to
|
OK, I think we're good to merge this now. Thanks! |
The current rules for github usernames only allow for alphanumeric characters or single hyphens and they cannot begin or end with a hyphen. In the past however github username rules were less strict and we need to support these (issue golang#194). Using the Google BigQuery public github dataset, I've checked the usernames of all public commits against the currently defined regex in deduce.go. From these results I've concluded that usernames with multiple consecutive hyphens and usernames that end with a hyphen were allowed in the past and do exist. Fortunately these are the only exceptions I've found, there were no usernames that started with a hyphen or contained any other special characters. In addition, this change now also allows one-letter usernames.
#185