Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler: Use the root task as a scheduler task #57465

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kpamnany
Copy link
Contributor

A Julia thread runs Julia's scheduler in the context of the switching task. If no task is found to switch to, the thread will sleep while holding onto the (possibly completed) task, preventing the task from being garbage collected. This recent Discourse post illustrates precisely this problem.

A solution to this would be for an idle Julia thread to switch to a "scheduler" task, thereby freeing the old task.

Other than thread 1, the root task that is created for every Julia-started (non-GC) thread essentially ends immediately -- we call jl_finish_task at the end of jl_threadfun.

This PR uses root tasks (on all but thread 1) as scheduler tasks. This solves the problem for all but thread 1. We could do the same for thread 1 also, but it would require special-casing as we cannot use thread 1's root task for this purpose.

@kpamnany kpamnany requested review from vtjnash and gbaraldi February 19, 2025 15:36
assert(jl_atomic_load_relaxed(&ct->_state) == JL_TASK_STATE_RUNNABLE);
return ct;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can/should delay this task switch even more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think of it, we can refactor this slighly. The sleep atomics only make sense after we switched to the root task. And it can/should go to sleep as soon as it checks, theres no need for it to spin since we're guaranteed to have already spun before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're saying we should move this to the top of the if block, before switching sleep_check_state to sleeping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I moved this code block ahead of the atomics and added a condition to the if to avoid sleep checking twice.

@JeffBezanson
Copy link
Member

We could do the same for thread 1 also, but it would require special-casing as we cannot use thread 1's root task for this purpose.

Why is this?

@gbaraldi
Copy link
Member

Thread 1s root task actually does stuff like run the REPL/toplevel code. So it's a dead task

@kpamnany
Copy link
Contributor Author

I considered creating a new task in julia_init and making that the root task, but the thread calling julia_init also needs to be wrapped in a task. Alternatively, we could add a sched_task to the TLS alongside root_task and use that uniformly on all threads instead. I chose the simplest option, but am open to any suggestions for improvement.

@vchuravy
Copy link
Member

Could we make the root task on thread 1, do nothing but launch a task that does the work it used to do, and thus it becomes the scheduler task?

@kpamnany
Copy link
Contributor Author

kpamnany commented Feb 21, 2025

Could we make the root task on thread 1, do nothing but launch a task that does the work it used to do, and thus it becomes the scheduler task?

We can. It gets complicated though, because we have to return from julia_init, so we need the caller's stack, which is the standard process stack (i.e. larger than our typical task stack). Also, if we make the root task on thread 1 the scheduler task, then what's the task wrapping the julia_init caller? Do we root that somewhere?

Am I understanding your suggestion correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants