Advertisement

explain "1 core was dedicated to do all the scheduling, now we don't do that anymore"

Started by March 02, 2020 01:34 AM
2 comments, last by Ingenu 4 years, 8 months ago

My understanding of good multithreading in games was that we have a job system that runs on 1 thread to which jobs are scheduled and get assigned to worker threads.

That's also the technique used by all free to use engines as far as I know (unity, unreal, cry, etc.).

Until I watched an interview with Id Software lead engine programmer Billy Kahn in which he says “1 core was dedicated to do all the scheduling, now we don't do that anymore” https://youtu.be/9S5ABf53rDo?t=123

Can someone explain how that works?

I don't know how their solution work but I implemented my ThreadPool and the Task System different. I don't have one scheduling thread but (nearly)-lockless queues that are concurrent for multi-read-multi-write. Each worker has it's own queue while there is also a queue for pending jobs.

When starting a thread/ a task, the system decides which worker to give a task by an incremental counter. If the counter exceeds the number of cores/ workers acquired, it will swap back. This way the system ensures that work is spread over all cores equally and also uses the whole bandwidth of the system.

Worker threads that don't have any work anymore are trying to find another worker with ‘too much’ work and grab from it's job backlog instead. This technic is called ‘work stealing’ and should also ensure the balance of all threads in the job system.

If a worker is still out of work, it will look for pending jobs. I use fibers that each have their own stack and save CPU registers to memory pages. So a job truely runs on it's own and dedicated from other jobs. Jobs can return to the Scheduler if needed and get placed into the pending jobs queue until some conditions are met. Every worker can access the pending job queue any time to test the next job for getting rescheduled.

Finally, worker threads can change into idle state. They won't consume CPU power anymore unless there are still pending jobs waiting to be rescheduled and can be reactivated if new jobs appear in their queue.

I watched this GDC talk and did some experiments with other mechanisms like setjmp/ longjmp before but am finally happy with the fiber solution

Advertisement

1 global task queue is known as Grand Central Dispatch scheduler. (Presented by Apple.)

1 task queue per worker thread with task stealing is Cilk scheduler. (University project then purchased by Intel and integrated in its compiler and other open source compilers before being considered deprecated by Intel promoting Threading Building Blocks + C++11 instead.) (That system is better than GCD because it keeps caches warm, reduces interactions between threads, stealing intelligently is difficult though, but you'll execute depth first and steal breadth first.)

But from the interview is not clear whether he implies a Cilk scheduler or a GCD one, or simply refers to not using a fork/join model (like OpenMP). (ie having tasks spawning tasks with dependency graph vs a single master thread with parallel for loops from time to time.)

-* So many things to do, so little time to spend. *-

This topic is closed to new replies.

Advertisement