Authors note: This was initially started some time ago but was discontinued due to schedule and other constraints. It is my intention to bring the rest of this theory to fruition with code and example. For now, theory:
Consistent Thread Execution Manager.
Theory:
A context switch is when the scheduler halts one thread of execution and starts another. These context switches are often done based on what the kernel thinks is best for any given situation.
However when developing a non-trivial multithreaded application the engineer will have a perspective on how things should work from a task-accomplishment point of view and code his or her logic accordingly. The problem is that context switches usually result in threads executing different than the engineer envisioned, resulting in unexpected behavior. This is usually the case when sleep-statements are used to force context switches when things are behaving in an erratic manner. One of the many reasons this is bad that is that the engineer is generally patching a situational problem (different load, different OS, machine, etc).
For any given non-trivial application, there are many permutations of context switches. Of these, X context switches will allow the application to complete, Y permutations will not resulting in deadlock or worse. Moreover, within X there are some permutations of context switches that will work better or more efficiently than others.
So step one is to identify the X group from the Y group.
This will be done using the thread context switch recording technique embodied in Modern Multithreaded Programming book.
Two key support routines for this will be the scoring module which will report how well any given thread does its job and completes, and the switch learning module which could use a GA or time-based neural net to determine success. Context switch patterns that result in deadlock or race conditions, even temporarily are discarded and successful patterns are retained.
It is unrealistic to think that a single context switch pattern can be determined as “best” for any non-trivial application and load situation. Therefore it seems that a series of context switch subpatterns can be determined and applied to different problem domains based on load or other system stress indicator.
Further, these subpatterns must be repeatable else storing and enforcing a playback pattern could use an unacceptable amount of memory. Thus the subpatterns must be generalized and tagged. With a proper thread class not only can the context switches be stored and played back but also thread status (”interruptible” vs “let me process” vs “critical section”) giving further guidance to the scheduler.
Each thread object (or struct in C) also has the concept of a default score. Lets say the range of the score can be 0-100; if a thread runs to completion a minimum score of 50 is achieved. The caller or creator of the thread can then further augment the score with up to 50 additional points based logically on how well that thread accomplished its task. When in “score” mode, the scheduler then collects scores from threads as they exit (C++ would make this easier but may not be available everywhere), it can feed these values into a GA or neural network set up to determine best of breed sets of switches. Logical scoring can include how quickly a routine accomplished its task, how well it used the time-slice it has been allotted, etc. For example, if there are two threads and the second thread spends an inordinate amount of time waiting for the first thread to do something, the excessive context switches will count against the score of the second thread which takes away from the overall score of the subroutine or app.
Once an optimal set of context switches has been determined (all scores must be at least 50 (an arbitrary number meaning the task/thread ran to completion), these switches are archived and when played back, assures predictable application performance and stability.
At first these patterns will be application-wide until the basic bugs can be worked out; from there, submanagers will handle subsystem patterns. Note: there may need to be come intersubsystem communications to coordinate activities.
To demonstrate this idea, the following elements will require development:
The task scheduler:
* Issues new threads based on ID (0-256)
* In training mode: records context switches, stores patterns, collects scores from threads, stores final score.
* In playback mode: Manages all context switches.
The thread class:
* Basic threading functionality (pthread)
* Has API for application-level score
* Reports to scheduler at a minimum a complete/non-complete score, augmented by the logical application score.
The API for application-level scoring should include two components:
1. A score for “thrash”
2. A logical score for fitness to purpose (whatever the purpose the thread has)
(To be continued…)

Related Articles
No user responded in this post
Leave A Reply