There are two things every programmer of multithreaded applications knows:
1. Logging is the simplest form of debugging for such systems.
2. The act of logging triggers context switches which changes the execution which in turn means the application behaves differently under debugging conditions than it does under normal conditions.
Often the application of #1 results in #2 meaning the “bug” can go away or change in nature to the point where the original bug is no longer obvious. You can assume that you have the bug fixed yet as soon as you remove/disable the debug code and the bug can return or worse, uncover a completely new one.
Part of the reason for this is that writing to a log, socket or other place to store the log will cause the kernel to execute a context switch. This means that different threads will execute in a different order under debug conditions…this is what I call “the observer effect”: a watched program executes differently than one running in release mode. This is even more noticeable when running it under GDB or Valgrind for example. There are two ways to deal with this:
1. Force the context switches to happen the same way under both conditions.
2. Reduce/eliminate the context switches incurred by the logging process itself.
While #1 is interesting and the subject of another project I have on the back-burner, #2 is the subject of this article. One way to achieve what we want is to defer all file IO until such a time that it no longer matters. This means storing the log-contents in memory which brings up another problem: if you have multiple threads of execution all attempting to write to the same object/data store, you have to gate access via synchronization tools such as the mutex or semaphore and that means blocking some threads while others work and that still brings about the dreaded artificially-induced context switches. One way around this is to give each thread its own area of memory to write to. However since 90% of logging messages are text this could quickly become a resource-hungry monster. This would be bad enough in a normal system but on an embedded system this would make such a method all but unusable.
One way around this is by tokenizing the log messages. If you look at most of your logs you will find most if not all of the messages follow a similar pattern, often with only a few bits of data being different. A few examples of typical log messages reveal:
[11:50:34] Test Process 1: Starting test begin
[11:50:34] Test Process 1: Locking resource mainQueue
[11:50:34] Test Process 1: Unlocking resource mainQueue
[11:50:34] Test Process 1: Sleeping for 1000
[11:50:34] Test Process 1: HWEvent Mouse event
[11:50:34] Test Process 1: Info Finishing task
[11:50:34] Test Process 1: Warning low resources
If we look more closely at this we find four basic kinds of information:
1. Time stamp
2. Originator of the message
3. General text of the message
4. The message “payload” or specific data being conveyed.
In the entry: “[11:50:34] Test Process 1: Locking resource mainQueue”
1. The time stamp is obvious, in brackets
2. “Test Process 1″ is the process originating the message
3. “Locking resource X” is the generalized text of the message and
4. “mainQueue” is the message payload
Total message consumption: ~56 bytes.
With the goals of reducing the chance of triggering a context switch and reducing the memory footprint of the message in mind I created a message token like so:
typedef struct TBasicMsg
{
time_t msgTime;// time of message
mType msgType;// enumeration of message type
payloadType plType;// enumeration of payload type
basePayload basePL;// 16-byte payload
}basicMsg;
The time_t time stamp is obvious. The message type is an enumeration of basic core messages:
typedef enum {
msgNone = 0,
msgStart,
msgStop,
msgLockResource,
msgUnlockResource,
msgSleep,
msgWake,
msgWait,
msgResume,
… etc. Each of these corresponds to a full textual message or more accurately a message format string suitable for sprintf(), stored in an STL map in the server. For example, msgSleep may correspond to the string “Sleeping for %d seconds”.
The payload is simply up to 16 bytes of data. It could be four integers, 16 characters, 15 characters + a NULL, two doubles, etc. Exactly how that data block is interpreted is dictated by the payload type field, enumerated as so:
typedef enum {plEmbeddedChar = 0, plLUChar, plDouble1, plDouble2, plInt1, plInt2, plInt3, plInt4, plLong1, plLong2, plNone} payloadType;
Each of these corresponds to a union to easily translate the 16 byte block defined as:
const short payloadSize = 16;
typedef struct TPayload
{
char base[payloadSize];
}basePayload;
The basePayload is then translated into a specific set of data through the use of a union. For example, plInt reflects:
union intPayload
{
basePayload core;
int n1, n2, n3, n4;
};
There are unions to translate this into various combinations of values such as doubles, integers, longs, short character arrays and so on.
Using this combination of elements, all messages can be contained in a single 28-byte block. The call to store such a message as “Sleeping for 10 seconds” would look like this at the client level:
Log.logMsg(msgSleep, plInt1, Log.makePayload(1000));
This call sends the message token about sleeping along with a payload type meaning a single integer. The makePayload() overloaded function converts the single integer into a basePayload type and returns it. The whole thing is then added to a BasicMsg type along with the current timestamp and pushed into a memory slot provided by the server.
The Bigger Picture:
The idea is that there is a single microlog “server” created and 1-n clients. The server is never instantiated directly but rather by the first instance of the first client in the constructor:
CMicroLog::CMicroLog(string strProcName, long lThreadID, string &strLogName)
{
// if server is not there, create
if(ptrServer == NULL)
{
ptrServer = new CMLogServer(strLogName);
nClients = 1;
}
else
{
nClients++;
}
ptrServer and nClients are both static members of the CMicroLog class. The first client instance created will build the server instance and the last one destroyed will delete the server:
CMicroLog::~CMicroLog()
{
// this should be wrapped in a lock
nClients--;
if( nClients == 0 )
{
cout << "Deleting server..." << endl;
delete ptrServer;
}
}
When the destructor for the server is called, the stored byte-ranges for all client-generated messages are translated into textual equivalents and written to the filename strLogName. Thus it makes a certain kind of sense to have a “control” client created so the log is not written before you are truly ready to exit.
Upon instantiation, after the server is created or the client count is updated, the client registers itself with the server through the registerLog() method:
nBufSize = ptrServer->registerLog(strProcName, lThreadID, &ptrMainBufStart);
strProcName is the textual name of this process (ie “Database manager” or “Socket service”), lThreadID is a long that can store the actual ID or handle for that thread/process or an arbitrary number more meaningful to the debug process. Finally &ptrMainBufStart is simply a reference to a pointer to a memory block managed by the server. On making this call the server sets aside a block of memory for direct use by this client that is the size of the basic message times MSG_BUF_SLOTS, currently 1024. This essentially sets up a circular queue managed by the server associated with this client. Each call by the client to logMsg() pushes the 28-byte block in the slot pointed to by ptrMainBuf (initialized to ptrMainBufStart, established by the server) and updates the value of ptrMainBuf. When ptrMainBuf reaches the end of the block it is reset to the beginning held by ptrMainBufStart. All pointer manipulation and memory writes are performed by the client. Since the client “owns” this zone, no synchronization is needed.
Finally when the last client is destroyed the server calls flushLog() which iterates through all allocated memory blocks, all valid (non-zero) messages are entered into an STL multimap, keyed on the time-stamp. This has the net effect of aggregating all messages from all registered clients in time-stamp order into a single list. For each message in the list, the payload is converted to a string and inserted into the message. That list is then translated into textual messages and written to the log file.
The Code:
The link at the bottom of the article has the source code for the library (CMicroLog (client), CMlogServer (server) and mlog_main.cpp (test/demo application). Please note that in its current version it is very much alpha code and will probably see a lot of refinement and extension in the area of message variety and construction. Consider this to be a proof of concept as much as anything. You will need the cross-platform make system CMake installed to build everything (apt-get install cmake).
Two notes on extending the system (and it is *meant* to be extended!):
1. If you want to extend the core message library, add the new message to the msgType enumeration in mlogmsgs.hpp and then add a string insertion into the msfDigest in mlog.cpp:CMlogserver::initMsgDigest().
2. There is an experimental extension type called auxMsgTypes in auxmsgs.hpp. It is triggered by specifying msgAuxMsg in the logMsg() call, plLUChar in the payload type and then the desired auxMsgType in the overloaded CMicroLog::makePayload() call.
The test app can be run in two different ways:
1. mlog_test -t basic will run a single-threaded test to demonstrate core message building. All messages are written to singletest.log.
2. mlog_test -t multiple -n <number of threads> to test a multithreaded model. This will spin up <number of threads> threads, each of which will create a CMicroLog client and go through a series of sleep and wake periods at random intervals which in turn emits a number of log messages to the server. Once all of the threads are done the server will collate the messages into a single log and write it to multitest.log.
There are a lot of logging systems that are more complete and easier to use. The value this brings to the table is if you need to constrain the number of artificially triggered context switches and/or constrain the amount of memory consumed by the logging process. If you want smaller circular queues, adjust the value of MSG_BUF_SLOTS in mlog.hpp. If you need to add more/different messages to the system or add custom payloads, edit mlogmsgs.hpp and mlog.cpp.
The main logging library is contained in libmlog.so. The main logging code is in mlog.cpp/hpp. The message enums and unions are stored in mlogmshs.hpp and auxmsgs.hpp. The entire codebase is released under the GPL 3.0. Sorce Code is here:
Experiment and share your findings…


Related Articles
2 users responded in this post
[...] Multithreaded app logging Greetings to all. This is my first contribution here. I have been doing multithreaded work for years on a variety of platforms, most recently on Playstation 3/CellBE and Linux for Sony, not that it matters other than to point out its not Windows. One of the trickiest things we have come across is being able to troubleshoot logic problems in heavily multithreaded applications. For those not immediately wincing, the problem is that often the only thing you really have to work with due to platform or other issues is the dreaded printf() or writing to a log file. While both approaches work to an extent the real issue is that doing so causes the kernel to generate unwanted and artificial context switches, resulting in the app working one way with debug statements and another way without. How many times has it happened that you had some nasty but that you finally trap with printf() or log messages, fix the but, build in release mode and suddenly the app works completely different? This is what I refer to as the "observer effect" for lack of anything original. I have spent a lot of time thinking about what could be done for this problem and I think I have come up with a simple germ of an idea to help with it. The system is not by any means complete, more of a proof of concept right now but I think that it has promise. The next real-world system I have to work on I will put it to the test. So how does it work? In short I had the idea of a passive logging server that provided each out-of-process client a unique memory block to play in, used as a circular queue. Since each client then just pushes data (log msgs) to a slot in memory, no synchronization is needed. More, the messages themselves are tokenized such that each one uses a predictable block of memory and the write can be accomplished with a simple memcpy(). How they are tokenized is beyond the scope of this post but is spelt out more in the article on my website (link to that and demo code at bottom of post). The server itself is never instantiated directly; rather the first client instantiated creates the server and as the last client goes out of scope (in theory at the close of the application) the server is destroyed. As each successive client is created the server allocates a block of memory for that particular client to use. As part of destruction the server iterates through the allocated blocks, converting the tokenized logs into textual messages and writes them all to a disk-file. I have a more long-winded article on my blog along with a link to the source code. FWIW it is GPL 3.0. You will need cmake to built the demo app which shows general message construction and how it works in a heavier multithreaded environment. Caveat: the exact message creation/tokenization is still very alpha; I think I will be able to make it more "real world" once I use it a few times in something other than a test-harness. In any event this code and article should explain the direction I am taking on this and would be interested in the thoughts of other folks used to dealing with these kinds of things… Cheers Jeff Microlog 1.0a [...]
Um…right….WTF?
Leave A Reply