"Chris Thomasson"
comcast.net> wrote in message
news:fsidnbwvkJFOl7jVnZ2dnUVZ_u-dnZ2d@comcast.com...
>> On May 8, 11:25 am, "Chris Thomasson" comcast.net> wrote:
>>> I am doing a general-purpose shared memory allocator and need to
>>> determine
>>> when a process has been unexpectedly terminated so that it can make a
>>> reclaim adjustment to the allocator state. In other words, if ProcessA
>>> allocates a block of shared memory, and gets terminates, I want to be
>>> able
>>> to catch this and reclaim the block. What do you think is the most
>>> elegant
>>> solution? I want to avoid creating a "watchdog" process...
>> Elegant one I think would be one that uses Operating System
>> notifications directly.
>
> [...]
>
> Humm... Well, it looks like I have to use a manager process. I don't think
> there is any other clean method for detecting and recovering from
> TerminateProcess. Okay, that's fine. Now, I need to think about efficient
> means of maintaining coherency when process get terminated in the middle
> of a critical-section. Windows process level mutexs have the
> WAIT_ABANDONDED state. The following threads deals with some of this:
>
>
http://groups.google.com/group/comp.programming.threads/msg/94f2a233bd65bf8a
>
>
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/b5775d...
> (read all...)
>
> Basically, you keep a version number in the critical-section in order to
> detect different "levels" of coherent data. Something like:
>
>
> lock();
> // see if we need to recover from a disaster!
> if (version != 3) {
> if (version > -1 && version < 3) {
> // perform recovery
> switch (version) {
> case 0:
> case 1:
> case 2:
> }
> } else {
> // the version is trashed! Something very bad happened!
> abort();
> }
> }
>
>
> // zero version
> // membar
> --------------------------------------
> // perform action 1
> --------------------------------------
> // membar
> // inc version
> --------------------------------------
> // perform action 2
> --------------------------------------
> // membar
> // inc version
> --------------------------------------
> // perform action 3
> --------------------------------------
> // membar
> // inc version
>
>
> assert(version == 3);
> unlock();
>
>
>
> Any suggestions?
Could 2 extra event objects be used (see below)?
hEventOkForClientToProceed - manual reset event;
hEventNotifyManager - auto reset event;
in client process:
while (1)
{
ret = WaitForSingleObject(hMutex,INFINITE);
if (ret ==WAIT_ABANDONED)
CloseHandle(hMutex);
// full memory fence
if ( pSharedMemory->mutexVersion != mutexVersion)
{
ResetEvent(hEventOkForClientToProceed);
SetEvent(hEventNotifyManager);
WaitForSingleObject(hEventOkForClientToProceed,INFINITE);
}
mutexVersion = pSharedMemory->mutexVersion;
hMutex = OpenMutex(SYNCHRONIZE,FALSE,pSharedMemory->strMutexName);
continue;
}
break;
}
in manager process:
WaitForSingleObject(hEventNotifyManager,INFINITE);
//Validate/repair shared memory state/data;
CloseHandle(hMutex);
while (1)
{
// full memory fence
strcpy(pSharedMemory->strMutexName,MyGenerateMutexName());
pSharedMemory->mutexVersion++;
hMutex= CreateMutex(NULL.TRUE,pSharedMemory->strMutexName);
if ( hMutex )
{
if ( ERROR_ALREADY_EXISTS == GetLastError() )
{
continue;
}
ReleaseMutex(hMutex);
break;
}
else
{
// unrecoverable error
}
}
// full memory fence
SetEvent(hEventOkForClientToProceed);
One thing missing: syncronising access to control structures
(pSharedMemory->strMutexName, pSharedMemory->mutexVersion) - is it needed?
Boris