Re: Detecting when a process is abruptly terminated...         


Author: boris
Date: May 10, 2008 04:20

Oops: bug fix (see inline below).

"boris" nospam.net> wrote in message
news:48257a55$0$34507$742ec2ed@news.sonic.net...
>
> "Chris Thomasson" comcast.net> wrote in message
> news:fsidnbwvkJFOl7jVnZ2dnUVZ_u-dnZ2d@comcast.com...
>>> On May 8, 11:25 am, "Chris Thomasson" comcast.net> wrote:
>>>> I am doing a general-purpose shared memory allocator and need to
>>>> determine
>>>> when a process has been unexpectedly terminated so that it can make a
>>>> reclaim adjustment to the allocator state. In other words, if ProcessA
>>>> allocates a block of shared memory, and gets terminates, I want to be
>>>> able
>>>> to catch this and reclaim the block. What do you think is the most
>>>> elegant
>>>> solution? I want to avoid creating a "watchdog" process...
>>
>>
>> "foobar" gmail.com> wrote in message
>> news:e9d40eca-c317-4229-b1e6-923b8bb65440@y38g2000hsy.googlegroups.com...
>>> Elegant one I think would be one that uses Operating System
>>> notifications directly.
>>
>> [...]
>>
>> Humm... Well, it looks like I have to use a manager process. I don't
>> think there is any other clean method for detecting and recovering from
>> TerminateProcess. Okay, that's fine. Now, I need to think about efficient
>> means of maintaining coherency when process get terminated in the middle
>> of a critical-section. Windows process level mutexs have the
>> WAIT_ABANDONDED state. The following threads deals with some of this:
>>
>> http://groups.google.com/group/comp.programming.threads/msg/94f2a233bd65bf8a
>>
>> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/b5775d...
>> (read all...)
>>
>> Basically, you keep a version number in the critical-section in order to
>> detect different "levels" of coherent data. Something like:
>>
>>
>> lock();
>> // see if we need to recover from a disaster!
>> if (version != 3) {
>> if (version > -1 && version < 3) {
>> // perform recovery
>> switch (version) {
>> case 0:
>> case 1:
>> case 2:
>> }
>> } else {
>> // the version is trashed! Something very bad happened!
>> abort();
>> }
>> }
>>
>>
>> // zero version
>> // membar
>> --------------------------------------
>> // perform action 1
>> --------------------------------------
>> // membar
>> // inc version
>> --------------------------------------
>> // perform action 2
>> --------------------------------------
>> // membar
>> // inc version
>> --------------------------------------
>> // perform action 3
>> --------------------------------------
>> // membar
>> // inc version
>>
>>
>> assert(version == 3);
>> unlock();
>>
>>
>>
>> Any suggestions?
> Could 2 extra event objects be used (see below)?
> hEventOkForClientToProceed - manual reset event;
> hEventNotifyManager - auto reset event;
>
> in client process:
>
> while (1)
> {
> ret = WaitForSingleObject(hMutex,INFINITE);
> if (ret ==WAIT_ABANDONED)
> CloseHandle(hMutex);
> // full memory fence
> if ( pSharedMemory->mutexVersion != mutexVersion)
> {
> ResetEvent(hEventOkForClientToProceed);
> SetEvent(hEventNotifyManager);
> WaitForSingleObject(hEventOkForClientToProceed,INFINITE);
> }
> mutexVersion = pSharedMemory->mutexVersion;
> hMutex = OpenMutex(SYNCHRONIZE,FALSE,pSharedMemory->strMutexName);
> continue;
> }
> break;
> }
>
> in manager process:
>
> WaitForSingleObject(hEventNotifyManager,INFINITE);
> //Validate/repair shared memory state/data;
> CloseHandle(hMutex);
Remove the above line: move CloseHandle(hMutex); inside while(1) loop.
> while (1)
> {
> // full memory fence
add line:
CloseHandle(hMutex);
> strcpy(pSharedMemory->strMutexName,MyGenerateMutexName());
> pSharedMemory->mutexVersion++;
> hMutex= CreateMutex(NULL.TRUE,pSharedMemory->strMutexName);
> if ( hMutex )
> {
> if ( ERROR_ALREADY_EXISTS == GetLastError() )
> {
> continue;
> }
> ReleaseMutex(hMutex);
> break;
> }
> else
> {
> // unrecoverable error
> }
> }
> // full memory fence
> SetEvent(hEventOkForClientToProceed);
>
> One thing missing: syncronising access to control structures
> (pSharedMemory->strMutexName, pSharedMemory->mutexVersion) - is it needed?
>
> Boris
>
>
>
diggit! del.icio.us! reddit!