Yuri schrieb:
> Hi guys,
> Now my recursive mutex stopped to work. I guess I will  have to
> patch the patch? :-)

Hi,

adding "else break;" after line 206 in mutex.c should fix it. If the
mutex is of the RECURSIVE type and owned by the calling thread, the
while loop is immediately left.

Regards
Andreas

Hi guys,
Now my recursive mutex stopped to work. I guess I will  have to
patch the patch? :-)

Yuri

Vasanth Asokan wrote:
> Yuri,
>
> The patch is against the file Andreas referred to in his previous posts. To
> avoid touching the EDK installation area, you should ideally create and use
> a local copy of the kernel. Here are the steps,
>
> 1. Copy <edk_install>/sw/lib/bsp/xilkernel_v3_00_a into <your edk
> project>/bsp/xilkernel_v3_00_a. This becomes your local copy of the kernel.
> 2. In an EDK bash shell, cd to <your edk project>/bsp/xilkernel_v3_00_a
> 3. Run, "patch -p0 < mutex.c.patch"
> 4. Regenerate libraries from within XPS
>
> Step 3 assumes you have "patch" utility installed. If you don't, you can
> make the change referred to in the patch file yourself. It is just a one
> word change in xilkernel_v3_00_a/src/src/ipc/mutex.c.
>
> Vasanth
> "Yuri" <shteinman@squarepeg.ca> wrote in message
> news:1158179842.784063.280080@p79g2000cwp.googlegroups.com...
> > Vasanth,
> > Can you please tell how do I apply this patch in Windows environment?
> >
> > Thank you,
> > Yuri Shteinman
> >
> > Vasanth Asokan wrote:
> >> Andreas,
> >>
> >> It is indeed a bug. Thanks for taking the time to track this. I have
> >> attached a naive fix (which will not avoid starvation).
> >> The fix should appear in EDK 9.1i.
> >>
> >> thanks,
> >> Vasanth
> >>
> >

Yuri,

The patch is against the file Andreas referred to in his previous posts. To 
avoid touching the EDK installation area, you should ideally create and use 
a local copy of the kernel. Here are the steps,

1. Copy <edk_install>/sw/lib/bsp/xilkernel_v3_00_a into <your edk 
project>/bsp/xilkernel_v3_00_a. This becomes your local copy of the kernel.
2. In an EDK bash shell, cd to <your edk project>/bsp/xilkernel_v3_00_a
3. Run, "patch -p0 < mutex.c.patch"
4. Regenerate libraries from within XPS

Step 3 assumes you have "patch" utility installed. If you don't, you can 
make the change referred to in the patch file yourself. It is just a one 
word change in xilkernel_v3_00_a/src/src/ipc/mutex.c.

Vasanth
"Yuri" <shteinman@squarepeg.ca> wrote in message 
news:1158179842.784063.280080@p79g2000cwp.googlegroups.com...
> Vasanth,
> Can you please tell how do I apply this patch in Windows environment?
>
> Thank you,
> Yuri Shteinman
>
> Vasanth Asokan wrote:
>> Andreas,
>>
>> It is indeed a bug. Thanks for taking the time to track this. I have
>> attached a naive fix (which will not avoid starvation).
>> The fix should appear in EDK 9.1i.
>>
>> thanks,
>> Vasanth
>>
>

Vasanth,
Can you please tell how do I apply this patch in Windows environment?

Thank you,
Yuri Shteinman

Vasanth Asokan wrote:
> Andreas,
>
> It is indeed a bug. Thanks for taking the time to track this. I have
> attached a naive fix (which will not avoid starvation).
> The fix should appear in EDK 9.1i.
> 
> thanks,
> Vasanth
>

To cut a long story short, Xilinx is the one to blame. Their
pthread_mutex_lock implementation in xilkernel_v3 is somewhat broken.

If a mutex is locked by thread 1 when pthread_mutex_lock() is called by
 thread 2, thread 2 is suspended and added to the wait queue of the
mutex. On release of the mutex the first thread in the wait queue is
unblocked. Depending on the time slice thread 1 may run on and aquire
the mutex since it is free. This happens in my case due to the tight loop.

Now comes thread 2 into action. It's unblocked and thus set to run when
the time slice of thread 1 ends. Thread 2 is blocked inside
pthread_mutex_lock_basic(), defined in
"EDK\sw\lib\bsp\xilkernel_v3_00_a\src\src\ipc\mutex.c", by a call to
process_block(). When process_block() returns thread 2 aquires the mutex
_without_ checking if it is really free, which is not true in my case
since thread 1 has locked it immediately after freeing it.

Best regards
Andreas

David Ashley wrote:
> Well something's broken. The code ought to behave
> differently...At this point there are no options but
> grasping at straws...

Check the behavior of sleep().  I take it sleep(500) is not meant to
sleep for 500 seconds.  POSIX sleep() has wierd interactions with
setitimer(), ualarm(), usleep(), and SIGALRM.  It can also return early
due to signal delivery.  Nanosleep() is easier to use.

Try the following:
	lock
	print "begin thread %i"
	sleep
	print "end thread %i"
	unlock

ryanrsrsrs@yahoo.com wrote:
> David Ashley wrote:
> 
>>Did you init the mutex? Is the mutex residing on some cacheable
>>memory, and the cache hasn't been flushed?
> 
> 
> On a POSIX-conforming system (ie. compiler + OS + hardware as a whole),
> the pthread API is sufficient to achieve synchronization.  No low level
> trick are necessary.   That's the whole point.
> 

Well something's broken. The code ought to behave
differently...At this point there are no options but
grasping at straws...

The init code wasn't included, you can't just declare
a mutex, you need to initialize it:

static pthread_mutex_t amutex=PTHREAD_MUTEX_INITIALIZER;

For example...

-Dave

-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture

David Ashley wrote:
> Did you init the mutex? Is the mutex residing on some cacheable
> memory, and the cache hasn't been flushed?

On a POSIX-conforming system (ie. compiler + OS + hardware as a whole),
the pthread API is sufficient to achieve synchronization.  No low level
trick are necessary.   That's the whole point.

Andreas Hofmann wrote:
> On the other hand, the Xilinx manual clearly states that when
> pthread_mutex_unlock() is called "the thread
> that is at the head of the mutex wait queue is unblocked and
> allowed to lock the mutex.". Thus calling or omiting yield() shouldn't
> make any difference.

Unblocking a thread is not the same as giving it processor time.
Assuming there are only two threads, then the unblocked thread will not
actually execute unless it has a higher priority than the current
thread, or the current thread blocks.  If the both threads have equal
priority, then the blocked thread will also run if the current thread
calls sched_yield().  And if the priorities are equal and the current
thread uses the SCHED_RR policy, then the blocked thread will also run
when the current thread's time quantum expires.  If either thread uses
the SCHED_OTHER policy then all bets are off.  Likewise, if Xilinx's
implementation is not POSIX conforming, then who knows what will happen.

Andreas Hofmann wrote:
> The other threads should be blocked because sleep() is called while the
> mutex is held by the sleeping thread.
> 
> Admittedly my code immediately tries to lock the mutex after releasing
> it so the other threads may have no chance to execute their lock
> request. This seems to cause the problems because after enabling
> yield()-support and calling yield() after pthread_mutex_unlock() solves
> the problem. The program does now behave as expected.
> 
> However, i do not fully understand what the kernel is doing when yield()
> is missing. Shouldn't the other threads, which do not hold the lock,
> starve while one of the thread locks the mutex over and over again?
> 
> On the other hand, the Xilinx manual clearly states that when
> pthread_mutex_unlock() is called "the thread
> that is at the head of the mutex wait queue is unblocked and
> allowed to lock the mutex.". Thus calling or omiting yield() shouldn't
> make any difference.

Andreas,

Something is wrong. As your code is written there is no way
the 2 threads can print out timestamps so close together.
It's as if your pthread locking is having no affect at all.

Did you init the mutex? Is the mutex residing on some cacheable
memory, and the cache hasn't been flushed? Check the details
of the mutex itself, there may be limitations. On blackfin dsp
mutex locking was based on their test and set instruction, which
was flawed -- you had to use external memory otherwise it didn't
work. There may be something similiar going on here.

-Dave

-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture