Ticket #696 (new defect)
Agent threads are not joined safely
| Reported by: | hakan@… | Owned by: | |
|---|---|---|---|
| Priority: | major | Milestone: | |
| Component: | LEAP | Version: | 3.0.0-GA |
| Keywords: | threads | Cc: | |
| patch waiting for maintainer: | no |
Description
When saImmOmInitialize is invoked the first time, a few threads are created. When saImmOmFinalize is invoked for the last handle, these threads exits after a while. Unfortunately at least one of the threads is still running after saImmOmFinalize has returned. This is very unfortunate as there are no other synchronization primitives in the API (as far as I can see) that we can use to wait for the final OpenSAF thread to exit. We have encountered some nasty crashes when we unload the OpenSAF agent library code too soon after the final call to saImmOmFinalize. You should wait for all threads to join before saImmOmFinalize returns.
It may be the case that there are other places in the OpenSAF agent library code that suffers from the same bug. See the stack trace from gdb below for details about which threads that are active before the final call to saImmOmFinalize.
/Håkan
—
Håkan Mattsson, Erlang/OTP, Ericsson AB
(gdb) thr 2
[Switching to thread 2 (process 15689)]#0 0x00002b0d895a29a2 in select () from /lib64/libc.so.6
(gdb) bt
#0 0x00002b0d895a29a2 in select () from /lib64/libc.so.6
#1 0x00002aaaab174a05 in ncs_sel_obj_select (highest_sel_obj={raise_obj = 15, rmv_obj = 16}, rfds=0x420684b0, wfds=0x0, efds=0x0, timeout_in_10ms=0x0) at src/os_defs.c:2818
#2 0x00002aaaab149d23 in ncs_ipc_recv_common (mbx=0x2aaaab2e2580, block=1) at src/sysf_ipc.c:447
#3 0x00002aaaab149bf5 in ncs_ipc_recv (mbx=0x2aaaab2e2580) at src/sysf_ipc.c:394
#4 0x00002aaaab1b736b in dta_do_evts (mbx=0x2aaaab2e2580) at dta_api.c:1260
#5 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0
#6 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6
#7 0x0000000000000000 in ?? ()
(gdb) thr 3
[Switching to thread 3 (process 15688)]#0 0x00002b0d895a08b6 in poll () from /lib64/libc.so.6
(gdb) bt
#0 0x00002b0d895a08b6 in poll () from /lib64/libc.so.6
#1 0x00002aaaab17d39e in mdtm_process_recv_events () at src/mds_dt_tipc.c:640
#2 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0
#3 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
(gdb) thr 4
[Switching to thread 4 (process 15687)]#0 0x00002b0d895a29a2 in select () from /lib64/libc.so.6
(gdb) bt
#0 0x00002b0d895a29a2 in select () from /lib64/libc.so.6
#1 0x00002aaaab14e4d6 in ncs_tmr_wait () at src/sysf_tmr.c:541
#2 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0
#3 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
