Ticket #692 (new defect)

Opened 3 months ago

crash in cpnd at opensaf restart

Reported by: chris.j.leary@… Owned by:
Priority: critical Milestone: PL 3.0.1
Component: CPSv Version: 3.0.0-GA
Keywords: Cc:
patch waiting for maintainer: no

Description

Our system was running for 6-8 hours. The system was stopped and then restarted via /etc/init.d/opensafd stop ; /etc/init.d/opensafd start and the system rebooted shortly thereafter.

After the system came back up, it was found that ncs_cpnd had core dumped 6 times in a row during startup before escalation to node reboot occurred.

Below is the backtrace of one of the core dumps (all are identical). There appears to be an error in req->info.read.i_addr passed to ncs_os_posix_shm.

Core was generated by `/usr/lib/opensaf/ncs_cpnd'.
Program terminated with signal 11, Segmentation fault.
[New process 28112]
[New process 28111]
[New process 28110]
[New process 28109]
[New process 28108]
#0 0x00002b17ffb09d90 in memcpy () from /lib/libc.so.6
(gdb) bt
#0 0x00002b17ffb09d90 in memcpy () from /lib/libc.so.6
#1 0x00002b17ff20e462 in ncs_os_posix_shm (req=0x40dd21c0)

at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c:1326

#2 0x000000000040af10 in cpnd_find_free_loc (cb=0x631980, type=<value optimized out>)

at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_res.c:640

#3 0x000000000040b168 in cpnd_restart_shm_client_update (cb=0x40dd2200, cl_node=0x639110)

at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_res.c:1045

#4 0x000000000041780e in cpnd_process_evt (evt=0x637eb0)

at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_evt.c:466

#5 0x0000000000403622 in cpnd_main_process (info=<value optimized out>)

at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_init.c:628

#6 0x00002b17ffdf53f7 in start_thread () from /lib/libpthread.so.0
#7 0x00002b17ffb64b4d in clone () from /lib/libc.so.6
#8 0x0000000000000000 in ?? ()
(gdb) up
#1 0x00002b17ff20e462 in ncs_os_posix_shm (req=0x40dd21c0)

at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c:1326

1326 /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c: No such file or directory.

in /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c

(gdb) print req
$1 = (NCS_OS_POSIX_SHM_REQ_INFO *) 0x40dd21c0
(gdb) print *req
$2 = {type = NCS_OS_POSIX_SHM_REQ_READ, info = {open = {i_name = 0x0, i_flags = 4, i_map_flags = 0,

i_size = 1088233984, i_offset = 40, o_addr = 0x0, o_fd = 0, o_hdl = 0}, close = {i_hdl = 0,
i_addr = 0x4, i_fd = 1088233984, i_size = 40}, unlink = {i_name = 0x0}, read = {i_hdl = 0,
i_addr = 0x4, i_to_buff = 0x40dd2200, i_read_size = 40, i_offset = 0}, write = {i_hdl = 0,
i_addr = 0x4, i_from_buff = 0x40dd2200, i_write_size = 40, i_offset = 0}}}

Attachments

Add/Change #692 (crash in cpnd at opensaf restart)

Author



Action
as new
Note: See TracTickets for help on using tickets.