OpenSAF: Ticket Query http://devel.opensaf.org/query?status=%21closed&row=description Open Source High Availability Middleware Generally Based on SA Forum Specifications en-US OpenSAF http://devel.opensaf.org/chrome/site/logo.jpg http://devel.opensaf.org/query?status=%21closed&row=description Trac 0.11.1 http://devel.opensaf.org/ticket/319 http://devel.opensaf.org/ticket/319 #319: Bus Error in CPND process Mon, 29 Dec 2008 05:18:05 GMT aditya.sinha@… <p> I ams using OpenSAF2.0.0, while using SAF compliant checkpointing service, it is found that application is not able to create sections beyond a specified number of times. On furthe analysis it is found that cpnd process creates a core dump because of bus error and restarts. After restarting I think cpnd process is not able to handle request for creation of checkpoint section. </p> <p> Following is the stack trace of cpnd prcoess. ==================================================================== <a class="missing ticket">#0</a> 0x0014d510 in ncs_os_posix_shm (req=0xb7f7a19c) at ./src/os_defs.c:1518 </p> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x08052f87 in cpnd_sec_hdr_update (sec_info=0xa833aa4, cp_node=0x9f8747c) </p> <blockquote> <p> at ./cpnd_proc.c:1992 </p> </blockquote> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x0805915b in cpnd_ckpt_sec_add (cp_node=0x9f8747c, id=0xa833c04, </p> <blockquote> <p> exp_time=10000000000, gen_flag=0) at ./cpnd_db.c:599 </p> </blockquote> <p> <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x0806023c in cpnd_evt_proc_ckpt_sect_create (cb=0x9f82ddc, evt=0x9f85e94, </p> <blockquote> <p> sinfo=0x9f85ff0) at ./cpnd_evt.c:2244 </p> </blockquote> <p> <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x08059f96 in cpnd_process_evt (evt=0x9f85e8c) at ./cpnd_evt.c:256 </p> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x0804bd49 in cpnd_main_process (info=0x9f82ddc) at ./cpnd_init.c:590 </p> <p> <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00abf2db in start_thread () from /lib/libpthread.so.0 </p> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x0028514e in clone () from /lib/libc.so.6 </p> <p> ==================================================================== </p> Results http://devel.opensaf.org/ticket/319#changelog http://devel.opensaf.org/ticket/429 http://devel.opensaf.org/ticket/429 #429: GLSV service reports Deadlock and CPU shoots up Wed, 25 Feb 2009 10:19:00 GMT umesh.patel@… <p> GLSV service reports deadlock when thread takes locks in an EX mode using saLckResourceLock API. When two threads from the same process tried to take lock on same resource in EX mode, GLSV report Deadlock. CPU for ncs_glnd shoots up to 80%. </p> <p> Please note that thread takes lock on only one resource. There is no execution path for a thread where it can hold lock one resource and tries for lock on another resource(which might have true deadlock condition). </p> Results http://devel.opensaf.org/ticket/429#changelog http://devel.opensaf.org/ticket/568 http://devel.opensaf.org/ticket/568 #568: Controller failover does not work when EntityLocations is not 1 or 2 Mon, 18 May 2009 16:56:09 GMT troyh <p> Controller failover does not work when the controllers are not physically in <a class="missing wiki" href="http://devel.opensaf.org/wiki/EntityLocations" rel="nofollow">EntityLocations?</a> 1 and 2. </p> <p> On my c-class cluser. I have 4 physical slots, 9, 10, 11 and 12. If I configure the clsuter using the default slot_id's 1-4 and update the <a class="missing wiki" href="http://devel.opensaf.org/wiki/EntityLoactions" rel="nofollow">EntityLoactions?</a> 9-12 respectively. The cluster starts and mostly works as expected, except for active controller failover does not work. If I look at the log on the standby controller when the active controller fails I see: </p> <blockquote> <p> opensaf_immnd: Director Service in NOACTIVE state kernel: TIPC: Resetting link <1.1.47:eth0-1.1.31:eth0>, peer not responding kernel: TIPC: Lost link <1.1.47:eth0-1.1.31:eth0> on network plane A kernel: TIPC: Lost contact with <1.1.31> opensaf_immd: IMMND DOWN on active controller f1 detected at standby immd!! f2. Possible failover opensaf_immd: Resend of fevs message 20, will not mbcp to peer IMMD opensaf_immd: Resend of fevs message 21, will not mbcp to peer IMMD opensaf_immnd: DISCARD DUPLICATE FEVS message:20 opensaf_immnd: DISCARD DUPLICATE FEVS message:21 opensaf_immnd: Global discard node received for nodeId:2010f pid:22233 ncs_scap: AVD: Heart Beat missed with active director on 2010f opensaf_fmsd: Role: STANDBY, FM_EVT_HB_LOSS: for slot_id: 9, subslot_id: 15 </p> </blockquote> <p> The STANDBY clearly knows that the ACTIVE went away but yet doesn't seem to think it needs to become ACTIVE. </p> <p> If I simply move the blades to physical slots 1-4 and updated the <a class="missing wiki" href="http://devel.opensaf.org/wiki/EntityLocations" rel="nofollow">EntityLocations?</a> everything seems to work as expected. </p> <p> There seems to be a disconnect between the slot/subslot ID and the actual <a class="missing wiki" href="http://devel.opensaf.org/wiki/EntityLocation" rel="nofollow">EntityLocation?</a>. </p> Results http://devel.opensaf.org/ticket/568#changelog http://devel.opensaf.org/ticket/632 http://devel.opensaf.org/ticket/632 #632: message queue server hangs after certain time Thu, 09 Jul 2009 11:46:56 GMT umesh.patel@… <p> on high load, message queue library threads stuck on pthread mutex lock (Deadlock). Message queue server gets stuck and stops taking further message from the queue. </p> <p> We are using message queue in synchronous mode, that client waits for server to reply. On 100 messages per seconds problem hit after 10-15 minutes only. This is highly repeatable. This problem observed frequently for message size more than 3500 bytes. </p> <p> I am attaching pstack file for your reference. </p> Results http://devel.opensaf.org/ticket/632#changelog http://devel.opensaf.org/ticket/674 http://devel.opensaf.org/ticket/674 #674: Cleanup mds_papi.h before 4.0 GA Tue, 13 Oct 2009 10:49:53 GMT hafe <p> Reference: <a class="ext-link" href="http://list.opensaf.org/archives/devel/2009-October/005321.html"><span class="icon">http://list.opensaf.org/archives/devel/2009-October/005321.html</span></a> </p> Results http://devel.opensaf.org/ticket/674#changelog http://devel.opensaf.org/ticket/724 http://devel.opensaf.org/ticket/724 #724: CheckpointOpen causes node reload Tue, 01 Dec 2009 15:10:25 GMT stefan.k.berg@… <p> When creating a checkpoint with the enclosed test program, the node restarts. Output from the test program: </p> <pre class="wiki">Line 69, TX#: Call 1: saCkptInitialize (&ckptHandle, 0, &version) Line 69, TX#: Return 1: saCkptInitialize (&ckptHandle, 0, &version) = 1 Line 72, TX#: Call 2: saCkptSelectionObjectGet( ckptHandle, &fd ) Line 72, TX#: Return 2: saCkptSelectionObjectGet( ckptHandle, &fd ) = 1 Line 109, TX#: Call 3: saCkptCheckpointOpen (ckptHandle, &ckptName, &ckptCreateAttr, ckptOpenFlags, timeout, &checkpointHandle) </pre><p> (i.e. call does not return, but causes node restart) </p> <p> Output from syslog at that point in time: </p> <pre class="wiki">Dec 1 16:04:20 SC_2_1 local0.info opensaf_immnd: Create runtime object 'This_is_a_long_checkpoint_name' by Impl id: 5 Dec 1 16:04:20 SC_2_1 local0.notice opensaf_immnd: ERR_INVALID_PARAM: Can not tolerate ',' in RDN: 'safReplica=safAmfNode=SC_2_1,safAmfCluster=myAmfCluster' Dec 1 16:04:20 SC_2_1 user.err ncs_cpd: saImmOiRtObjectCreate_2 FAILED, rc = 7 Dec 1 16:04:20 SC_2_1 user.err opensaf_scap: NCS_AvSv: Card going for reboot -safComp=CPD,safSu=SC_2_1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown(8) Recovery is:nodeFailover(5) Dec 1 16:04:20 SC_2_1 user.info opensaf_scap: Component 'safComp=CPD,safSu=SC_2_1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown(8)' - rcvr=5 Dec 1 16:04:20 SC_2_1 local0.info opensaf_immnd: Implementer 5 disconnected. Marking it as doomed <Conn:238, Node:2010f> Dec 1 16:04:20 SC_2_1 local0.info opensaf_immnd: DISCARDING IMPLEMENTER 5 <238, 2010f> (safCheckPointService) Dec 1 16:04:20 SC_2_1 user.crit opensaf_scap: node rebooting, reason: A reset has been trigerred for this node </pre><p> Test based on changeset 5eac71d09644 (4.0.M3 + UML changeset). </p> <p> BR, </p> <p> Stefan </p> Results http://devel.opensaf.org/ticket/724#changelog http://devel.opensaf.org/ticket/266 http://devel.opensaf.org/ticket/266 #266: AMF: Wrong behavior of N-way active redundancy model Mon, 29 Sep 2008 17:39:07 GMT marioa <p> See background information: <a class="ext-link" href="http://list.opensaf.org/pipermail/users/2008-September/001492.html"><span class="icon">http://list.opensaf.org/pipermail/users/2008-September/001492.html</span></a> </p> <p> According to AMF spec the characteristics of the N-way redundancy model is to: "At any given time, the Availability Management Framework should make sure that the redundancy level (the preferred number of active assignments) for each SI is guaranteed, if possible, while the maximum number of service units is not exceeded." (AMF Chapter 3.7.5.1 bullet 6). </p> <p> The goal is unquestionable, the remaining question is then if it is possible to assign HA-state=ACTIVE to the service units at node nine and ten (see background information in mail thread) </p> <p> According to AMF spec (chapter 3.3.2.3), the "readiness state" of a component indicates whether a component is ready to take service instance assignments. When a component's readiness state is In-service it is eligible for CSI assignments. The components readiness state is in-service if its operational state is enabled and and the readiness state of the SU containing it is in-service. </p> <p> According to AMF spec (chapter 3.3.1.4), the readiness state of an SU is In-service if its operational state and and the operational state of its containing node is enabled its administrative state and the administrative state of its containing service group, node and cluster are unlocked its presence state is either instantiated or restarting </p> <p> The log records that we have show that the SUs at both node nine and ten have readiness state In-service, which means there shall be no hinder for AMF to assign CSIs with HA-state=Active to the components of these SUs. </p> <p> The problem has been explained (i mail thread) as a consequence of the SG is not being "in a stable state". We can not find anything stated in the AMF specification that the SG has to be in a stable state before SUs can be assigned. </p> <p> Possible view on the problem is that AMF detects that a csiSet(QUIESCED) operation has timed out on node 5 and AMF has detected this and tried to recover. During the recovery CLEAN UP has been done successfully and then an attempt to INSTANTIATE the component has been done. INSTANTIATION has failed however leaving the component in the INSTANTIATION-FAILED state. (The mistake is perhaps that the successful CLEANUP has not been internally reported to the SG so that the SI-state of the SU could be removed.) </p> Results http://devel.opensaf.org/ticket/266#changelog http://devel.opensaf.org/ticket/443 http://devel.opensaf.org/ticket/443 #443: IFSv related Thu, 12 Mar 2009 10:02:42 GMT gaurav.nangla@… <p> We have observed that upon using multiple virtual ip addresses for our processes that are close to each other(for eg: 10.124.25.1 and 10.124.25.2 are installed for our active processes on the same node) and then we perform failover/switchover of one process, it is found that both the ip addresses get removed. However this behaviour is mostly non-existent when we use sufficiently apart virtual ip addresses(for eg: 10.124.25.9 and 10.124.25.95).The /var/opt/opensaf/stdouts/ncs_ifnd.log shows ip addr del being called for the killed process ip and subsequently ip addr add for the same ip which is correct, but the other virtual ip simply disappears without any ip addr del being called in the ncs_ifnd.log file. </p> Results http://devel.opensaf.org/ticket/443#changelog http://devel.opensaf.org/ticket/540 http://devel.opensaf.org/ticket/540 #540: Deadlock in OpenSAF APIs Sat, 02 May 2009 06:55:59 GMT adityakar.jha@… <p> My applicaion is using most of the OpenSAF services alongwith core services like MDS and Timer and mailbox. I observed deadlock (at irregular intervals ranging from 3 hrs. to 24 hrs.) in Application threads (using OpenSAF APIs) resulting in Application not responding to AMF initiated <a class="missing wiki" href="http://devel.opensaf.org/wiki/HealthCheck" rel="nofollow">HealthCheck?</a> leading to Application restart by OpenSAF. </p> <p> </p> <p> Attachment is the stack of some of the relevant threads of our application indicating the threads waiting on locks. </p> <p> </p> <p> An urgent insight into this problem shall be highly helpful. </p> Results http://devel.opensaf.org/ticket/540#changelog http://devel.opensaf.org/ticket/577 http://devel.opensaf.org/ticket/577 #577: SCAP crashes for N-way redundancy model. Tue, 26 May 2009 12:40:34 GMT nagendra <p> In N-way redudancy model application, Whan Scap comming up, it crashed because of stack overflow. This happens when avd_sg_nway_si_assign() calls m_AVD_SET_SG_FSM(cb, sg, AVD_SG_FSM_STABLE), which in tern calls avd_sg_nway_si_assign and loop gets created. </p> Results http://devel.opensaf.org/ticket/577#changelog http://devel.opensaf.org/ticket/692 http://devel.opensaf.org/ticket/692 #692: crash in cpnd at opensaf restart Fri, 30 Oct 2009 15:38:15 GMT chris.j.leary@… <p> Our system was running for 6-8 hours. The system was stopped and then restarted via /etc/init.d/opensafd stop ; /etc/init.d/opensafd start and the system rebooted shortly thereafter. </p> <p> After the system came back up, it was found that ncs_cpnd had core dumped 6 times in a row during startup before escalation to node reboot occurred. </p> <p> Below is the backtrace of one of the core dumps (all are identical). There appears to be an error in req->info.read.i_addr passed to ncs_os_posix_shm. </p> <p> Core was generated by `/usr/lib/opensaf/ncs_cpnd'. Program terminated with signal 11, Segmentation fault. [New process 28112] [New process 28111] [New process 28110] [New process 28109] [New process 28108] <a class="missing ticket">#0</a> 0x00002b17ffb09d90 in memcpy () from /lib/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b17ffb09d90 in memcpy () from /lib/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002b17ff20e462 in ncs_os_posix_shm (req=0x40dd21c0) </p> <blockquote> <p> at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c:1326 </p> </blockquote> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x000000000040af10 in cpnd_find_free_loc (cb=0x631980, type=<value optimized out>) </p> <blockquote> <p> at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_res.c:640 </p> </blockquote> <p> <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x000000000040b168 in cpnd_restart_shm_client_update (cb=0x40dd2200, cl_node=0x639110) </p> <blockquote> <p> at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_res.c:1045 </p> </blockquote> <p> <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x000000000041780e in cpnd_process_evt (evt=0x637eb0) </p> <blockquote> <p> at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_evt.c:466 </p> </blockquote> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x0000000000403622 in cpnd_main_process (info=<value optimized out>) </p> <blockquote> <p> at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/cpsv/cpnd/cpnd_init.c:628 </p> </blockquote> <p> <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00002b17ffdf53f7 in start_thread () from /lib/libpthread.so.0 <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x00002b17ffb64b4d in clone () from /lib/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/8" title="defect: OpenSAF does not own POSIX Shared memory (closed: fixed)">#8</a> 0x0000000000000000 in ?? () (gdb) up <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002b17ff20e462 in ncs_os_posix_shm (req=0x40dd21c0) </p> <blockquote> <p> at /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c:1326 </p> </blockquote> <p> 1326 /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c: No such file or directory. </p> <blockquote> <p> in /home/cleary/build_opensaf3/cae/extern/opensaf-3.0.0/services/leap/src/os_defs.c </p> </blockquote> <p> (gdb) print req $1 = (NCS_OS_POSIX_SHM_REQ_INFO *) 0x40dd21c0 (gdb) print *req $2 = {type = NCS_OS_POSIX_SHM_REQ_READ, info = {open = {i_name = 0x0, i_flags = 4, i_map_flags = 0, </p> <blockquote> <p> i_size = 1088233984, i_offset = 40, o_addr = 0x0, o_fd = 0, o_hdl = 0}, close = {i_hdl = 0, i_addr = 0x4, i_fd = 1088233984, i_size = 40}, unlink = {i_name = 0x0}, read = {i_hdl = 0, i_addr = 0x4, i_to_buff = 0x40dd2200, i_read_size = 40, i_offset = 0}, write = {i_hdl = 0, i_addr = 0x4, i_from_buff = 0x40dd2200, i_write_size = 40, i_offset = 0}}} </p> </blockquote> Results http://devel.opensaf.org/ticket/692#changelog http://devel.opensaf.org/ticket/745 http://devel.opensaf.org/ticket/745 #745: SNMP Subagent library not recognized Thu, 14 Jan 2010 07:38:08 GMT anonymous <p> We are migrating from OpenSAF release 2.1.0 GA to 3.0.1 GA. We have placed application's snmp subagent library in /usr/lib/opnesaf and made an entry for the same in the file /etc/opensaf/subagt_lib.conf. However in the SUBAGENT log file in /var/lib/opensaf/log I find the error "unable to find libhsstm_subagt.so : file does not exist.". OpenSAF is not able to read the library file placed in correct location with correct name. </p> <p> The same library is working fine with OpenSAF Release 3.0.0 GA with the same setup as for Release 3.0.1 GA. </p> <p> Also the same library is working fine with OpenSAF Release 2.1.0 GA. </p> <p> An early help shall be highly helpful. </p> Results http://devel.opensaf.org/ticket/745#changelog http://devel.opensaf.org/ticket/134 http://devel.opensaf.org/ticket/134 #134: Define fault Injections use cases Wed, 26 Mar 2008 12:09:03 GMT marioa <p> Current test cases in OpenSAF test infrastructure are focused on API testing on single node. This task will define fault injection use cases (in form of text document) that can be used as input for implementing test cases covering defined use cases, within OpenSAF test framework. </p> Results http://devel.opensaf.org/ticket/134#changelog http://devel.opensaf.org/ticket/136 http://devel.opensaf.org/ticket/136 #136: Return values of system calls not checked in LEAP file API Fri, 28 Mar 2008 10:40:18 GMT hans.feldt@… <p> In LEAP function ncs_os_file(), some system calls always return SUCCESS and not the actual status from the system call/command. The clients of this API must check the return value and implement proper error handling. Otherwise this could potentially lead to an inconsistent state of the middleware or crasches. </p> Results http://devel.opensaf.org/ticket/136#changelog http://devel.opensaf.org/ticket/165 http://devel.opensaf.org/ticket/165 #165: MDS over alternate transports Wed, 14 May 2008 11:06:15 GMT marioa Results http://devel.opensaf.org/ticket/165#changelog http://devel.opensaf.org/ticket/183 http://devel.opensaf.org/ticket/183 #183: OpenSAF4.0 Solaris Port with new build system Thu, 22 May 2008 18:45:33 GMT fherrm <p> The goal of this task is to make OpenSAF available on Solaris both from a build perspective and from an execution environment perspective. </p> <p> Sun Studio (<a class="ext-link" href="http://developers.sun.com/sunstudio/"><span class="icon">http://developers.sun.com/sunstudio/</span></a>) as well as gcc must be supported. </p> Results http://devel.opensaf.org/ticket/183#changelog http://devel.opensaf.org/ticket/184 http://devel.opensaf.org/ticket/184 #184: Implement SAI-AIS-PLM-A.01.01 Thu, 22 May 2008 18:52:18 GMT fherrm <p> Provide an implementation of the PLM Service A.01.01 </p> Results http://devel.opensaf.org/ticket/184#changelog http://devel.opensaf.org/ticket/208 http://devel.opensaf.org/ticket/208 #208: enhance test tools and build of test infrastructure Tue, 24 Jun 2008 01:48:46 GMT carol.wilhelmy@… <p> The OpenSAF test tools and the build of the OpenSAF test infrastructure require enhancement to accommodate support of new platforms like Solaris. To do this, I suggest </p> <ul><li>enhancing existing documentation </li><li>eliminating unused legacy stuff </li><li>consolidating many scripts into a few simpler scripts </li><li>separating tools for doing build of tests versus setup and running of tests </li><li>replacing the many bash/csh/sh build scripts and makefiles with a simple, clean recursive make structure </li><li>automating as many of the manual steps as possible </li></ul><p> This has the additional potential side effects of </p> <ul><li>enabling easy maintenance </li><li>enabling support for new platforms (like Solaris) </li><li>enabling parallel builds of tests in the future </li><li>enabling the build of tests for a select service (rather than having to re-build all tests every time) </li><li>reducing the number of questions on the users/devel lists about building and running tests </li><li>enabling easier addition of new tests </li><li>ease-of-use: doing things the same way for all platforms </li></ul> Results http://devel.opensaf.org/ticket/208#changelog http://devel.opensaf.org/ticket/210 http://devel.opensaf.org/ticket/210 #210: Controller starts heart beat timer for own AvND causing whole cluster to restart Tue, 24 Jun 2008 13:49:26 GMT bertil.engelholm@… <p> Sometimes when an active controller is rebooted the new active controller also reboots causing the whole cluster to restart. </p> <p> The reason it reboots is because it is mistakenly marked as a payload node in the cb->avnd_anchor data structure. Therefore the HB timer towards its own AvND is started when the controller becomes active. And when that timer times out the controller reboots causing all payload nodes to reboot since they lost contact with both controllers. </p> <p> After debugging this for some time I have seen that the avnd->type variable is wrong the first time when the standby controller is updated with AvND data from the active controller, using ckpt. So when this standby controller later becomes active it starts the HB timer. However I have not seen that the active controller (that sends the data to the standby) has started its timer. So my conclusion is that the avnd->type variable is mistakenly changed to payload sometime from it's set to controller to the time the avnd data is sent to the standby. </p> <p> This problem sometimes occure about every 10:th reboot but when debug printouts is added the problem sometimes disappears completely. So it feels like it could be timing related or possibly some interference between different priority levels. </p> Results http://devel.opensaf.org/ticket/210#changelog http://devel.opensaf.org/ticket/250 http://devel.opensaf.org/ticket/250 #250: LGPL not a good license for documentation Fri, 19 Sep 2008 21:59:43 GMT scon <blockquote class="citation"> <p> Hi All, I have created a new tar ball for the R2 documentation. Please reference it at: (<a class="ext-link" href="http://download.opensaf.org/documentation/OpenSAF_R2_Documentation_091608_pdfs.tar.gz"><span class="icon">http://download.opensaf.org/documentation/OpenSAF_R2_Documentation_091608_pdfs.tar.gz</span></a>). See below for a description of the changes that I have made. </p> </blockquote> <p> I don't think that the LGPL [1] is the most appropriate license for the documentation. The "lesser" part of the LGPL is specifically intended to allow linking with non-free modules, this is obviously not applicable with documentation. </p> <p> I believe the straight GNU GPL[2] version 2.0 would be a much more appropriate license for this documentation. Baring the use of the GPL I would chose the FreeBSD Documentation License[3], or something similar. </p> <p> I'm sure changes to licensing need to go through the OpenSAF board of directors. Any one else have thoughts on this? </p> <p> Troy </p> <p> [1] <a class="ext-link" href="http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html"><span class="icon">http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html</span></a> [2] <a class="ext-link" href="http://www.gnu.org/licenses/old-licenses/gpl-2.0.html"><span class="icon">http://www.gnu.org/licenses/old-licenses/gpl-2.0.html</span></a> [3] <a class="ext-link" href="http://www.freebsd.org/copyright/freebsd-doc-license.html"><span class="icon">http://www.freebsd.org/copyright/freebsd-doc-license.html</span></a> </p> Results http://devel.opensaf.org/ticket/250#changelog http://devel.opensaf.org/ticket/260 http://devel.opensaf.org/ticket/260 #260: RDE reference implementation causing split brain Thu, 25 Sep 2008 07:14:02 GMT marioa <p> Reference implementation for RDE is not robust enough, relying on timers/timeouts, and causing often split brain situations (both controllers actives or both standby). </p> <p> See mail discussion: <a class="ext-link" href="http://list.opensaf.org/pipermail/devel/2008-August/001150.html"><span class="icon">http://list.opensaf.org/pipermail/devel/2008-August/001150.html</span></a> </p> <p> Fix could look like: - having epoch counter that increments each time a TCP connection has been established with peer RDF - then when controller establish connection one with larger epoch number (larger history) will be active - if two epoch numbers are same one with smaller OpenSAF node number will be active </p> Results http://devel.opensaf.org/ticket/260#changelog http://devel.opensaf.org/ticket/267 http://devel.opensaf.org/ticket/267 #267: LEAP timer memory leak Wed, 01 Oct 2008 07:02:10 GMT hafe <p> Services using timers like CKPT experience a slow but measurable increase in memory use over time (process ncs_cpnd). Of course depending of the application. </p> <p> The LEAP timer control block is approximately 64 bytes and will not be returned until the timer has expired. The function ncs_tmr_free() does not free the memory, it just change the state of the timer. </p> Results http://devel.opensaf.org/ticket/267#changelog http://devel.opensaf.org/ticket/273 http://devel.opensaf.org/ticket/273 #273: SAI-AIS-SMF: Software Management Framework, Step 1 Sun, 12 Oct 2008 15:07:34 GMT marioa <p> First implementation step/increment of Software Management Framework. Functional scope of first step (that will be addressed by this ticket) is still to be defined. </p> Results http://devel.opensaf.org/ticket/273#changelog http://devel.opensaf.org/ticket/274 http://devel.opensaf.org/ticket/274 #274: Add support for 'Log Service Administration API' Mon, 13 Oct 2008 08:38:34 GMT hafe <p> Include support for SaLogFilterSetCallbackT in API. </p> Results http://devel.opensaf.org/ticket/274#changelog http://devel.opensaf.org/ticket/275 http://devel.opensaf.org/ticket/275 #275: Add support for LOG 'Alarms and Notifications' Mon, 13 Oct 2008 08:39:19 GMT hafe Results http://devel.opensaf.org/ticket/275#changelog http://devel.opensaf.org/ticket/281 http://devel.opensaf.org/ticket/281 #281: MAS sometimes goes out of sync Mon, 27 Oct 2008 09:08:12 GMT bertil.engelholm@… <p> MAS synchonization of it's data between active and standby is not working properly. Sometimes it goes out of sync which will cause a cluster restart if you get a failover in this situation. A warm sync made once every minute will correct the problem but it's up to a minut where you basically run without a synced standby. The problem is in the mas_init_register function where the active node sometimes increases it data change counter (async_count) but the standby will fail to do the same which means it's out of sync. In more detail it's when the standby make a mab_fltrid_list_get and finds out that o_fltr_id == reg_req->fltr_id. In this branch the async_count is never increased. The question is if the active node should send a SYNC_DONE at all in this case ? </p> Results http://devel.opensaf.org/ticket/281#changelog http://devel.opensaf.org/ticket/282 http://devel.opensaf.org/ticket/282 #282: buffer overflow detected in librda when GCC stack-protector is enabled Wed, 29 Oct 2008 17:43:27 GMT troyh <p> When compiling with "-D_FORTIFY_SOURCE=2 -fstack-protector" a buffer overflow is detected at runtime in librda. </p> <p> Wed Oct 29 11:38:53 MDT 2008 - Starting Node Initialization Daemon: /usr/bin/ncs_nid Starting TIPC service... Done. Starting RDF service... Done. *** buffer overflow detected ***: /usr/bin/ncs_nid terminated ======= Backtrace: ========= /lib64/libc.so.6(<span class="underline">chk_fail+0x2f)[0x361f2e54bf] /lib64/libc.so.6[0x361f2e47af] /usr/lib64/librda.so.3(rda_get_control_block+0x48)[0x2af40fc583c8] /usr/lib64/librda.so.3[0x2af40fc583ee] /usr/lib64/librda.so.3(pcs_rda_get_role+0x59)[0x2af40fc58cd9] /usr/bin/ncs_nid[0x405671] /usr/bin/ncs_nid[0x405aac] /lib64/libc.so.6(</span>libc_start_main+0xf4)[0x361f21d8b4] /usr/bin/ncs_nid[0x401b79] ======= Memory map: ======== 00400000-00408000 r-xp 00000000 68:11 3160578 /usr/bin/ncs_nid 00608000-00609000 rw-p 00008000 68:11 3160578 /usr/bin/ncs_nid 073a4000-073c5000 rw-p 073a4000 00:00 0 345c600000-345c60d000 r-xp 00000000 68:11 25002004 /lib64/libgcc_s-4.1.2-20080102.so.1 345c60d000-345c80d000 —p 0000d000 68:11 25002004 /lib64/libgcc_s-4.1.2-20080102.so.1 345c80d000-345c80e000 rw-p 0000d000 68:11 25002004 /lib64/libgcc_s-4.1.2-20080102.so.1 361ee00000-361ee1a000 r-xp 00000000 68:11 25002267 /lib64/ld-2.5.so 361f01a000-361f01b000 r—p 0001a000 68:11 25002267 /lib64/ld-2.5.so 361f01b000-361f01c000 rw-p 0001b000 68:11 25002267 /lib64/ld-2.5.so 361f200000-361f34a000 r-xp 00000000 68:11 25002268 /lib64/libc-2.5.so 361f34a000-361f549000 —p 0014a000 68:11 25002268 /lib64/libc-2.5.so 361f549000-361f54d000 r—p 00149000 68:11 25002268 /lib64/libc-2.5.so 361f54d000-361f54e000 rw-p 0014d000 68:11 25002268 /lib64/libc-2.5.so 361f54e000-361f553000 rw-p 361f54e000 00:00 0 361f600000-361f602000 r-xp 00000000 68:11 25002269 /lib64/libdl-2.5.so 361f602000-361f802000 —p 00002000 68:11 25002269 /lib64/libdl-2.5.so 361f802000-361f803000 r—p 00002000 68:11 25002269 /lib64/libdl-2.5.so 361f803000-361f804000 rw-p 00003000 68:11 25002269 /lib64/libdl-2.5.so 361fa00000-361fa82000 r-xp 00000000 68:11 25002036 /lib64/libm-2.5.so 361fa82000-361fc81000 —p 00082000 68:11 25002036 /lib64/libm-2.5.so 361fc81000-361fc82000 r—p 00081000 68:11 25002036 /lib64/libm-2.5.so 361fc82000-361fc83000 rw-p 00082000 68:11 25002036 /lib64/libm-2.5.so 361fe00000-361fe15000 r-xp 00000000 68:11 25002273 /lib64/libpthread-2.5.so 361fe15000-3620014000 —p 00015000 68:11 25002273 /lib64/libpthread-2.5.so 3620014000-3620015000 r—p 00014000 68:11 25002273 /lib64/libpthread-2.5.so 3620015000-3620016000 rw-p 00015000 68:11 25002273 /lib64/libpthread-2.5.so 3620016000-362001a000 rw-p 3620016000 00:00 0 3620e00000-3620e07000 r-xp 00000000 68:11 25002274 /lib64/librt-2.5.so 3620e07000-3621007000 —p 00007000 68:11 25002274 /lib64/librt-2.5.so 3621007000-3621008000 r—p 00007000 68:11 25002274 /lib64/librt-2.5.so 3621008000-3621009000 rw-p 00008000 68:11 25002274 /lib64/librt-2.5.so 2af40f9a5000-2af40f9a6000 rw-p 2af40f9a5000 00:00 0 2af40f9b5000-2af40f9b6000 rw-p 2af40f9b5000 00:00 0 2af40f9b6000-2af40fa3f000 r-xp 00000000 68:11 3160457 /usr/lib64/libncs_core.so.3.0.0 2af40fa3f000-2af40fc3e000 —p 00089000 68:11 3160457 /usr/lib64/libncs_core.so.3.0.0 2af40fc3e000-2af40fc4a000 rw-p 00088000 68:11 3160457 /usr/lib64/libncs_core.so.3.0.0 2af40fc4a000-2af40fc57000 rw-p 2af40fc4a000 00:00 0 2af40fc57000-2af40fc5a000 r-xp 00000000 68:11 3160459 /usr/lib64/librda.so.3.0.0 2af40fc5a000-2af40fe5a000 —p 00003000 68:11 3160459 /usr/lib64/librda.so.3.0.0 2af40fe5a000-2af40fe5b000 rw-p 00003000 68:11 3160459 /usr/lib64/librda.so.3.0.0 2af40fe5b000-2af40fe5d000 rw-p 2af40fe5b000 00:00 0 7fff9b0f0000-7fff9b105000 rw-p 7fff9b0f0000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 —p 00000000 00:00 0 [vdso] Wed Oct 29 11:39:06 MDT 2008 - SERVICE Initialization Failed: </p> Results http://devel.opensaf.org/ticket/282#changelog http://devel.opensaf.org/ticket/285 http://devel.opensaf.org/ticket/285 #285: AvND should sent heart beat messages using a real time thread Tue, 04 Nov 2008 07:06:49 GMT hafe <p> Background, see: <a class="ext-link" href="http://list.opensaf.org/pipermail/devel/2008-June/000964.html"><span class="icon">http://list.opensaf.org/pipermail/devel/2008-June/000964.html</span></a> <a href="http://devel.opensaf.org/ticket/241">http://devel.opensaf.org/ticket/241</a> </p> Results http://devel.opensaf.org/ticket/285#changelog http://devel.opensaf.org/ticket/286 http://devel.opensaf.org/ticket/286 #286: DTSv file content should be more syslog like Tue, 04 Nov 2008 07:08:27 GMT hafe <p> The format of DTSv files should change and become more "syslog like". This includes a mandatory syslog time stamp (still including milliseconds) in column one. Redundant information (such as node ID) should be removed. The change is to facilitate cross searching for information and even merging of DTSv files with syslog files for event correlation. </p> <p> Background: </p> <blockquote> <p> <a class="ext-link" href="http://list.opensaf.org/pipermail/devel/2008-October/001713.html"><span class="icon">http://list.opensaf.org/pipermail/devel/2008-October/001713.html</span></a> </p> </blockquote> Results http://devel.opensaf.org/ticket/286#changelog http://devel.opensaf.org/ticket/290 http://devel.opensaf.org/ticket/290 #290: LEAP scope should be reduced to a number of add-on services Tue, 04 Nov 2008 07:17:35 GMT hafe <p> The LEAP scope should be reduced to a number of add-on services (edu, mailbox, etc). When there exist a corresponding function in standard C or POSIX this should be used and LEAP reduced with that function. </p> <p> Background: <a class="ext-link" href="http://list.opensaf.org/pipermail/devel/2008-October/001713.html"><span class="icon">http://list.opensaf.org/pipermail/devel/2008-October/001713.html</span></a> </p> Results http://devel.opensaf.org/ticket/290#changelog http://devel.opensaf.org/ticket/291 http://devel.opensaf.org/ticket/291 #291: DTSv should write to node local disc partitions. Tue, 04 Nov 2008 07:19:46 GMT hafe <p> DTSv should write to node local disc partitions. DTSv clients on controllers always use the local DTSv server. DTSv should be started first and stopped last. This solves the problem of traces lost when they are needed most e.g. a failed controller fail-over. </p> <p> Background: <a class="ext-link" href="http://list.opensaf.org/pipermail/devel/2008-October/001713.html"><span class="icon">http://list.opensaf.org/pipermail/devel/2008-October/001713.html</span></a> </p> Results http://devel.opensaf.org/ticket/291#changelog http://devel.opensaf.org/ticket/301 http://devel.opensaf.org/ticket/301 #301: avm_rda_cb() fails when standby controller is shutdown Mon, 17 Nov 2008 19:44:55 GMT mibis <p> Running two controllers with HISv turned on. Shutdown standby controller (nis_scxb stop) and the active controller calls avm_rda_cb(). This code calls avm_find_ent_info() which fails resulting in an error message appearing in the scap log: </p> <blockquote> <p> PCS_RDA_NODE_RESET_CMD recvd, slot 2, shelf 2 Error in avm_rda_cb : ent_info is NULL </p> </blockquote> <p> It appears that the avm_rda_cb() is not able to complete its function of resetting the standby controller due to the failure of being able to lookup the node_name for the failed node using the avm_find_ent_info() call. </p> Results http://devel.opensaf.org/ticket/301#changelog http://devel.opensaf.org/ticket/305 http://devel.opensaf.org/ticket/305 #305: LOG service record write throughput is very bad Tue, 02 Dec 2008 13:05:24 GMT hafe <p> The reason is that the O_SYNC flag is set when opening a file. When the logsv directory is using drbd is gets even worse. </p> <p> The suggestion is to removed the O_SYNC flag. The consequence is that a controller power off could result in loss of acknowledged log records in the log file. </p> <p> Removing the O_SYNC flag results in ~250 times improvement. From 20 to 5000 records per second! </p> Results http://devel.opensaf.org/ticket/305#changelog http://devel.opensaf.org/ticket/330 http://devel.opensaf.org/ticket/330 #330: Report MDS_DOWN if no service exists Fri, 16 Jan 2009 08:48:15 GMT bertil.engelholm@… <p> If an m+n adest is withdrawn and there are no other adest available for the service MDS_DOWN is sent first after AWAIT_ACTIVE_TMR expires. If a new adest is published within AWAIT_ACTIVE_TIME it means that the user will not be informed that there has been no service available at all for a moment. </p> <p> In AVD-AvND communication this means that if a new controller comes up within AWAIT_ACTIVE_TIME AvND is not informed that it has lost contact with both AVD for a while. This implies that the payload node will never enter the cluster again until someone reboots it manually. If MDS_DOWN is sent by MDS to AvND as soon as there are no AVD service available it'll reboot the payload node. </p> <p> I'm working on a patch solving this. </p> Results http://devel.opensaf.org/ticket/330#changelog http://devel.opensaf.org/ticket/332 http://devel.opensaf.org/ticket/332 #332: PSS store corrupted Fri, 16 Jan 2009 09:15:20 GMT hafe <p> We have had major problems with PSSV where the reformat was interrupted in the middle of copying _ISU back to current which causes a corrupt PSSV. </p> <p> It is suggested to change the copying into renames and include restoring in case you are interrupted in the middle of the renaming. </p> <p> Patch will be published in short. </p> Results http://devel.opensaf.org/ticket/332#changelog http://devel.opensaf.org/ticket/334 http://devel.opensaf.org/ticket/334 #334: Seg fault in ncs_cpnd Mon, 19 Jan 2009 13:31:21 GMT hafe <p> Problem seen using 2.0.1 </p> <p> It seems that the shared memory info is not valid (i_offset is crazy) but used anyway. </p> <p> Any ideas? </p> <p> Program terminated with signal 11, Segmentation fault. <a class="missing ticket">#0</a> 0x00002b8adf1353d1 in memcpy () from /lib64/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b8adf1353d1 in memcpy () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aabaaab4af8 in ?? () <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x000000000040f1c2 in cpnd_restart_shm_ckpt_free (cb=0x5386d0, cp_node=0x2aaaabadf510) at ./cpnd_res.c:1251 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x000000000041caf9 in cpnd_evt_proc_ckpt_destroy (cb=0x5386d0, evt=0xdfd2f0, sinfo=0xdfd470) at ./cpnd_evt.c:4874 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x00000000004121c4 in cpnd_process_evt (evt=0xdfd2e0) at ./cpnd_evt.c:337 <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x0000000000404b4d in cpnd_main_process (info=0x5386d0) at ./cpnd_init.c:590 <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00002b8adefad143 in start_thread () from /lib64/libpthread.so.0 <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x00002b8adf180b8d in clone () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/8" title="defect: OpenSAF does not own POSIX Shared memory (closed: fixed)">#8</a> 0x0000000000000000 in ?? () (gdb) up <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aabaaab4af8 in ?? () (gdb) <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x000000000040f1c2 in cpnd_restart_shm_ckpt_free (cb=0x5386d0, cp_node=0x2aaaabadf510) at ./cpnd_res.c:1251 1251 ./cpnd_res.c: No such file or directory. </p> <blockquote> <p> in ./cpnd_res.c </p> </blockquote> <p> (gdb) info locals ckpt_info = {ckpt_name = {length = 0, value = '\0' <repeats 255 times>}, ckpt_id </p> <h1 id="a0maxSections0maxSecSize0node_id0offset0client_bitmap">0, maxSections = 0, maxSecSize = 0, node_id = 0, offset = 0, client_bitmap</h1> <p> 0, is_valid = 0, bm_offset = 0, </p> <blockquote> <p> is_unlink = 0, is_close = 0, cpnd_rep_create = 0, is_first = 0, close_time = </p> </blockquote> <p> 0, next = 0} ckpt_hdr = {num_ckpts = 1990} rc = 1 i_offset = 4294966952 no_ckpts = 1990 (gdb) p cp_node $1 = (CPND_CKPT_NODE *) 0x2aaaabadf510 (gdb) p *cp_node $2 = {patnode = {bit = 9, left = 0x5cb740, right = 0x2aaaabadf510, key_info = 0x2aaaabadf530 "[a\017"}, ckpt_id = 1007963, ckpt_name = {length = 0, </p> <blockquote> <p> value = "tappCkpt-399250", '\0' <repeats 240 times>}, create_attrib = </p> </blockquote> <p> {creationFlags = 1, checkpointSize = 700, retentionDuration = 60000000000, maxSections = 1, maxSectionSize = 700, </p> <blockquote> <p> maxSectionIdSize = 256}, open_flags = 0, ckpt_lcl_ref_cnt = 0, </p> </blockquote> <p> active_mds_dest = 570711172513473, is_active_exist = 1, replica_info = {n_secs = 0, mem_used = 0, open = { </p> <blockquote> <p> type = NCS_OS_POSIX_SHM_REQ_OPEN, info = {open = {i_name = 0xc84300 'ÿ' </p> </blockquote> <p> <repeats 32 times>, "5555", i_flags = 66, i_map_flags = 1, i_size = 1132, i_offset = 0, o_addr = 0x2aaaabc01000, </p> <blockquote> <p> o_fd = 1007, o_hdl = 0}, close = {i_hdl = 13124352, i_addr = </p> </blockquote> <p> 0x100000042, i_fd = 1132, i_size = 0}, unlink = {i_name = 0xc84300 'ÿ' <repeats 32 times>, "5555"}, read = { </p> <blockquote> <p> i_hdl = 13124352, i_addr = 0x100000042, i_to_buff = 0x46c, </p> </blockquote> <p> i_read_size = 0, i_offset = 0}, write = {i_hdl = 13124352, i_addr = 0x100000042, i_from_buff = 0x46c, i_write_size = 0, </p> <blockquote> <p> i_offset = 0}}}, shm_sec_mapping = 0x2aaaabadfa60, section_info = </p> </blockquote> <p> 0x0}, clist = 0x0, cpnd_dest_list = 0x2aaaabadfb10, cpnd_rep_create = 1, is_unlink = 1, is_close = 0, ret_tmr = { </p> <blockquote> <p> type = 0, tmr_id = 0x0, ckpt_id = 0, agent_dest = 0, lcl_sec_id = 0, uarg = </p> </blockquote> <p> 0, is_active = 0, write_type = 0}, is_restart = 0, is_ckpt_onscxb = 1, cur_state </p> <h1 id="a0oth_state">0, oth_state</h1><p> 0, </p> <blockquote> <p> agent_dest_list = 0x0, close_time = 0, is_rdset = 0, offset = -1, node_name = </p> </blockquote> <p> {length = 0, value = '\0' <repeats 255 times>}, is_cpa_created_ckpt_replica = 0, evt_bckup_q = 0x0, cpa_sinfo = { </p> <blockquote> <p> to_svc = 0, dest = 0, stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', </p> </blockquote> <p> data = '\0' <repeats 11 times>}}, cpa_sinfo_flag = 0} (gdb) p cp_node->offset $3 = -1 (gdb) p cb->shm_addr.ckpt_addr $4 = (void *) 0x2aaaaaab4c4c (gdb) printf "%x\n",i_offset fffffea8 </p> Results http://devel.opensaf.org/ticket/334#changelog http://devel.opensaf.org/ticket/425 http://devel.opensaf.org/ticket/425 #425: Test Infrastructure Modification Wed, 18 Feb 2009 15:09:10 GMT zx5337 <p> Summary </p> <hr /> <p> The compilation of the test frame is not modified; only the test execution and results collection are modifed. Some of the modifications are made according to Carol's (Ticket #208). For Avsv test requires the OpenSAF to be restarted,it is not supported currently. </p> <p> New files </p> <hr /> <p> run_opensaf_tests.sh setup_tet_env.sh opensaf_tet_env.sh cpsv/suites/cpsv_env.sh cpsv/suites/reg_cpsv.cfg cpsv/suites/reg_cpsv.dep edsv/suites/tet_edsv_util.c edsv/suites/edsv_env.sh edsv/suites/reg_edsv.cfg edsv/suites/reg_edsv.dep glsv/suites/glsv_env.sh glsv/suites/reg_glsv.cfg glsv/suites/reg_glsv.dep ifsv/src/tet_ifsv_util ifsv/suites/ifsv_env.sh ifsv/suites/reg_ifsv.cfg ifsv/suites/reg_ifsv.dep ifsv/vip/tet_ifsv_vip_util.c mbcsv/src/tet_mbcsv_util.c mbcsv/suites/mbcsv_env.sh mbcsv/suites/reg_mbcsv.cfg mbcsv/suites/reg_mbcsv.dep mds/src/tet_mds_util.c mds/suites/ mds_env.sh mds/suites/reg_mds.cfg mds/suites/reg_mds.dep mqsv/suites/ mqsv_env.sh mqsv/suites/reg_mqsv.cfg mqsv/suites/reg_mqsv.dep srmsv/src/tet_srmsv_util.c srmsv/suites/ srmsv_env.sh srmsv/suites/reg_srmsv.cfg srmsv/suites/reg_srmsv.dep </p> <p> Changed files </p> <hr /> <p> common/src/tet_init.c cpsv/src/tet_cpa_test.c cpsv/src/tet_cpsv_util.c cpsv/suites/reg_cpsv.scen edsv/src/tet_eda.c edsv/suites/reg_edsv.scen glsv/src/tet_gla.c glsv/src/tet_glsv_util.c glsv/suites/reg_glsv.scen ifsv/src/tet_ifa.c ifsv/suites/reg_ifsv.scen ifsv/vip/vip_ifa.c mbcsv/src/mbcsv_purpose.c mbcsv/suites/reg_mbcsv.scen mds/src/tet_mdstipc_api.c mds/suites/reg_mds.scen mqsv/src/tet_mqa.c mqsv/src/tet_mqsv_util.c mqsv/suites/reg_mqsv.scen srmsv/src/tet_srma_test.c srmsv/suites/reg_srmsv.scen </p> <p> Deleted files </p> <hr /> <p> cpsv/suites/reg_cpsv.sh cpsv/suites/run cpsv/suites/tetexec.cfg edsv/suites/reg_edsv.sh edpsv/suites/run edsv/suites/tetexec.cfg glsv/suites/reg_glsv.sh glsv/suites/run glsv/suites/tetexec.cfg ifsv/suites/reg_ifsv.sh ifsv/suites/eg_ifsv_driver.sh ifsv/suites/reg_vip.sh ifsv/suites/run ifsv/suites/tetexec.cfg logsv maa_switch mbcsv/suites/reg_mbcsv.sh mbcsv/suites/run mbcsv/suites/tetexec.cfg mds/suites/reg_mds.sh mds/suites/run mds/suites/tetexec.cfg mqsv/suites/reg_mqsv.sh mqsv/suites/run mqsv/suites/tetexec.cfg srmsv/suites/reg_srmsv.sh srmsv/suites/run srmsv/suites/tetexec.cfg alignEtcHosts.pl build_all.sh install_tccd.sh lib_path.sh make_env.csh make_tetinc run_test.sh makefile saf_tests_build_and_run setup.sh test_utils.sh tetware_patch </p> <p> Improvement </p> <hr /> <p> 1. Combine two files, run_test.sh and lib_path.sh, into a single shell script run_opensaf_test.sh. </p> <p> 2. Delete two functions build_reg_scen_file and build_ifsv_reg_scen_file, which create scenario files for TETware. This provides developers with the flexibility of modifying the test scenarios according to requirements. </p> <p> 3. Check the existence of the suites directory instead of specifying the test name in the script to ensure the availability of the test name. In this case, developers do not need to modify the script when adding a new test. </p> <p> 4. Provide some template files when adding a new test. </p> <p> 5. Support dependent tests. </p> <p> 6. Support concurrent execution of multiple tests. </p> <p> 7. Create test results in html format. </p> <p> 8. Collect log information to log file. </p> <p> 9. Run tests and generate results using only one command. </p> <p> 10. Delete "tetware_patch". </p> Results http://devel.opensaf.org/ticket/425#changelog http://devel.opensaf.org/ticket/445 http://devel.opensaf.org/ticket/445 #445: PSSv related Thu, 12 Mar 2009 10:06:56 GMT gaurav.nangla@… <p> The admswitch cli command is unstable with the component PSS always failing on the standby server with the following error code: safComp=CompT_PSS,safSu=SuT_NCS_CNTLR,safNode=SC_2_2 failed due to : errorReport(1). We are not using any replication mechanism other than the default mechanism. </p> Results http://devel.opensaf.org/ticket/445#changelog http://devel.opensaf.org/ticket/446 http://devel.opensaf.org/ticket/446 #446: AvSv related Thu, 12 Mar 2009 10:08:32 GMT gaurav.nangla@… <p> We have observed that sometimes when a process restarts it receives a wrong first <a class="missing wiki" href="http://devel.opensaf.org/wiki/CsiSetCallback" rel="nofollow">CsiSetCallback?</a>() with ha_state = SA_AMF_HA_ACTIVE and csi_desc.csiStateDescriptor.activeDescriptor.transitionDescriptor = SA_AMF_CSI_NOT_QUIESCED/SA_AMF_CSI_QUIESCED instead of SA_AMF_CSI_NEW_ASSIGN. </p> Results http://devel.opensaf.org/ticket/446#changelog http://devel.opensaf.org/ticket/489 http://devel.opensaf.org/ticket/489 #489: OpenSAF Message Queue Open fails with no resource error Wed, 08 Apr 2009 10:03:49 GMT manish srivastava <manish.srivastava@…> <p> I am using OpenSAF message queue service. Many a times, I have seen the error while opening/creating a message queue. The error is SA_AIS_ERR_NO_RESOURCES. This happens when the no free queue is available in OS. </p> <p> A workaround solution is to increase the number of OS queues to a larger number in /etc/sysctl.conf file. But this is just delaying the problem. </p> <p> The basic issue is that OpenSAF controller is not able to free the OS queues even when the queues are successfully closed. We do frequent failovers where the message queue close api is not called. In such case, the cleanup should be done by the OpenSAF framework. My queues are non persistent and with a retention timer of zero. There is no log except that the queue opens fails with error 18(SA_AIS_ERR_NO_RESOURCES) </p> Results http://devel.opensaf.org/ticket/489#changelog http://devel.opensaf.org/ticket/502 http://devel.opensaf.org/ticket/502 #502: Implement saImmOmAdminOperationContinue APIs. Thu, 16 Apr 2009 09:55:38 GMT anders <p> The following IMM-OM API calls where added in the A.02.01 version of the standard: </p> <blockquote> <p> saImmOmAdminOperationContinue() saImmOmAdminOperationContinueAsync() saImmOmAdminOperationContinueClear() </p> </blockquote> <p> These calls should be implemented to get a complete implementation of the A.02.01 version of the IMM standard in OpenSAF. </p> Results http://devel.opensaf.org/ticket/502#changelog http://devel.opensaf.org/ticket/531 http://devel.opensaf.org/ticket/531 #531: Error with saLckResourceLock timeout expiration Tue, 28 Apr 2009 12:04:19 GMT j.krmelj@… <p> When saLckResourceLock timeout runs out and a component requests lock again (by calling saLckResourceLock), the lock is granted with lockId=0. The resource later cannot be released anymore. This error is followed by some MDS error. </p> <p> Description of test environment: Four componenents located on separate nodes. Every component initializes and registers to Lock service, opens a resource and try to get a lock on it waiting maximum of 15 seconds. When lock is granted, task then sleeps 5 seconds and after that releases a lock. </p> <p> When up to three tasks request a lock on the resorce simultaneously, everything works fine. When fourth task joins a cluster and starts to request a lock, MDS errors apper and lock cannot be granted anymore. </p> <p> The testing environment consists of four nodes, node1, 2, 3 and 4. Test application runs on every node. Nodes node1 and node2 are system controllers, node1 is active controller, node3 and node4 are payloads. </p> <p> In app_gla.log files you can find the stdouts of my application. </p> <p> The scenario is as follows: 1. 21:58:07: node1 joins the cluster, lock agent is initialized, resource is open and while loop is entered 2. 21:58:56: node2 joins the cluster, lock agent is initialized, resource is open and while loop is entered 3. 22:01:55: node3 joins the cluster, lock agent is initialized, resource is open and while loop is entered 4. 22:04:31: node4 joins the cluster, lock agent is initialized, resource is open and while loop is entered 5. 22:04:41: node4 gets a lock 6. 22:04:46: node4 releases a lock and request another one 7. 22:04:47: node2 gets a lock 8. 22:04:51: node2 releases a lock and request another one 9. 22:04:51: node3 gets a lock 10. 22:04:56: node3 releases a lock and request another one 11. MDS error appear 12. 22:04:56: node1 gets error, timeout requesting lock 13. 22:04:59: node1 request lock again, request is granted but lock ID=0! 14. 22:05:04: node1 tries to unlock a resource, but fails due to lock ID=0 15. No other node is granted a lock, timeouts appear everywhere. </p> Results http://devel.opensaf.org/ticket/531#changelog http://devel.opensaf.org/ticket/538 http://devel.opensaf.org/ticket/538 #538: Add full support for Reader API Thu, 30 Apr 2009 17:15:57 GMT arne <p> The Reader API should recreate notifications from SAF log service records. Filter and search methods are missing. </p> <p> Only Alarm notifications can be read from a cache. </p> Results http://devel.opensaf.org/ticket/538#changelog http://devel.opensaf.org/ticket/543 http://devel.opensaf.org/ticket/543 #543: IMMSv: Persistent repository Tue, 05 May 2009 08:19:17 GMT anders <p> The IMM standard specifies that configuration attributes and persistent runtime attributes shall be *persistent* relative to cluster rstarts. </p> <p> Such persistence shall hold for every committed CCB. This implies that the saImmOmCcbApply call will block and return SA_AIS_OK only when/if persistence for that CCB is secured, by the IMMSv. </p> <p> The persistence guarantee also applies implicitly for every update of a persistent runtime attribute. This implies that invocations of calls saImmOiRtObjectCreate/Delete/Update will block and return SA_AIS_OK only when/if persistence is secured for that operation, by the IMMSv. </p> <p> The IMMSv as provided in OpenSAF3.0 does not comply with the persistence requirement. Persistence is only supported in the weaker sense of allowing dumps to the imm.xml format, which may then be used to replace the imm.xml file used at cluster re-start. The OpenSAF IMM implementation only provides the immdump binary, which dumps the persistent part of the current IMM contents. </p> <p> Such dumps must be generated by the user of OpenSAF, when that user wants to secure persistence in the face of cluster restarts. Care must also be taken to wrap the use of immdump in such a way that the dump atomically replaces the file used for loading. Specifically, the user must avoid direct overwrite of the current imm.xml with the new dump, since a failed dump would result in *no* valid imm.xml fail being available at cluster restart. </p> <p> It has not yet been decided how far the OpenSAF implementation of IMMSv shall go to provide the support for CCB level persistence. OpenSAF is used by several users with differing requirements on persistence and differing preferences of what technology to use for the persistence back-end. </p> <p> Any solution provided as part of OpenSAF should be light-weight (little or no configuration) and low on resource demands. The solution should probably also be open for "plug in" towards different persistence back ends as decided by the OpenSAF user. </p> Results http://devel.opensaf.org/ticket/543#changelog http://devel.opensaf.org/ticket/545 http://devel.opensaf.org/ticket/545 #545: Cross Compilation for WindRiver Pne-2.0 Fails Wed, 06 May 2009 07:19:45 GMT Suryanarayana Garlapati <suryanarayana.garlapati@…> <p> Hi, </p> <p> I am Cross compiling the Opensaf with the Windriver PNE-2.0. With the following changeset, its failing. </p> <hr /> <p> changeset: 515:b10400b32cf8 user: Mathi <Mathivanan.NP@…> date: Thu Apr 30 19:15:01 2009 +0530 summary: Fix for <a class="closed ticket" href="http://devel.opensaf.org/ticket/303" title="defect: ncs_eds segfaults on SuSE SLES 10 SP2 when compiled at O1 or above (closed: fixed)">#303</a>: ncs_eds segfaults on SuSE SLES 10 SP2 when compiled at O1 or above </p> <p> Following is the error observed after doing the configure. </p> <p> checking for main in -lxerces-c... no configure: error: Can't find the xerces shared libraries. </p> <p> But the xerces are present. </p> <hr /> <p> With the Following changeset the Cross Compilation is sucess. </p> <p> changeset: 468:eec5b563a77b user: Steve Constant <steve.constant@…> date: Fri Apr 10 11:23:04 2009 -0600 summary: Added tag 3.0.FC for changeset 9be981691099 </p> <p> Following is the ./configure option used. </p> <p> ./configure cc_exec_prefix="/opt/windriver/pne-2.0/7211/bin/i586-wrs-linux-gnu-x86_32-glibc_cgl" cc_lib_dir="/opt/windriver/pne-2.0/7211/sysroot/lib/" —enable-hpi —with-openhpi —with-hpi-interface=B02 —host=i586-wrs-linux-gnu CFLAGS="-g -I/home/surya/" CXXFLAGS="-g -I/home/surya" LDFLAGS="-L/opt/windriver/pne-2.0/7211/sysroot/usr/lib/" </p> <p> One more point to note that, with the latest changeset. Following error is observed in config.log configure:3044: checking whether we are cross compiling configure:3046: result: no </p> <p> but the older one in which it is success has the following log. configure:2897: checking whether we are cross compiling configure:2899: result: yes </p> <p> Attached are the config.log in both the cases. </p> <p> but previous </p> Results http://devel.opensaf.org/ticket/545#changelog http://devel.opensaf.org/ticket/548 http://devel.opensaf.org/ticket/548 #548: cpnd delays opensaf start with 5 seconds, causes csiSetcallbackTimeout Thu, 07 May 2009 09:56:54 GMT hafe <p> cpnd broadcasts a message and will get stuck in MDS for 5s due to: </p> <p> "When the subscription timer is running, any broadcast message-send is blocked till the timer expires because MDS is in the process of discovering all relevant MDS Client instances at that time." </p> <p> This causes csiSetcallbackTimeout on both cpnd and vds components if a 5s csiSetcallbackTimeout is used (as it used to be earlier) </p> <p> Please refer to the discussion starting at <a class="ext-link" href="http://list.opensaf.org/archives/devel/2009-April/003270.html"><span class="icon">http://list.opensaf.org/archives/devel/2009-April/003270.html</span></a> </p> Results http://devel.opensaf.org/ticket/548#changelog http://devel.opensaf.org/ticket/574 http://devel.opensaf.org/ticket/574 #574: Missing error Checking in mds_dt_tipc.c Mon, 25 May 2009 12:27:03 GMT gagandeep.bajaj@… <p> missing error handling of calls to ncs_enc_init_space_pp() and ncs_encode_n_octets_in_uba() in mds_dt_tipc.c. </p> <p> This can cause a segmentation fault. </p> Results http://devel.opensaf.org/ticket/574#changelog http://devel.opensaf.org/ticket/579 http://devel.opensaf.org/ticket/579 #579: AVD lacks error handling for MDS send failure Thu, 28 May 2009 10:14:04 GMT hafe <p> <a class="ext-link" href="http://list.opensaf.org/archives/devel/2009-May/004155.html"><span class="icon">http://list.opensaf.org/archives/devel/2009-May/004155.html</span></a> </p> <p> Why do the sequence numbers in the AVD-AvND protocol exist? </p> <p> What is the problem that the are supposed to solve? </p> Results http://devel.opensaf.org/ticket/579#changelog http://devel.opensaf.org/ticket/601 http://devel.opensaf.org/ticket/601 #601: FM/AVSv: SCAP failure can cause duplicate active for 2N redundant model Wed, 17 Jun 2009 11:48:22 GMT anders <p> See discussion thread in: </p> <blockquote> <p> <a class="ext-link" href="http://list.opensaf.org/archives/devel/2009-May/004096.html"><span class="icon">http://list.opensaf.org/archives/devel/2009-May/004096.html</span></a> </p> </blockquote> <p> If the SCAP process crashes then this should lead to IMMEDIATE node restart (or at least restart of middleware and SAF application at that node). </p> <p> The current solution allows applications to continue executing (for 10 seconds), then standby is promoted to active in parallell with an order from FM at standby to FM at the "active in demise" to restart. </p> <p> This solution is both unreliable (we dont know if the FM at the old active will comply) dangerous (since we allow a node with extreemely serious AVSv problems to continue executing) and defective (since it has a tendency to cause duplicate execution of 2N redundancy model). </p> <p> The only reason I dont class the ticket as critical is that the problem should be rare in a real system. We have only seen the problem when testing by manually killing SCAP. </p> <p> I have provided a simple illustative patch that shows approximately what should be done. In essence, when AVA detects loss of contact with (the local) AVND, it should termiante its hosting process. </p> <p> In addition, one of the processes/AVA's should order the node restart AND send a message to the peer FM that it is going down, which will cut short the 10 second waiting time for failover. </p> Results http://devel.opensaf.org/ticket/601#changelog http://devel.opensaf.org/ticket/608 http://devel.opensaf.org/ticket/608 #608: ncs_scap seg faults in fma Wed, 24 Jun 2009 07:45:55 GMT hafe <p> (gdb) bt full <a class="missing ticket">#0</a> 0x00002b40564b94f1 in buffered_vfprintf () from /lib64/libc.so.6 No symbol table info available. <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002b40564b520c in vfprintf () from /lib64/libc.so.6 No symbol table info available. <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002b40564d0853 in <span class="underline">fxprintf () from /lib64/libc.so.6 No symbol table info available. <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002b405649d2c8 in </span>assert_fail () from /lib64/libc.so.6 No symbol table info available. <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x00002b4056063cdb in fma_fm_node_reset_ind (cb=0x6ae1d0, phy_addr={slot = 7 '\a', site = 15 '\017'}) </p> <blockquote> <p> at src/fma_init.c:102 </p> <blockquote> <p> node = (NCS_PATRICIA_NODE *) 0x2aaaaab1d660 hdl_rec = (FMA_HDL_REC *) 0x2aaaaab1d660 pend_cbk_rec = (FMA_PEND_CBK_REC *) 0x2aaaaab62b40 <span class="underline">PRETTY_FUNCTION</span> = "fma_fm_node_reset_ind" </p> </blockquote> </blockquote> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x00002b40560641e9 in fma_mbx_msg_handler (cb=0x6ae1d0, fma_mbx_evt=0x2aaaaad99ab0) at src/fma_init.c:257 </p> <blockquote> <p> rc = 1 msg = (FMA_FM_MSG *) 0x2aaaaab44af0 </p> </blockquote> <p> <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00002b4056064535 in fma_main_proc (fma_init_hdl=0x2b4056167938) at src/fma_init.c:349 </p> <blockquote> <p> cb = (FMA_CB *) 0x6ae1d0 mbx_sel_obj = {raise_obj = 35, rmv_obj = 36} temp_sel_obj_set = {<span class="underline">fds_bits = {68719476736, 0 <repeats 15 times>}} fma_mbx_evt = (FMA_MBX_EVT_T *) 0x2aaaaad99ab0 msg = 0 </span></p> </blockquote> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x00002b40566ba143 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. <a class="closed ticket" href="http://devel.opensaf.org/ticket/8" title="defect: OpenSAF does not own POSIX Shared memory (closed: fixed)">#8</a> 0x00002b4056534bed in clone () from /lib64/libc.so.6 No symbol table info available. <a class="closed ticket" href="http://devel.opensaf.org/ticket/9" title="defect: OpenSAF deployment support non-flexible (closed: fixed)">#9</a> 0x0000000000000000 in ?? () No symbol table info available. </p> <p> (gdb) info locals node = (NCS_PATRICIA_NODE *) 0x2aaaaab1d660 hdl_rec = (FMA_HDL_REC *) 0x2aaaaab1d660 pend_cbk_rec = (FMA_PEND_CBK_REC *) 0x2aaaaab62b40 <span class="underline">PRETTY_FUNCTION</span> = "fma_fm_node_reset_ind" (gdb) p hdl_red No symbol "hdl_red" in current context. (gdb) p hdl_rec $1 = (FMA_HDL_REC *) 0x2aaaaab1d660 (gdb) p *hdl_rec $2 = {hdl_node = {bit = 7, left = 0x6ae2c8, right = 0x2aaaaab1d660, key_info = 0x2aaaaab1d680 "\001"}, hdl = 4237295617, sel_obj = {raise_obj = 37, rmv_obj = 38}, reg_cbk = { </p> <blockquote> <p> fmNodeResetIndCallback = 0x50830a <avm_rcv_fma_node_reset_cb>, fmSysManSwitchReqCallback = 0x508501 <avm_rcv_fma_switchover_req_cb>}, pend_cbk = {num = 1, head = 0x0, tail = 0x0}, </p> </blockquote> <blockquote> <p> pend_resp = {num = 0, head = 0x0, tail = 0x0}} </p> </blockquote> <p> The assert tests: </p> <blockquote> <p> if (!((list)->head)) { \ </p> <blockquote> <p> m_FMA_ASSERT(!((list)->num)); \ (list)->head = (rec); \ </p> </blockquote> <p> } \ </p> </blockquote> <p> That is if head is NULL, num should be 0 which it is not. </p> Results http://devel.opensaf.org/ticket/608#changelog http://devel.opensaf.org/ticket/610 http://devel.opensaf.org/ticket/610 #610: Runtime Object deletion failure for non-existing stream leads to node reboot Wed, 24 Jun 2009 13:53:24 GMT Sangeeta.meena@… <p> immutil_saImmOiRtObjectDelete returns SA_AIS_ERR_NOT_EXIST incase the runtime object doesn't exist with IMMND(after immnd restart case)and since immutilWrapperProfile.errorsAreFatal is not reset to 0,IMMSV does abort and the system reboots. </p> <p> if (lgs_cb->ha_state == SA_AMF_HA_ACTIVE) </p> <blockquote> <p> { </p> <blockquote> <p> TRACE("Stream is closed, I am HA active so remove IMM object"); SaNameT objectName; strcpy((char *) objectName.value, stream->name); objectName.length = strlen((char *) objectName.value); (void) immutil_saImmOiRtObjectDelete(lgs_cb->immOiHandle, &objectName); </p> <blockquote> <p> } </p> </blockquote> </blockquote> </blockquote> <blockquote> <p> log_stream_delete(s); </p> </blockquote> Results http://devel.opensaf.org/ticket/610#changelog http://devel.opensaf.org/ticket/612 http://devel.opensaf.org/ticket/612 #612: IMM shell commands - add missing features Tue, 30 Jun 2009 07:17:03 GMT hafe <p> Missing features: 1. Delete configuration object (done) 2. Add configuration objects from XML file (not done) 3. Search using e.g. class name and/or other ways (partly done) 4. Delete class (not done) </p> <p> As described in the "OSAF CLI" document (by Frederic Herrmann and Carol Wilhelmy, see devel list May 2008), an open topic is the XML schema for bullet 2 above. By restricting to only add operations, we can reuse the existing IMM schema. </p> Results http://devel.opensaf.org/ticket/612#changelog http://devel.opensaf.org/ticket/613 http://devel.opensaf.org/ticket/613 #613: Cold sync & Async update Related to Reader's Api is missing Wed, 01 Jul 2009 08:51:56 GMT Manoj Lalavat <manoj.lalavat@…> <p> At present notification service does not check point standby about reader's API. Both cold sync & Async update is missing for Reader API's. </p> Results http://devel.opensaf.org/ticket/613#changelog http://devel.opensaf.org/ticket/617 http://devel.opensaf.org/ticket/617 #617: AMF IMM integration, Drop 2 Mon, 06 Jul 2009 10:37:05 GMT marioa <p> <strong>General:</strong> Focus on support for Dynamic AMF Model changes, implementation better matching new AMF system model and features required by SMF contribution. </p> <p> Depended on ticket: <a class="closed ticket" href="http://devel.opensaf.org/ticket/161" title="enhancement: AMF IMM integration, Drop 1 (closed: fixed)">#161</a> </p> <p> <strong>Remaining Work</strong>: </p> <ul><li>function prototypes for 3 cases of adding an object to a database (initially, CCB create, and consequence of cold sync). </li><li>CSI assignments created/deleted in IMM </li><li>All cached runtime updated in IMM appropriately (IMM runtime attributes updated properly at initial start) </li><li>Dynamic changes to AMF Model </li><li>Synchronous AMF Node admin operations (update AvD - AvND protocl) </li><li>Convert all logging tracing to use Logtrace API </li><li>CSI dependency (instead of rank) </li><li>Align impl of SI Dependency and SI Ranked SU with other classes </li><li>Modular imm.xml (per service) </li><li>Any impact to MW AMF model caused by service modularity (service packaged per rpm) </li><li>avsv_demo packaged in rpm. Dynamically configuring it's Model from script. </li><li>additional usability in form of AMF CLI/shell commands </li><li>remove AvM coupling </li></ul> Results http://devel.opensaf.org/ticket/617#changelog http://devel.opensaf.org/ticket/618 http://devel.opensaf.org/ticket/618 #618: DTSv OI: Integrate DTSv service with IMM Mon, 06 Jul 2009 10:50:16 GMT marioa <p> DTSv service is changed to use IMM service for configuration management of the service itself (instead of MASv) Work includes: </p> <ul><li>removing DTSv dependency to MASv (code removal) </li><li>creating DTSv class model in imm.xml schema </li><li>Adapting DTSv to use IMM interface (OI & OM) for reading configuration data and providing runtime information (same classes as DTSv supports today via MASv) </li></ul> Results http://devel.opensaf.org/ticket/618#changelog http://devel.opensaf.org/ticket/622 http://devel.opensaf.org/ticket/622 #622: MQSv OI: Integrate MQSv service with IMM Tue, 07 Jul 2009 07:46:52 GMT marioa <p> MQSv service is changed to use IMM service instead of MASv. Work includes: </p> <ul><li>removing MQSv dependency to MASv (code removal) </li><li>Adapting MQSv to use IMM interface (OI) and provide runtime information for same classes as MQSv supports via MASv (three runtime classes: SaMsgQueue, SaMsgQueuePriority and SaMsgQueueGroup) </li></ul> Results http://devel.opensaf.org/ticket/622#changelog http://devel.opensaf.org/ticket/625 http://devel.opensaf.org/ticket/625 #625: Build System: Removal of MASv & PSSv Tue, 07 Jul 2009 13:41:42 GMT marioa <p> This work covers removal of MASv and PSSv code from repository and corresponding update of build system (makefiles, etc.). </p> <p> Note: To be able to execute this work all services using MASv should be migrated to use IMMSv. This work is covered with other tickets (<a class="closed ticket" href="http://devel.opensaf.org/ticket/161" title="enhancement: AMF IMM integration, Drop 1 (closed: fixed)">#161</a>, <a class="assigned ticket" href="http://devel.opensaf.org/ticket/618" title="enhancement: DTSv OI: Integrate DTSv service with IMM (assigned)">#618</a>, <a class="closed ticket" href="http://devel.opensaf.org/ticket/619" title="enhancement: Build System: Elimination of IFSv service in OpenSAF (closed: fixed)">#619</a>, <a class="closed ticket" href="http://devel.opensaf.org/ticket/620" title="enhancement: EDSv OI: Integrate EDSv service with IMM (closed: fixed)">#620</a>, <a class="closed ticket" href="http://devel.opensaf.org/ticket/621" title="enhancement: CPSv OI: Integrate CPSv service with IMM (closed: fixed)">#621</a>, <a class="accepted ticket" href="http://devel.opensaf.org/ticket/622" title="enhancement: MQSv OI: Integrate MQSv service with IMM (accepted)">#622</a>, <a class="closed ticket" href="http://devel.opensaf.org/ticket/623" title="enhancement: GLSv OI: Integrate GLSv service with IMM (closed: fixed)">#623</a>). So start of work covered by this ticket is depending on finalizing all referenced tickets. </p> Results http://devel.opensaf.org/ticket/625#changelog http://devel.opensaf.org/ticket/626 http://devel.opensaf.org/ticket/626 #626: SNMP subagent using NTF consumer interface for receiving notifications Tue, 07 Jul 2009 14:01:02 GMT marioa <p> SNMP subagent will be changed to use NTF consumer interface for receiving notifications (instead of EDSv). This work is depending on Ticket <a class="closed ticket" href="http://devel.opensaf.org/ticket/624" title="enhancement: AMF (AvSv) using NTFSv for sending Notifications (instead of EDSv) (closed: fixed)">#624</a>. </p> Results http://devel.opensaf.org/ticket/626#changelog http://devel.opensaf.org/ticket/627 http://devel.opensaf.org/ticket/627 #627: Build System: SNMP subagent as optional product Tue, 07 Jul 2009 14:14:43 GMT marioa <p> This work includes: </p> <ul><li>moving of SNMP subagent outside of OpenSAF base repository/product (necessary changes to build system) </li></ul><p> </p> Results http://devel.opensaf.org/ticket/627#changelog http://devel.opensaf.org/ticket/628 http://devel.opensaf.org/ticket/628 #628: Build System: Separate RPMs for individual services Wed, 08 Jul 2009 13:38:44 GMT marioa <p> This work is enabler for aligning OpenSAF with modularity capability as specified in wanted architecture. Each service will be packaged in own RPM. Exception are base infrastructure services (like MBCsv, DTSv, LEAP, MDS, RDE) that will be contained in one "base infrastructure RPM". </p> Results http://devel.opensaf.org/ticket/628#changelog http://devel.opensaf.org/ticket/630 http://devel.opensaf.org/ticket/630 #630: AvSv: Separation of AMF and CLM service Wed, 08 Jul 2009 13:50:19 GMT marioa <p> Today <a class="missing wiki" href="http://devel.opensaf.org/wiki/AvSv" rel="nofollow">AvSv?</a> service implements both AMF and CLM functionality in same service. This ticket covers splitting of AMF and CLM in own services. </p> Results http://devel.opensaf.org/ticket/630#changelog http://devel.opensaf.org/ticket/631 http://devel.opensaf.org/ticket/631 #631: Consolidated logging (enhancement to Logtrace API) Wed, 08 Jul 2009 16:26:56 GMT marioa <p> See attached slides. Note that this ticket covers impact to Logtrace API and functionality. Adapting each OpenSAF service to actually use this API will be covered by per service ticket. </p> Results http://devel.opensaf.org/ticket/631#changelog http://devel.opensaf.org/ticket/637 http://devel.opensaf.org/ticket/637 #637: Setting SaNameT length to SA_MAX_NAME_LENGTH or larger leads to crash Thu, 16 Jul 2009 10:26:59 GMT arne <p> If length for the SaNameT types notifyingObject and notificationObject in SaLogNtfLogHeaderT are set to SA_MAX_NAME_LENGTH or larger value the opensaf_saflogd will crash. </p> <p> This trace should be changed to a warning or error so it is visible in syslog. </p> <p> from lgs_mds.c: </p> <blockquote> <p> if (SA_MAX_NAME_LENGTH <= ntfLogH->notificationObject->length) </p> <blockquote> <p> { </p> <blockquote> <p> TRACE("notificationObject to big"); return(0); </p> </blockquote> <p> } </p> </blockquote> </blockquote> Results http://devel.opensaf.org/ticket/637#changelog http://devel.opensaf.org/ticket/641 http://devel.opensaf.org/ticket/641 #641: Separation of AMF and CLM in OpenSAF Sat, 25 Jul 2009 05:17:08 GMT murthy <p> Currently AMF and CLM have combined implementation in OpenSAF. This is not allowing to implement new concepts like CLM cluster, AMF cluster. </p> <p> These independent cluster views are relevent for SMF and other subsequent SAF specs. </p> <p> OpenSAF should partition these spec implementations. Also upgrade to the newer version of the CLM specification. </p> Results http://devel.opensaf.org/ticket/641#changelog http://devel.opensaf.org/ticket/644 http://devel.opensaf.org/ticket/644 #644: ncs_rde: Mds fails to initialize tipc Mon, 10 Aug 2009 10:18:16 GMT anders <p> SC_2_1# gdb /usr/lib64/opensaf/ncs_rde ncs_rde.10352.SC_2_1.core (gdb) bt <a class="missing ticket">#0</a> 0x00002b0c08d3ada5 in raise () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002b0c08d3c1a0 in abort () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002b0c08d342e6 in <span class="underline">assert_fail () from /lib64/libc.so.6 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002b0c08f98e74 in mdtm_tipc_init (nodeid=131343, mds_tipc_ref=0x7fffa1fc60fc) at src/mds_dt_tipc.c:349 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x00002b0c08fb039f in mds_mdtm_init (node_id=131343, mds_tipc_ref=0x7fffa1fc60fc) at src/mds_dt.c:42 <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x00002b0c08fa3710 in mds_lib_req (req=0x7fffa1fc6150) at src/mds_main.c:270 <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00002b0c08f94add in ncs_mds_startup (argc=0, argv=0x7fffa1fc6300) at src/ncs_main_pub.c:447 <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x00002b0c08f9504e in ncs_core_agents_startup (argc=0, argv=0x7fffa1fc6300) at src/ncs_main_pub.c:632 <a class="closed ticket" href="http://devel.opensaf.org/ticket/8" title="defect: OpenSAF does not own POSIX Shared memory (closed: fixed)">#8</a> 0x00002b0c08f9467e in ncs_agents_startup (argc=0, argv=0x7fffa1fc6300) at src/ncs_main_pub.c:300 <a class="closed ticket" href="http://devel.opensaf.org/ticket/9" title="defect: OpenSAF deployment support non-flexible (closed: fixed)">#9</a> 0x000000000040267f in rde_agents_startup () at rde_amf.c:461 <a class="closed ticket" href="http://devel.opensaf.org/ticket/10" title="defect: clean up nis_scxb & nis_snmpd_clean.sh (closed: fixed)">#10</a> 0x0000000000404457 in main (argc=1, argv=0x7fffa1fc6748) at rde_main.c:130 (gdb) </span></p> <p> That is probably: </p> <blockquote> <p> #if MDS_TIPC_1_5 </p> <blockquote> <p> tipc_node_id=mdtm_tipc_own_node(tipc_cb.BSRsock); /* This gets </p> <blockquote> <p> the tipc ownaddress*/ </p> </blockquote> </blockquote> </blockquote> <p> </p> <blockquote> <blockquote> <p> /* Connect to the Topology Server */ memset(&topsrv, 0, sizeof(topsrv)); topsrv.family = AF_TIPC; topsrv.addrtype = TIPC_ADDR_NAME; topsrv.addr.name.name.type = TIPC_TOP_SRV; topsrv.addr.name.name.instance = TIPC_TOP_SRV; </p> </blockquote> </blockquote> <blockquote> <blockquote> <p> if (0 > connect(tipc_cb.Dsock,(struct </p> <blockquote> <p> sockaddr*)&topsrv,sizeof(topsrv))) </p> </blockquote> <p> { </p> <blockquote> <p> syslog(LOG_ERR,"MDS:MDTM: Failed to connect to topology server"); assert(0); exit(1); </p> </blockquote> </blockquote> <p> } </p> </blockquote> Results http://devel.opensaf.org/ticket/644#changelog http://devel.opensaf.org/ticket/648 http://devel.opensaf.org/ticket/648 #648: segfault due to wrong string handling in CPSV/DTSV at logging Tue, 18 Aug 2009 11:09:08 GMT bertil.engelholm@… <p> When CPND logs e.g. Ckpt Sect Get Failed (e.g. in ckpt_evt_proc_ckpt_sect_exp_set) it tries to include the sect_id.id as a string. The problem is that this string don't seem to be null terminated so there are garbage data added to the log string. When this string later is sent to the device (using fprintf in dts_pvt.c:dtsv_log_msg) this garbage data might accidently be interpreted as a fprintf format string (e.g. %s) which will make fprintf expect additional arguments. But this argument is not included by dtsv_log_msg so fprintf seems to use garbage data causing a segfault to happen. </p> <p> So the formating of seg_id needs to be changed so that garbage data is not added to the log string and the fprintf in dtsv_log_msg should be changed to e.g. fwrite since the only thing that should be done here is to send a string to the device. There shouldn't be any format characters in the string at this point so using fprintf is unnecessary (it only cost extra execution time to look for format strings that are not there). </p> Results http://devel.opensaf.org/ticket/648#changelog http://devel.opensaf.org/ticket/652 http://devel.opensaf.org/ticket/652 #652: Remove cruft from LEAP Mon, 24 Aug 2009 14:12:43 GMT hafe <p> See: </p> <p> <a class="ext-link" href="http://list.opensaf.org/archives/devel/2009-August/005000.html"><span class="icon">http://list.opensaf.org/archives/devel/2009-August/005000.html</span></a> </p> Results http://devel.opensaf.org/ticket/652#changelog http://devel.opensaf.org/ticket/658 http://devel.opensaf.org/ticket/658 #658: Remove ncs_main_pvt.c and add main() to each service Fri, 28 Aug 2009 06:23:45 GMT jfournier <p> Issues are starting to arise while doing OIs integration. </p> <p> <a class="ext-link" href="http://list.opensaf.org/archives/devel/2009-August/005040.html"><span class="icon">http://list.opensaf.org/archives/devel/2009-August/005040.html</span></a> </p> <p> Now is a good time to remove ncs_main_pvt.c and move the main() in each services. </p> <p> Proper 'auto-daemonization' and getopt argv parsing will be introduced at the same time. </p> Results http://devel.opensaf.org/ticket/658#changelog http://devel.opensaf.org/ticket/660 http://devel.opensaf.org/ticket/660 #660: /bin/sh scripts use source command not available in /bin/sh Tue, 08 Sep 2009 09:07:20 GMT hafe <p> On system where /bin/sh is a not link to /bin/bash the scripts will fail. E.g. Ubuntu... </p> Results http://devel.opensaf.org/ticket/660#changelog http://devel.opensaf.org/ticket/664 http://devel.opensaf.org/ticket/664 #664: ncs_scap SEGV when ran under QEMU Mon, 28 Sep 2009 00:49:37 GMT jfournier <p> Core was generated by `/usr/lib/opensaf/ncs_scap ROLE=1 NID_SVC_NAME=SCAP'. Program terminated with signal 11, Segmentation fault. [New process 5869] [New process 5860] [New process 5867] [New process 5859] [New process 5892] [New process 5858] [New process 5873] [New process 5862] [New process 5870] [New process 5864] [New process 5861] [New process 5872] [New process 5865] [New process 5868] <a class="missing ticket">#0</a> 0x41e238f6 in malloc_atfork () from /lib/libc.so.6 </p> <pre class="wiki">(gdb) bt #0 0x41e238f6 in malloc_atfork () from /lib/libc.so.6 #1 0x41e22ff5 in malloc () from /lib/libc.so.6 ... #2-3160 0x41e22ff5 in malloc () from /lib/libc.so.6 ... #3161 0x41e22ff5 in malloc () from /lib/libc.so.6 #3162 0x41f985b3 in ncs_mem_alloc (nbytes=4792, mem_region=0x0, service_id=NCS_SERVICE_ID_AVM, sub_id=9, line=71, file=0x81b36cc "src/avm_db.c") at src/sysfpool.c:692 #3163 0x0816e7cf in avm_add_ent_info (avm_cb=0x81e01ec, entity_path=0xb7edb2d4) at src/avm_db.c:71 #3164 0x08171e68 in ncsavmentdeploytableentry_set (cb=0x81e01ec, arg=0x82ab6bc, var_info=0x81c6be0, test_flag=0) at src/avm_ent.c:223 #3165 0x41fb19d8 in miblib_process_mib_op_req (cb=0x81e01ec, args=0x82ab6bc) at src/hjmiblib.c:1419 #3166 0x41fb1dbb in ncsmiblib_process_req (req_info=0xb7edbdcc) at src/hjmiblib.c:1576 #3167 0x081786e9 in avm_proc_mib (mib_req=0x83c2eb4, avm_cb=0x81e01ec) at src/avm_fsm.c:621 #3168 0x0817783e in avm_msg_handler (avm_cb=0x81e01ec, evt=0x83c2eb4) at src/avm_fsm.c:251 #3169 0x0815dcab in avm_proc () at src/avm_proc.c:111 #3170 0x08159e60 in avm_init_proc (avm_init_hdl=0x81c6b60) at src/avm_init.c:403 #3171 0x41f010f0 in start_thread () from /lib/libpthread.so.0 #3172 0x41e7e8ce in clone () from /lib/libc.so.6 </pre> Results http://devel.opensaf.org/ticket/664#changelog http://devel.opensaf.org/ticket/665 http://devel.opensaf.org/ticket/665 #665: OpenSAF scripts should be validated for /bin/sh Wed, 30 Sep 2009 19:55:54 GMT jfournier <p> I think in the past mostly all scripts were switched back to /bin/bash because of some potential incompatibility with /bin/sh with their current syntax. </p> <p> If running on Busybox is desired, /bin/sh compatibility is needed for scripts and init.d sanity. </p> Results http://devel.opensaf.org/ticket/665#changelog http://devel.opensaf.org/ticket/677 http://devel.opensaf.org/ticket/677 #677: Clarify what are the OpenSAF data replication needs Thu, 15 Oct 2009 17:32:18 GMT jfournier <p> As of today, OpenSAF documentation suggests to use DRBD for replication. But it doesn't prevent OpenSAF users to use another replication mechanism. </p> <p> A single directory used to be replicated (/repl_opensaf), but its content wasn't LSB standard (log and data path) </p> <p> As of today all the logs and data is under /var/lib/opensaf. Per LSB req, logs should be structured under /var/log/opensaf. But having the files in multiple standard locations might breaks current replication scheme for some users. </p> <p> The OpenSAF documentation should not be talking about any replication mechanism as it can be confusing for some users. The OpenSAF documentation must clearly mention: </p> <ul><li>What should be replicated </li><li>What are the impact if replicated/not replicated </li><li>What is optional for replication and impacts </li><li>Explain what /var/lib/opensaf, /var/log/opensaf contains (to help users manage their own replication mechanisms) </li></ul> Results http://devel.opensaf.org/ticket/677#changelog http://devel.opensaf.org/ticket/678 http://devel.opensaf.org/ticket/678 #678: Add support for discarded notification callback Fri, 16 Oct 2009 11:46:23 GMT arne <p> Include support for SaNtfNotificationDiscardedCallbackT in API </p> Results http://devel.opensaf.org/ticket/678#changelog http://devel.opensaf.org/ticket/680 http://devel.opensaf.org/ticket/680 #680: Implement SAI-AIS-NTF-A.02.01 Fri, 16 Oct 2009 13:58:15 GMT arne <p> New functions: saNtfVariableDataSizeGet SaNtfStaticSuppressionFilterSetCallcackT </p> <p> Changed functions: </p> <blockquote> <p> saNtfInitialize_2 * </p> <blockquote> <p> -due to suppression callbacks </p> </blockquote> <p> saNtfStateChangeNotificationFilter_2 saNtfStateChangeNotificationAllocateFilter_2 saNtfLocalizedMessageFree_2 -add ntfHandle saNtfNotificationUnsubscribe_2 -add ntfHandle saNtfNotificationReadInitialize_2 * saNtfCallbacksT_2 * </p> <blockquote> <p> -due to suppression callbacks </p> </blockquote> </blockquote> <ul><li>= changed in A.03.01 </li></ul><p> Admin API - IMM integration Notifications </p> Results http://devel.opensaf.org/ticket/680#changelog http://devel.opensaf.org/ticket/681 http://devel.opensaf.org/ticket/681 #681: Implement SAI-AIS-NTF-A.03.01 Fri, 16 Oct 2009 14:08:27 GMT arne <p> Changes: Suppression extended to use <a class="missing wiki" href="http://devel.opensaf.org/wiki/EventTypeBitMap" rel="nofollow">EventTypeBitMap?</a> New miscellaneous notification type New way of handling correlation identifier Management if MIB removed </p> <p> New functions: </p> <blockquote> <p> saNtfMiscellaneousNotificationAllocate saNtfMiscellaneousNotificationFilterAllocate satNtfIdentifierAllocate saNtfNotificationSendWithId </p> </blockquote> <p> Changed functions: </p> <blockquote> <p> saNtfInitialize_3 saNtfStaticSuppressionFilterSetCallcackT_3 saNtfNotificationSubscribe_3 saNtfNotificationReadInitialize_3 saNtfStateChangeNotificationAllocate_3 SaNtfNotificationCallbackT_3 saNtfNotificationReadNext_3 </p> </blockquote> Results http://devel.opensaf.org/ticket/681#changelog http://devel.opensaf.org/ticket/682 http://devel.opensaf.org/ticket/682 #682: Overload protection Fri, 16 Oct 2009 14:41:49 GMT arne <p> Implement overload protection. Prioritize alarm and security alarms during heavy load. In extreme load situations reject/drop new notifications. </p> <p> For slow consumers implementation of <a class="accepted ticket" href="http://devel.opensaf.org/ticket/678" title="enhancement: Add support for discarded notification callback (accepted)">#678</a> is needed. </p> Results http://devel.opensaf.org/ticket/682#changelog http://devel.opensaf.org/ticket/686 http://devel.opensaf.org/ticket/686 #686: avsv traces/logs should be changed from "canned strings" to "printf strings" Mon, 26 Oct 2009 15:28:26 GMT hafe <p> This is needed to enable selection of trace/log backend. E.g. syslog or LOG </p> Results http://devel.opensaf.org/ticket/686#changelog http://devel.opensaf.org/ticket/689 http://devel.opensaf.org/ticket/689 #689: Segfault in AvND timer expire function Tue, 27 Oct 2009 13:06:49 GMT arne <p> This problem has occured several times in our system. Bertil has sent a mail on devel list earlier: <a class="ext-link" href="http://list.opensaf.org/archives/devel/2009-August/004968.html"><span class="icon">http://list.opensaf.org/archives/devel/2009-August/004968.html</span></a> </p> <p> Core was generated by `/opt/opensaf/controller/bin/ncs_scap ROLE=1 NID_SVC_ID=21'. Program terminated with signal 11, Segmentation fault. <a class="missing ticket">#0</a> 0x00000000004c099f in avnd_tmr_exp (uarg=0xffffff) at ./avnd_tmr.c:168 168 ./avnd_tmr.c: No such file or directory. </p> <blockquote> <p> in ./avnd_tmr.c </p> </blockquote> <p> (gdb) bt <a class="missing ticket">#0</a> 0x00000000004c099f in avnd_tmr_exp (uarg=0xffffff) at ./avnd_tmr.c:168 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002b37dafc32e7 in sysfTmrExpiry (tmp=0x2aaaaaccb420) at ./src/sysf_tmr.c:380 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002b37dafc36e7 in ncs_tmr_engine (tv=0x4001e4e0, next_delay=0x4001e590) at ./src/sysf_tmr.c:520 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002b37dafc3a97 in ncs_tmr_wait () at ./src/sysf_tmr.c:646 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x00002b37db278143 in start_thread () from /lib64/libpthread.so.0 <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x00002b37dc201bed in shmat () from /lib64/libc.so.6 <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x0000000000000000 in ?? () </p> Results http://devel.opensaf.org/ticket/689#changelog http://devel.opensaf.org/ticket/693 http://devel.opensaf.org/ticket/693 #693: MDS to provide NODE_UP/NODE_DOWN (using TIPC) Mon, 02 Nov 2009 09:54:24 GMT mathi <p> MDS shall be enhanced to provide NODE_UP/NODE_DOWN (i.e. similair to SVC_UP/SVC_DOWN) messages by making use of the TIPC interfaces. </p> <p> One user of this feature will be CLM. </p> Results http://devel.opensaf.org/ticket/693#changelog http://devel.opensaf.org/ticket/695 http://devel.opensaf.org/ticket/695 #695: Allow us to use more than 1024 file descriptors Wed, 04 Nov 2009 10:05:38 GMT hakan@… <p> When I invoke saImmOmInitialize the first time, a few threads are created. At least two of these threads are using <strong>select</strong> (ncs_sel_obj_select and ncs_tmr_wait). This is very unfortunate as the usage of select implies a superflous limitation of the maximum number of open file descriptors in our processes. If you use <strong>poll</strong> in your agent libraries, you will enable us to use more than 1024 file descriptors. </p> <p> It may be the case that there are other functions (than these mentioned above) that also are using select in OpenSAF agent libraries. See the stack trace from gdb below for details. </p> <p> /Håkan — Håkan Mattsson, Erlang/OTP, Ericsson AB </p> <p> (gdb) thr 2 [Switching to thread 2 (process 15689)]<a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aaaab174a05 in ncs_sel_obj_select (highest_sel_obj={raise_obj = 15, rmv_obj = 16}, rfds=0x420684b0, wfds=0x0, efds=0x0, timeout_in_10ms=0x0) at src/os_defs.c:2818 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002aaaab149d23 in ncs_ipc_recv_common (mbx=0x2aaaab2e2580, block=1) at src/sysf_ipc.c:447 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002aaaab149bf5 in ncs_ipc_recv (mbx=0x2aaaab2e2580) at src/sysf_ipc.c:394 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x00002aaaab1b736b in dta_do_evts (mbx=0x2aaaab2e2580) at dta_api.c:1260 <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0 <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x0000000000000000 in ?? () (gdb) thr 3 [Switching to thread 3 (process 15688)]<a class="missing ticket">#0</a> 0x00002b0d895a08b6 in poll () from /lib64/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b0d895a08b6 in poll () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aaaab17d39e in mdtm_process_recv_events () at src/mds_dt_tipc.c:640 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x0000000000000000 in ?? () (gdb) thr 4 [Switching to thread 4 (process 15687)]<a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aaaab14e4d6 in ncs_tmr_wait () at src/sysf_tmr.c:541 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x0000000000000000 in ?? () </p> Results http://devel.opensaf.org/ticket/695#changelog http://devel.opensaf.org/ticket/696 http://devel.opensaf.org/ticket/696 #696: Agent threads are not joined safely Wed, 04 Nov 2009 10:25:41 GMT hakan@… <p> When saImmOmInitialize is invoked the first time, a few threads are created. When saImmOmFinalize is invoked for the last handle, these threads exits after a while. Unfortunately at least one of the threads is still running <strong>after</strong> saImmOmFinalize has returned. This is very unfortunate as there are no other synchronization primitives in the API (as far as I can see) that we can use to wait for the final OpenSAF thread to exit. We have encountered some nasty crashes when we unload the OpenSAF agent library code too soon after the final call to saImmOmFinalize. You should wait for <strong>all</strong> threads to join <strong>before</strong> saImmOmFinalize returns. </p> <p> It may be the case that there are other places in the OpenSAF agent library code that suffers from the same bug. See the stack trace from gdb below for details about which threads that are active before the final call to saImmOmFinalize. </p> <p> /Håkan — Håkan Mattsson, Erlang/OTP, Ericsson AB </p> <p> (gdb) thr 2 [Switching to thread 2 (process 15689)]<a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aaaab174a05 in ncs_sel_obj_select (highest_sel_obj={raise_obj = 15, rmv_obj = 16}, rfds=0x420684b0, wfds=0x0, efds=0x0, timeout_in_10ms=0x0) at src/os_defs.c:2818 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002aaaab149d23 in ncs_ipc_recv_common (mbx=0x2aaaab2e2580, block=1) at src/sysf_ipc.c:447 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002aaaab149bf5 in ncs_ipc_recv (mbx=0x2aaaab2e2580) at src/sysf_ipc.c:394 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x00002aaaab1b736b in dta_do_evts (mbx=0x2aaaab2e2580) at dta_api.c:1260 <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0 <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x0000000000000000 in ?? () (gdb) thr 3 [Switching to thread 3 (process 15688)]<a class="missing ticket">#0</a> 0x00002b0d895a08b6 in poll () from /lib64/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b0d895a08b6 in poll () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aaaab17d39e in mdtm_process_recv_events () at src/mds_dt_tipc.c:640 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x0000000000000000 in ?? () (gdb) thr 4 [Switching to thread 4 (process 15687)]<a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 (gdb) bt <a class="missing ticket">#0</a> 0x00002b0d895a29a2 in select () from /lib64/libc.so.6 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002aaaab14e4d6 in ncs_tmr_wait () at src/sysf_tmr.c:541 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x00002b0d892cb143 in start_thread () from /lib64/libpthread.so.0 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x00002b0d895a8b8d in clone () from /lib64/libc.so.6 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x0000000000000000 in ?? () </p> Results http://devel.opensaf.org/ticket/696#changelog http://devel.opensaf.org/ticket/700 http://devel.opensaf.org/ticket/700 #700: Tool/script to generate initial config (imm.xml) for OpenSAF AMF Entities for variable cluster size Tue, 10 Nov 2009 13:04:11 GMT marioa <p> This tool/script will generate initial configuration (imm.xml)for OpenSAF MW entities. It should support following cluster size: </p> <ul><li>1 (1 SC) </li><li>2 (2SC) </li><li>N>3 (2SC + (N-2) PLs) </li></ul> Results http://devel.opensaf.org/ticket/700#changelog http://devel.opensaf.org/ticket/701 http://devel.opensaf.org/ticket/701 #701: Tool/script to support dynamic changes to OpenSAF cluster size (for MW enities) Tue, 10 Nov 2009 13:08:54 GMT marioa <p> This tools should provide support to dynamically change size of the OpenSAF cluster (without causing downtime). </p> <p> a) Tool should support extending cluster from 2SC + 0PL to 2SC + X PLs. T b) Tool should also support "shrinking" cluster from 2SC + MPLs to 2SC + N PLs; where 0<= N < M. </p> Results http://devel.opensaf.org/ticket/701#changelog http://devel.opensaf.org/ticket/702 http://devel.opensaf.org/ticket/702 #702: OpenSAF Release 4 Documentation Strategy Wed, 11 Nov 2009 14:28:46 GMT marioa <p> We need to change documentation structure to fit new OpenSAF Architecture. Should we have additional improvements in the way documentation is generated in release 4. </p> Results http://devel.opensaf.org/ticket/702#changelog http://devel.opensaf.org/ticket/703 http://devel.opensaf.org/ticket/703 #703: Defect Classification Proposal Wed, 11 Nov 2009 21:53:17 GMT marioa <p> From Murthy: </p> <p> All, I reviewed bug classification for other open source projects (opensuse, eclipse), and opensuse bug classification is fairly relevant to OpenSAF. Based on that I created the bug classification for OpenSAF. I am attaching the same here. Lets review it TLC meeting. Thanks Murthy </p> <p> Blocker </p> <p> Prevents developers or testers from performing their jobs. Impacts the development process. (Documentation) Missing documentation for a feature contribution Examples: </p> <p> Unable to start the OpenSAF services Unable to upgrade from one release to another (for release which support inservice-upgratde) </p> <p> Critical </p> <p> Crash, loss of data, corruption of data, severe memory leak. (Documentation) prescribes or doesn't warn against actions that cause data loss or corruption. Examples: </p> <p> Crash that is repeatable and evident to multiple users Memory leaks in the standard execution flows, that lead to memory outages. </p> <p> Major </p> <p> Major loss of function, as specified in the product requirements for this release, or existing in the current product. (Documentation) missing, misleading, inaccurate, or contradictory information to the degree that by following the documentation successful completion of fundamental tasks is unlikely. </p> <p> Examples: </p> <p> Feature complaince as specified in the user documentation missing Feature not working as per advertised compliance (certain AMF redundancy model not working) in SAF services Regression test case failure from previous opensaf release for SAF services Missing critical logging information for system debugging. </p> <p> Minor </p> <p> Non-major loss of function. (Documentation) missing, misleading, inaccurate, or contradictory information in the documentation, but successful task completion is probable. Examples: </p> <p> Improper log levels in the OpenSAF services Code doesn't have appropriate function headers. Code doesn't have proper indentation. non reproducible errors in infrastructure services not following proper naming convention for executables and source files of opensaf. </p> <p> trivial </p> <p> Issue that can be viewed as trivial (e.g. cosmetic, UI, easily documented). (Documentation) contains stylistic or formatting issues, but functionality is not hindered. </p> <p> Examples Typos in logs strings, CLI help strings, wrong comments etc </p> <p> </p> Results http://devel.opensaf.org/ticket/703#changelog http://devel.opensaf.org/ticket/708 http://devel.opensaf.org/ticket/708 #708: Discussion: Scope of legacy depreciation in Beta 4 Mon, 16 Nov 2009 15:04:07 GMT marioa <p> This ticket describes topic to be discussed on TLC meeting. </p> <p> Release 4 should avoid situation to do too much in last Beta of Release 5 (Beta 5) because it will take longer to stabilize product. </p> <p> We should aim as much as possible structural changes in beta 4, for example: </p> <ul><li>Finalize integration with IMM (DTSv OI remaining) </li><li>Remove MASv, PSSv, SNMP subagent </li><li>Remove AvM (or at least decoupling AMF from it) </li></ul><p> Guiding principle should be: </p> <ul><li>if we can remove certain legacy in Beta 4 without needing to produce significant extra code that will be obsoleted in Beta 5; such change should be done in Beta 4; otherwise change is postponed for Beta 5. </li></ul><p> Anybody is invited to provide comments via mail,(or adding comment to ticket for better tracebility) as input. </p> Results http://devel.opensaf.org/ticket/708#changelog http://devel.opensaf.org/ticket/715 http://devel.opensaf.org/ticket/715 #715: Deprecate NID and distribute its functionality across the Services init.d scripts Wed, 18 Nov 2009 16:27:42 GMT jfournier <p> NID functionality consists of parsing a node configuration file to know what Services to bootstrap pre-AMF. </p> <p> Each Service described in the configuration file has certain essential properties like what script to call, timeout values etc. But most of them are NID specific. </p> <p> Services are parsed in line order from the configuration file, once spawned, somewhere in the Services implementation a call to the public API for NID IPC is done to inform NID to proceed to the next configured Services or Retry/Abort depending on the status return. </p> <p> Such functionality can be moved completely in the various init.d scripts (see <a class="accepted ticket" href="http://devel.opensaf.org/ticket/256" title="enhancement: OpenSAF script improvements (accepted)">#256</a>, <a class="accepted ticket" href="http://devel.opensaf.org/ticket/654" title="enhancement: Various init scripts should be using LSB functions (accepted)">#654</a>, <a class="accepted ticket" href="http://devel.opensaf.org/ticket/658" title="enhancement: Remove ncs_main_pvt.c and add main() to each service (accepted)">#658</a>, <a class="accepted ticket" href="http://devel.opensaf.org/ticket/665" title="enhancement: OpenSAF scripts should be validated for /bin/sh (accepted)">#665</a>) and will be LSB compliant </p> Results http://devel.opensaf.org/ticket/715#changelog http://devel.opensaf.org/ticket/716 http://devel.opensaf.org/ticket/716 #716: Introduce the Doxygen documentation process in the source base Thu, 19 Nov 2009 01:30:54 GMT jfournier <p> Steps to slowly introduce the Doxygen process in the code base: </p> <p> 1. Write down a simple process for developers to follow (in progress) 2. Include documentation examples with the process (source and header templates) 3. Have the developers apply this documentation process to their code 4. Produce a Doxyfile and 'make docs' rules 5. Get nightly generation of the html documentation and published on the developer site under the manual section (including the doxy_warn.log to fix documentation issues) </p> <p> Welcome any other suggestions! </p> Results http://devel.opensaf.org/ticket/716#changelog http://devel.opensaf.org/ticket/718 http://devel.opensaf.org/ticket/718 #718: CSI attribute management has changed in OpenSAF 4.0 beta3 Fri, 20 Nov 2009 09:31:33 GMT karin.holm@… <p> Using Appconfig.xml you specified CSI attributes in SI with the following syntax: <nameValue name="logFile" value="/home/saf-demo/notif.log"/> </p> <p> This was received in the amfCSISetCallback as: CSI attribute name = "logFile", value = "/home/saf-demo/notif.log". </p> <p> With beta 3 release of OpenSAF the CSI attribute is specified as the following IMM object: </p> <blockquote> <p> <object class="SaAmfCSIAttribute"> </p> <blockquote> <p> <dn>safCsiAttr=logFile,safCsi=ntfSubscribeApp,safSi=SC-NWayActive,safApp=ntfSubscribeApp</dn> <attr> </p> <blockquote> <p> <name>saAmfCSIAttriValue</name> <value>/home/saf-demo/notif.log</value> </p> </blockquote> <p> </attr> </p> </blockquote> <p> </object> </p> </blockquote> <p> and recieved in received in the amfCSISetCallback as: CSI attribute name = "safCsiAttr=logFile,safCsi=ntfSubscribeApp,safSi=SC-NWayActive,safApp=ntfSubscribeApp", value = "/home/saf-demo/notif.log". </p> <p> This is quite a change in behavior and makes it hard to code a component independent of the configuration (need to know exactly how it will be configured). </p> <p> Shall the whole dn for the CSI attribute be used as name in the amfCSISetCallback call? should not just the name part of the dn for the CSI attribute be used instead? </p> Results http://devel.opensaf.org/ticket/718#changelog http://devel.opensaf.org/ticket/727 http://devel.opensaf.org/ticket/727 #727: Change the LEAP timer implementation to use POSIX timers Thu, 17 Dec 2009 15:27:13 GMT hafe <p> The LEAP timer implementation is: - redundant - its legacy - has bugs: <a href="http://devel.opensaf.org/ticket/689">http://devel.opensaf.org/ticket/689</a>, <a href="http://devel.opensaf.org/ticket/267">http://devel.opensaf.org/ticket/267</a> </p> <p> We should as a first step keep the API but change the impl. to use POSIX timers. Then the code could be refactored to directly use the POSIX API. </p> Results http://devel.opensaf.org/ticket/727#changelog http://devel.opensaf.org/ticket/729 http://devel.opensaf.org/ticket/729 #729: saClmClusterNodeGet() returns the name of AMF nodes Fri, 18 Dec 2009 14:52:22 GMT hafe <p> Patch is on the way. This is a very small conversion from AMF to CLM node until we have the new CLM in place. </p> Results http://devel.opensaf.org/ticket/729#changelog http://devel.opensaf.org/ticket/738 http://devel.opensaf.org/ticket/738 #738: SCAP fails to come up when an SG is configured in NWay redundancy model in OpenSAF 4.0 Beta3. Fri, 08 Jan 2010 13:46:40 GMT ashwanigoyal <p> SCAP fails to come up when an SG is configured in NWay redundancy model in OpenSAF 4.0 Beta3. </p> <p> The values of saAmfCtCompCategory and saAmfCtCompCapability changes. </p> <p> In imm.xml both are set to 1 but in AVD logs it is shown as 0 and 4. </p> <blockquote> <p> Tried to bring up OpenSAF with the following configuration. </p> </blockquote> <p> Service group type is defined with Redundancy Model (saAmfSgtRedundancyModel) 3 i.e. NWay Service group is defined with saAmfSGNumPrefInserviceSUs=8 saAmfSGNumPrefAssignedSUs=6 saAmfSGMaxActiveSIsperSUs=3 saAmfSGMaxStandbySIsperSUs=7 which is of Service group type. </p> <p> Comp type is defined with saAmfCtCompCategory = 1 saAmfCtCompCapability = 1 </p> <p> When I start OpenSAF service, it is stuck at SCAP process. The AVD reports the following error: </p> <p> NOTICE : 0x0002010f 1527447491 12 1 08Jan2010_19.33.15.200 avd_comp_config_get - 'safComp=Norm1,safSu=dummy_NplusM_1Norm_1,safSg=SG_dummy_nway,safApp=<a class="missing wiki" href="http://devel.opensaf.org/wiki/NwayApp" rel="nofollow">NwayApp?</a>' </p> <p> ERROR : 0x0002010f 1222492099 12 1 08Jan2010_17.41.38.113 avd_comp_config_validate - Illegal category 0 or cap 4 for SG red model 3 ERROR : 0x0002010f 1222492099 12 1 08Jan2010_17.41.38.114 avd_imm_config_get - Failed to read configuration, AMF will not start ERROR : 0x0002010f 1222492099 12 1 08Jan2010_17.41.38.115 AVD: AN INVALID DATA VALUE at avd_avmmsg.c:372 val 2 ERROR : 0x0002010f 1222492099 12 1 08Jan2010_17.41.38.115 AVD: AN INVALID DATA VALUE at avd_avmmsg.c:373 val 1 NOTICE : 0x0002010f 1222492099 12 1 08Jan2010_17.41.38.115 AVD: AVD Role change Failure 0 </p> Results http://devel.opensaf.org/ticket/738#changelog http://devel.opensaf.org/ticket/739 http://devel.opensaf.org/ticket/739 #739: Segmentation fault in ncs_scap (avd) Fri, 08 Jan 2010 14:49:55 GMT arne <p> d_compcsi_info->attrs.list is of type NCS_AVSV_ATTR_NAME_VAL but type of bigger size AVSV_SUSI_ASGN is allocated. Read from s_compcsi_info->attrs.list during copy could generate segfault. </p> <p> In function avsv_cpy_d2n_susi_msg: </p> <blockquote> <p> d_compcsi_info->attrs.list = m_MMGR_ALLOC_AVSV_COMMON_DEFAULT_VAL((s_compcsi_info->attrs.number * sizeof(AVSV_SUSI_ASGN))); </p> </blockquote> <p> ... memcpy(d_compcsi_info->attrs.list,s_compcsi_info->attrs.list,(s_compcsi_info->attrs.number * sizeof(AVSV_SUSI_ASGN))); </p> <hr /> <p> Core was generated by `/usr/lib64/opensaf/ncs_scap ROLE=2 NID_SVC_NAME=SCAP'. Program terminated with signal 11, Segmentation fault. <a class="missing ticket">#0</a> 0x00002b3dd1ca8413 in avsv_cpy_d2n_susi_msg (d_susi_msg=0x2aaaab0f7ff0, s_susi_msg=0x2aaaab0f8490) at avsv_d2nmsg.c:555 555 avsv_d2nmsg.c: No such file or directory. </p> <blockquote> <p> in avsv_d2nmsg.c </p> </blockquote> <p> (gdb) bt <a class="missing ticket">#0</a> 0x00002b3dd1ca8413 in avsv_cpy_d2n_susi_msg (d_susi_msg=0x2aaaab0f7ff0, s_susi_msg=0x2aaaab0f8490) at avsv_d2nmsg.c:555 <a class="closed ticket" href="http://devel.opensaf.org/ticket/1" title="enhancement: ALL: All files updated to show the LGPL License (closed: fixed)">#1</a> 0x00002b3dd1ca8758 in avsv_dnd_msg_copy (dmsg=0x2aaaab0f7ff0, smsg=0x2aaaab0f8490) at avsv_d2nmsg.c:751 <a class="closed ticket" href="http://devel.opensaf.org/ticket/2" title="defect: Incorect link in documentation bundle (closed: fixed)">#2</a> 0x000000000043acbc in avd_mds_cpy (cpy_info=0x4009df90) at avd_ndmsg.c:156 <a class="missing ticket" href="http://devel.opensaf.org/ticket/3" rel="nofollow">#3</a> 0x0000000000439c64 in avd_mds_cbk (info=0x4009df80) at avd_mds.c:444 <a class="assigned ticket" href="http://devel.opensaf.org/ticket/4" title="enhancement: AVA handle database not needed (assigned)">#4</a> 0x00002b3dd19349fb in mcm_msg_cpy_send (to=1 '\1', svc_cb=0x6b1f10, to_msg=0x4009e150, to_svc_id=27, dest_vdest_id=65535, </p> <blockquote> <p> i_req=0x4009e210, xch_id=0, dest=564117024603568, </p> </blockquote> <p> pri=MDS_SEND_PRIORITY_HIGH) at src/mds_c_sndrcv.c:1290 <a class="closed ticket" href="http://devel.opensaf.org/ticket/5" title="defect: Java VM and AMF integration not working (closed: fixed)">#5</a> 0x00002b3dd193444f in mds_mcm_send_msg_enc (to=1 '\1', svc_cb=0x6b1f10, to_msg=0x4009e150, to_svc_id=27, dest_vdest_id=65535, </p> <blockquote> <p> req=0x4009e210, xch_id=0, dest=564117024603568, pri=MDS_SEND_PRIORITY_HIGH) </p> </blockquote> <p> at src/mds_c_sndrcv.c:1152 <a class="accepted ticket" href="http://devel.opensaf.org/ticket/6" title="defect: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called (accepted)">#6</a> 0x00002b3dd19342a7 in mcm_pvt_normal_snd_process_common (env_hdl=131071, fr_svc_id=26, to_msg= </p> <blockquote> <p> {msg_type = 1 '\1', data = {msg = 0x2aaaab0f8490, info = {len = 33936, </p> </blockquote> <p> buff = 0x0}}, msg_fmt_ver = 0, rem_svc_sub_part_ver = 2 '\2', rem_svc_arch_word = 8 '\b', mds_bcast_list_hdr = 0x0}, to_dest=564117024603568, to_svc_id=27, req=0x4009e210, </p> <blockquote> <p> pri=MDS_SEND_PRIORITY_HIGH, xch_id=0) at src/mds_c_sndrcv.c:1111 </p> </blockquote> <p> <a class="closed ticket" href="http://devel.opensaf.org/ticket/7" title="defect: OpenSAF TIPC management wrong (closed: fixed)">#7</a> 0x00002b3dd1933d9e in mcm_pvt_normal_svc_snd (env_hdl=131071, fr_svc_id=26, msg=0x2aaaab0f8490, to_dest=564117024603568, </p> <blockquote> <p> to_svc_id=27, req=0x4009e210, pri=MDS_SEND_PRIORITY_HIGH) at </p> </blockquote> <p> src/mds_c_sndrcv.c:950 <a class="closed ticket" href="http://devel.opensaf.org/ticket/8" title="defect: OpenSAF does not own POSIX Shared memory (closed: fixed)">#8</a> 0x00002b3dd19337f1 in mds_mcm_send (info=0x78b240) at src/mds_c_sndrcv.c:711 <a class="closed ticket" href="http://devel.opensaf.org/ticket/9" title="defect: OpenSAF deployment support non-flexible (closed: fixed)">#9</a> 0x00002b3dd1932dde in mds_send (info=0x78b240) at src/mds_c_sndrcv.c:405 <a class="closed ticket" href="http://devel.opensaf.org/ticket/10" title="defect: clean up nis_scxb & nis_snmpd_clean.sh (closed: fixed)">#10</a> 0x00002b3dd19329a8 in ncsmds_api (svc_to_mds_info=0x78b240) at src/mds_papi.c:111 <a class="closed ticket" href="http://devel.opensaf.org/ticket/11" title="defect: memory leak in avsv demo (closed: fixed)">#11</a> 0x000000000043b198 in avd_d2n_msg_dequeue (cb=0x6a7460) at avd_ndmsg.c:308 <a class="missing ticket" href="http://devel.opensaf.org/ticket/12" rel="nofollow">#12</a> 0x000000000043d321 in avd_process_event (cb_now=0x6a7460, evt=0xf839c0) at avd_proc.c:688 <a class="missing ticket" href="http://devel.opensaf.org/ticket/13" rel="nofollow">#13</a> 0x000000000043ce17 in avd_main_proc (cb=0x6a7460) at avd_proc.c:600 <a class="missing ticket" href="http://devel.opensaf.org/ticket/14" rel="nofollow">#14</a> 0x000000000040fceb in avd_init_proc (avd_hdl_ptr=0x686d1c) at avd.c:592 <a class="missing ticket" href="http://devel.opensaf.org/ticket/15" rel="nofollow">#15</a> 0x00002b3dd294e193 in start_thread () from /lib64/libpthread.so.0 <a class="missing ticket" href="http://devel.opensaf.org/ticket/16" rel="nofollow">#16</a> 0x00002b3dd27c8dfd in clone () from /lib64/libc.so.6 </p> Results http://devel.opensaf.org/ticket/739#changelog http://devel.opensaf.org/ticket/742 http://devel.opensaf.org/ticket/742 #742: component does not gets CSI when it is killed but it restarts Tue, 12 Jan 2010 10:40:30 GMT ashwanigoyal <p> Whenever a component is killed CSI is not assigned to it after it restarts. </p> <p> From logs it is observed that it can not come to IN SERVICE state from OUT OF SERVICE. </p> <p> NOTICE : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.486 avd_comp_oper_state_set - 'safComp=AvSVTest1,safSu=SC_2_1_AvSVTest1,safSg=2N_AvSVTest,safApp=AvSVTestApp' DISABLED => ENABLED DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.490 AVD: ENTRED THE FUNCTION avsv_send_ckpt_data NOTICE : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.490 avd_comp_readiness_state_set - 'safComp=AvSVTest1,safSu=SC_2_1_AvSVTest1,safSg=2N_AvSVTest,safApp=AvSVTestApp' OUT_OF_SERVICE => OUT_OF_SERVICE DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: ENTRED THE FUNCTION avsv_send_ckpt_data DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: ENTRED THE FUNCTION avsv_send_ckpt_data DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: ENTRED THE FUNCTION avd_mds_cpy DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: RECEIVED THE VALUE at avd_ndmsg.c:142 val 136081512 DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: MDS Copy Cbk Success DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: MDS Rcv Cbk Success DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: MDS Send Success DEBUG : 0x0002010f 1173700547 12 1 12Jan2010_16.13.25.491 AVD: Dump of the AVD AvND Message : 0x081C89B0 </p> Results http://devel.opensaf.org/ticket/742#changelog http://devel.opensaf.org/ticket/743 http://devel.opensaf.org/ticket/743 #743: Improve printouts from NTF Tue, 12 Jan 2010 14:03:00 GMT arne <p> Printouts to syslog at initial start from NTFS is missing. For example if "component 'safComp=CompT_NTF,safSu=SuT_NCS_CNTLR,safNode=SC-2-1' faulted due to 'healthCheckcallbackTimeout(6)" occur at node restart. </p> <p> Probably NTF is in a try again loop waiting for saflog to be available but no printout indicate that. </p> <p> </p> Results http://devel.opensaf.org/ticket/743#changelog http://devel.opensaf.org/ticket/744 http://devel.opensaf.org/ticket/744 #744: Trace is causing crash -- saNtfNotificationSubscribe Wed, 13 Jan 2010 06:59:11 GMT Manoj Lalavat <manoj.lalavat@…> <p> In saNtfNotificationSubscribe within for loop we are trying to trace filter handle before checking NULL. </p> Results http://devel.opensaf.org/ticket/744#changelog http://devel.opensaf.org/ticket/4 http://devel.opensaf.org/ticket/4 #4: AVA handle database not needed Tue, 05 Feb 2008 07:27:45 GMT hans.feldt@… <p> The AVA control block contains a handle database that is not really used. Could be removed. </p> Results http://devel.opensaf.org/ticket/4#changelog http://devel.opensaf.org/ticket/6 http://devel.opensaf.org/ticket/6 #6: AMF Node fail-over,SaAmfCSIRemoveCallbackT erroneously called Tue, 05 Feb 2008 09:07:57 GMT hans.feldt@… <p> <a class="ext-link" href="http://list.opensaf.org/archives/users/2008-January/000772.html"><span class="icon">http://list.opensaf.org/archives/users/2008-January/000772.html</span></a> </p> Results http://devel.opensaf.org/ticket/6#changelog http://devel.opensaf.org/ticket/130 http://devel.opensaf.org/ticket/130 #130: Name space pollution in libraries Tue, 25 Mar 2008 06:52:15 GMT hans.feldt@… <p> All SAF (and others?) libraries need a version script see: <a class="ext-link" href="http://www.gnu.org/software/binutils/manual/ld-2.9.1/html_node/ld_25.html"><span class="icon">http://www.gnu.org/software/binutils/manual/ld-2.9.1/html_node/ld_25.html</span></a> </p> <p> that hides library internal symbols from the library name space. </p> <p> This is what is required, in e.g. LOG: </p> <p> - Change lib/lib_SaLog/Makefile.am: </p> <p> libSaLog_la_LDFLAGS = -Wl,-version-script=libSaLog.version </p> <p> - Add file libSaLog.version in lib/lib_SaLog, contents: </p> <p> # Version and symbol export for libSaLog.so OPENAIS_LOG_A.02.01 { </p> <blockquote> <p> global: </p> <blockquote> <p> saLog*; </p> </blockquote> <p> local: </p> <blockquote> <p> lga*; </p> </blockquote> </blockquote> <p> }; </p> Results http://devel.opensaf.org/ticket/130#changelog http://devel.opensaf.org/ticket/137 http://devel.opensaf.org/ticket/137 #137: AvSv programmer guide, additional configuration example Wed, 02 Apr 2008 20:56:51 GMT marioa <p> Two examples listed bellow, communicated on users list, show good use cases that are beneficial to OpenSAF users. </p> <p> Proposal is to adding these examples bellow to <a class="missing wiki" href="http://devel.opensaf.org/wiki/AvSv" rel="nofollow">AvSv?</a> documentation. This will: </p> <ul><li>even better illustrate how to use SNMP for configuring AMF </li><li>better emphasizes configuration capabilities of OpenSAF </li></ul><p> </p> <blockquote class="citation"> <p> ——Original Message—— From: users-bounces@… users-bounces@… On Behalf Of Arya Ravi-G20265 Sent: den 26 oktober 2007 17:48 To: Long, Qing Yang (TSG-GDCC-SH/CMEP); Sreenivasan Prabhu-G20200; users@… Subject: Re: [Users] Adding a node to OpenSAF cluster Hi Long Qingyang, Please see the following sample configuration using snmp mibsets to configure a component, csi, SU, SI and all. There are dependencies between different-different mib configurations so you need to make sure that the configuration is done in a particular order as mentioned below(Either follow the mib descriptions). You may add a complete new configuration or you can modify the existing one. Every mib object description talks about if it has any dependency, if you do not want to look into mib files than follow the configuration order mentioned below. Thanks- Arya Node Configuration: (Configure as many as node is needed) snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfNodeSuFailoverMax.\"safNode=PL_2_3\" u 2 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfNodeSuFailoverProb.\"safNode=PL_2_3\" x "00 00 03 A3 52 94 40 00" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB localhost SAF-AMF-MIB::saAmfNodeAdminState.\"safNode=PL_2_3\" i 2 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 NCS-AVSV-MIB::ncsNDNodeId.\"safNode=PL_2_3\" u 3 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfNodeRowStatus.\"safNode=PL_2_3\" i 1 SG Configuratin: (Configure SGs ) snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSGRedModel.\"safSg=SG_TEST\" i 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSGNumPrefInserviceSUs.\"safSg=SG_TEST\" u 2 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSGCompRestartProb.\"safSg=SG_TEST\" x "00 00 03 A3 52 94 40 00" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSGSuRestartProb.\"safSg=SG_TEST\" x "00 00 03 A3 52 94 40 00" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSGCompRestartMax.\"safSg=SG_TEST\" u 10 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSGSuRestartMax.\"safSg=SG_TEST\" u 10 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSGRowStatus.\"safSg=SG_TEST\" i 1 SU Configuration: (Configure SUs on a node or different nodes) snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSURank.\"safSu=SuT_TEST,safNode=PL_2_3\" u 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSUNumComponents.\"safSu=SuT_TEST,safNode=PL_ 2_3\" u 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSUParentSGName.\"safSu=SuT_TEST,safNode=PL_2 _3\" s "safSg=SG_TEST" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSURowStatus.\"safSu=SuT_TEST,safNode=PL_2_3\" i 1 SI Configuration: (Configure SIs) snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSIRank.\"safSi=Si_TEST\" u 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSINumCSIs.\"safSi=Si_TEST\" u 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSIParentSGName.\"safSi=Si_TEST\" s "safSg=SG_TEST" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSIRowStatus.\"safSi=Si_TEST\" i 1 CSI Configuration: (Configure CSIs ) snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCSType.\"safCsi=Csi_TEST,safSi=Si_TEST\" s "safCsi=Csi_TEST" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCSIRank.\"safCsi=Csi_TEST,safSi=Si_TEST\" u 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCSIRowStatus.\"safCsi=Csi_TEST,safSi=Si_TEST\" i 1 SUsPerSI Configuration: snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSUsperSISUName.\"safSi=Si_TEST\".1 s "safSu=SuT_TEST,safNode=PL_2_3" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfSUsperSIRowStatus.\"safSi=Si_TEST\".1 i 1 CSTypeParam Configuration: snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompCSTypeSupportedRowStatus.\"safComp=CompT _TEST,safS u=SuT_TEST,safNode=PL_2_3\".\"safCsi=Csi_TEST\" i 4 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCSTypeParamRowStatus.\"safCsi=Csi_TEST\".\"t est\" i 4 Component Configuration: (Configure components) snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompCapability.\"safComp=CompT_TEST,safSu=Su T_TEST,saf Node=PL_2_3\" i 4 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompCategory.\"safComp=CompT_TEST,safSu=SuT_ TEST,safNo de=PL_2_3\" i 0 nmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompInstantiateCmd.\"safComp=CompT_TEST,safS u=SuT_TEST ,safNode=PL_2_3\" s "/home/g20265/comp_clc/test_1.sh" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompCleanupCmd.\"safComp=CompT_TEST,safSu=Su T_TEST,saf Node=PL_2_3\" s "/home/g20265/comp_clc/comp_term.pl" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompTerminateTimeout.\"safComp=CompT_TEST,sa fSu=SuT_TE ST,safNode=PL_2_3\" x "00 00 03 A3 52 94 40 00" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompNumMaxInstantiate.\"safComp=CompT_TEST,s afSu=SuT_T EST,safNode=PL_2_3\" u 5 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompNumMaxActiveCsi.\"safComp=CompT_TEST,saf Su=SuT_TES T,safNode=PL_2_3\" u 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompNumMaxStandbyCsi.\"safComp=CompT_TEST,sa fSu=SuT_TE ST,safNode=PL_2_3\" u 1 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompAMEnable.\"safComp=CompT_TEST,safSu=SuT_ TEST,safNo de=PL_2_3\" i 2 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB -r 0 10.232.92.186 SAF-AMF-MIB::saAmfCompRowStatus.\"safComp=CompT_TEST,safSu=SuT _TEST,safN ode=PL_2_3\" i 1 <span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span> From: users-bounces@… users-bounces@… On Behalf Of Long, Qing Yang (TSG-GDCC-SH/CMEP) Sent: Thursday, October 25, 2007 8:23 PM To: Sreenivasan Prabhu-G20200; users@… Subject: Re: [Users] Adding a node to OpenSAF cluster Thanks Prabhu. These commands you provide works well. Paynode "PL_2_20" can run up. But this is only basic configuration. No application is configured on this node. It is empty and useless. And only one OpenSAF process is running: "/opt/opensaf/payload/bin/ncs_pcap NID_SVC_ID=18". Some HA-Aware applications should be added to this node. How to add "service group", "service unit", "service instance" etc. including all kinds of software components? For example: "CompT_SC_CPND", "CompT_SC_MQND", "CompT_SC_GLND", "safComp=CompT_IFND", "safComp=CompT_EDS", "safComp=CompT_AvSvDemo" etc. When "PL_2_20" is running, it is just like "PL_2_3" which is configured by XML files. How to configure these software components using snmpset? Best regards. Long Qingyang <span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span> From: Sreenivasan Prabhu-G20200 prabhu@… Sent: Thursday, October 25, 2007 9:04 PM To: Long, Qing Yang (TSG-GDCC-SH/CMEP); users@… Subject: RE: [Users] Adding a node to OpenSAF cluster Hi Long Qingyang You can configure a running opens system using MIBS. for example to add a node configuration dynamically on a running system. say to add a payload node on slot 20 (0x02140f), give the following SNMP commands on your active controller. snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB localhost SAF-AMF-MIB::saAmfNodeSuFailoverProb.\"safNode=PL_2_20\" x "00 00 00 00 05 F5 E1 00" snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB localhost SAF-AMF-MIB::saAmfNodeSuFailoverMax.\"safNode=PL_2_20\" u 2 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB localhost SAF-AMF-MIB::saAmfNodeAdminState.\"safNode=PL_2_20\" i 2 snmpset -v2c -c public -m /usr/share/snmp/mibs/NCS-AVSV-MIB localhost NCS-AVSV-MIB::ncsNDNodeId.\"safNode=PL_2_20\" u 134928 snmpset -v2c -c public -m /usr/share/snmp/mibs/SAF-AMF-MIB localhost SAF-AMF-MIB::saAmfNodeRowStatus.\"safNode=PL_2_20\" i 1 after wards you need to add SU's, components etc.. on this new node. Refer SAF-AMF-MIB & NCS-AVSV-MIB for more details. Note: copy all mibs to /usr/share/snmp/mibs on active controller export MIBDIRS=/usr/share/snmp/mibs from your shell. Thanks Prabhu <span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span><span class="underline"></span> From: users-bounces@… users-bounces@… On Behalf Of Long, Qing Yang (TSG-GDCC-SH/CMEP) Sent: Thursday, October 25, 2007 6:56 AM To: users@… Subject: [Users] Adding a node to OpenSAF cluster Hello friends, I need your help about the following questions 1. OpenSAF uses NCSSystemBOM.xml and <a class="missing wiki" href="http://devel.opensaf.org/wiki/AppConfig" rel="nofollow">AppConfig?</a>.xml to configure a cluster node and software components. How to achieve this function: Add a node to OpenSAF cluster when OpenSAF is running ? Do you have C language program which implementing this function? 2. Or do you have a step by step operation guide to add a node to OpenSAF cluster manually? Thanks. Long Qingyang </p> </blockquote> Results http://devel.opensaf.org/ticket/137#changelog