informix-community

Open full view…

onbar -r fails to recover DB server

naguibenator
Mon, 23 Jul 2018 12:12:05 GMT

Here are the steps I followed 1- Perform level-0 backup on the source server. Backups are going to the Cloud device (S3 bucket device setup using the ONPSM tool) # onbar -b -L 0 2- Copied $INFORMIXDIR/etc/psm directory and $INFORMIXDIR/etc/ixbar.? from the source server to the target and renamed the ixbar.? file replacing the IBM Informix server name that is used on the source computer with the IBM Informix server name of the target computer. 3- Changed ownership of the copied /psm directory and the ixbar file to user informix 4- Shutdown the db to perform a cold restore onmode -yuk 5- Checked that the backup objects are there onpsm -O list 6- Ran onbar -r -cf yes Got this weird error: the environment variable BARBSALIBPATH must be set in order to use the -cf option! I checked the BARBSALIBPATH in the ONCONFIG file and it is set to $INFORMIXDIR/lib/libbsapsm.so 7- onsmsync -b baract.log 2018-07-20 00:19:07 25656 25370 (-43207) Unable to open connection to database server: . 2018-07-20 00:19:07 25656 25370 onsmsync complete, returning 155 (0x9b) 2018-07-20 00:21:24 25933 25931 /opt/informix-12.10.fc10/bin/onbard -r 8- onbar -r Warning: Parameter's user-configured value was adjusted. (ALARMPROGRAM) gzip: stdin: unexpected end of file gzip: stdin: unexpected end of file 9- The server came back in fast recovery mode 10 – Checked the bardbug.log. I can see it connecting to the bucket and issuing a request like GET /awwdst13a/rootdbs/0/rootdbs.377.1 HTTP/1.1 then failure messages like Failed writing body (0 != 16347) Closing connection 0 2018-07-20 00:21:25 25937 25933 smtranid.c:155 Object Transaction List was empty. 2018-07-20 00:21:27 25933 25931 barixbarlist: enter 2018-07-20 00:21:27 25933 25931 barixbarlist: return 0 (0x00) 2018-07-20 00:21:27 25933 25931 barbuildtimeline: enter 2018-07-20 00:21:27 25933 25931 barbuildtimeline: return 0 (0x00) 2018-07-20 00:21:27 25933 25931 smcatalog.c:521 nsmOpenDatabase: Count: opened = 9, missing = 0. 2018-07-20 00:21:27 25933 25931 smtranid.c:84 Object Transaction List was empty. 2018-07-20 00:21:27 25941 25933 smcatalog.c:521 nsmOpenDatabase: Count: opened = 9, missing = 0. 2018-07-20 00:21:27 25941 filter writetofilter: Write failed on parent's output pipe. errno = 32.. 2018-07-20 00:21:27 25941 filter writetofilter: Error writing to handle/fd 17 located at 0x1e05930. 11- Checked the baract.log 2018-07-20 00:21:27 25945 25933 Successfully connected to Storage Manager. 2018-07-20 00:21:27 25933 25931 Begin cold level 0 restore rootdbs (Storage Manager copy ID: 0 377). 2018-07-20 00:21:40 25945 25933 Informix Primary Storage Manager session 867 closed 2018-07-20 00:21:40 25945 25933 The child process for the backup and restore filter is terminating with exit code 0. 2018-07-20 00:21:41 25933 25931 Completed cold level 0 restore rootdbs. 2018-07-20 00:21:41 25969 25933 Begin cold level 0 restore api7dbs (Storage Manager copy ID: 0 381). 2018-07-20 00:21:42 25968 25933 Informix Primary Storage Manager session 869 opened. 2018-07-20 00:21:42 25968 25933 Successfully connected to Storage Manager. 2018-07-20 00:21:42 25968 25933 Starting Filter /usr/bin/gunzip. 2018-07-20 00:21:42 25968 25933 Informix Primary Storage Manager session 869 closed 2018-07-20 00:21:44 25971 25933 Begin cold level 0 restore dat2dbs (Storage Manager copy ID: 0 380). 2018-07-20 00:21:51 25983 25971 There are no more bytes to read. ISAM Error = 0. OS Error = 10 (No child processes). 2018-07-20 00:21:51 25983 25971 XBSA Error: (BSAGetData) A system error occurred. Aborting XBSA session. 2018-07-20 00:21:51 25983 25971 Informix Primary Storage Manager session 875 closed 2018-07-20 00:21:51 25983 25971 The child process for the backup and restore filter is terminating with exit code 0. 2018-07-20 00:21:53 25971 25933 Unable to close the storage space restore: The physical restore was not completed.. 2018-07-20 00:21:53 25971 25933 Process 25971 25933 completed. 2018-07-20 00:21:53 25933 25931 (-43246) The ON-Bar process 25971 exited with a problem (exit code 131 (0x83), signal 0).

gcastro
Mon, 23 Jul 2018 22:01:57 GMT

Hi. your steps so not seem consistent: In Step 2 : You only need to rename the ixbar file so that the name matches the SERVERNUM in your onconfig. You should NOT edit the ixbar file to change the the contents or change the INFORMIXSERVER you see in the first column. In Step 6 you ran "onbar -r -cf yes", this is done to restore your config file, ixbar and sqlhosts file. But you just copied your ixbar manually, so is not clear why you want to do this. Furthermore you get an error but you did not address the error. In step 7 you ran "onsmsync -b" which is used to regenerate the ixbar file, which you just copied in step 2, this command fails because you need to have the engine online in order to use it. The error "Failed writing body" means we tried to write the dbspace data but failed for some reason. You should check if there are any errors in the online.log. IN addition to that I see you are using filters, maybe there is something wrong in your filter configuration, therefore double check that.

naguibenator
Thu, 26 Jul 2018 08:31:43 GMT

Sorry I lumped a number of steps together when I actually should have outlined the 2 different restore processes I have attempted. So Below is what I attempted and the error I got: On the source (backup) db 1- Perform level-0 backup on the source server. Backups are going to the Cloud device (S3 bucket device setup using the ONPSM tool) # onbar -b -L 0 2- On the target host, ran oninit -i then onpsm -C init -d to start clean. Also, there were 4 dbs on that system I was able to drop two of them but not the sysmaster and sysutils ones [Screen Shot 2018-07-26 at 1](//muut.com/u/informix-community/s1/:informix-community:TOtK:screenshot20180726at1.43.25pm.png.jpg) 3- Copied $INFORMIXDIR/etc/psm directory and $INFORMIXDIR/etc/ixbar.x from the source server to the target (recovery host) and renamed the ixbar file so that the name matches the SERVERNUM in the onconfig file. 4- Ran onpsm -D list onpsm -O list and I could see the storage devices that existed on the source as well as the storage objects 5- Ran onbar -r and got this 2018-07-26 16:02:57 23329 22831 Begin cold level 0 restore api3dbs (Storage Manager copy ID: 0 435). 2018-07-26 16:03:09 23293 23291 Informix Primary Storage Manager session 1008 closed 2018-07-26 16:03:09 23293 23291 The child process for the backup and restore filter is terminating with exit code 0. 2018-07-26 16:03:10 23291 22831 Completed cold level 0 restore api2dbs. 2018-07-26 16:03:10 23291 22831 Process 23291 22831 completed. 2018-07-26 16:03:10 23729 22831 Process 23729 22831 successfully forked. 2018-07-26 16:03:10 23729 22831 The PSM is ready. 2018-07-26 16:03:10 23729 22831 Informix Primary Storage Manager session 1013 opened. 2018-07-26 16:03:10 23729 22831 Successfully connected to Storage Manager. 2018-07-26 16:03:10 23729 22831 Starting Filter /usr/bin/gunzip. 2018-07-26 16:03:10 23729 22831 Informix Primary Storage Manager session 1013 closed 2018-07-26 16:03:10 23731 23729 The PSM is ready. 2018-07-26 16:03:10 23731 23729 Informix Primary Storage Manager session 1014 opened. 2018-07-26 16:03:10 23731 23729 Successfully connected to Storage Manager. 2018-07-26 16:03:10 23729 22831 Begin cold level 0 restore dat4dbs (Storage Manager copy ID: 0 436). 2018-07-26 16:03:29 23316 23314 There are no more bytes to read. ISAM Error = 0. OS Error = 10 (No child processes). 2018-07-26 16:03:29 23316 23314 XBSA Error: (BSAGetData) A system error occurred. Aborting XBSA session. 2018-07-26 16:03:29 23316 23314 Informix Primary Storage Manager session 1010 closed 2018-07-26 16:03:29 23316 23314 The child process for the backup and restore filter is terminating with exit code 0. 2018-07-26 16:03:30 23314 22831 Unable to close the storage space restore: The physical restore was not completed.. 2018-07-26 16:03:30 23314 22831 Process 23314 22831 completed. 2018-07-26 16:03:30 22831 22829 (-43246) The ON-Bar process 23314 exited with a problem (exit code 131 (0x83), signal 0). 2018-07-26 16:03:30 24980 22831 Process 24980 22831 successfully forked. 2018-07-26 16:03:30 24980 22831 The PSM is ready. 2018-07-26 16:03:30 24980 22831 Informix Primary Storage Manager session 1015 opened. 2018-07-26 16:03:30 24980 22831 Successfully connected to Storage Manager. 2018-07-26 16:03:30 24980 22831 Starting Filter /usr/bin/gunzip. 2018-07-26 16:03:30 24980 22831 Informix Primary Storage Manager session 1015 closed 2018-07-26 16:03:30 24982 24980 The PSM is ready. 2018-07-26 16:03:30 24982 24980 Informix Primary Storage Manager session 1016 opened. 2018-07-26 16:03:30 24982 24980 Successfully connected to Storage Manager. 2018-07-26 16:03:30 24980 22831 Begin cold level 0 restore dat0dbs (Storage Manager copy ID: 0 437). 2018-07-26 16:03:43 23331 23329 There are no more bytes to read. ISAM Error = 0. 2018-07-26 16:06:12 22831 22829 (-43140) Due to the previous error, logical restore will not be attempted. 2018-07-26 16:06:12 22831 22829 /opt/informix-12.10.fc10/bin/onbar_d complete, returning 131 (0x83)

gcastro
Thu, 26 Jul 2018 18:25:48 GMT

The initial step "oninit -i" is not needed. Without seeing the whole log is not possible to know what is happening. At this point the error seems to indicate there is one dbspace that is not complete. I suggest you to call Tech Support to assist you.