tracing - Python Script randomly failing - What tools can I use to determine the cause? -


i have python script wrote part of pool automation project. have made many changes on time improving , adding functionally. result, have not had opportunity allow run long periods of time until when (pretty much) got wanted it. have let run time, randomly failing , restarting (via watchdog support).

i running script on raspberry pi3 via systemd , includes watchdog support since want/need run of time. watchdog catching script when fails , restarting suppose to, rather figure out causing script fail in first place.

the script connects mysql database, grabs information on level of swimming pool , how many watts in use pool pump , determines if need fill pool. if do, use relay open sprinkler valve attached pool, if not not anything. check see if sprinklers running, pool pump running , if has thrown physical disconnect switch. has number of status leds use , couple of switches along lcd screen talking via serial pi.

this script pretty thing running on pi other sshd , system stuff...no apache, no node-red, ftp, etc...

i have ssh session open pi , session never fails, when script fails. continuous ping pi shows 0 packet loss, when script fails. syslog shows following when script fails , restarted:

jun  6 08:08:56 scruffy systemd[1]: unit pool_control.service entered failed state. jun  6 08:08:57 scruffy systemd[1]: pool_control.service holdoff time over, scheduling restart. jun  6 08:08:57 scruffy systemd[1]: stopping installing python script pool fill control /w watchdog... jun  6 08:08:57 scruffy systemd[1]: starting installing python script pool fill control /w watchdog... jun  6 08:08:58 scruffy systemd[1]: started installing python script pool fill control /w watchdog. jun  6 08:08:58 scruffy kernel: [34864.219647] gpiomem-bcm2835 3f200000.gpiomem: gpiomem device opened. 

dmesg shows when script fails , restarted:

[    8.938912] gpiomem-bcm2835 3f200000.gpiomem: gpiomem device opened. [34864.219647] gpiomem-bcm2835 3f200000.gpiomem: gpiomem device opened. 

the log program shows nothing out of ordinary:

2016-06-06 13:26:24,387 info notify socket = /run/systemd/notify 2016-06-06 13:26:24,616 debug pushbullet notification sent - pool fill control started 2016-06-06 13:26:24,617 info pool_fill_control.py v2.6 (2016-06-05) started 2016-06-06 13:26:25,182 debug sprinklers not running (rachio). 2016-06-06 13:26:25,183 debug sprinkler_run_led should off. blue led 2016-06-06 13:26:25,184 debug watchdog ping sent 2016-06-06 13:26:25,611 debug get_pool_level returned 1 2016-06-06 13:26:25,764 debug pool_pump_running_watts returned 12 watts in use pump. 2016-06-06 13:26:25,765 debug pump_run_led should off. yellow led 2016-06-06 13:26:25,766 debug pool_filling_led should off. blue led 2016-06-06 13:26:25,766 debug pool level ok (pfc_level_ok) sent mightyhat 

while script running, here output of top:

top - 13:29:36 15:01,  3 users,  load average: 0.05, 0.07, 0.05 tasks: 119 total,   1 running, 118 sleeping,   0 stopped,   0 zombie %cpu(s):  0.7 us,  1.2 sy,  0.0 ni, 98.0 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st kib mem:    947760 total,   390032 used,   557728 free,   114444 buffers kib swap:   102396 total,        0 used,   102396 free.    97648 cached mem 

and meminfo:

root scruffy: log #  cat /proc/meminfo  memtotal:         947760 kb memfree:          558160 kb memavailable:     864020 kb buffers:          114460 kb cached:            97640 kb swapcached:            0 kb active:           202888 kb inactive:          31192 kb active(anon):      23672 kb inactive(anon):     6140 kb active(file):     179216 kb inactive(file):    25052 kb unevictable:        1744 kb mlocked:            1744 kb swaptotal:        102396 kb swapfree:         102396 kb dirty:                16 kb writeback:             0 kb anonpages:         23844 kb mapped:            19188 kb shmem:              6424 kb slab:             140780 kb sreclaimable:     132312 kb sunreclaim:         8468 kb kernelstack:        1000 kb pagetables:          668 kb nfs_unstable:          0 kb bounce:                0 kb writebacktmp:          0 kb commitlimit:      576276 kb committed_as:      92620 kb vmalloctotal:    1114112 kb vmallocused:           0 kb vmallocchunk:          0 kb cmatotal:           8192 kb cmafree:            3736 kb 

here more system information:

root scruffy: log #  uptime 13:41:58 15:14,  3 users,  load average: 0.02, 0.04, 0.05  root scruffy: log #  uname -a linux scruffy 4.4.9-v7+ #884 smp fri may 6 17:28:59 bst 2016 armv7l gnu/linux 

here systemd startup/shutdown script:

# script starts , stops our pool fill control python script  [unit] description=installing python script pool fill control /w watchdog requires=basic.target after=multi-user.target  [service] type=notify watchdogsec=70s execstart=/usr/bin/python /root/pool_control/pool_fill_control.py execstop=/root/pool_control/setupgpio.sh restart=always  # number of times service restarted within time period can set # if condition met, rpi can rebooted # startlimitburst=4 startlimitinterval=180s # actions can none|reboot|reboot-force|reboot-immidiate startlimitaction=none  # following defined /etc/systemd/system.conf file , # global services # #defaulttimeoutstartsec=90s #defaulttimeoutstopsec=90s # # can set on per process here: # if not defined here, fall system.conf values timeoutstartsec=2s timeoutstopsec=2s  [install] wantedby=multi-user.target 

i have tried running on fresh install of jessie moving pi, same results, after indeterminate amount of time, script fails , watchdog restarts it.

the script in question quite long, not sure proper procedure posting here, have on github here:

https://github.com/rjsears/pool_fill_control/blob/master/pool_fill_control.py

i looking guidance on how troubleshoot code determine may causing fail, or if have egregious code jumps right out @ more experience python. not have experience , first (what consider real) python script.

eventually want interface internal website duplicate physical functions (button presses, leds) via webpage, want script work correctly before going further.

and or direction appreciated!