OS/Comm Failures
Posted: 17 Aug 2022, 23:50
I've got dozens of RevPis deployed, and I'm having issues with devices becoming unreachable on the network. This has happened a handful of times in the past couple of years, but is happening with increased frequency in the last couple of weeks. We did recently run 'apt update' and 'apt dist-upgrade' on some of these devices, but the issue has also been seen on devices that haven't been updated recently. When this occurs, I'm unable to ping the devices on either ethernet interface. The LEDs remain in the normal state. Cycling power resolves the issue. Looking through log files, it appears the OS stops functioning (e.g. cron jobs don't run as scheduled). In one case, I found an entry in kern.log (pasted below), but in most cases there is nothing of note. It has happened to several devices installed across multiple networks and locations. I have Cores, Connects, and Compacts deployed, but this issue is occurring on Compacts running Buster. Any advice or recommended troubleshooting steps are appreciated.
Code: Select all
Aug 12 09:35:25 revpi kernel: [244060.084807] ------------[ cut here ]------------
Aug 12 09:35:25 revpi kernel: [244060.084819] WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:2498 set_task_cpu+0x23c/0x31c
Aug 12 09:35:25 revpi kernel: [244060.084843] Modules linked in: sha256_generic cfg80211 rfkill 8021q garp stp llc raspberrypi_hwmon snd_bcm2835(C) snd_pcm snd_timer snd bcm2835_isp(C) bcm2835_codec(C) v4l2_mem2mem bcm2835_v4l2(C) videobuf2_dma_contig bcm2835_mmal_vchiq(C) vc_sm_cma(C) videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc uio_pdrv_genirq uio spidev ks8851_spi ks8851_common eeprom_93cx6 piControl(O) ad5446 ti_dac082s085 mcp320x iio_mux mux_gpio mux_core fixed gpio_74x164 spi_bcm2835aux spi_bcm2835 gpio_max3191x crc8 industrialio i2c_dev ip_tables x_tables ipv6
Aug 12 09:35:25 revpi kernel: [244060.084975] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G C O 5.10.103-rt62-v7 #1
Aug 12 09:35:25 revpi kernel: [244060.084982] Hardware name: BCM2835
Aug 12 09:35:25 revpi kernel: [244060.084986] Backtrace:
Aug 12 09:35:25 revpi kernel: [244060.084991] [<80a55220>] (dump_backtrace) from [<80a555ac>] (show_stack+0x20/0x24)
Aug 12 09:35:25 revpi kernel: [244060.085008] r7:80feace4 r6:00000000 r5:60000193 r4:80feace4
Aug 12 09:35:25 revpi kernel: [244060.085011] [<80a5558c>] (show_stack) from [<80a59918>] (dump_stack+0xbc/0xe8)
Aug 12 09:35:25 revpi kernel: [244060.085024] [<80a5985c>] (dump_stack) from [<80120010>] (__warn+0xfc/0x114)
Aug 12 09:35:25 revpi kernel: [244060.085041] r9:00000009 r8:801562d0 r7:000009c2 r6:00000009 r5:801562d0 r4:80d0f298
Aug 12 09:35:25 revpi kernel: [244060.085044] [<8011ff14>] (__warn) from [<80a55c0c>] (warn_slowpath_fmt+0x70/0xc0)
Aug 12 09:35:25 revpi kernel: [244060.085057] r7:000009c2 r6:80d0f298 r5:80f07808 r4:00000000
Aug 12 09:35:25 revpi kernel: [244060.085060] [<80a55ba0>] (warn_slowpath_fmt) from [<801562d0>] (set_task_cpu+0x23c/0x31c)
Aug 12 09:35:25 revpi kernel: [244060.085074] r9:00000000 r8:b6b57180 r7:80f01d2c r6:80f07858 r5:00000001 r4:8178e000
Aug 12 09:35:25 revpi kernel: [244060.085077] [<80156094>] (set_task_cpu) from [<8016da90>]
Aug 12 18:47:19 revpi kernel: [ 0.000000] Booting Linux on physical CPU 0x0
Aug 12 18:47:19 revpi kernel: [ 0.000000] Linux version 5.10.103-rt62-v7 (support@kunbus.com) (arm-linux-gnueabihf-gcc (Debian 8.3.0-2) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP PREEMPT_RT Tue, 24 May 2022 11:41:05 +0000