Page 1 of 1

RevPi NICs failing

Posted: 25 Oct 2023, 15:14
by mcww_rd
We have several systems with Revolution Pis in them deployed. Every once in a while, some will be unreachable over network. After investigating for a while and trying the change our code, config, enclosure design, etc., we’re out of ideas as to what we’re doing that could be causing this.

Code: Select all

Oct 16 07:03:48 RevPi79982 kernel: [469514.653782] ------------[ cut here ]------------ 
Oct 16 07:03:48 RevPi79982 kernel: [469514.653812] WARNING: CPU: 0 PID: 12 at net/sched/sch_generic.c:468 dev_watchdog+0x36c/0x380
Oct 16 07:03:48 RevPi79982 kernel: [469514.653831] NETDEV WATCHDOG: eth0 (smsc95xx): transmit queue 0 timed out
Oct 16 07:03:48 RevPi79982 kernel: [469514.653836] Modules linked in: cdc_acm cfg80211 rfkill 8021q garp stp llc rtc_pcf2127 regmap_spi raspberrypi_hwmon bcm2835_isp(C) snd_bcm2835(C) snd_pcm bcm2835_codec(C) snd_timer v4l2_mem2mem snd videobuf2_dma_contig bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) vc_sm_cma(C) videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc rpivid_mem nvmem_rmem uio_pdrv_genirq uio ks8851_spi ks8851_common eeprom_93cx6 piControl(O) ad5446 ti_dac082s085 mcp320x iio_mux mux_gpio mux_core gpio_74x164 spi_bcm2835aux spi_bcm2835 gpio_max3191x crc8 industrialio i2c_dev ip_tables x_tables ipv6
Oct 16 07:03:48 RevPi79982 kernel: [469514.653971] CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G         C O      5.10.120-rt70-v7l #1
Oct 16 07:03:48 RevPi79982 kernel: [469514.653977] Hardware name: BCM2835
Oct 16 07:03:48 RevPi79982 kernel: [469514.653980] Backtrace:
Oct 16 07:03:48 RevPi79982 kernel: [469514.653984] [<c0bf1968>] (dump_backtrace) from [<c0bf1cf8>] (show_stack+0x20/0x24)
Oct 16 07:03:48 RevPi79982 kernel: [469514.653998]  r7:c14ebe48 r6:00000000 r5:60000013 r4:c14ebe48
Oct 16 07:03:48 RevPi79982 kernel: [469514.654000] [<c0bf1cd8>] (show_stack) from [<c0bf6210>] (dump_stack+0xbc/0xe8)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654010] [<c0bf6154>] (dump_stack) from [<c0222034>] (__warn+0xfc/0x114)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654023]  r9:00000009 r8:c0aca9b4 r7:000001d4 r6:00000009 r5:c0aca9b4 r4:c10c79f0
Oct 16 07:03:48 RevPi79982 kernel: [469514.654025] [<c0221f38>] (__warn) from [<c0bf2484>] (warn_slowpath_fmt+0xa4/0xc0)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654035]  r7:000001d4 r6:c10c79f0 r5:c1407848 r4:c10c79b4
Oct 16 07:03:48 RevPi79982 kernel: [469514.654037] [<c0bf23e4>] (warn_slowpath_fmt) from [<c0aca9b4>] (dev_watchdog+0x36c/0x380)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654048]  r9:00000000 r8:c2d4ff00 r7:c2d78000 r6:c2d78294 r5:c2d78300 r4:c1405100
Oct 16 07:03:48 RevPi79982 kernel: [469514.654050] [<c0aca648>] (dev_watchdog) from [<c02ba6c4>] (call_timer_fn+0x50/0x250)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654062]  r10:00000000 r9:ef721740 r8:02cbf3a0 r7:c0aca648 r6:c2d78300 r5:00000000
Oct 16 07:03:48 RevPi79982 kernel: [469514.654065]  r4:c15480e0
Oct 16 07:03:48 RevPi79982 kernel: [469514.654067] [<c02ba674>] (call_timer_fn) from [<c02bc1f0>] (run_timer_softirq+0x530/0x7cc)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654076]  r9:ef721740 r8:c0aca648 r7:c195fe7c r6:c2d78300 r5:02cbf3a0 r4:00000000
Oct 16 07:03:48 RevPi79982 kernel: [469514.654079] [<c02bbcc0>] (run_timer_softirq) from [<c020160c>] (__do_softirq+0x18c/0x4c0)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654089]  r10:00000082 r9:00000000 r8:ffffe000 r7:c1547a60 r6:00000001 r5:00000002
Oct 16 07:03:48 RevPi79982 kernel: [469514.654092]  r4:c1403084
Oct 16 07:03:48 RevPi79982 kernel: [469514.654094] [<c0201480>] (__do_softirq) from [<c022a698>] (run_ksoftirqd+0x58/0xe0)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654103]  r10:c191fddc r9:00000000 r8:00000001 r7:c140fc9c r6:00000000 r5:c1901d40
Oct 16 07:03:48 RevPi79982 kernel: [469514.654105]  r4:ffffe000
Oct 16 07:03:48 RevPi79982 kernel: [469514.654107] [<c022a640>] (run_ksoftirqd) from [<c024e610>] (smpboot_thread_fn+0x1e8/0x33c)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654115] [<c024e428>] (smpboot_thread_fn) from [<c024a024>] (kthread+0x1d4/0x1f8)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654125]  r9:c1901d40 r8:c024e428 r7:c1954200 r6:c1901d80 r5:c195e000 r4:c1901dc0
Oct 16 07:03:48 RevPi79982 kernel: [469514.654128] [<c0249e50>] (kthread) from [<c0200114>] (ret_from_fork+0x14/0x20)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654135] Exception stack(0xc195ffb0 to 0xc195fff8)
Oct 16 07:03:48 RevPi79982 kernel: [469514.654139] ffa0:                                     00000000 00000000 00000000 00000000
Oct 16 07:03:48 RevPi79982 kernel: [469514.654143] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Oct 16 07:03:48 RevPi79982 kernel: [469514.654147] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
Oct 16 07:03:48 RevPi79982 kernel: [469514.654151]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0249e50
Oct 16 07:03:48 RevPi79982 kernel: [469514.654154]  r4:c1901d80
Oct 16 07:03:48 RevPi79982 kernel: [469514.654157] ---[ end trace 0000000000000002 ]---
It looks like the NIC dies until we kick it.

Re: RevPi NICs failing

Posted: 25 Oct 2023, 15:39
by nicolaiB
Hi

Which image version and kernel are you using?

image version:
cat /etc/revpi/image-release

kernel version:
uname -a
dpkg -l | grep raspberrypi-kernel

Nicolai

Re: RevPi NICs failing

Posted: 25 Oct 2023, 18:24
by mcww_rd
nicolaiB wrote: 25 Oct 2023, 15:39 Hi

Which image version and kernel are you using?

image version:
cat /etc/revpi/image-release

kernel version:
uname -a
dpkg -l | grep raspberrypi-kernel

Nicolai

Code: Select all


2022-07-28-revpi-buster.img
Linux RevPi79982 5.10.120-rt70-v7l #1 SMP PREEMPT_RT Thu, 28 Jul 2022 10:36:48 +0200 armv7l GNU/Linux
ii  raspberrypi-kernel                   1:9.20220728-5.10.120+revpi1            armhf        Revolution Pi Linux kernel

Note we've seen this same issue across various kernels and images since last year.

Re: RevPi NICs failing

Posted: 26 Oct 2023, 08:36
by nicolaiB
There were some improvents in the smsc95xx driver in the latest releases. Could you please install the updates on one of the RevPis and check if this will improve stability?

Nicolai

Re: RevPi NICs failing

Posted: 26 Oct 2023, 14:43
by mcww_rd
nicolaiB wrote: 26 Oct 2023, 08:36 There were some improvents in the smsc95xx driver in the latest releases. Could you please install the updates on one of the RevPis and check if this will improve stability?

Nicolai
Which updates in particular? Just

Code: Select all

apt
,

Code: Select all

raspi-config
, kernel, all?

There's also the issue of testing. We don't know what the root issue is so we can't reproduce the fault. How would we know if the update fixes out problem besides waiting around for months and hoping it doesn't happen again?

Re: RevPi NICs failing

Posted: 07 Nov 2023, 20:39
by mcww_rd
Is there a way to escalate this? We're receiving no support. Our e-mails are going unanswered--we're looking at changing suppliers if we don;t get to the bottom of this.

Re: RevPi NICs failing

Posted: 08 Nov 2023, 07:41
by dirk
Dear mcww_rd,
can you please send me and us an SOS report to support@kunbus.com?
Where did you send your emails, or can you provide a ticket number, i.e. discrete via E-Mail together with an SOS-Report?
Can you provide more details about the structure, for example a sketch?

Thank you