Kernel module + Core 3 questions
Posted: 19 May 2017, 10:22
Hello,
first of all thank you for the great product!
In order to understand better how to develop applications for the RevPi, I've taken a look at the current kernel module source and have a couple of questions about it:
Regards,
Christian
first of all thank you for the great product!
In order to understand better how to develop applications for the RevPi, I've taken a look at the current kernel module source and have a couple of questions about it:
- The kernel module parses the configuration JSON file from the filesystem directly - and uses the set_fs(DS_KERNEL) trick to use the VFS API to read the file. Wouldn't it be better for the kernel module to provide a configfs-based system to configure the driver (see e.g. the LIO iSCSI target or the USB Gadget support for examples of the use of configfs for kernel configuration) and use a userspace helper to parse the current JSON configuration and map that to the configfs backend? (Note that set_fs is likely to be removed from the kernel at some point in the future, see https://lwn.net/Articles/722267/ for details.)
- Why use a global lock piDev_g.lockPI for the process image for everything? Typically values that are read/written from/to that region are 4 bytes or smaller, where atomic operations can be used to ensure that these writes are consistent. Changing individual values could be sped up quite a bit by relying on atomic operations on the kernel <-> userland side. Sure, for large read/write calls one would still need the lock, and the gate thread would have to reimplement the copy operation in terms of a loop of atomic loads / saves (in addition to taking the lock) - and while I haven't measured that, I suspect that this could improve performance for the cases where the user only wants to set a single value.
- In lieu of this: wouldn't it also be much faster to be able to mmap() the process image into a program's address space, and use direct memory operations from the program, instead of having to execute system calls? Sure, if an application wants to have consistency for areas larger than 4 bytes (the largest atomic op on 32bit ARM), it would still have to resort to read()/write() system calls, but for just switching a single input / output, for example, (or reading/writing single values of 4 bytes or less from/to field busses) this could speed up applications even further, because you'd save one or even two context switches to the kernel. Furthermore, the bounds checks could all happen on the initial mmap() system call (+ page faults by the CPU), further increasing performance.
- Why implement waiting for events via a ioctl, KB_WAIT_FOR_EVENT? Why not make the file descriptor pollable via select/poll/epoll? This would allow easy integration into existing event loops.
- The Core 3 has an ARMv8 chip - but Raspbian only supports the 32bit architecture (and the chip in an effective ARMv7 mode). I would expect that the Revolution Pi would also only work in 32bit mode? Since there's only 1 GiB of RAM anyway, from that perspective this isn't a limitation - however, the ARMv8 architecture supports additional instructions (especially related to floating point arithmetics) in ARMv8 mode. Do you know if userspace applications are able to make use of the processor improvements even in 32 bit mode?
- Alternatively, Debian itself appears to run natively (without any Raspbian patches) on the Raspberry Pi 3 in ARMv8 mode (with aarch64 as architecture), at least if I read the wiki correctly - haven't tried it myself. Are there any plans to support the Revolution Core 3 under aarch64 based directly off Debian (+ some firmware packages)? Especially since Debian Stretch (and by extension jessie-backports) now has kernel 4.9 + RT patch already included on some architectures, albeit not ARM?
Regards,
Christian