@crispyoz regarding user space DT overlays: there's a much more convenient solution by now (actually, it seems to have existed for a long time already), the dtbocfg loadable kernel module by Ichiro Kawazome, which I packaged for OpenWrt here.
The only thing required for that to work is that the kernel is compiled with CONFIG_OF_OVERLAY=y. Much easier than patching the kernel sources.
This all works fine, but I have a general problem with 24.10 on the MT7688 (not on other targets): it consumes waaaay to much CPU time doing I/O, and I could not yet track it down. Performance is normal for everything else, but when there is only a bit more of I/O going on, say a network download AND a serial output streaming some data, it sometimes stalls for seconds. I have no idea yet what that is. Does 24.10 run smoothly for you?