The LibreSOC hybrid 3D CPU-VPU-GPU is intended to provide a significant reduction in both hardware complexity, software (driver) complexity and systems integration primarily initially for embedded and mobile environments.
Larrabee or more specifically Nyuzi showed that a software-only "Traditional Vector Processor" architecture makes for a fantastic High Performance Compute Engine that, unfortunately, also turns out to have only 25% the performance/watt of current competitive embedded mobile GPUs. Not only that but SIMD, despite being (seductively) easy to implement by hardware engineers, has been shown to have harmful consequences at the software level (setup and loop end cleanup). A recent patch to glibc6 to add POWER9 VSX strncpy was a whopping 250 hand-crafted assembly instructions, where its equivalent using Cray Vector principles is around 14.
All of this was solved decades ago by Cray Vector designs, and then forgotten. Only now is variable-length Vectorisation being rediscovered and deployed in modern architectures: RISC-V RVV, ARM SVE2 and also Simple-V. This talk therefore goes through the background and concepts behind Simple-V. Thanks to a grant from NLnet, SV will be formally documented and proposed formally as an extension to OpenPOWER, for review by the OpenPOWER Foundation.
Speakers: Luke Kenneth Casson Leighton