Torvalds’ advice to Intel is to focus on things that matter instead of wasting resources on new instruction sets. He referring to the AVX-512, that he feels aren’t beneficial outside the HPC market. Linus Torvalds expressing his displeasure over “Alder Lake” lacking AVX-512.
He also cautioned against placing too much weight on floating-point performance benchmarks. Especially those that take advantage of exotic new instruction sets that have a fragmented and varied implementation across product lines.
I hope AVX512 dies a painful death, and that Intel starts fixing real problems instead of trying to create magic instructions to then create benchmarks that they can look good on. I hope Intel gets back to basics: gets their process working again, and concentrate more on regular code that isn’t HPC or some other pointless special case.
Torvalds believes AVX2 is “more than enough” thanks to its proliferation. He advocated that processor manufacturers design better FPUs for their core designs so they don’t have to rely on instruction set-level optimization to eke out performance.
I’ve said this before, and I’ll say it again: in the heyday of x86, when Intel was laughing all the way to the bank and killing all their competition, absolutely everybody else did better than Intel on FP loads. Intel’s FP performance sucked (relatively speaking), and it matter not one iota. Because absolutely nobody cares outside of benchmarks.
After several more paragraphs, Torvalds reaches his conclusion:
Stop with the special-case garbage, and make all the core common stuff that everybody cares about run as well as you humanly can.
What Is Intel AVX-512?
Intel AVX-512 (Advanced Vector Extensions) is a set of new CPU instructions that impacts compute, storage, and network functions. The number 512 refers to the width, in bits, of the register file, which sets the parameters for how much data a set of instructions can operate upon at a time.
In other words, AVX is an extension that allows a speed up in certain floating point operations. AVX will primarily impact media transcoding. Basically it’s a set of instructions that programmers can use that allow a CPU to process vectorizable calculations really fast.