in modern cpu, I thought there is so called SIMD instruction, i.e. one op with multiple data. say, I want to add 4 int with another 4 int in vector or array, instead of one by one, it can be passed together to one register with one instruction. in c/c++, how can I do this? I know there must be some compiler support to know that target machine must support such instruction in cpu model, and compiling machine maybe different than running machine, so, this involves both build/running like toolchain building. but at least what instruction set or pysuedo assembly language can do, or can I do it at high-level language like c/c++?
Response
Here's a research plan for that topic. If you need to update it, let me know!
C/C++ SIMD Vector Addition
Research Websites
in modern cpu, I thought there is so called SIMD instruction, i.e. one op with multiple data. say, I want to add 4 int with another 4 int in vector or array, instead of one by one, it can be passed together to one register with one instruction. in c/c++, how can I do this? I know there must be some compiler support to know that target machine must support such instruction in cpu model, and compiling machine maybe different than running machine, so, this involves both build/running like toolchain building. but at least what instruction set or pysuedo assembly language can do, or can I do it at high-level language like c/c++?
(1) Research how C/C++ compilers can automatically vectorize code to utilize SIMD instructions.
(2) Investigate compiler flags and optimization levels in popular C/C++ compilers (like GCC, Clang, MSVC) that enable or enhance SIMD usage.
(3) Find information about SIMD intrinsics available in C/C++ for different processor architectures (e.g., SSE, AVX for x86; NEON for ARM).
(4) Explore how to use these intrinsics to directly program SIMD operations in C/C++.
(5) Research methods in C/C++ to detect the CPU's supported SIMD instruction sets at runtime.
(6) Investigate how to write conditional code in C/C++ that uses different SIMD instruction sets based on the target CPU's capabilities.
(7) Look for examples of C/C++ code demonstrating vector or array addition using SIMD, both through compiler auto-vectorization and explicit intrinsics.
(8) Explore libraries or frameworks in C/C++ that provide higher-level abstractions for SIMD programming, potentially offering platform independence.