Skip to content

Changes

#2 (Dec 1, 2023, 11:53:07 AM)

  1. Add Kokkos::num_threads() and Kokkos::device_id() — Damien L-G / githubweb
  2. Check Kokkos::num_threads and device_id in tests — Damien L-G / githubweb
  3. Use std::sort for sorting within a bin when possible — Andrey Prokopenko / githubweb
  4. RangePolicyRequire was not using require — tccleve / githubweb
  5. Fix reducer result check for Serial+HPX ParallelReduce — Daniel Arndt / githubweb
  6. Define KOKKOS_ARCH_NVIDIA_GPU macro when targeting an NVIDIA GPU architecture — Damien L-G / githubweb
  7. Use KOKKOS_ARCH_NVIDIA_GPU macro in SYCL, OpenACC, and OpenMPTarget backends where appropriate — Damien L-G / githubweb
  8. Add ParallelScanRangePolicy test — tccleve / githubweb
  9. Address review comments — Andrey Prokopenko / githubweb
  10. Fix partition_master test — Daniel Arndt / githubweb
  11. Use CombinedReducer in HostIterateTile — Daniel Arndt / githubweb
  12. Desul atomics: fix bug max uint64_t value — Damien L-G / githubweb
  13. Make CUDA bhalf conversion code more forward compatible — Damien L-G / githubweb
  14. Let KOKKOS_ARCH_NVIDIA_GPU provide the Compute Capability — Damien L-G / githubweb
  15. Bring back git info to benchmarks output — cezary.skrzynski / githubweb
  16. [ci skip] Add a comment — Andrey Prokopenko / githubweb
  17. Rename KOKKOS_{ -> IMPL_}ARCH_NVIDIA_GPU — Damien L-G / githubweb
  18. Improve macro name KOKKOS_IMPL_{ARCH_NVIDIA_GPU_AMPERE_PLUS -> NVIDIA_GPU_ARCH_SUPPORT_BHALF} — Damien L-G / githubweb
  19. Allow that C++20 is passed to nvcc — noreply / githubweb
  20. Allow c++20 in nvcc_wrapper for nvcc 12 and above — noreply / githubweb
  21. Print Kokkos version at configuration time (#5979) — noreply / githubweb
  22. #5641: Fix HIP & CUDA MDRange reduce for sizeof(value_type) < sizeof(int) (#5745) — noreply / githubweb
  23. Fix sycl.large_team_scratch_size — Daniel Arndt / githubweb
  24. Add to HIP tests in Makefile — tccleve / githubweb
  25. Bit manipulation: implement `byteswap` (#5967) — noreply / githubweb
  26. UnorderedMap: Ensure size() working in case of copies — Daniel Arndt / githubweb
  27. Use (non-mutable) std::shared_ptr instead — Daniel Arndt / githubweb
  28. Add another test case — Daniel Arndt / githubweb
  29. graph(hip): enable test — romin.tomasetti / githubweb
  30. Remove workaround for submit_barrier not being enqueued properly — Daniel Arndt / githubweb
  31. Support --compiler-options in nvcc_wrapper — noreply / githubweb
  32. Cleanup OpenMPTaget ParallelReduce — Daniel Arndt / githubweb
  33. Fix compiling OpenMPTarget for AMD GPUs — Daniel Arndt / githubweb
  34. Desul atomics: Fix NVCC warning integer conversion resulted in a change of sign — Damien L-G / githubweb
  35. Fixup NVIDIA GPU arch must be defined potentially for other backends as well — Damien L-G / githubweb
  36. Fix warning with NVC++ — Damien L-G / githubweb
  37. Fix team_scratch_1_queues for SYCL+Cuda — Daniel Arndt / githubweb
  38. m_team_scratch_pool -> m_team_scratch_event — Daniel Arndt / githubweb
  39. Explain acquire_team_scratch_space — Daniel Arndt / githubweb
  40. Move Cuda/Kokkos_Cuda_NvidiaGpuArchitectures.hpp -> impl/Kokkos_NvidiaGpuArchitectures.hpp — Damien L-G / githubweb
  41. Drive-by fix typos "fix {to -> too} many" — noreply / githubweb
  42. simd: Fixed an incorrectly returning size for uint64_t in avx2 (#6004) — noreply / githubweb
  43. Pass local_accessor directly instead — Daniel Arndt / githubweb
  44. Fix warning in some user code when using std::memcpy — Bruno Turcksin / githubweb
  45. CMake: update package compatibility mode when building within Trilinos — Andrey Prokopenko / githubweb
  46. SYCL: Make is_device_copyable future-proof (#6009) — noreply / githubweb
  47. desul: Move lock_array_copied from global scope (#5999) — noreply / githubweb
  48. Fix namespace for is_device_copyable — Daniel Arndt / githubweb
  49. Fixup 4.0 change log (#6015) — noreply / githubweb
  50. Add half_t and bhalf_t limits (#5778) — noreply / githubweb
  51. Add UnorderedMapInsertOps for coo2crs (#5877) — noreply / githubweb
  52. #6027: replace remaining instances of ALL_t with Kokkos::ALL_t (#6028) — noreply / githubweb
  53. Add -Wdeprecated-copy warning and fix OMPT scan bug related to assignment operators (#6026) — noreply / githubweb
  54. Relax scratch space limits for HIP reductions (#6029) — noreply / githubweb
  55. Fix -Wmissing-field-initializers warning — Daniel Arndt / githubweb
  56. Fix enum warnings — Daniel Arndt / githubweb
  57. Fix OMPT size compare warnings — Daniel Arndt / githubweb
  58. Add -Wextra — Daniel Arndt / githubweb
  59. Cherry-pick TriBITS update from Trilinos — Daniel Arndt / githubweb
  60. KOKKOS_COMPILER_PGI -> KOKKOS_COMPILER_NVHPC — Daniel Arndt / githubweb
  61. Try removing _kokkos_pgi_compiler_bug_workaround — Daniel Arndt / githubweb
  62. Remove unused constructors for ThreadVectorRangeBoundairesStruct — donlee / githubweb
  63. Define at most one KOKKOS_COMPILER* macro — Daniel Arndt / githubweb
  64. Define KOKKOS_COMPILER_INTEL_LLVM — Daniel Arndt / githubweb
  65. Desul atomics: let relocatable device code mode be part of the configuration (#5991) — noreply / githubweb
  66. CHANGELOG.md: add threads sort — ajpowel / githubweb
  67. Remove Kokkos Keyword Listing section from BUILD.md and refer to the wiki instead — Damien L-G / githubweb
  68. Remove (outdated) license information [ci skip] — Damien L-G / githubweb
  69. Remove libnuma (#6048) — noreply / githubweb
  70. Don't create a shared state for size() in UnorderedMap's deep_copy — Daniel Arndt / githubweb
  71. OpenMPTarget: Changes to Makefile.kokkos (#6053) — noreply / githubweb
  72. Partially reverse #5504 — Daniel Arndt / githubweb
  73. Reduce size of ScatterView test when using OpenMP — Bruno Turcksin / githubweb
  74. Fix bug in Makefile.kokkos — stamoor / githubweb
  75. Fix join for ValueWrapperForNoNeutralElement — Daniel Arndt / githubweb
  76. SYCL: Use in-order queue for SYCL+Cuda — Daniel Arndt / githubweb
  77. perf_test is still not working — Daniel Arndt / githubweb
  78. Fix PerfTests by limiting GramSchmidt — Daniel Arndt / githubweb
  79. Only pass one wrapper object in SYCL reductions — Daniel Arndt / githubweb
  80. Explicitly cast to CombinedFunctorReducerType — Daniel Arndt / githubweb
  81. Fix Kokkos_SIMD with AVX2 on 64-bit architectures (#6075) — noreply / githubweb
  82. Update changelog (#6058) — noreply / githubweb
  83. Fix up SYCL execution space instance creation for Intel GPUs — Daniel Arndt / githubweb
  84. Converted a shared_ptr to a host view in UnorderedMap (#6073) — noreply / githubweb
  85. Update OpenACC FunctorAdapter (#6077) — noreply / githubweb
  86. Improve SYCL parallel_scan — Daniel Arndt / githubweb
  87. Compiling with auto deduction of workgroup sizes — Daniel Arndt / githubweb
  88. Disable failed bit manipulation tests when compiled by NVHPC  (#6088) — noreply / githubweb
  89. Add support for Darwin 32-bit and PPC (#5916) — noreply / githubweb
  90. Unconditionally enable CUDA extended lambda support — pbmille / githubweb
  91. Tentative arguments switch for nvcc 12+ — pbmille / githubweb
  92. Change Makefile.kokkos too — pbmille / githubweb
  93. Implement CMake messages per team decision — pbmille / githubweb
  94. Fix definitions and docs to remove CUDA Lambda option — pbmille / githubweb
  95. Don't fail to define broader 'lambdas are available' macro — pbmille / githubweb
  96. Always expect KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA to be set — pbmille / githubweb
  97. Remove various test exclusions based on KOKKOS_ENABLE_CUDA_LAMBDA — pbmille / githubweb
  98. Update changelog — ndellin / githubweb
  99. [ci skip] Fixup changelog — ndellin / githubweb
  100. Work around nvcc issue for view_mapping and add FIXME_NVCC comment — pbmille / githubweb
  101. OpenMPTarget: Update hierarchical parallelism. (#6043) — noreply / githubweb
  102. Enable OpenMP in CUDA-11.0-NVCC-RDC to test DEPRECATED_CODE_3=ON (#5978) — noreply / githubweb
  103. fix ternary op in subset of std algorithms not working with nvhpc (#6095) — noreply / githubweb
  104. Add implementation of bit_cast in <Kokkos_BitManipulation.hpp> — Damien L-G / githubweb
  105. Add compile time tests for the constraints on the bit_cast function template — Damien L-G / githubweb
  106. Add the Experimental:: builtin variant (just defer to regular bit_cast) — Damien L-G / githubweb
  107. Add runtime tests for bit_cast — Damien L-G / githubweb
  108. Use Kokkos::bit_cast in SIMD instead of rolling its own — Damien L-G / githubweb
  109. Clang-format glitch — Damien L-G / githubweb
  110. view(uvm): fence if need in allocation (#6005) — romin.tomasetti / githubweb
  111. Disable tests that fail at runtime with NVHPC (likely not liking the class declaration within the body of the functor) — Damien L-G / githubweb
  112. change impl of `is_sorted_until` to use reduce (#6097) — noreply / githubweb
  113. Fix typo and remove accidentally committed assertions — noreply / githubweb
  114. Added multiple reducers support for team-level parallel reduce (#5727) — noreply / githubweb
  115. Work around NVHPC issue with enum types — crtrott / githubweb
  116. Work around NVHPC 23.x issues — crtrott / githubweb
  117. Kokkos: Remove TriBITS Kokkos subpackages (trilinos/Trilinos#11545) (#6104) — noreply / githubweb
  118. Drop pointless Kokkos{Algorithms,Containers}_config.h files — Damien L-G / githubweb
  119. Revert "Merge pull request #5964 from PhilMiller/cuda-lambda-default" — Damien L-G / githubweb
  120. Update the OpenACC parallel_reduce() constructs with Range/MDRange/Team (#6072) — noreply / githubweb
  121. Always pass -extended-lambda option to NVCC and force Kokkos_ENABLE_CUDA_LAMBDA ON — Damien L-G / githubweb
  122. Reorganize ZeroMemset (#6087) — noreply / githubweb
  123. Drop CUDA_LAMBDA guards in Cuda headers — Damien L-G / githubweb
  124. Work around NVHPC 23.x not dealing with __isGlobal — crtrott / githubweb
  125. Drop unused cmake macros — Damien L-G / githubweb
  126. Fixup cmake style — Damien L-G / githubweb
  127. use ASSERT_EQ in all std algorithms tests — fnrizzi / githubweb
  128. Reintroduce test skip for nvhpc < 23.3 — crtrott / githubweb
  129. hpcbind: check for correct Slurm variable — rberger / githubweb
  130. Fix macro guards in test for NVC++ as the CUDA compiler — Damien L-G / githubweb
  131. Allow templated functors in parallel_for, parallel_reduce and parallel_scan (#5976) — noreply / githubweb
  132. Import sycl::bit_cast into the Kokkos namespace — Daniel Arndt / githubweb
  133. Qualify calls possibly ambiguous calls to bit_cast — Daniel Arndt / githubweb
  134. Fix nightlies -- workaround compiler bug in GCC 9.1 and 9.2 (#6118) — noreply / githubweb
  135. Kokkos_BitManipulation: KOKKOS_COMPILER_GCC->KOKKOS_COMPILER_GNU (#6119) — noreply / githubweb
  136. Cuda: Remove unused attach_texture_object — Daniel Arndt / githubweb
  137. Move half traits to private header and add half/bhalf infinity trait (#6055) — noreply / githubweb
  138. Increase minimum required HPX version to 1.8.0 — mikael.simberg / githubweb
  139. Conditionally use hpx::post instead of hpx::apply based on HPX version — mikael.simberg / githubweb
  140. Don't restrict index type in builtin reducers — Daniel Arndt / githubweb
  141. dual view: update template types (#6085) — romin.tomasetti / githubweb
  142. sorting an empty view should exit early and not fail (#6130) — noreply / githubweb
  143. core/src: Move floating_point_wrapper to private header — eharvey / githubweb
  144. Disable tests failing with NVHPC — Daniel Arndt / githubweb
  145. Fix bit_cast for SYCL again — Daniel Arndt / githubweb
  146. Disable tests for OpenMPTarget — Daniel Arndt / githubweb
  147. Improve indentation of comments — Daniel Arndt / githubweb
  148. Allow deprecated declarations in SYCL+Cuda CI — Daniel Arndt / githubweb
  149. Try running for other execution spaces — Daniel Arndt / githubweb
  150. Add guards for Cuda — Daniel Arndt / githubweb
  151. Expand list of kokkos options not to export with cmake — Damien L-G / githubweb
  152. Do not append to Kokkos_OPTIONS variables those in the do not export list — Damien L-G / githubweb
  153. Drop Kokkos_ENABLE_LAUNCH_COMPILER option — Damien L-G / githubweb
  154. Export Kokkos_ENABLE_<OPTION> that are relevant — Damien L-G / githubweb
  155. Drop Kokkos_ENABLE_PROFILING_LOAD_PRINT option — Damien L-G / githubweb
  156. Suppress bogus warning about CUDA_LAMBDA being ON — Damien L-G / githubweb
  157. [ci skip] Add nightly ci for spack (#6135) — noreply / githubweb
  158. OpenMPTarget: Enable Cray compiler for the OpenMPTarget backend. (#5889) — noreply / githubweb
  159. Revert to `DualView<class,class=void,class=void,class=void>` when deprecated code 4 is enabled — Damien L-G / githubweb
  160. Fix Kokkos_ENABLE_CUDA_LAMBDA for Trilinos — Daniel Arndt / githubweb
  161. Fix bogus warnings in nested CUDA parallel_reduce — Daniel Arndt / githubweb
  162. `BinSort`, `BinOp1D`, `BinOp3D`: mark default constructor as deleted (#6131) — noreply / githubweb
  163. KokkosTools: Don't call callbacks before backends are initialized (#6114) — noreply / githubweb
  164. Add `Kokkos::all_libs` alias target for compatibility with TriBITS/Trilinos (#6157) — noreply / githubweb
  165. sorting: add to binsort support for strided views and reorg tests (#6081) — noreply / githubweb
  166. Allow linking against build tree (#6078) — noreply / githubweb
  167. Implement `HPX::in_parallel` (#6143) — noreply / githubweb
  168. OpenMPTarget: Changes for OpenMPTarget backend with nvhpc compiler. — rgayatri / githubweb
  169. OpenMPTarget: Add a fixme. — rgayatri / githubweb
  170. Update Makefile.kokkos — noreply / githubweb
  171. Remove extended_namespace template paramter for SYCLMemoryOrder/Scope — Daniel Arndt / githubweb
  172. OpenMPTarget: update fixme comment. — rgayatri / githubweb
  173. OpenMPTarget: Replace kokkos macros in desul. — rgayatri / githubweb
  174. OpenMPTarget: Restore desul changes. — rgayatri / githubweb
  175. Cherry-pick v3.7.02 changelog into develop [ci skip] — Damien L-G / githubweb
  176. Kokkos: Mark HWLOC as a TriBITS TPL as well (trilinos/Trilinos#11938) — rabartl / githubweb
  177. Clean up FunctorAnalysis — Daniel Arndt / githubweb
  178. SIMD: make binary op tests to test against all data types (#5913) — noreply / githubweb
  179. Also create symlinks for CMake configuration files to cmake_packages/Kokkos for TriBITS (#6163) — noreply / githubweb
  180. Allow passing a temporary std::vector to partition_space (#6167) — noreply / githubweb
  181. .github/workflows: Remove push trigger — eharvey / githubweb
  182. .github/workflows: Only trigger upon push to develop — eharvey / githubweb
  183. Replace _mm512_loadu_epi64 and _mm512_storeu_epi64 with _mm512_loadu_si512 and _mm512_storeu_si512 — donlee / githubweb
  184. OpenMPTarget: include desul changes. — rgayatri / githubweb
  185. Weed out verbose output from `dynamic_view` container unit test (#6173) — noreply / githubweb
  186. shortcut value for is_dynamic_view — fnrizzi / githubweb
  187. add trait and test — fnrizzi / githubweb
  188. Fix global fence in Kokkos::resize(DynRankView) (#6184) — noreply / githubweb
  189. Left align demangled stacktrace output. (#6191) — noreply / githubweb
  190. [HIP] Improve heuristic deciding the number of blocks used in parallel_reduce (#6160) — noreply / githubweb
  191. Improve OpenMP affinity warning to include MPI concerns (#6185) — noreply / githubweb
  192. Update version number on develop after branching off for 4.1.00 — Daniel Arndt / githubweb
  193. Fix test_quad_precision_math_constants test — Daniel Arndt / githubweb
  194. implementation and tests — fnrizzi / githubweb
  195. implementation and tests — fnrizzi / githubweb
  196. [ci skip] test_all_sandia: update compilers and queues — ndellin / githubweb
  197. team-level std algos: common code needed (#6199) — noreply / githubweb
  198. Fix compiling SYCL with KOKKOS_IMPL_DO_NOT_USE_PRINTF_USAGE — Daniel Arndt / githubweb
  199. snapshot mdspan namespace changes (#6162) — noreply / githubweb
  200. Disable AVX512 support for NVHPC — Daniel Arndt / githubweb
  201. Fix host-annotations of AVX2, AVX512, and NEON constructors — Daniel Arndt / githubweb
  202. Introduce impl_get_value/impl_get_mask — Daniel Arndt / githubweb
  203. Fix a gcc-8.4.0 warning — Daniel Arndt / githubweb
  204. Fix host-device annotation for where_expression/const_where_expression — Daniel Arndt / githubweb
  205. Make in-order queues the default via macro — Daniel Arndt / githubweb
  206. Avoid SFINAE in favor of overloads — Daniel Arndt / githubweb
  207. Move scalar overloads to Scalar header — Daniel Arndt / githubweb
  208. Disable KOKKOS_ARCH_AVX512XEON for NVHPC — Daniel Arndt / githubweb
  209. Changelog for 4.1.00 (#6225) — noreply / githubweb
  210. reorganize sort headers (#6230) — noreply / githubweb
  211. SYCL: Support for bhalf_t (#6204) — noreply / githubweb
  212. only compute with relavent entries — tccleve / githubweb
  213. make constraints on `Kokkos::sort` more visible/clear (#6234) — noreply / githubweb
  214. slim API and move code to impl — fnrizzi / githubweb
  215. This PR contains minor code changes and bug fixes needed for LLVM-Clacc — lees2 / githubweb
  216. improve all other corner cases as per review comment — fnrizzi / githubweb
  217. refine for cuda uvm — fnrizzi / githubweb
  218. use exespace to check rather than mem space — fnrizzi / githubweb
  219. Fix AVX2 simd support for ZEN2 AMD CPU. (#6238) — noreply / githubweb
  220. fix corner case — fnrizzi / githubweb
  221. Fix windows symlink configure issue (#6241) — noreply / githubweb
  222. fix corner cases — fnrizzi / githubweb
  223. bug_report.md: new PR branching from `develop` (#5034) — noreply / githubweb
  224. Fix whitespace in bug_report.md (#6244) — noreply / githubweb
  225. Avoid undefined behavior in TestTaskScheduker.hpp — Daniel Arndt / githubweb
  226. Remove calling tribits_exclude_autotools_files() — rabartl / githubweb
  227. Ensure that complex is only instantiated for cv-unqualified floating-point type — Damien L-G / githubweb
  228. Deprecated Kokkos::vector — Damien L-G / githubweb
  229. Warn if <Kokkos_Vector.hpp> is included — Damien L-G / githubweb
  230. Drop Vector test with makefiles and conditionally remove it with CMake — Damien L-G / githubweb
  231. Ignore <Kokkos_Vector.hpp> in the header self-containment tests — Damien L-G / githubweb
  232. SYCL: Use in-order queues in InterOp tests (#6246) — noreply / githubweb
  233. std_algos: fix wrong corner case for `is_partitioned` (#6257) — noreply / githubweb
  234. Make sure macros are defined — Daniel Arndt / githubweb
  235. Error out when Kokkoks_Vector.hpp is included with deprecated code disabled — Daniel Arndt / githubweb
  236. SIMD: Add abs() for all int types (#6069) — noreply / githubweb
  237. Fix SIMD abs unit test accidentally using complex overload — Damien L-G / githubweb
  238. Fix SIMD tests on NEON — Daniel Arndt / githubweb
  239. Add default ParallelFor copy constructor for HIP — Bruno Turcksin / githubweb
  240. Workaround gcc/8.2.0 compiler issue with _mm512_abs_pd — ndellin / githubweb
  241. Implement Kokkos::printf (#6083) — noreply / githubweb
  242. Improve SYCL TeamPolicy reduction — Daniel Arndt / githubweb
  243. make Kokkos_CXX_COMPILER_VERSION available to CMake consumers — cwpears / githubweb
  244. Fully qualify Experimental::SYCL in algorithms to avoid finding conflicting namespaces — Daniel Arndt / githubweb
  245. subset of team level impl of std algorithms — fnrizzi / githubweb
  246. fix copyright — fnrizzi / githubweb
  247. guard for openmptarget — fnrizzi / githubweb
  248. fix for openmptarget — fnrizzi / githubweb
  249. address comments — fnrizzi / githubweb
  250. Update CMakeLists for unit tests with OpenMPTarget, OpenACC with NVHPC (#6260) — noreply / githubweb
  251. Update CI from CUDA 11.7.0 to 11.7.1 — Bruno Turcksin / githubweb
  252. Improve SYCL reduction performance: RangePolicy (#6264) — noreply / githubweb
  253. Improve SYCL reduction performance: workgroup_reduction (#6270) — noreply / githubweb
  254. SYCL TeamPolicy: Fix sign comparison warning — Daniel Arndt / githubweb
  255. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  256. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  257. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  258. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  259. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  260. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  261. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  262. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  263. format — fnrizzi / githubweb
  264. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  265. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  266. Suppress warnings — Daniel Arndt / githubweb
  267. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  268. Improve macro definitions — Daniel Arndt / githubweb
  269. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  270. Modify fences in View API test — cezary.skrzynski / githubweb
  271. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  272. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  273. bring back previous code as discussed in meeting — fnrizzi / githubweb
  274. create cudaAPI function wrappers — tccleve / githubweb
  275. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  276. Some api function require cuda11.2+ — tccleve / githubweb
  277. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  278. Rework stream inputs — tccleve / githubweb
  279. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  280. Remove static in comment — tccleve / githubweb
  281. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  282. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  283. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  284. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  285. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  286. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  287. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  288. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  289. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  290. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  291. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  292. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  293. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  294. Fix reviewer's comments — Bruno Turcksin / githubweb
  295. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  296. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  297. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  298. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  299. remove spurious undefs — fnrizzi / githubweb
  300. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  301. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  302. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  303. address reviews [skip ci] — antoine.meyer54 / githubweb
  304. formatting — fnrizzi / githubweb
  305. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  306. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  307. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  308. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  309. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  310. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  311. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  312. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  313. fix lambda capture — fnrizzi / githubweb
  314. remove unnecessary file, fix constraints — fnrizzi / githubweb
  315. remove file — fnrizzi / githubweb
  316. fix — fnrizzi / githubweb
  317. fix lambda capture and constraints — fnrizzi / githubweb
  318. keep only subset — fnrizzi / githubweb
  319. revert files — fnrizzi / githubweb
  320. remove file — fnrizzi / githubweb
  321. fix syntax — fnrizzi / githubweb
  322. format — fnrizzi / githubweb
  323. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  324. Rename AMD GPU architectures (#6266) — noreply / githubweb
  325. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  326. SIMD: add generator constructors (#6347) — noreply / githubweb
  327. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  328. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  329. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  330. Initial implementation of gfx942 (#6358) — noreply / githubweb
  331. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  332. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  333. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  334. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  335. Add support for HIP Graph — Bruno Turcksin / githubweb
  336. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  337. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  338. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  339. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  340. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  341. Use constexpr West in src — Bruno Turcksin / githubweb
  342. Use constexpr West in test — Bruno Turcksin / githubweb
  343. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  344. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  345. team-level std algos: part 2 (#6205) — noreply / githubweb
  346. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  347. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  348. Rebased and applied feedbacks — donlee / githubweb
  349. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  350. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  351. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  352. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  353. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  354. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  355. More NVC++ 23.7 updates — crtrott / githubweb
  356. NVC++ clang-format fixes — crtrott / githubweb
  357. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  358. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  359. OpenACC: Guard tests relying on abort — crtrott / githubweb
  360. Fix TestAtomic to use the test execspace — crtrott / githubweb
  361. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  362. Update nvhpc gtest skip message — crtrott / githubweb
  363. Work around OpenMPTarget failure — crtrott / githubweb
  364. Update base docker file for nvhpc — crtrott / githubweb
  365. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  366. Fix reviewer's comments — Bruno Turcksin / githubweb
  367. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  368. HIP: Update print_configuration (#6387) — noreply / githubweb
  369. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  370. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  371. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  372. team-level std algos: part 3 (#6207) — noreply / githubweb
  373. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  374. SIMD: add float simd support (#6177) — noreply / githubweb
  375. team-level std algos: part 4 (#6208) — noreply / githubweb
  376. Added a gen ctor for float (#6397) — noreply / githubweb
  377. team-level std algos: part 5 (#6209) — noreply / githubweb
  378. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  379. Fixup checked interger operations death test — Damien L-G / githubweb
  380. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  381. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  382. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  383. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  384. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  385. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  386. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  387. team-level std algos: part 6 (#6210) — noreply / githubweb
  388. address comments — fnrizzi / githubweb
  389. OpenMP backend refactor files. (#6403) — noreply / githubweb
  390. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  391. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  392. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  393. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  394. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  395. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  396. use single — fnrizzi / githubweb
  397. address comments — fnrizzi / githubweb
  398. formatting — fnrizzi / githubweb
  399. Team-level std algos: part 7 (#6211) — noreply / githubweb
  400. formatting — fnrizzi / githubweb
  401. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  402. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  403. Drop check whether device supports unified addressing — Damien L-G / githubweb
  404. fix single as per Christian's suggestion — fnrizzi / githubweb
  405. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  406. check-copyright improvements (#6399) — noreply / githubweb
  407. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  408. Address reviewer' comments — Bruno Turcksin / githubweb
  409. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  410. add comment — fnrizzi / githubweb
  411. improve tests to address review — fnrizzi / githubweb
  412. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  413. avoid potential race condition HIP — tccleve / githubweb
  414. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  415. Same for scan — andrei.elovikov / githubweb
  416. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  417. improve tests with intra-team result check — fnrizzi / githubweb
  418. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  419. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  420. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  421. remove old impl — fnrizzi / githubweb
  422. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  423. Clean up benchmarks/gups — cwpears / githubweb
  424. benchmark/gups: use CMake — cwpears / githubweb
  425. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  426. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  427. add missing assert — fnrizzi / githubweb
  428. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  429. add intra team check for missing test — fnrizzi / githubweb
  430. fix intel compile error — fnrizzi / githubweb
  431. fix unreachable for intel — fnrizzi / githubweb
  432. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  433. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  434. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  435. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  436. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  437. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  438. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  439. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  440. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  441. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  442. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  443. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  444. improve tests (#6432) — noreply / githubweb
  445. improve tests (#6437) — noreply / githubweb
  446. Move final assignment to correct scope — cezary.skrzynski / githubweb
  447. fix casting warning in Random test — fnrizzi / githubweb
  448. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  449. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  450. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  451. improve tests to check intra-team result (#6431) — noreply / githubweb
  452. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  453. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  454. team-level std algos: part 10 (#6256) — noreply / githubweb
  455. team-level std algos: part 11 (#6258) — noreply / githubweb
  456. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  457. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  458. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  459. use shortcut — cezary.skrzynski / githubweb
  460. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  461. Fix formatting — cezary.skrzynski / githubweb
  462. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  463. Use std::is_same_v — cezary.skrzynski / githubweb
  464. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  465. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  466. Use std::is_same_v — cezary.skrzynski / githubweb
  467. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  468. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  469. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  470. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  471. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  472. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  473. Use call operator instead of run_me function — maarten.arnst / githubweb
  474. team-level std algos: part 12 (#6350) — noreply / githubweb
  475. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  476. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  477. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  478. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  479. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  480. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  481. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  482. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  483. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  484. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  485. Fix NVCC warnings (#6483) — noreply / githubweb
  486. team-level std algos: part 13 (#6351) — noreply / githubweb
  487. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  488. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  489. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  490. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  491. fix impl — fnrizzi / githubweb
  492. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  493. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  494. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  495. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  496. add overload for TeamThreadRange — fnrizzi / githubweb
  497. address review comment — fnrizzi / githubweb
  498. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  499. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  500. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  501. ad threadvector — fnrizzi / githubweb
  502. fix order — fnrizzi / githubweb
  503. remove guards — fnrizzi / githubweb
  504. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  505. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  506. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  507. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  508. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  509. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  510. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  511. add guards — fnrizzi / githubweb
  512. avoid auto — fnrizzi / githubweb
  513. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  514. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  515. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  516. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  517. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  518. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  519. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  520. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  521. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  522. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  523. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  524. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  525. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  526. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  527. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  528. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  529. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  530. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  531. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  532. Remove sleep and wake functions — Bruno Turcksin / githubweb
  533. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  534. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  535. Remove unused variables — Bruno Turcksin / githubweb
  536. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  537. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  538. Remove useless forward declaration — Bruno Turcksin / githubweb
  539. Remove spawn function — Bruno Turcksin / githubweb
  540. Add comments — Bruno Turcksin / githubweb
  541. Fix indentation — Bruno Turcksin / githubweb
  542. Fix typo in macro guard — Bruno Turcksin / githubweb
  543. Reduce number of View constructor instantiations — Damien L-G / githubweb
  544. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  545. Split files in HIP backend — Bruno Turcksin / githubweb
  546. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  547. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  548. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  549. Threads remove unused variables and functions (#6566) — noreply / githubweb
  550. Remove unused Sandia testing files (#6568) — noreply / githubweb
  551. fallback implementation cleanup — donlee / githubweb
  552. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  553. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  554. Drop Clang+CUDA workaround — Damien L-G / githubweb
  555. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  556. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  557. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  558. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  559. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  560. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  561. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  562. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  563. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  564. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  565. OpenACC: add atomics support (#6446) — noreply / githubweb
  566. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  567. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  568. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  569. try fix — fnrizzi / githubweb
  570. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  571. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  572. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  573. Added missing operator* to NEON simd — crtrott / githubweb
  574. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  575. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  576. try fix — fnrizzi / githubweb
  577. avoid pyt package — fnrizzi / githubweb
  578. try — fnrizzi / githubweb
  579. fix for macos — fnrizzi / githubweb
  580. remove comments — fnrizzi / githubweb
  581. use reference — crtrott / githubweb
  582. add branching — fnrizzi / githubweb
  583. [ci skip] fix formatting — cezary.skrzynski / githubweb
  584. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  585. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  586. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  587. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  588. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  589. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  590. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  591. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  592. unorderedmap: modernize traits — romin.tomasetti / githubweb
  593. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb

#2 (Dec 1, 2023, 11:53:07 AM)

  1. Add Kokkos::num_threads() and Kokkos::device_id() — Damien L-G / githubweb
  2. Check Kokkos::num_threads and device_id in tests — Damien L-G / githubweb
  3. Use std::sort for sorting within a bin when possible — Andrey Prokopenko / githubweb
  4. RangePolicyRequire was not using require — tccleve / githubweb
  5. Fix reducer result check for Serial+HPX ParallelReduce — Daniel Arndt / githubweb
  6. Define KOKKOS_ARCH_NVIDIA_GPU macro when targeting an NVIDIA GPU architecture — Damien L-G / githubweb
  7. Use KOKKOS_ARCH_NVIDIA_GPU macro in SYCL, OpenACC, and OpenMPTarget backends where appropriate — Damien L-G / githubweb
  8. Add ParallelScanRangePolicy test — tccleve / githubweb
  9. Address review comments — Andrey Prokopenko / githubweb
  10. Fix partition_master test — Daniel Arndt / githubweb
  11. Use CombinedReducer in HostIterateTile — Daniel Arndt / githubweb
  12. Desul atomics: fix bug max uint64_t value — Damien L-G / githubweb
  13. Make CUDA bhalf conversion code more forward compatible — Damien L-G / githubweb
  14. Let KOKKOS_ARCH_NVIDIA_GPU provide the Compute Capability — Damien L-G / githubweb
  15. Bring back git info to benchmarks output — cezary.skrzynski / githubweb
  16. [ci skip] Add a comment — Andrey Prokopenko / githubweb
  17. Rename KOKKOS_{ -> IMPL_}ARCH_NVIDIA_GPU — Damien L-G / githubweb
  18. Improve macro name KOKKOS_IMPL_{ARCH_NVIDIA_GPU_AMPERE_PLUS -> NVIDIA_GPU_ARCH_SUPPORT_BHALF} — Damien L-G / githubweb
  19. Allow that C++20 is passed to nvcc — noreply / githubweb
  20. Allow c++20 in nvcc_wrapper for nvcc 12 and above — noreply / githubweb
  21. Print Kokkos version at configuration time (#5979) — noreply / githubweb
  22. #5641: Fix HIP & CUDA MDRange reduce for sizeof(value_type) < sizeof(int) (#5745) — noreply / githubweb
  23. Fix sycl.large_team_scratch_size — Daniel Arndt / githubweb
  24. Add to HIP tests in Makefile — tccleve / githubweb
  25. Bit manipulation: implement `byteswap` (#5967) — noreply / githubweb
  26. UnorderedMap: Ensure size() working in case of copies — Daniel Arndt / githubweb
  27. Use (non-mutable) std::shared_ptr instead — Daniel Arndt / githubweb
  28. Add another test case — Daniel Arndt / githubweb
  29. graph(hip): enable test — romin.tomasetti / githubweb
  30. Remove workaround for submit_barrier not being enqueued properly — Daniel Arndt / githubweb
  31. Support --compiler-options in nvcc_wrapper — noreply / githubweb
  32. Cleanup OpenMPTaget ParallelReduce — Daniel Arndt / githubweb
  33. Fix compiling OpenMPTarget for AMD GPUs — Daniel Arndt / githubweb
  34. Desul atomics: Fix NVCC warning integer conversion resulted in a change of sign — Damien L-G / githubweb
  35. Fixup NVIDIA GPU arch must be defined potentially for other backends as well — Damien L-G / githubweb
  36. Fix warning with NVC++ — Damien L-G / githubweb
  37. Fix team_scratch_1_queues for SYCL+Cuda — Daniel Arndt / githubweb
  38. m_team_scratch_pool -> m_team_scratch_event — Daniel Arndt / githubweb
  39. Explain acquire_team_scratch_space — Daniel Arndt / githubweb
  40. Move Cuda/Kokkos_Cuda_NvidiaGpuArchitectures.hpp -> impl/Kokkos_NvidiaGpuArchitectures.hpp — Damien L-G / githubweb
  41. Drive-by fix typos "fix {to -> too} many" — noreply / githubweb
  42. simd: Fixed an incorrectly returning size for uint64_t in avx2 (#6004) — noreply / githubweb
  43. Pass local_accessor directly instead — Daniel Arndt / githubweb
  44. Fix warning in some user code when using std::memcpy — Bruno Turcksin / githubweb
  45. CMake: update package compatibility mode when building within Trilinos — Andrey Prokopenko / githubweb
  46. SYCL: Make is_device_copyable future-proof (#6009) — noreply / githubweb
  47. desul: Move lock_array_copied from global scope (#5999) — noreply / githubweb
  48. Fix namespace for is_device_copyable — Daniel Arndt / githubweb
  49. Fixup 4.0 change log (#6015) — noreply / githubweb
  50. Add half_t and bhalf_t limits (#5778) — noreply / githubweb
  51. Add UnorderedMapInsertOps for coo2crs (#5877) — noreply / githubweb
  52. #6027: replace remaining instances of ALL_t with Kokkos::ALL_t (#6028) — noreply / githubweb
  53. Add -Wdeprecated-copy warning and fix OMPT scan bug related to assignment operators (#6026) — noreply / githubweb
  54. Relax scratch space limits for HIP reductions (#6029) — noreply / githubweb
  55. Fix -Wmissing-field-initializers warning — Daniel Arndt / githubweb
  56. Fix enum warnings — Daniel Arndt / githubweb
  57. Fix OMPT size compare warnings — Daniel Arndt / githubweb
  58. Add -Wextra — Daniel Arndt / githubweb
  59. Cherry-pick TriBITS update from Trilinos — Daniel Arndt / githubweb
  60. KOKKOS_COMPILER_PGI -> KOKKOS_COMPILER_NVHPC — Daniel Arndt / githubweb
  61. Try removing _kokkos_pgi_compiler_bug_workaround — Daniel Arndt / githubweb
  62. Remove unused constructors for ThreadVectorRangeBoundairesStruct — donlee / githubweb
  63. Define at most one KOKKOS_COMPILER* macro — Daniel Arndt / githubweb
  64. Define KOKKOS_COMPILER_INTEL_LLVM — Daniel Arndt / githubweb
  65. Desul atomics: let relocatable device code mode be part of the configuration (#5991) — noreply / githubweb
  66. CHANGELOG.md: add threads sort — ajpowel / githubweb
  67. Remove Kokkos Keyword Listing section from BUILD.md and refer to the wiki instead — Damien L-G / githubweb
  68. Remove (outdated) license information [ci skip] — Damien L-G / githubweb
  69. Remove libnuma (#6048) — noreply / githubweb
  70. Don't create a shared state for size() in UnorderedMap's deep_copy — Daniel Arndt / githubweb
  71. OpenMPTarget: Changes to Makefile.kokkos (#6053) — noreply / githubweb
  72. Partially reverse #5504 — Daniel Arndt / githubweb
  73. Reduce size of ScatterView test when using OpenMP — Bruno Turcksin / githubweb
  74. Fix bug in Makefile.kokkos — stamoor / githubweb
  75. Fix join for ValueWrapperForNoNeutralElement — Daniel Arndt / githubweb
  76. SYCL: Use in-order queue for SYCL+Cuda — Daniel Arndt / githubweb
  77. perf_test is still not working — Daniel Arndt / githubweb
  78. Fix PerfTests by limiting GramSchmidt — Daniel Arndt / githubweb
  79. Only pass one wrapper object in SYCL reductions — Daniel Arndt / githubweb
  80. Explicitly cast to CombinedFunctorReducerType — Daniel Arndt / githubweb
  81. Fix Kokkos_SIMD with AVX2 on 64-bit architectures (#6075) — noreply / githubweb
  82. Update changelog (#6058) — noreply / githubweb
  83. Fix up SYCL execution space instance creation for Intel GPUs — Daniel Arndt / githubweb
  84. Converted a shared_ptr to a host view in UnorderedMap (#6073) — noreply / githubweb
  85. Update OpenACC FunctorAdapter (#6077) — noreply / githubweb
  86. Improve SYCL parallel_scan — Daniel Arndt / githubweb
  87. Compiling with auto deduction of workgroup sizes — Daniel Arndt / githubweb
  88. Disable failed bit manipulation tests when compiled by NVHPC  (#6088) — noreply / githubweb
  89. Add support for Darwin 32-bit and PPC (#5916) — noreply / githubweb
  90. Unconditionally enable CUDA extended lambda support — pbmille / githubweb
  91. Tentative arguments switch for nvcc 12+ — pbmille / githubweb
  92. Change Makefile.kokkos too — pbmille / githubweb
  93. Implement CMake messages per team decision — pbmille / githubweb
  94. Fix definitions and docs to remove CUDA Lambda option — pbmille / githubweb
  95. Don't fail to define broader 'lambdas are available' macro — pbmille / githubweb
  96. Always expect KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA to be set — pbmille / githubweb
  97. Remove various test exclusions based on KOKKOS_ENABLE_CUDA_LAMBDA — pbmille / githubweb
  98. Update changelog — ndellin / githubweb
  99. [ci skip] Fixup changelog — ndellin / githubweb
  100. Work around nvcc issue for view_mapping and add FIXME_NVCC comment — pbmille / githubweb
  101. OpenMPTarget: Update hierarchical parallelism. (#6043) — noreply / githubweb
  102. Enable OpenMP in CUDA-11.0-NVCC-RDC to test DEPRECATED_CODE_3=ON (#5978) — noreply / githubweb
  103. fix ternary op in subset of std algorithms not working with nvhpc (#6095) — noreply / githubweb
  104. Add implementation of bit_cast in <Kokkos_BitManipulation.hpp> — Damien L-G / githubweb
  105. Add compile time tests for the constraints on the bit_cast function template — Damien L-G / githubweb
  106. Add the Experimental:: builtin variant (just defer to regular bit_cast) — Damien L-G / githubweb
  107. Add runtime tests for bit_cast — Damien L-G / githubweb
  108. Use Kokkos::bit_cast in SIMD instead of rolling its own — Damien L-G / githubweb
  109. Clang-format glitch — Damien L-G / githubweb
  110. view(uvm): fence if need in allocation (#6005) — romin.tomasetti / githubweb
  111. Disable tests that fail at runtime with NVHPC (likely not liking the class declaration within the body of the functor) — Damien L-G / githubweb
  112. change impl of `is_sorted_until` to use reduce (#6097) — noreply / githubweb
  113. Fix typo and remove accidentally committed assertions — noreply / githubweb
  114. Added multiple reducers support for team-level parallel reduce (#5727) — noreply / githubweb
  115. Work around NVHPC issue with enum types — crtrott / githubweb
  116. Work around NVHPC 23.x issues — crtrott / githubweb
  117. Kokkos: Remove TriBITS Kokkos subpackages (trilinos/Trilinos#11545) (#6104) — noreply / githubweb
  118. Drop pointless Kokkos{Algorithms,Containers}_config.h files — Damien L-G / githubweb
  119. Revert "Merge pull request #5964 from PhilMiller/cuda-lambda-default" — Damien L-G / githubweb
  120. Update the OpenACC parallel_reduce() constructs with Range/MDRange/Team (#6072) — noreply / githubweb
  121. Always pass -extended-lambda option to NVCC and force Kokkos_ENABLE_CUDA_LAMBDA ON — Damien L-G / githubweb
  122. Reorganize ZeroMemset (#6087) — noreply / githubweb
  123. Drop CUDA_LAMBDA guards in Cuda headers — Damien L-G / githubweb
  124. Work around NVHPC 23.x not dealing with __isGlobal — crtrott / githubweb
  125. Drop unused cmake macros — Damien L-G / githubweb
  126. Fixup cmake style — Damien L-G / githubweb
  127. use ASSERT_EQ in all std algorithms tests — fnrizzi / githubweb
  128. Reintroduce test skip for nvhpc < 23.3 — crtrott / githubweb
  129. hpcbind: check for correct Slurm variable — rberger / githubweb
  130. Fix macro guards in test for NVC++ as the CUDA compiler — Damien L-G / githubweb
  131. Allow templated functors in parallel_for, parallel_reduce and parallel_scan (#5976) — noreply / githubweb
  132. Import sycl::bit_cast into the Kokkos namespace — Daniel Arndt / githubweb
  133. Qualify calls possibly ambiguous calls to bit_cast — Daniel Arndt / githubweb
  134. Fix nightlies -- workaround compiler bug in GCC 9.1 and 9.2 (#6118) — noreply / githubweb
  135. Kokkos_BitManipulation: KOKKOS_COMPILER_GCC->KOKKOS_COMPILER_GNU (#6119) — noreply / githubweb
  136. Cuda: Remove unused attach_texture_object — Daniel Arndt / githubweb
  137. Move half traits to private header and add half/bhalf infinity trait (#6055) — noreply / githubweb
  138. Increase minimum required HPX version to 1.8.0 — mikael.simberg / githubweb
  139. Conditionally use hpx::post instead of hpx::apply based on HPX version — mikael.simberg / githubweb
  140. Don't restrict index type in builtin reducers — Daniel Arndt / githubweb
  141. dual view: update template types (#6085) — romin.tomasetti / githubweb
  142. sorting an empty view should exit early and not fail (#6130) — noreply / githubweb
  143. core/src: Move floating_point_wrapper to private header — eharvey / githubweb
  144. Disable tests failing with NVHPC — Daniel Arndt / githubweb
  145. Fix bit_cast for SYCL again — Daniel Arndt / githubweb
  146. Disable tests for OpenMPTarget — Daniel Arndt / githubweb
  147. Improve indentation of comments — Daniel Arndt / githubweb
  148. Allow deprecated declarations in SYCL+Cuda CI — Daniel Arndt / githubweb
  149. Try running for other execution spaces — Daniel Arndt / githubweb
  150. Add guards for Cuda — Daniel Arndt / githubweb
  151. Expand list of kokkos options not to export with cmake — Damien L-G / githubweb
  152. Do not append to Kokkos_OPTIONS variables those in the do not export list — Damien L-G / githubweb
  153. Drop Kokkos_ENABLE_LAUNCH_COMPILER option — Damien L-G / githubweb
  154. Export Kokkos_ENABLE_<OPTION> that are relevant — Damien L-G / githubweb
  155. Drop Kokkos_ENABLE_PROFILING_LOAD_PRINT option — Damien L-G / githubweb
  156. Suppress bogus warning about CUDA_LAMBDA being ON — Damien L-G / githubweb
  157. [ci skip] Add nightly ci for spack (#6135) — noreply / githubweb
  158. OpenMPTarget: Enable Cray compiler for the OpenMPTarget backend. (#5889) — noreply / githubweb
  159. Revert to `DualView<class,class=void,class=void,class=void>` when deprecated code 4 is enabled — Damien L-G / githubweb
  160. Fix Kokkos_ENABLE_CUDA_LAMBDA for Trilinos — Daniel Arndt / githubweb
  161. Fix bogus warnings in nested CUDA parallel_reduce — Daniel Arndt / githubweb
  162. `BinSort`, `BinOp1D`, `BinOp3D`: mark default constructor as deleted (#6131) — noreply / githubweb
  163. KokkosTools: Don't call callbacks before backends are initialized (#6114) — noreply / githubweb
  164. Add `Kokkos::all_libs` alias target for compatibility with TriBITS/Trilinos (#6157) — noreply / githubweb
  165. sorting: add to binsort support for strided views and reorg tests (#6081) — noreply / githubweb
  166. Allow linking against build tree (#6078) — noreply / githubweb
  167. Implement `HPX::in_parallel` (#6143) — noreply / githubweb
  168. OpenMPTarget: Changes for OpenMPTarget backend with nvhpc compiler. — rgayatri / githubweb
  169. OpenMPTarget: Add a fixme. — rgayatri / githubweb
  170. Update Makefile.kokkos — noreply / githubweb
  171. Remove extended_namespace template paramter for SYCLMemoryOrder/Scope — Daniel Arndt / githubweb
  172. OpenMPTarget: update fixme comment. — rgayatri / githubweb
  173. OpenMPTarget: Replace kokkos macros in desul. — rgayatri / githubweb
  174. OpenMPTarget: Restore desul changes. — rgayatri / githubweb
  175. Cherry-pick v3.7.02 changelog into develop [ci skip] — Damien L-G / githubweb
  176. Kokkos: Mark HWLOC as a TriBITS TPL as well (trilinos/Trilinos#11938) — rabartl / githubweb
  177. Clean up FunctorAnalysis — Daniel Arndt / githubweb
  178. SIMD: make binary op tests to test against all data types (#5913) — noreply / githubweb
  179. Also create symlinks for CMake configuration files to cmake_packages/Kokkos for TriBITS (#6163) — noreply / githubweb
  180. Allow passing a temporary std::vector to partition_space (#6167) — noreply / githubweb
  181. .github/workflows: Remove push trigger — eharvey / githubweb
  182. .github/workflows: Only trigger upon push to develop — eharvey / githubweb
  183. Replace _mm512_loadu_epi64 and _mm512_storeu_epi64 with _mm512_loadu_si512 and _mm512_storeu_si512 — donlee / githubweb
  184. OpenMPTarget: include desul changes. — rgayatri / githubweb
  185. Weed out verbose output from `dynamic_view` container unit test (#6173) — noreply / githubweb
  186. shortcut value for is_dynamic_view — fnrizzi / githubweb
  187. add trait and test — fnrizzi / githubweb
  188. Fix global fence in Kokkos::resize(DynRankView) (#6184) — noreply / githubweb
  189. Left align demangled stacktrace output. (#6191) — noreply / githubweb
  190. [HIP] Improve heuristic deciding the number of blocks used in parallel_reduce (#6160) — noreply / githubweb
  191. Improve OpenMP affinity warning to include MPI concerns (#6185) — noreply / githubweb
  192. Update version number on develop after branching off for 4.1.00 — Daniel Arndt / githubweb
  193. Fix test_quad_precision_math_constants test — Daniel Arndt / githubweb
  194. implementation and tests — fnrizzi / githubweb
  195. implementation and tests — fnrizzi / githubweb
  196. [ci skip] test_all_sandia: update compilers and queues — ndellin / githubweb
  197. team-level std algos: common code needed (#6199) — noreply / githubweb
  198. Fix compiling SYCL with KOKKOS_IMPL_DO_NOT_USE_PRINTF_USAGE — Daniel Arndt / githubweb
  199. snapshot mdspan namespace changes (#6162) — noreply / githubweb
  200. Disable AVX512 support for NVHPC — Daniel Arndt / githubweb
  201. Fix host-annotations of AVX2, AVX512, and NEON constructors — Daniel Arndt / githubweb
  202. Introduce impl_get_value/impl_get_mask — Daniel Arndt / githubweb
  203. Fix a gcc-8.4.0 warning — Daniel Arndt / githubweb
  204. Fix host-device annotation for where_expression/const_where_expression — Daniel Arndt / githubweb
  205. Make in-order queues the default via macro — Daniel Arndt / githubweb
  206. Avoid SFINAE in favor of overloads — Daniel Arndt / githubweb
  207. Move scalar overloads to Scalar header — Daniel Arndt / githubweb
  208. Disable KOKKOS_ARCH_AVX512XEON for NVHPC — Daniel Arndt / githubweb
  209. Changelog for 4.1.00 (#6225) — noreply / githubweb
  210. reorganize sort headers (#6230) — noreply / githubweb
  211. SYCL: Support for bhalf_t (#6204) — noreply / githubweb
  212. only compute with relavent entries — tccleve / githubweb
  213. make constraints on `Kokkos::sort` more visible/clear (#6234) — noreply / githubweb
  214. slim API and move code to impl — fnrizzi / githubweb
  215. This PR contains minor code changes and bug fixes needed for LLVM-Clacc — lees2 / githubweb
  216. improve all other corner cases as per review comment — fnrizzi / githubweb
  217. refine for cuda uvm — fnrizzi / githubweb
  218. use exespace to check rather than mem space — fnrizzi / githubweb
  219. Fix AVX2 simd support for ZEN2 AMD CPU. (#6238) — noreply / githubweb
  220. fix corner case — fnrizzi / githubweb
  221. Fix windows symlink configure issue (#6241) — noreply / githubweb
  222. fix corner cases — fnrizzi / githubweb
  223. bug_report.md: new PR branching from `develop` (#5034) — noreply / githubweb
  224. Fix whitespace in bug_report.md (#6244) — noreply / githubweb
  225. Avoid undefined behavior in TestTaskScheduker.hpp — Daniel Arndt / githubweb
  226. Remove calling tribits_exclude_autotools_files() — rabartl / githubweb
  227. Ensure that complex is only instantiated for cv-unqualified floating-point type — Damien L-G / githubweb
  228. Deprecated Kokkos::vector — Damien L-G / githubweb
  229. Warn if <Kokkos_Vector.hpp> is included — Damien L-G / githubweb
  230. Drop Vector test with makefiles and conditionally remove it with CMake — Damien L-G / githubweb
  231. Ignore <Kokkos_Vector.hpp> in the header self-containment tests — Damien L-G / githubweb
  232. SYCL: Use in-order queues in InterOp tests (#6246) — noreply / githubweb
  233. std_algos: fix wrong corner case for `is_partitioned` (#6257) — noreply / githubweb
  234. Make sure macros are defined — Daniel Arndt / githubweb
  235. Error out when Kokkoks_Vector.hpp is included with deprecated code disabled — Daniel Arndt / githubweb
  236. SIMD: Add abs() for all int types (#6069) — noreply / githubweb
  237. Fix SIMD abs unit test accidentally using complex overload — Damien L-G / githubweb
  238. Fix SIMD tests on NEON — Daniel Arndt / githubweb
  239. Add default ParallelFor copy constructor for HIP — Bruno Turcksin / githubweb
  240. Workaround gcc/8.2.0 compiler issue with _mm512_abs_pd — ndellin / githubweb
  241. Implement Kokkos::printf (#6083) — noreply / githubweb
  242. Improve SYCL TeamPolicy reduction — Daniel Arndt / githubweb
  243. make Kokkos_CXX_COMPILER_VERSION available to CMake consumers — cwpears / githubweb
  244. Fully qualify Experimental::SYCL in algorithms to avoid finding conflicting namespaces — Daniel Arndt / githubweb
  245. subset of team level impl of std algorithms — fnrizzi / githubweb
  246. fix copyright — fnrizzi / githubweb
  247. guard for openmptarget — fnrizzi / githubweb
  248. fix for openmptarget — fnrizzi / githubweb
  249. address comments — fnrizzi / githubweb
  250. Update CMakeLists for unit tests with OpenMPTarget, OpenACC with NVHPC (#6260) — noreply / githubweb
  251. Update CI from CUDA 11.7.0 to 11.7.1 — Bruno Turcksin / githubweb
  252. Improve SYCL reduction performance: RangePolicy (#6264) — noreply / githubweb
  253. Improve SYCL reduction performance: workgroup_reduction (#6270) — noreply / githubweb
  254. SYCL TeamPolicy: Fix sign comparison warning — Daniel Arndt / githubweb
  255. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  256. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  257. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  258. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  259. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  260. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  261. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  262. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  263. format — fnrizzi / githubweb
  264. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  265. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  266. Suppress warnings — Daniel Arndt / githubweb
  267. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  268. Improve macro definitions — Daniel Arndt / githubweb
  269. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  270. Modify fences in View API test — cezary.skrzynski / githubweb
  271. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  272. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  273. bring back previous code as discussed in meeting — fnrizzi / githubweb
  274. create cudaAPI function wrappers — tccleve / githubweb
  275. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  276. Some api function require cuda11.2+ — tccleve / githubweb
  277. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  278. Rework stream inputs — tccleve / githubweb
  279. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  280. Remove static in comment — tccleve / githubweb
  281. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  282. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  283. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  284. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  285. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  286. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  287. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  288. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  289. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  290. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  291. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  292. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  293. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  294. Fix reviewer's comments — Bruno Turcksin / githubweb
  295. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  296. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  297. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  298. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  299. remove spurious undefs — fnrizzi / githubweb
  300. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  301. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  302. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  303. address reviews [skip ci] — antoine.meyer54 / githubweb
  304. formatting — fnrizzi / githubweb
  305. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  306. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  307. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  308. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  309. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  310. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  311. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  312. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  313. fix lambda capture — fnrizzi / githubweb
  314. remove unnecessary file, fix constraints — fnrizzi / githubweb
  315. remove file — fnrizzi / githubweb
  316. fix — fnrizzi / githubweb
  317. fix lambda capture and constraints — fnrizzi / githubweb
  318. keep only subset — fnrizzi / githubweb
  319. revert files — fnrizzi / githubweb
  320. remove file — fnrizzi / githubweb
  321. fix syntax — fnrizzi / githubweb
  322. format — fnrizzi / githubweb
  323. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  324. Rename AMD GPU architectures (#6266) — noreply / githubweb
  325. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  326. SIMD: add generator constructors (#6347) — noreply / githubweb
  327. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  328. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  329. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  330. Initial implementation of gfx942 (#6358) — noreply / githubweb
  331. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  332. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  333. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  334. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  335. Add support for HIP Graph — Bruno Turcksin / githubweb
  336. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  337. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  338. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  339. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  340. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  341. Use constexpr West in src — Bruno Turcksin / githubweb
  342. Use constexpr West in test — Bruno Turcksin / githubweb
  343. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  344. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  345. team-level std algos: part 2 (#6205) — noreply / githubweb
  346. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  347. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  348. Rebased and applied feedbacks — donlee / githubweb
  349. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  350. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  351. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  352. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  353. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  354. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  355. More NVC++ 23.7 updates — crtrott / githubweb
  356. NVC++ clang-format fixes — crtrott / githubweb
  357. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  358. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  359. OpenACC: Guard tests relying on abort — crtrott / githubweb
  360. Fix TestAtomic to use the test execspace — crtrott / githubweb
  361. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  362. Update nvhpc gtest skip message — crtrott / githubweb
  363. Work around OpenMPTarget failure — crtrott / githubweb
  364. Update base docker file for nvhpc — crtrott / githubweb
  365. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  366. Fix reviewer's comments — Bruno Turcksin / githubweb
  367. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  368. HIP: Update print_configuration (#6387) — noreply / githubweb
  369. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  370. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  371. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  372. team-level std algos: part 3 (#6207) — noreply / githubweb
  373. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  374. SIMD: add float simd support (#6177) — noreply / githubweb
  375. team-level std algos: part 4 (#6208) — noreply / githubweb
  376. Added a gen ctor for float (#6397) — noreply / githubweb
  377. team-level std algos: part 5 (#6209) — noreply / githubweb
  378. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  379. Fixup checked interger operations death test — Damien L-G / githubweb
  380. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  381. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  382. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  383. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  384. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  385. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  386. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  387. team-level std algos: part 6 (#6210) — noreply / githubweb
  388. address comments — fnrizzi / githubweb
  389. OpenMP backend refactor files. (#6403) — noreply / githubweb
  390. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  391. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  392. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  393. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  394. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  395. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  396. use single — fnrizzi / githubweb
  397. address comments — fnrizzi / githubweb
  398. formatting — fnrizzi / githubweb
  399. Team-level std algos: part 7 (#6211) — noreply / githubweb
  400. formatting — fnrizzi / githubweb
  401. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  402. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  403. Drop check whether device supports unified addressing — Damien L-G / githubweb
  404. fix single as per Christian's suggestion — fnrizzi / githubweb
  405. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  406. check-copyright improvements (#6399) — noreply / githubweb
  407. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  408. Address reviewer' comments — Bruno Turcksin / githubweb
  409. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  410. add comment — fnrizzi / githubweb
  411. improve tests to address review — fnrizzi / githubweb
  412. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  413. avoid potential race condition HIP — tccleve / githubweb
  414. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  415. Same for scan — andrei.elovikov / githubweb
  416. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  417. improve tests with intra-team result check — fnrizzi / githubweb
  418. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  419. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  420. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  421. remove old impl — fnrizzi / githubweb
  422. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  423. Clean up benchmarks/gups — cwpears / githubweb
  424. benchmark/gups: use CMake — cwpears / githubweb
  425. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  426. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  427. add missing assert — fnrizzi / githubweb
  428. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  429. add intra team check for missing test — fnrizzi / githubweb
  430. fix intel compile error — fnrizzi / githubweb
  431. fix unreachable for intel — fnrizzi / githubweb
  432. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  433. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  434. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  435. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  436. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  437. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  438. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  439. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  440. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  441. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  442. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  443. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  444. improve tests (#6432) — noreply / githubweb
  445. improve tests (#6437) — noreply / githubweb
  446. Move final assignment to correct scope — cezary.skrzynski / githubweb
  447. fix casting warning in Random test — fnrizzi / githubweb
  448. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  449. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  450. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  451. improve tests to check intra-team result (#6431) — noreply / githubweb
  452. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  453. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  454. team-level std algos: part 10 (#6256) — noreply / githubweb
  455. team-level std algos: part 11 (#6258) — noreply / githubweb
  456. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  457. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  458. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  459. use shortcut — cezary.skrzynski / githubweb
  460. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  461. Fix formatting — cezary.skrzynski / githubweb
  462. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  463. Use std::is_same_v — cezary.skrzynski / githubweb
  464. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  465. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  466. Use std::is_same_v — cezary.skrzynski / githubweb
  467. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  468. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  469. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  470. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  471. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  472. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  473. Use call operator instead of run_me function — maarten.arnst / githubweb
  474. team-level std algos: part 12 (#6350) — noreply / githubweb
  475. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  476. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  477. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  478. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  479. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  480. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  481. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  482. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  483. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  484. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  485. Fix NVCC warnings (#6483) — noreply / githubweb
  486. team-level std algos: part 13 (#6351) — noreply / githubweb
  487. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  488. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  489. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  490. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  491. fix impl — fnrizzi / githubweb
  492. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  493. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  494. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  495. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  496. add overload for TeamThreadRange — fnrizzi / githubweb
  497. address review comment — fnrizzi / githubweb
  498. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  499. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  500. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  501. ad threadvector — fnrizzi / githubweb
  502. fix order — fnrizzi / githubweb
  503. remove guards — fnrizzi / githubweb
  504. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  505. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  506. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  507. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  508. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  509. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  510. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  511. add guards — fnrizzi / githubweb
  512. avoid auto — fnrizzi / githubweb
  513. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  514. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  515. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  516. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  517. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  518. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  519. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  520. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  521. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  522. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  523. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  524. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  525. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  526. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  527. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  528. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  529. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  530. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  531. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  532. Remove sleep and wake functions — Bruno Turcksin / githubweb
  533. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  534. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  535. Remove unused variables — Bruno Turcksin / githubweb
  536. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  537. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  538. Remove useless forward declaration — Bruno Turcksin / githubweb
  539. Remove spawn function — Bruno Turcksin / githubweb
  540. Add comments — Bruno Turcksin / githubweb
  541. Fix indentation — Bruno Turcksin / githubweb
  542. Fix typo in macro guard — Bruno Turcksin / githubweb
  543. Reduce number of View constructor instantiations — Damien L-G / githubweb
  544. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  545. Split files in HIP backend — Bruno Turcksin / githubweb
  546. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  547. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  548. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  549. Threads remove unused variables and functions (#6566) — noreply / githubweb
  550. Remove unused Sandia testing files (#6568) — noreply / githubweb
  551. fallback implementation cleanup — donlee / githubweb
  552. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  553. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  554. Drop Clang+CUDA workaround — Damien L-G / githubweb
  555. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  556. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  557. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  558. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  559. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  560. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  561. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  562. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  563. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  564. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  565. OpenACC: add atomics support (#6446) — noreply / githubweb
  566. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  567. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  568. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  569. try fix — fnrizzi / githubweb
  570. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  571. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  572. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  573. Added missing operator* to NEON simd — crtrott / githubweb
  574. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  575. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  576. try fix — fnrizzi / githubweb
  577. avoid pyt package — fnrizzi / githubweb
  578. try — fnrizzi / githubweb
  579. fix for macos — fnrizzi / githubweb
  580. remove comments — fnrizzi / githubweb
  581. use reference — crtrott / githubweb
  582. add branching — fnrizzi / githubweb
  583. [ci skip] fix formatting — cezary.skrzynski / githubweb
  584. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  585. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  586. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  587. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  588. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  589. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  590. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  591. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  592. unorderedmap: modernize traits — romin.tomasetti / githubweb
  593. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb