Skip to content

Changes

#2 (Dec 1, 2023, 11:52:02 AM)

  1. Add support for HIP Graph — Bruno Turcksin / githubweb
  2. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  3. Fix reviewer's comments — Bruno Turcksin / githubweb
  4. Address reviewer' comments — Bruno Turcksin / githubweb
  5. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  6. Same for scan — andrei.elovikov / githubweb
  7. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  8. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  9. Use std::is_same_v — cezary.skrzynski / githubweb
  10. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  11. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  12. Use std::is_same_v — cezary.skrzynski / githubweb
  13. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  14. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  15. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  16. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  17. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  18. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  19. Use call operator instead of run_me function — maarten.arnst / githubweb
  20. team-level std algos: part 12 (#6350) — noreply / githubweb
  21. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  22. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  23. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  24. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  25. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  26. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  27. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  28. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  29. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  30. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  31. Fix NVCC warnings (#6483) — noreply / githubweb
  32. team-level std algos: part 13 (#6351) — noreply / githubweb
  33. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  34. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  35. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  36. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  37. fix impl — fnrizzi / githubweb
  38. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  39. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  40. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  41. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  42. add overload for TeamThreadRange — fnrizzi / githubweb
  43. address review comment — fnrizzi / githubweb
  44. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  45. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  46. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  47. ad threadvector — fnrizzi / githubweb
  48. fix order — fnrizzi / githubweb
  49. remove guards — fnrizzi / githubweb
  50. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  51. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  52. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  53. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  54. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  55. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  56. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  57. add guards — fnrizzi / githubweb
  58. avoid auto — fnrizzi / githubweb
  59. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  60. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  61. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  62. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  63. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  64. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  65. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  66. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  67. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  68. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  69. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  70. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  71. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  72. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  73. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  74. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  75. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  76. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  77. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  78. Remove sleep and wake functions — Bruno Turcksin / githubweb
  79. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  80. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  81. Remove unused variables — Bruno Turcksin / githubweb
  82. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  83. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  84. Remove useless forward declaration — Bruno Turcksin / githubweb
  85. Remove spawn function — Bruno Turcksin / githubweb
  86. Add comments — Bruno Turcksin / githubweb
  87. Fix indentation — Bruno Turcksin / githubweb
  88. Fix typo in macro guard — Bruno Turcksin / githubweb
  89. Reduce number of View constructor instantiations — Damien L-G / githubweb
  90. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  91. Split files in HIP backend — Bruno Turcksin / githubweb
  92. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  93. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  94. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  95. Threads remove unused variables and functions (#6566) — noreply / githubweb
  96. Remove unused Sandia testing files (#6568) — noreply / githubweb
  97. fallback implementation cleanup — donlee / githubweb
  98. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  99. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  100. Drop Clang+CUDA workaround — Damien L-G / githubweb
  101. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  102. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  103. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  104. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  105. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  106. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  107. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  108. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  109. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  110. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  111. OpenACC: add atomics support (#6446) — noreply / githubweb
  112. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  113. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  114. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  115. try fix — fnrizzi / githubweb
  116. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  117. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  118. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  119. Added missing operator* to NEON simd — crtrott / githubweb
  120. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  121. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  122. try fix — fnrizzi / githubweb
  123. avoid pyt package — fnrizzi / githubweb
  124. try — fnrizzi / githubweb
  125. fix for macos — fnrizzi / githubweb
  126. remove comments — fnrizzi / githubweb
  127. use reference — crtrott / githubweb
  128. add branching — fnrizzi / githubweb
  129. [ci skip] fix formatting — cezary.skrzynski / githubweb
  130. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  131. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  132. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  133. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  134. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  135. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  136. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  137. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  138. unorderedmap: modernize traits — romin.tomasetti / githubweb
  139. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb

#2 (Dec 1, 2023, 11:52:02 AM)

  1. Add support for HIP Graph — Bruno Turcksin / githubweb
  2. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  3. Fix reviewer's comments — Bruno Turcksin / githubweb
  4. Address reviewer' comments — Bruno Turcksin / githubweb
  5. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  6. Same for scan — andrei.elovikov / githubweb
  7. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  8. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  9. Use std::is_same_v — cezary.skrzynski / githubweb
  10. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  11. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  12. Use std::is_same_v — cezary.skrzynski / githubweb
  13. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  14. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  15. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  16. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  17. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  18. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  19. Use call operator instead of run_me function — maarten.arnst / githubweb
  20. team-level std algos: part 12 (#6350) — noreply / githubweb
  21. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  22. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  23. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  24. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  25. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  26. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  27. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  28. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  29. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  30. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  31. Fix NVCC warnings (#6483) — noreply / githubweb
  32. team-level std algos: part 13 (#6351) — noreply / githubweb
  33. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  34. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  35. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  36. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  37. fix impl — fnrizzi / githubweb
  38. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  39. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  40. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  41. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  42. add overload for TeamThreadRange — fnrizzi / githubweb
  43. address review comment — fnrizzi / githubweb
  44. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  45. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  46. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  47. ad threadvector — fnrizzi / githubweb
  48. fix order — fnrizzi / githubweb
  49. remove guards — fnrizzi / githubweb
  50. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  51. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  52. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  53. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  54. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  55. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  56. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  57. add guards — fnrizzi / githubweb
  58. avoid auto — fnrizzi / githubweb
  59. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  60. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  61. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  62. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  63. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  64. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  65. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  66. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  67. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  68. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  69. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  70. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  71. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  72. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  73. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  74. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  75. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  76. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  77. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  78. Remove sleep and wake functions — Bruno Turcksin / githubweb
  79. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  80. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  81. Remove unused variables — Bruno Turcksin / githubweb
  82. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  83. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  84. Remove useless forward declaration — Bruno Turcksin / githubweb
  85. Remove spawn function — Bruno Turcksin / githubweb
  86. Add comments — Bruno Turcksin / githubweb
  87. Fix indentation — Bruno Turcksin / githubweb
  88. Fix typo in macro guard — Bruno Turcksin / githubweb
  89. Reduce number of View constructor instantiations — Damien L-G / githubweb
  90. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  91. Split files in HIP backend — Bruno Turcksin / githubweb
  92. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  93. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  94. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  95. Threads remove unused variables and functions (#6566) — noreply / githubweb
  96. Remove unused Sandia testing files (#6568) — noreply / githubweb
  97. fallback implementation cleanup — donlee / githubweb
  98. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  99. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  100. Drop Clang+CUDA workaround — Damien L-G / githubweb
  101. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  102. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  103. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  104. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  105. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  106. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  107. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  108. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  109. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  110. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  111. OpenACC: add atomics support (#6446) — noreply / githubweb
  112. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  113. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  114. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  115. try fix — fnrizzi / githubweb
  116. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  117. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  118. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  119. Added missing operator* to NEON simd — crtrott / githubweb
  120. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  121. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  122. try fix — fnrizzi / githubweb
  123. avoid pyt package — fnrizzi / githubweb
  124. try — fnrizzi / githubweb
  125. fix for macos — fnrizzi / githubweb
  126. remove comments — fnrizzi / githubweb
  127. use reference — crtrott / githubweb
  128. add branching — fnrizzi / githubweb
  129. [ci skip] fix formatting — cezary.skrzynski / githubweb
  130. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  131. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  132. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  133. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  134. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  135. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  136. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  137. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  138. unorderedmap: modernize traits — romin.tomasetti / githubweb
  139. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb

#2 (Dec 1, 2023, 11:52:02 AM)

  1. Add support for HIP Graph — Bruno Turcksin / githubweb
  2. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  3. Fix reviewer's comments — Bruno Turcksin / githubweb
  4. Address reviewer' comments — Bruno Turcksin / githubweb
  5. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  6. Same for scan — andrei.elovikov / githubweb
  7. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  8. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  9. Use std::is_same_v — cezary.skrzynski / githubweb
  10. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  11. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  12. Use std::is_same_v — cezary.skrzynski / githubweb
  13. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  14. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  15. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  16. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  17. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  18. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  19. Use call operator instead of run_me function — maarten.arnst / githubweb
  20. team-level std algos: part 12 (#6350) — noreply / githubweb
  21. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  22. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  23. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  24. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  25. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  26. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  27. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  28. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  29. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  30. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  31. Fix NVCC warnings (#6483) — noreply / githubweb
  32. team-level std algos: part 13 (#6351) — noreply / githubweb
  33. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  34. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  35. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  36. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  37. fix impl — fnrizzi / githubweb
  38. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  39. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  40. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  41. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  42. add overload for TeamThreadRange — fnrizzi / githubweb
  43. address review comment — fnrizzi / githubweb
  44. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  45. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  46. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  47. ad threadvector — fnrizzi / githubweb
  48. fix order — fnrizzi / githubweb
  49. remove guards — fnrizzi / githubweb
  50. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  51. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  52. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  53. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  54. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  55. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  56. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  57. add guards — fnrizzi / githubweb
  58. avoid auto — fnrizzi / githubweb
  59. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  60. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  61. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  62. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  63. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  64. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  65. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  66. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  67. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  68. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  69. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  70. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  71. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  72. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  73. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  74. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  75. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  76. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  77. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  78. Remove sleep and wake functions — Bruno Turcksin / githubweb
  79. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  80. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  81. Remove unused variables — Bruno Turcksin / githubweb
  82. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  83. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  84. Remove useless forward declaration — Bruno Turcksin / githubweb
  85. Remove spawn function — Bruno Turcksin / githubweb
  86. Add comments — Bruno Turcksin / githubweb
  87. Fix indentation — Bruno Turcksin / githubweb
  88. Fix typo in macro guard — Bruno Turcksin / githubweb
  89. Reduce number of View constructor instantiations — Damien L-G / githubweb
  90. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  91. Split files in HIP backend — Bruno Turcksin / githubweb
  92. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  93. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  94. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  95. Threads remove unused variables and functions (#6566) — noreply / githubweb
  96. Remove unused Sandia testing files (#6568) — noreply / githubweb
  97. fallback implementation cleanup — donlee / githubweb
  98. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  99. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  100. Drop Clang+CUDA workaround — Damien L-G / githubweb
  101. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  102. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  103. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  104. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  105. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  106. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  107. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  108. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  109. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  110. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  111. OpenACC: add atomics support (#6446) — noreply / githubweb
  112. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  113. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  114. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  115. try fix — fnrizzi / githubweb
  116. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  117. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  118. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  119. Added missing operator* to NEON simd — crtrott / githubweb
  120. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  121. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  122. try fix — fnrizzi / githubweb
  123. avoid pyt package — fnrizzi / githubweb
  124. try — fnrizzi / githubweb
  125. fix for macos — fnrizzi / githubweb
  126. remove comments — fnrizzi / githubweb
  127. use reference — crtrott / githubweb
  128. add branching — fnrizzi / githubweb
  129. [ci skip] fix formatting — cezary.skrzynski / githubweb
  130. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  131. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  132. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  133. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  134. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  135. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  136. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  137. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  138. unorderedmap: modernize traits — romin.tomasetti / githubweb
  139. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb

#2 (Dec 1, 2023, 11:52:02 AM)

  1. Add support for HIP Graph — Bruno Turcksin / githubweb
  2. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  3. Fix reviewer's comments — Bruno Turcksin / githubweb
  4. Address reviewer' comments — Bruno Turcksin / githubweb
  5. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  6. Same for scan — andrei.elovikov / githubweb
  7. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  8. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  9. Use std::is_same_v — cezary.skrzynski / githubweb
  10. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  11. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  12. Use std::is_same_v — cezary.skrzynski / githubweb
  13. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  14. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  15. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  16. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  17. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  18. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  19. Use call operator instead of run_me function — maarten.arnst / githubweb
  20. team-level std algos: part 12 (#6350) — noreply / githubweb
  21. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  22. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  23. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  24. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  25. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  26. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  27. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  28. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  29. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  30. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  31. Fix NVCC warnings (#6483) — noreply / githubweb
  32. team-level std algos: part 13 (#6351) — noreply / githubweb
  33. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  34. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  35. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  36. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  37. fix impl — fnrizzi / githubweb
  38. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  39. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  40. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  41. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  42. add overload for TeamThreadRange — fnrizzi / githubweb
  43. address review comment — fnrizzi / githubweb
  44. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  45. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  46. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  47. ad threadvector — fnrizzi / githubweb
  48. fix order — fnrizzi / githubweb
  49. remove guards — fnrizzi / githubweb
  50. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  51. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  52. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  53. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  54. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  55. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  56. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  57. add guards — fnrizzi / githubweb
  58. avoid auto — fnrizzi / githubweb
  59. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  60. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  61. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  62. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  63. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  64. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  65. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  66. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  67. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  68. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  69. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  70. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  71. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  72. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  73. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  74. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  75. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  76. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  77. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  78. Remove sleep and wake functions — Bruno Turcksin / githubweb
  79. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  80. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  81. Remove unused variables — Bruno Turcksin / githubweb
  82. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  83. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  84. Remove useless forward declaration — Bruno Turcksin / githubweb
  85. Remove spawn function — Bruno Turcksin / githubweb
  86. Add comments — Bruno Turcksin / githubweb
  87. Fix indentation — Bruno Turcksin / githubweb
  88. Fix typo in macro guard — Bruno Turcksin / githubweb
  89. Reduce number of View constructor instantiations — Damien L-G / githubweb
  90. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  91. Split files in HIP backend — Bruno Turcksin / githubweb
  92. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  93. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  94. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  95. Threads remove unused variables and functions (#6566) — noreply / githubweb
  96. Remove unused Sandia testing files (#6568) — noreply / githubweb
  97. fallback implementation cleanup — donlee / githubweb
  98. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  99. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  100. Drop Clang+CUDA workaround — Damien L-G / githubweb
  101. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  102. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  103. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  104. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  105. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  106. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  107. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  108. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  109. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  110. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  111. OpenACC: add atomics support (#6446) — noreply / githubweb
  112. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  113. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  114. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  115. try fix — fnrizzi / githubweb
  116. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  117. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  118. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  119. Added missing operator* to NEON simd — crtrott / githubweb
  120. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  121. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  122. try fix — fnrizzi / githubweb
  123. avoid pyt package — fnrizzi / githubweb
  124. try — fnrizzi / githubweb
  125. fix for macos — fnrizzi / githubweb
  126. remove comments — fnrizzi / githubweb
  127. use reference — crtrott / githubweb
  128. add branching — fnrizzi / githubweb
  129. [ci skip] fix formatting — cezary.skrzynski / githubweb
  130. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  131. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  132. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  133. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  134. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  135. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  136. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  137. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  138. unorderedmap: modernize traits — romin.tomasetti / githubweb
  139. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb