Skip to content
Success

Changes

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified Makefile.kokkos (diff)
The file was modified core/src/Kokkos_Core.hpp (diff)
The file was modified cmake/kokkos_tribits.cmake (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was addedbenchmarks/gather/CMakeLists.txt
The file was modified benchmarks/gather/gather.hpp (diff)
The file was modified benchmarks/gather/main.cpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was addedbenchmarks/atomic/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was addedbenchmarks/policy_performance/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/main.cpp (diff)
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified .jenkins (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified core/src/CMakeLists.txt (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified CMakeLists.txt (diff)
The file was modified Makefile.kokkos (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestArray.cpp (diff)
The file was modified core/unit_test/TestAggregate.hpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified cmake/kokkos_tribits.cmake (diff)
The file was modified core/src/Kokkos_Core.hpp (diff)
The file was modified Makefile.kokkos (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/gather/main.cpp (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
The file was modified benchmarks/gather/gather.hpp (diff)
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was addedbenchmarks/atomic/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/policy_performance/CMakeLists.txt
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/main.cpp (diff)
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified .jenkins (diff)
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
The file was modified core/src/CMakeLists.txt (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified CMakeLists.txt (diff)
The file was modified Makefile.kokkos (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestArray.cpp (diff)
The file was modified core/unit_test/TestAggregate.hpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified Makefile.kokkos (diff)
The file was modified cmake/kokkos_tribits.cmake (diff)
The file was modified core/src/Kokkos_Core.hpp (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/gather/main.cpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was modified benchmarks/gather/gather.hpp (diff)
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was addedbenchmarks/atomic/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/policy_performance/CMakeLists.txt
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/main.cpp (diff)
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified .jenkins (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified core/src/CMakeLists.txt (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified Makefile.kokkos (diff)
The file was modified CMakeLists.txt (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestArray.cpp (diff)
The file was modified core/unit_test/TestAggregate.hpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified core/src/Kokkos_Core.hpp (diff)
The file was modified Makefile.kokkos (diff)
The file was modified cmake/kokkos_tribits.cmake (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/gather/gather.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
The file was modified benchmarks/gather/main.cpp (diff)
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/atomic/CMakeLists.txt
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was addedbenchmarks/policy_performance/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/main.cpp (diff)
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified .jenkins (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified core/src/CMakeLists.txt (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified Makefile.kokkos (diff)
The file was modified CMakeLists.txt (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestAggregate.hpp (diff)
The file was modified core/unit_test/TestArray.cpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified core/src/Kokkos_Core.hpp (diff)
The file was modified Makefile.kokkos (diff)
The file was modified cmake/kokkos_tribits.cmake (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/gather/gather.hpp (diff)
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was modified benchmarks/gather/main.cpp (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/atomic/CMakeLists.txt
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was addedbenchmarks/policy_performance/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/main.cpp (diff)
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified .jenkins (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
The file was modified core/src/CMakeLists.txt (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified Makefile.kokkos (diff)
The file was modified CMakeLists.txt (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestArray.cpp (diff)
The file was modified core/unit_test/TestAggregate.hpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified core/src/Kokkos_Core.hpp (diff)
The file was modified Makefile.kokkos (diff)
The file was modified cmake/kokkos_tribits.cmake (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/gather/gather.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
The file was modified benchmarks/gather/main.cpp (diff)
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/atomic/CMakeLists.txt
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/policy_performance/CMakeLists.txt
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified .jenkins (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
The file was modified core/src/CMakeLists.txt (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified CMakeLists.txt (diff)
The file was modified Makefile.kokkos (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestAggregate.hpp (diff)
The file was modified core/unit_test/TestArray.cpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified cmake/kokkos_tribits.cmake (diff)
The file was modified core/src/Kokkos_Core.hpp (diff)
The file was modified Makefile.kokkos (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was modified benchmarks/gather/main.cpp (diff)
The file was modified benchmarks/gather/gather.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was addedbenchmarks/atomic/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/policy_performance/CMakeLists.txt
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
The file was modified .jenkins (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified cmake/KokkosCore_config.h.in (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified core/src/CMakeLists.txt (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified CMakeLists.txt (diff)
The file was modified Makefile.kokkos (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestArray.cpp (diff)
The file was modified core/unit_test/TestAggregate.hpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified Makefile.kokkos (diff)
The file was modified cmake/kokkos_tribits.cmake (diff)
The file was modified core/src/Kokkos_Core.hpp (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/gather/gather.hpp (diff)
The file was modified benchmarks/gather/main.cpp (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was addedbenchmarks/atomic/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was modified benchmarks/CMakeLists.txt (diff)
The file was addedbenchmarks/policy_performance/CMakeLists.txt
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/main.cpp (diff)
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified .jenkins (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified core/src/CMakeLists.txt (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified Makefile.kokkos (diff)
The file was modified CMakeLists.txt (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestAggregate.hpp (diff)
The file was modified core/unit_test/TestArray.cpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)

Summary

  1. SYCL: Cleanup device selection (details)
  2. Adding SECURITY.md file (details)
  3. Cleanup KOKKOS_CONFIGURE_CORE (details)
  4. Avoid detecting unwanted fences in the parallel_scan_no_fence test (details)
  5. Add test for current CTAD support with RangePolicy (#6803) (details)
  6. multi-GPU support: Add test for all policies (#6782) (details)
  7. Implement sort_by_key (#6801) (details)
  8. Fix scorecard workflow (#6831) (details)
  9. Remove redundant RangePolicy constructor (#6841) (details)
  10. Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent (details)
  11. Remove variadic range policy constructor (#6845) (details)
  12. Avoid unused variable warning in TestRangePolicyCTAD.cpp (details)
  13. bytes_and_flops with CMake (details)
  14. bytes_and_flops fix a small bug in command line argument (details)
  15. add gather benchmark to CMake (details)
  16. add atomic benchmark to CMake (details)
  17. add policy_performance benchmark to CMake (details)
  18. fix some warnings in policy_performance benchmark (details)
  19. policy_benchmark: fix indentation (details)
  20. Benchmarks: disable 2 benchmarks for OpenMPTarget (details)
  21. bytes_and_flops: fix a counter name (details)
  22. Fix C-style cast (details)
  23. Fix amdclang++ compilation (#6857) (details)
  24. Fix fallback implementation for sort_by_key (#6856) (details)
  25. Update the unsafe implicit conversion error message in MDRangePolicy (#6855) (details)
  26. Split some classes from Kokkos_ViewMapping (#6859) (details)
  27. update mdspan tpl (details)
  28. CTAD (deduction guides) for RangePolicy (#6850) (details)
  29. CTAD (deduction guides) for MDRangePolicy (#5516) (details)
  30. Add support for rocThrust in sort when using HIP (#6793) (details)
  31. Add HIP specialization for sort-by-key (details)
  32. Fix unused variable warning in TestMDRangePolicyCTAD.cpp (details)
  33. Improve message on view out of bounds access and always abort (#6861) (details)
  34. Fix linking with rothrust in downstream applications (details)
  35. SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870) (details)
  36. Don't use Fedora development version in GitHub CI (details)
  37. kokkos_tpls.cmake: update default option to enable rocthrust (details)
  38. Scratch space fix for MultiGPU (#6866) (details)
  39. [ci skip] Bump version number to 4.3.99 (details)
  40. Fix Makefile.kokkos for Threads (details)
  41. Fix a bug in Makefile when using AMD GPU architectures (#6892) (details)
  42. Cuda: Fix configuring with CMake 3.28.4 (#6898) (details)
  43. Update Intel GPU architectures in Makefile (#6895) (details)
  44. Move `Kokkos::Array` tests to a more suitable place (#6905) (details)
  45. Fixed and simplified the unit test (details)
Commit 37962b3d2f9a26ce3fdcb80229ab3a73106f456b by Daniel Arndt
SYCL: Cleanup device selection
The file was modified core/src/SYCL/Kokkos_SYCL.cpp (diff)
Commit 2a8ac6f48a467e91b0abce5edf22b34fbfe50e4f by crtrott
Adding SECURITY.md file

This is based on the one from OSSF Scorecard
The file was addedSECURITY.md
Commit c3f0a2698327fbbc25a1f26674b8f60ae9eb701b by Daniel Arndt
Cleanup KOKKOS_CONFIGURE_CORE
The file was modified Makefile.kokkos (diff)
The file was modified cmake/kokkos_tribits.cmake (diff)
The file was modified core/src/Kokkos_Core.hpp (diff)
Commit e2c810e1f213a09bce152eba623b7f275c035ec8 by Daniel Arndt
Avoid detecting unwanted fences in the parallel_scan_no_fence test
The file was modified core/unit_test/tools/TestEventCorrectness.hpp (diff)
Commit 24f251a854b36c13804e1ac850e8ae2a9e2bcdac by noreply
Add test for current CTAD support with RangePolicy (#6803)

* Add test for current CTAD support with RangePolicy

Co-authored-by: Nevin Liber <nliber@anl.gov>

* Rework CTAD test to avoid "memeber <bla> was declared but never referenced" warnings with icpc 19

* Attempt to fix CI

* Attempt to eliminate maybe unused warning in icpc

* Disable CTAD tests for nvcc < 11.2
as compiler bugs prevent CTAD expressions inside decltype

---------

Co-authored-by: Nevin Liber <nliber@anl.gov>
Co-authored-by: Nevin ":-)" Liber <nliber+github@gmail.com>
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was addedcore/unit_test/TestRangePolicyCTAD.cpp
Commit 16a5ebe95e88bf103ff7459ca1b1ce9983426eed by noreply
multi-GPU support: Add test for all policies (#6782)

* Cuda multi-GPU support: Test with managed and unmanaged Views

* Move check for cuda_device

* Also test copying between devices

* Refactor using StreamsAndDevices

* Don't use shared_ptr
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit c90a9c6f7bf439b4e224a0eaabcc2af815980dc5 by noreply
Implement sort_by_key (#6801)

* Implement sort_by_key

* Address review comments

* Make passed in view const ref

* Fix _via_sort for OpenMPTarget

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>

* Rip out ROCThrust for now

* Few changes to address the comments

* Fix SYCL

* Fix SYCL implementation

---------

Co-authored-by: Daniel Arndt <arndtd@ornl.gov>
The file was modified algorithms/src/Kokkos_Sort.hpp (diff)
The file was addedalgorithms/src/sorting/Kokkos_SortByKeyPublicAPI.hpp
The file was addedalgorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp
The file was addedalgorithms/unit_tests/TestSortByKey.hpp
The file was modified algorithms/unit_tests/CMakeLists.txt (diff)
Commit 058c3a08e6b871f2ddfe3774f390ee473525df5e by noreply
Fix scorecard workflow (#6831)

* Add Scorecrad

* another one

* Don't run on pull requests

* Add comments back in
The file was modified .github/workflows/scorecard.yml (diff)
Commit 04a5334c699cb9b87293d27bc73090b3b7c13019 by noreply
Remove redundant RangePolicy constructor (#6841)

* Remove redundant RangePolicy constructor

* Disable test case for Cuda <11.2 completely
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 0cdc9eb768582bec4af8f0e44057e235ca8277f0 by Damien L-G
Bump Google Benchmark version v1.{6.2 -> 7.1} in CMake FetchContent
The file was modified core/perf_test/CMakeLists.txt (diff)
Commit 8b8de2cf47cd85005b6adafb8fe5c99b57c32c23 by noreply
Remove variadic range policy constructor (#6845)

* Removed variadic ctor from RangePolicy, as well as extra
set(...) methods.

* In RangePolicy:
Deprecated set(ChunkSize)
In the ctor that takes a ChunkSize, set the chunk size via
set_chunk_size instead of via member initializers (to be more consistent
with other code)

* Removed superfluous "inline" from two of the RangePolicy constructors
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
Commit dc524910d397feb7050b39f34d38dfd7d4f82b65 by Daniel Arndt
Avoid unused variable warning in TestRangePolicyCTAD.cpp
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit 277339090e02e6555cf4f8508bd4b1f78bb6d2a2 by cedric.chevalier
bytes_and_flops with CMake

Fix some warnings as well
The file was modified benchmarks/bytes_and_flops/bench_int32_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_int64_t.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_double.cpp (diff)
The file was addedbenchmarks/bytes_and_flops/CMakeLists.txt
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_float.cpp (diff)
The file was modified benchmarks/bytes_and_flops/bench_stride.hpp (diff)
Commit 5c9a4aa3ce9360892d8c558e19618357366b2640 by cedric.chevalier
bytes_and_flops fix a small bug in command line argument
The file was modified benchmarks/bytes_and_flops/main.cpp (diff)
Commit 932466f21364b7d99aa06d5e499c7e4bf1a1c00d by cedric.chevalier
add gather benchmark to CMake
The file was modified benchmarks/gather/gather.hpp (diff)
The file was modified benchmarks/CMakeLists.txt (diff)
The file was modified benchmarks/gather/gather_unroll.hpp (diff)
The file was modified benchmarks/gather/main.cpp (diff)
The file was addedbenchmarks/gather/CMakeLists.txt
Commit 16d2edbb34925d23602a69ca6866db5d783d2aa6 by cedric.chevalier
add atomic benchmark to CMake
The file was addedbenchmarks/atomic/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 750ef211ac68526a6677ed298300e6dc0c4b86a6 by cedric.chevalier
add policy_performance benchmark to CMake
The file was addedbenchmarks/policy_performance/CMakeLists.txt
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 97fa76f29d35c2dcb364473c3bc759510de45c59 by cedric.chevalier
fix some warnings in policy_performance benchmark
The file was modified benchmarks/policy_performance/policy_perf_test.hpp (diff)
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 715d6156e9729e2be930ff1c7845e32a27efe164 by cedric.chevalier
policy_benchmark: fix indentation
The file was modified benchmarks/policy_performance/main.cpp (diff)
Commit 4dcbff2cf4b6a4b7476b3cf647e7d0b8adf10808 by cedric.chevalier
Benchmarks: disable 2 benchmarks for OpenMPTarget

Apply Rahul suggestion to disable two benchmarks that are causing Internal Compiler Errors with OpenMPTarget.
The file was modified benchmarks/CMakeLists.txt (diff)
Commit 3d485c19da4aea22021df2673df2183d35fc87f2 by cedric.chevalier
bytes_and_flops: fix a counter name
The file was modified benchmarks/bytes_and_flops/bench_unroll_stride.hpp (diff)
The file was modified core/src/Serial/Kokkos_Serial_Task.hpp (diff)
Commit 99c7e1b1c430f2012eaea2f4bec90e8e0858b9f9 by noreply
Fix amdclang++ compilation (#6857)

* Fix amdclang++ compilation

* Add guards for hipcc
The file was modified cmake/kokkos_arch.cmake (diff)
Commit 9feb104d9b3ce86230f1b56c6207246f67f22de1 by noreply
Fix fallback implementation for sort_by_key (#6856)

* Fix fallback implementation for sort_by_key

* Guard with KOKKOS_ENABLE_ONEDPL

* Drop sort_on_device

* Improve wording

* Improve comment
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit c3c8a70d21ffe88f810bfe392fb5a0457c62c928 by noreply
Update the unsafe implicit conversion error message in MDRangePolicy (#6855)

* Updated the error message in MDRangePolicy about unsafe implicit conversions

* Addressed clang warnings
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)
Commit 9a7e7958ae6e35124546663936520f93f964d021 by noreply
Split some classes from Kokkos_ViewMapping (#6859)

* move ViewOffset and ViewDataAnalysis to a separate header

* only include Kokkos_Macro in ViewDataAnalysis
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was addedcore/src/impl/Kokkos_ViewDataAnalysis.hpp
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_extents.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/submdspan_mapping.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/extents.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded_fwd.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_stride.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_right.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/mdspan/mdspan.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/config.hpp (diff)
The file was addedtpls/mdspan/include/experimental/__p2642_bits/layout_padded.hpp
The file was modified tpls/mdspan/include/experimental/__p0009_bits/layout_left.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p0009_bits/compressed_pair.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p1684_bits/mdarray.hpp (diff)
The file was modified tpls/mdspan/include/experimental/__p2630_bits/strided_slice.hpp (diff)
Commit 6db04b3b5803fb7686475effe2b6af071b885697 by noreply
CTAD (deduction guides) for RangePolicy (#6850)

* Removed the check for NVCC compiler version when testing CTAD,
as decltype(RangePolicy(...)) isn't compiling under any version
of Cuda.

* Added in the explicit deduction guides for RangePolicy:
• Correctness when passing in an execution space
• Workaround for nvcc as RangePolicy<...> doesn't have any
template parameters that can be deduced, so gcc/clang assume that a
matching ctor in the primary template deduces to RangePolicy<> while
nvcc assumes it is a bug.

Rewrote the tests to be of the form:

[[maybe_unused]] static inline auto rpxy = RangePolicy(x, y);
static_assert(is_same_v<RangePolicy<WhatShouldBeHere>, decltype(rpxy)>);

This form avoids the most vexing parse.o, and seems to pass on all
compilers.

Added tests for SomeExecutionSpace which is guaranteed not to be
DefaultExecutionSpace.

Added calls to ImplicitlyConvertibleToDefaultExecutionSpace::operator
DefaultExecutionSpace() and SomeExecutionSpace::concurrency() to
avoid maybe_unused errors under some compilers.

Notes:

The default constructed CTAD RangePolicy uses list initialization syntax
(curlies) instead of parentheses to get around a gcc 8.2 compiler bug.

nestodes uses "Kokkos::DefaultExecutionSpace" instead of "auto" in order
to trigger the implicit conversion.
The file was modified core/src/Kokkos_ExecPolicy.hpp (diff)
The file was modified core/unit_test/TestRangePolicyCTAD.cpp (diff)
Commit cfc260ac0aa36b42626d51093e1098aefbe79da7 by noreply
CTAD (deduction guides) for MDRangePolicy (#5516)

* CTAD for MDRangePolicy

* WIP: Moved CTAD tests from TestMDRangePolicyConstructors.hpp to
TestMDRangePolicyCTAD.cpp

* WIP Added [[maybe_unused]] to all static inline variables in
CTAD tests

* Removed TEST_EXECSPACE from the MDRangePolicy CTAD tests
(as that isn't needed for compile time only tests) and
associated cleanup

* Marked
struct ImplicitlyConvertibleToDefaultExecutionSpace::operator Kokkos::DefaultExecutionSpace() const;
as [[maybe_unused]] to make it consistent with the
other policy CTAD tests (in other PRs)

* Added a Workaround for nvc++ (CUDA-11.7-NVHPC) ignoring
[[maybe_unused]] on

ImplicitlyConvertibleToDefaultExecutionSpace::operator
Kokkos::DefaultExecutionSpace() const

by defining it and implicitly calling it in another [[maybe_unused]] static inline
variable.

* Workaround for HIP-ROCm-5.2 "declared but never referenced"

* Added MDRangePolicy CTAD tests for initializer_lists
Fixed a comment
The file was modified core/src/KokkosExp_MDRangePolicy.hpp (diff)
The file was addedcore/unit_test/TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/CMakeLists.txt (diff)
Commit 35ad698e03594745f07728df0bcdaa49c5d0e7d4 by noreply
Add support for rocThrust in sort when using HIP (#6793)

* Use rocthrust in sort when using HIP

* Fix reviewer's comments

* Make sure that we don't compile Kokkos for every supported architecture when enabling rocthrust

* Export ROCTHRUST as a Kokkos dependency
The file was modified .jenkins (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
The file was modified cmake/KokkosCore_config.h.in (diff)
The file was addedcmake/Modules/FindTPLROCTHRUST.cmake
The file was modified algorithms/src/sorting/impl/Kokkos_SortImpl.hpp (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified core/src/HIP/Kokkos_HIP_Instance.cpp (diff)
Commit e5126e929450f52dfc18f7bd5b7b33fc221e5b48 by Bruno Turcksin
Add HIP specialization for sort-by-key
The file was modified algorithms/src/sorting/impl/Kokkos_SortByKeyImpl.hpp (diff)
Commit 3a765351c9aa01aa0d4a1bca0102f45a54b13dfe by Daniel Arndt
Fix unused variable warning in TestMDRangePolicyCTAD.cpp
The file was modified core/unit_test/TestMDRangePolicyCTAD.cpp (diff)
Commit a2b64e0e8e78fe4a257f32a871984997d48305f3 by noreply
Improve message on view out of bounds access and always abort (#6861)

* Add test for view out-of-bounds access

* [DO NOT MERGE] enable bound checking in view accesses unconditionally

* Abort on out-of-bounds access on the host side instead of throwing

* Improve out-of-bounds error message

* Add test for printing multidimensional index

* Try to get offsetview to compile

* Do not bother with the label on the device side

* Update test to reflect that device does not print the label

* Fixup OffsetView out-of-bounds on device

* Revert "Update test to reflect that device does not print the label"

This reverts commit 7dbecbf635386556d3557c78a495970b439a3e05.

* Revert "Do not bother with the label on the device side"

This reverts commit 40348e88d265f7dfaf42354db5be18db21d55808.

* Fixup UNMANAGED -> UNAVAILABLE

* Only enable the test when debug bound checking is enabled

* Revert "[DO NOT MERGE] enable bound checking in view accesses unconditionally"

This reverts commit ff9d411fa0e8898f3db60055a12f222e9836fc57.

* Fix typo formated -> formatted

* Drop unecesary cast to void and Kokkos::Impl:: qualification

* Add test with mixed integer types

* Improve function name check_bounds -> within_range

* Prefer right fold per review

* Per review drop immediately invoked lambda trick on the host side

* Silent warnings about tracker variable not being used on the device
The file was addedcore/unit_test/TestViewOutOfBoundsAccess.hpp
The file was modified core/src/impl/Kokkos_ViewMapping.hpp (diff)
The file was modified core/unit_test/CMakeLists.txt (diff)
The file was modified containers/src/Kokkos_OffsetView.hpp (diff)
Commit 8062a602070ce6d611b73c9eb626ebd7dc1a039b by Daniel Arndt
Fix linking with rothrust in downstream applications
The file was modified cmake/Modules/FindTPLROCTHRUST.cmake (diff)
Commit 5e7cab99b5789abe6a1b411134f41fd80bf1c740 by noreply
SYCL: Make sure to call find_dependency for oneDPL if necessary  (#6870)

* Make sure to call find_dependency for oneDPL if necessary

* Move linking with oneDPL to algorithms

* Don't guard KOKKOS_IMPORT_TPL by CMake language
The file was modified core/src/CMakeLists.txt (diff)
The file was modified cmake/Modules/FindTPLONEDPL.cmake (diff)
The file was modified algorithms/src/CMakeLists.txt (diff)
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit c1a800650e83da2eb515599f495f9bce2aaba80f by Daniel Arndt
Don't use Fedora development version in GitHub CI
The file was modified .github/workflows/continuous-integration-workflow.yml (diff)
Commit 49bd895ae27a53f06f72c7acf8a8f60043295ae1 by ndellin
kokkos_tpls.cmake: update default option to enable rocthrust

Avoid configuration issues with Trilinos of the form:
Make Error at kokkos/cmake/kokkos_tpls.cmake:29 (MESSAGE):
Enabled TPL ROCTHRUST inside TriBITS build, but this can only be enabled in
a standalone build
The file was modified cmake/kokkos_tpls.cmake (diff)
Commit 096e72437cf93d78d7a85414f811de8c7f608b0e by noreply
Scratch space fix for MultiGPU (#6866)

* Use CudaSpace::(de)allocate() instead of Kokkos_SharedAllocate.hpp for Cuda scratch memory

Fixes bug where all scratch mem was allocated on default device

* Add scratch test to MultiGPU testing

* fix typo in multi-gpu test

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>

* Use reductions for error accumulator

* Remove exec_ arg from ScratchFunctor (not needed anymore)

* avoid implicit capture of this in test

---------

Co-authored-by: Dong Hun Lee <59181952+ldh4@users.noreply.github.com>
The file was modified core/src/Cuda/Kokkos_Cuda_Instance.cpp (diff)
The file was modified core/unit_test/cuda/TestCuda_InterOp_StreamsMultiGPU.cpp (diff)
Commit 05bd485166c98372e254f4566978f12b9eef3eeb by Damien L-G
[ci skip] Bump version number to 4.3.99
The file was modified CMakeLists.txt (diff)
The file was modified Makefile.kokkos (diff)
Commit 872dc422ff8b77616ac5884f26b83ced61328ee0 by Daniel Arndt
Fix Makefile.kokkos for Threads
The file was modified Makefile.targets (diff)
Commit 2035e313d7a54f9e1572eb5f315249ea841fb258 by noreply
Fix a bug in Makefile when using AMD GPU architectures (#6892)

* Fix bug in Makefile when using AMD GPU architectures

* Fix indentation

* Fix documentation of the architecture to match the code
The file was modified Makefile.kokkos (diff)
Commit 8d734b0267284c3232e715355eb2c690d305ae3f by noreply
Cuda: Fix configuring with CMake 3.28.4 (#6898)

* Cuda: Fix configuring with CMake 3.29.0

* CMake 3.28.4 is also affected
The file was modified cmake/Modules/FindTPLCUDA.cmake (diff)
Commit 68c66846945c28e7b95ba099f2b4fbbe2b72fd63 by noreply
Update Intel GPU architectures in Makefile (#6895)

* Update Intel GPU architectures in Makefile

* Add some comments
The file was modified Makefile.kokkos (diff)
Commit 635551058d436885bf3757eb0bd1ea17bbf0abd4 by noreply
Move `Kokkos::Array` tests to a more suitable place (#6905)

* Move Kokkos::Array tests to a more suitable place

* Workaround bogous(?) compile error with Array::operator[] not being constexpr
The file was modified core/unit_test/TestArray.cpp (diff)
The file was modified core/unit_test/TestAggregate.hpp (diff)
Commit 879ca05604e287b75e817a60ace64e564b85b3b9 by donlee
Fixed and simplified the unit test
The file was modified core/unit_test/TestMDRangePolicyConstructors.hpp (diff)