Skip to content

Changes

#17 (Aug 28, 2024, 8:09:30 AM)

  1. Fix bogus icpc -Werror — ndellin / githubweb
  2. Fix macro guards — Daniel Arndt / githubweb

#16 (Aug 27, 2024, 10:37:19 PM)

  1. Use raw pointers for std::sort if possible — Daniel Arndt / githubweb
  2. core(graph): allow submission onto an arbitrary exec space instance — romin.tomasetti / githubweb
  3. Fix typo in macro guard — noreply / githubweb

#16 (Aug 27, 2024, 10:37:19 PM)

  1. Use raw pointers for std::sort if possible — Daniel Arndt / githubweb
  2. core(graph): allow submission onto an arbitrary exec space instance — romin.tomasetti / githubweb
  3. Fix typo in macro guard — noreply / githubweb

#15 (Aug 27, 2024, 3:20:37 PM)

  1. Make sure the graph death test has the `_DeathTest` suffix (#7262) — noreply / githubweb
  2. Allow extracting host and device views from DualView with const value type (#7242) — noreply / githubweb
  3. Split test — Daniel Arndt / githubweb

#13 (Aug 25, 2024, 11:37:05 AM)

  1. Fix DynRankView::operator[](index_type) constraint — Daniel Arndt / githubweb
  2. Move and rename Kokkos_View.hpp — crtrott / githubweb
  3. Add new Kokkos_View.hpp to include Kokkos_ViewLegacy.hpp — crtrott / githubweb
  4. Fix test — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#12 (Aug 23, 2024, 4:44:38 PM)

  1. SYCL: Address oneAPI 2025.0.0 deprecations — Daniel Arndt / githubweb
  2. Add workflow to create releases with SLSA provenance generation — Damien L-G / githubweb
  3. Drop (unused) .clang-format-ignore file — Damien L-G / githubweb
  4. Remove (unused) code coverage configuration file — Damien L-G / githubweb
  5. Remove master_history.txt — Damien L-G / githubweb
  6. Prefer dot-file-style build configuration file with AppVeyor — Damien L-G / githubweb
  7. Remove obscure test scripts — Damien L-G / githubweb
  8. Update CONTRIBUTING and remove outdated comment to reflect our using clang-foramt-16 [ci ckip] — Damien L-G / githubweb
  9. Drop outdated file describing how to snapshot Kokkos into Trilinos [ci skip] — Damien L-G / githubweb
  10. Ignore codebase formatting commits in the blame view (#7204) — noreply / githubweb
  11. Update error message clang-format version not 16 (#7208) — noreply / githubweb
  12. [ci skip] Do not add CI and other tools configuration files to archive files — Damien L-G / githubweb
  13. GitHub CI: Use hashes for versions — Daniel Arndt / githubweb
  14. OpenMPTarget: FunctorAdapter to centralize tag evaluation. (#7200) — noreply / githubweb
  15. Pin dependencies in Dockerfiles (#7210) — noreply / githubweb
  16. OpenMPTarget: FunctorAdapter bug in AMD GPU code path. — rgayatri / githubweb
  17. Bump actions/checkout from 4.1.6 to 4.1.7 — noreply / githubweb
  18. Bump actions/upload-artifact from 4.3.5 to 4.3.6 — noreply / githubweb
  19. Bump github/codeql-action from 3.25.6 to 3.26.0 — noreply / githubweb
  20. Bump DoozyX/clang-format-lint-action from 0.17 to 0.18 — noreply / githubweb
  21. Simplify constraints for trivial types — Damien L-G / githubweb
  22. Drop pointless constrait on Impl::ViewValueFunctor<ScalarType>::construct_shared_allocation() — Damien L-G / githubweb
  23. Specialize Impl::ViewValueFunctor for trivial types — Damien L-G / githubweb
  24. Improve fence labels in View init and delete — Damien L-G / githubweb
  25. Add functions for retrieving the default tile sizes in MDRangePolicy (#6839) — noreply / githubweb
  26. Fix ub in bin sort when sorting within bins on host (#7223) — noreply / githubweb
  27. Hands off reserved identifiers (#7224) — noreply / githubweb
  28. Hide IMPL_CUDA_MALLOC_ASYNC option when CUDA is not enabled — Damien L-G / githubweb
  29. Mark HIP_MULTIPLE_KERNEL_INSTANTIATIONS and IMPL_HIP_UNIFIED_MEMORY options as backend specific — Damien L-G / githubweb
  30. Improve further view initialization/destruction (#7225) — noreply / githubweb
  31. Cuda: Check if device support cudaMallocAsync (#7217) — noreply / githubweb
  32. Cherry-pick 4.4.00 changelog — Damien L-G / githubweb
  33. Bump github/codeql-action from 3.26.0 to 3.26.2 — noreply / githubweb
  34. Improve GH action to produce release artifacts (#7231) — noreply / githubweb
  35. Fix clang-tidy header guard to ignore tpls — Daniel Arndt / githubweb
  36. Fix misspelled cmake variable, for some systems leading to not compiling all tests — crtrott / githubweb
  37. Fix overlooked naming test in #7222 — crtrott / githubweb
  38. Introduce new `SequentialHostInit` view allocation property (#7229) — noreply / githubweb
  39. Adding occupancy tuning for CUDA architectures (#6788) — noreply / githubweb
  40. Fixup [Experimental::]SYCL — Damien L-G / githubweb
  41. Use macros to protect use of SharedSpace — crtrott / githubweb
  42. core(graph): promote `instantiate` to public API — romin.tomasetti / githubweb
  43. Enforce modernize-type-traits (#7227) — noreply / githubweb
  44. typo: kokkkos -> kokkos — romin.tomasetti / githubweb
  45. core(view): aligning `HostMirror` and `host_mirror_type` — romin.tomasetti / githubweb
  46. Add support for CUDA unified memory architectures i.e. Grace Hopper (#6823) — noreply / githubweb
  47. Using CUDA limits to set extents for blocks,grids (#7235) — noreply / githubweb
  48. Fix some more clang-tidy complains — Daniel Arndt / githubweb
  49. Update nightly CI from ROCm 6.1 to ROCm 6.2 — Bruno Turcksin / githubweb
  50. Don't use modulo — Daniel Arndt / githubweb

#11 (Aug 8, 2024, 10:52:26 AM)

  1. SYCL: Use sycl::shift_group_[left|right] and sycl::select_from_group (#7146) — noreply / githubweb
  2. Hidden friend operator== for Kokkos::Array (#7148) — noreply / githubweb
  3. OpenMPTarget: Update docker clang build. (#7147) — noreply / githubweb
  4. Make struct "ChunkSize" constructor explicit to avoid implicit construction in RangePolicy (#7151) — noreply / githubweb
  5. Fix Kokkos::Array<T, 0> default initialization for icpc (#7154) — noreply / githubweb
  6. Make ExecutionSpace constructors explicit (#7156) — noreply / githubweb
  7. Bump ossf/scorecard-action from 2.3.3 to 2.4.0 — noreply / githubweb
  8. Fix Kokkos_CoreUnitTest_DeviceAndThreads (#7159) — noreply / githubweb
  9. Add nvidia Grace Architecture (#7158) — noreply / githubweb
  10. tutorials: do not mention requiring c++11 — timo.heister / githubweb
  11. Enable deprecation warnings in the GCC 8.4 build — Damien L-G / githubweb
  12. Disable deprecated warnings with GCC < 11.1 for Pair<T1, void> — Damien L-G / githubweb
  13. Prefer ExecutionSpace::name() to a typeid expression in hello world — Damien L-G / githubweb
  14. OpenMPTarget: Delete ununsed code. — rgayatri / githubweb
  15. Implement KOKKOS_ENABLE_IMPL_VIEW_OF_VIEWS_DESTRUCTOR_PRECONDITION_VIOLATION_WORKAROUND (#7168) — noreply / githubweb
  16. Hide `IMPL_REF_COUNT_BRANCH_UNLIKELY` option (#7175) — noreply / githubweb
  17. [ci skip] Bump develop to version 4.4.99 — Damien L-G / githubweb
  18. remove usage of ENABLE_CXX11_DISPATCH_LAMBDA (#7176) — noreply / githubweb
  19. Add support for AMD Phoenix APUs with Radeon 740M/760M/780M/880M/890M (#7162) — noreply / githubweb
  20. mention indent/formatting script — timo.heister / githubweb
  21. add missing tutorials to CMake configuration — timo.heister / githubweb
  22. add NOLINT statement — timo.heister / githubweb
  23. Fix bogus warnings for cuda/11.4 with gcc/8.5 (#7181) — noreply / githubweb
  24. OpenMPTarget: Remove OpenMPTargetExec (#6594) — noreply / githubweb
  25. Avoid nesting fences into parallel_for when initializing/deleting views — Damien L-G / githubweb
  26. Fix atomic accessor for pre-volta GPU architectures (#7189) — noreply / githubweb
  27. Bump actions/upload-artifact from 4.3.4 to 4.3.5 — noreply / githubweb
  28. OpenMPTarget: DeepCopy in separate file. (#7192) — noreply / githubweb
  29. Move SYCL out of Experimental (#7171) — noreply / githubweb
  30. clang-format 16 — Daniel Arndt / githubweb

#10 (Jul 22, 2024, 4:25:43 PM)

  1. Fix HIP — Daniel Arndt / githubweb

#9 (Jul 22, 2024, 4:23:38 PM)

  1. Fix HIP — Daniel Arndt / githubweb

#8 (Jul 22, 2024, 2:15:48 PM)

  1. Fix sign comparison warnings — Daniel Arndt / githubweb

#5 (Jan 29, 2024, 7:27:58 PM)

  1. Add runtime function to query the number of devices and make device ID consistent with `KOKKOS_VISIBLE_DEVICES` (#6713) — noreply / githubweb
  2. Clean up test case — Daniel Arndt / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb

#3 (Jan 25, 2024, 10:41:36 AM)

  1. implementation and tests — fnrizzi / githubweb
  2. implementation and tests — fnrizzi / githubweb
  3. only compute with relavent entries — tccleve / githubweb
  4. subset of team level impl of std algorithms — fnrizzi / githubweb
  5. fix copyright — fnrizzi / githubweb
  6. guard for openmptarget — fnrizzi / githubweb
  7. fix for openmptarget — fnrizzi / githubweb
  8. address comments — fnrizzi / githubweb
  9. SIMD: suppress a uninitialized variable warning (#6294) — noreply / githubweb
  10. OpenACC CMakechange Clacc (#6250) — noreply / githubweb
  11. `Kokkos::sort` support custom comparator (#6253) — noreply / githubweb
  12. Add nightly build using latest gcc and c++23 — Bruno Turcksin / githubweb
  13. Explicitly capture this in lambda function — Bruno Turcksin / githubweb
  14. Fix typo in nightly jenkins configuration — Bruno Turcksin / githubweb
  15. Fix a memory bug in the `free_state` function of random pools. (#6290) — noreply / githubweb
  16. std_algos: for_each: try condense the impl — fnrizzi / githubweb
  17. format — fnrizzi / githubweb
  18. Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants — Daniel Arndt / githubweb
  19. Use KOKKOS_IF_ON_HOST — Daniel Arndt / githubweb
  20. Suppress warnings — Daniel Arndt / githubweb
  21. Don't suppress warnings for NVHPC — Daniel Arndt / githubweb
  22. Improve macro definitions — Daniel Arndt / githubweb
  23. Enable Serial backend in HPX build — cezary.skrzynski / githubweb
  24. Modify fences in View API test — cezary.skrzynski / githubweb
  25. Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159) — noreply / githubweb
  26. Improve SYCL reduction performance: MDRangePolicy (#6271) — noreply / githubweb
  27. bring back previous code as discussed in meeting — fnrizzi / githubweb
  28. create cudaAPI function wrappers — tccleve / githubweb
  29. Reorganize #include <cuda_runtime_api.h> — tccleve / githubweb
  30. Some api function require cuda11.2+ — tccleve / githubweb
  31. Cuda10 requires "stream=nullptr" as default arg — tccleve / githubweb
  32. Rework stream inputs — tccleve / githubweb
  33. Use "if constexpr" for setCudaDevice — tccleve / githubweb
  34. Remove static in comment — tccleve / githubweb
  35. Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314) — noreply / githubweb
  36. add helper variable templates `are_*_iterators_v` (#6312) — noreply / githubweb
  37. [HIP] Optimize parallel_reduce (#6229) — noreply / githubweb
  38. Split Kokkos_SYCL_Parallel_Range.hpp — Daniel Arndt / githubweb
  39. Split Kokkos_SYCL_Parallel_Reduce.hpp — Daniel Arndt / githubweb
  40. Move Kokkos_SYCL_Scan.hpp — Daniel Arndt / githubweb
  41. Split Kokkos_Parallel_Team.hpp — Daniel Arndt / githubweb
  42. Add comment that we don't need to link against clang_rt with ROCm 5.5 — Bruno Turcksin / githubweb
  43. Error out when CXX standard is not set when using amdclang or cray clang — Bruno Turcksin / githubweb
  44. Disable failing tests for ROCm 5.5 and 5.6 — Bruno Turcksin / githubweb
  45. #5635: Add parallel_scan overload with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  46. #5635: Add test for parallel_scan with return value for ThreadVectorRange — arek.szczepkowicz / githubweb
  47. #5635: Serial/OpenMP:  Parallel_scan with return value for TeamThreadRange (#6090) — noreply / githubweb
  48. Fix reviewer's comments — Bruno Turcksin / githubweb
  49. Allow using the SYCL execution space on AMD GPUs (#6321) — noreply / githubweb
  50. Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318) — noreply / githubweb
  51. SIMD: add shift ops for all int types (#6109) — noreply / githubweb
  52. SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on  (#6300) — noreply / githubweb
  53. remove spurious undefs — fnrizzi / githubweb
  54. Decrease maximum memory available to ccache — Bruno Turcksin / githubweb
  55. Fix gtest when using C++20 — Bruno Turcksin / githubweb
  56. Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces — Daniel Arndt / githubweb
  57. address reviews [skip ci] — antoine.meyer54 / githubweb
  58. formatting — fnrizzi / githubweb
  59. SIMD: split simd unit tests into separate files (#6278) — noreply / githubweb
  60. Run NVHPC only on V100 — Bruno Turcksin / githubweb
  61. Use checked arithmetic builtins for overflow detection (#6313) — noreply / githubweb
  62. Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273) — noreply / githubweb
  63. allow sorting via native oneDPL  to support views with stride = 1 (#6322) — noreply / githubweb
  64. Adopt new HIP cmake's way of finding clang-rt — nicurtis / githubweb
  65. Fix #6334: intel compiler does not like returning inside if constexpr (#6335) — noreply / githubweb
  66. Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337) — noreply / githubweb
  67. fix lambda capture — fnrizzi / githubweb
  68. remove unnecessary file, fix constraints — fnrizzi / githubweb
  69. remove file — fnrizzi / githubweb
  70. fix — fnrizzi / githubweb
  71. fix lambda capture and constraints — fnrizzi / githubweb
  72. keep only subset — fnrizzi / githubweb
  73. revert files — fnrizzi / githubweb
  74. remove file — fnrizzi / githubweb
  75. fix syntax — fnrizzi / githubweb
  76. format — fnrizzi / githubweb
  77. Disable default oneDPL support in Trilinos — Daniel Arndt / githubweb
  78. Rename AMD GPU architectures (#6266) — noreply / githubweb
  79. Fix compiling SIMD libraray with NEON and gcc-13 — Daniel Arndt / githubweb
  80. SIMD: add generator constructors (#6347) — noreply / githubweb
  81. Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability — mikael.simberg / githubweb
  82. Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346) — noreply / githubweb
  83. Use std::aligned_alloc for allocations (#6341) — noreply / githubweb
  84. Initial implementation of gfx942 (#6358) — noreply / githubweb
  85. Introduce constructor for multi-GPU support. — Daniel Arndt / githubweb
  86. Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled — Daniel Arndt / githubweb
  87. Extend 'hip_driver_check_page_migration' (#6364) — noreply / githubweb
  88. Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation — mikael.simberg / githubweb
  89. Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers — mikael.simberg / githubweb
  90. Add support for HIP Graph — Bruno Turcksin / githubweb
  91. Replace one of the ROCM 5.2 configuration with ROCM 5.6 — Bruno Turcksin / githubweb
  92. Use aligned version of operator new instead of aligned_alloc — Daniel Arndt / githubweb
  93. HPX: Don't interfere with exception handling — Daniel Arndt / githubweb
  94. Fix -Wformat-truncation warnings in CI (#6354) — noreply / githubweb
  95. Do not use HIP Graph with ROCm 5.2 — Bruno Turcksin / githubweb
  96. Use constexpr West in src — Bruno Turcksin / githubweb
  97. Use constexpr West in test — Bruno Turcksin / githubweb
  98. SIMD: convert binary operators to hidden friends (#6320) — noreply / githubweb
  99. simd: make mask and condition unit test to check with all data types (#6360) — noreply / githubweb
  100. team-level std algos: part 2 (#6205) — noreply / githubweb
  101. Add deprecated attribute to HostSpace(AllocationMechanism) definition — Damien L-G / githubweb
  102. Added gather_from and scatter_to for AVX2 and AVX512 simd — donlee / githubweb
  103. Rebased and applied feedbacks — donlee / githubweb
  104. Changed gen ctor usage to pass in KOKKOS_LAMBDA — donlee / githubweb
  105. Suppress NVCC attribute ignored warning on SIMD operators as hidden friends — Damien L-G / githubweb
  106. Remove deprecated code 3 support for volatile join — crtrott / githubweb
  107. Disable a test not working with nvhpc-23.1 — crtrott / githubweb
  108. Reenabling tests for nvhpc 23.7 — crtrott / githubweb
  109. Update containers and algorithms for NVC++ 23.7 — crtrott / githubweb
  110. More NVC++ 23.7 updates — crtrott / githubweb
  111. NVC++ clang-format fixes — crtrott / githubweb
  112. Update nvhpc to version 23.7 in the CI — crtrott / githubweb
  113. NVHPC 23.7 update: address reviewer comments — crtrott / githubweb
  114. OpenACC: Guard tests relying on abort — crtrott / githubweb
  115. Fix TestAtomic to use the test execspace — crtrott / githubweb
  116. Use NVHPC 23.7 for testing of OpenACC — crtrott / githubweb
  117. Update nvhpc gtest skip message — crtrott / githubweb
  118. Work around OpenMPTarget failure — crtrott / githubweb
  119. Update base docker file for nvhpc — crtrott / githubweb
  120. Remove stray Cuda graph pattern specialization from tag — Damien L-G / githubweb
  121. Fix typo — noreply / githubweb
  122. Fix reviewer's comments — Bruno Turcksin / githubweb
  123. Fix uninitialize variable warning with gcc 13 — Bruno Turcksin / githubweb
  124. HIP: Update print_configuration (#6387) — noreply / githubweb
  125. Add test — Daniel Arndt / githubweb
  126. Fix typo. — noreply / githubweb
  127. Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248) — noreply / githubweb
  128. Don't use local headers or runtime in HPX backend due to deprecation — mikael.simberg / githubweb
  129. Explicitly check for valid device id — Daniel Arndt / githubweb
  130. Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377) — noreply / githubweb
  131. team-level std algos: part 3 (#6207) — noreply / githubweb
  132. Adding is_scoped_enum & to_underlying (#6356) — noreply / githubweb
  133. SIMD: add float simd support (#6177) — noreply / githubweb
  134. team-level std algos: part 4 (#6208) — noreply / githubweb
  135. Added a gen ctor for float (#6397) — noreply / githubweb
  136. team-level std algos: part 5 (#6209) — noreply / githubweb
  137. Deprecate Cuda(cudaStream_t, bool) — Damien L-G / githubweb
  138. Fixup checked interger operations death test — Damien L-G / githubweb
  139. Deprecate HIP(hipStream_t, bool) — Damien L-G / githubweb
  140. Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default — Damien L-G / githubweb
  141. Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined — Damien L-G / githubweb
  142. Introduce disable_malloc_async Cuda option with generated makefiles — Damien L-G / githubweb
  143. Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC — Damien L-G / githubweb
  144. Use archive extraction time for timestamps — cezary.skrzynski / githubweb
  145. Disable performance benchmarks in AppVeyor CI — cezary.skrzynski / githubweb
  146. team-level std algos: part 6 (#6210) — noreply / githubweb
  147. address comments — fnrizzi / githubweb
  148. OpenMP backend refactor files. (#6403) — noreply / githubweb
  149. Drop (unused) `Cuda::cuda_internal_maximum_shared_words` — Damien L-G / githubweb
  150. Drop check that the host backend is initialized before the Cuda/HIP/SYCL one — Damien L-G / githubweb
  151. Drop unused HIPInternal::m_maxSharedWords data member — Damien L-G / githubweb
  152. Drop unused HIPInternal::m_hipArch static data member — Damien L-G / githubweb
  153. !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize — Damien L-G / githubweb
  154. Drop pre-Kepler logic in Cuda::impl_initialize — Damien L-G / githubweb
  155. use single — fnrizzi / githubweb
  156. address comments — fnrizzi / githubweb
  157. formatting — fnrizzi / githubweb
  158. Team-level std algos: part 7 (#6211) — noreply / githubweb
  159. formatting — fnrizzi / githubweb
  160. Enable death tests for fedora rawhide — cezary.skrzynski / githubweb
  161. core/src: Add half math functions to private header (#6124) — noreply / githubweb
  162. Drop check whether device supports unified addressing — Damien L-G / githubweb
  163. fix single as per Christian's suggestion — fnrizzi / githubweb
  164. Only warn once (at initialization) when forcing allocation in unified memory — Damien L-G / githubweb
  165. check-copyright improvements (#6399) — noreply / githubweb
  166. Use execution space instance argument to get device properties in block size deduction — Damien L-G / githubweb
  167. Address reviewer' comments — Bruno Turcksin / githubweb
  168. Fix to avoid #186-D pointless comparison warning. — maarten.arnst / githubweb
  169. add comment — fnrizzi / githubweb
  170. improve tests to address review — fnrizzi / githubweb
  171. Fix guard for isnan test for bhalf_t — Daniel Arndt / githubweb
  172. avoid potential race condition HIP — tccleve / githubweb
  173. Set the device id in cuda_kernel_arch — Daniel Arndt / githubweb
  174. [SYCL][Reduction] Group counter should use at least memory_order::acq_rel — andrei.elovikov / githubweb
  175. Same for scan — andrei.elovikov / githubweb
  176. Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy — Daniel Arndt / githubweb
  177. improve tests with intra-team result check — fnrizzi / githubweb
  178. Fixes for Kokkos::Array (#6372) — noreply / githubweb
  179. try fix for unique, previous impl to remove later — fnrizzi / githubweb
  180. #5635: Add parallel_scan changes for CUDA and TeamThreadRange — cezary.skrzynski / githubweb
  181. remove old impl — fnrizzi / githubweb
  182. #5635: Enable TeamThreadRange test for CUDA — cezary.skrzynski / githubweb
  183. Clean up benchmarks/gups — cwpears / githubweb
  184. benchmark/gups: use CMake — cwpears / githubweb
  185. OpenMPTarget: Disable check for SIMD compiler macros — Daniel Arndt / githubweb
  186. #5635: Add parallel_scan with value for CUDA and ThreadVectorRange — cezary.skrzynski / githubweb
  187. add missing assert — fnrizzi / githubweb
  188. #5635: Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  189. add intra team check for missing test — fnrizzi / githubweb
  190. fix intel compile error — fnrizzi / githubweb
  191. fix unreachable for intel — fnrizzi / githubweb
  192. re-enable unit tests for sort and random via makefile (#6422) — noreply / githubweb
  193. OpenMPTarget init-join fix (#6444) — noreply / githubweb
  194. Fix Cuda parallel_scan ThreadVectorRange range — Daniel Arndt / githubweb
  195. Assign final sum in Cuda parallel_scan ThreadVectorRange — Daniel Arndt / githubweb
  196. Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443) — noreply / githubweb
  197. std_algos: improving min, max, minmax (#6421) — noreply / githubweb
  198. team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426) — noreply / githubweb
  199. Skip bessel function tests known to fail on Intel GPUs (#6434) — noreply / githubweb
  200. team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436) — noreply / githubweb
  201. Fix race condition in functor_vec_scan_ret_val test — Daniel Arndt / githubweb
  202. Fix parallel_scan_with_reducers test — Daniel Arndt / githubweb
  203. team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425) — noreply / githubweb
  204. improve tests (#6432) — noreply / githubweb
  205. improve tests (#6437) — noreply / githubweb
  206. Move final assignment to correct scope — cezary.skrzynski / githubweb
  207. fix casting warning in Random test — fnrizzi / githubweb
  208. Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449) — noreply / githubweb
  209. HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7 — Daniel Arndt / githubweb
  210. fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459) — noreply / githubweb
  211. improve tests to check intra-team result (#6431) — noreply / githubweb
  212. SIMD: Math functions should be in namespace Kokkos — Daniel Arndt / githubweb
  213. SYCL: Disable another bessel function test for Intel GPUs — Daniel Arndt / githubweb
  214. Check for default device — Daniel Arndt / githubweb
  215. team-level std algos: part 10 (#6256) — noreply / githubweb
  216. team-level std algos: part 11 (#6258) — noreply / githubweb
  217. #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302) — noreply / githubweb
  218. #5635: Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  219. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  220. use shortcut — cezary.skrzynski / githubweb
  221. Skip testing for non-power-of-two team sizes — cezary.skrzynski / githubweb
  222. Fix formatting — cezary.skrzynski / githubweb
  223. Add parallel_scan overloads with value for HIP backend — cezary.skrzynski / githubweb
  224. Use std::is_same_v — cezary.skrzynski / githubweb
  225. #5635: Move some tests for parallel_scan to TestTeamScan — cezary.skrzynski / githubweb
  226. #5635: SYCL: Add parallel_scan overload with return value — cezary.skrzynski / githubweb
  227. Use std::is_same_v — cezary.skrzynski / githubweb
  228. OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1 — Daniel Arndt / githubweb
  229. Add compatible copy assignment operator to DualView — maarten.arnst / githubweb
  230. Fix minimum version for Google benchmark — Daniel Arndt / githubweb
  231. Add test of copy constructor/assignment operator for DualView. — maarten.arnst / githubweb
  232. Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU — maarten.arnst / githubweb
  233. Modify test so that source and destination view are of different type — maarten.arnst / githubweb
  234. Use call operator instead of run_me function — maarten.arnst / githubweb
  235. team-level std algos: part 12 (#6350) — noreply / githubweb
  236. core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407) — noreply / githubweb
  237. Moving abort and assert into their own public headers (#6445) — noreply / githubweb
  238. Add test for parallel_scan with return value for ThreadVectorRange — cezary.skrzynski / githubweb
  239. Add parallel_scan overloads with value for Threads — cezary.skrzynski / githubweb
  240. Allow detecting SIMD types based on compiler macros  (#6188) — noreply / githubweb
  241. Add KOKKOS_ARCH_ARM_NEON — Daniel Arndt / githubweb
  242. Fix implementation for cyl_bessel_i0 — Daniel Arndt / githubweb
  243. cleaning: remove iostream from headers where possible (IWYU) — romin.tomasetti / githubweb
  244. Fix compiling SIMD unit tests on NVIDIA — Daniel Arndt / githubweb
  245. guards to ensure DBL_EPSILON return for POWER8,9 — ajpowel / githubweb
  246. Fix NVCC warnings (#6483) — noreply / githubweb
  247. team-level std algos: part 13 (#6351) — noreply / githubweb
  248. Also fix annotations for generator constructor for AVX512 and NEON — Daniel Arndt / githubweb
  249. #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange — cezary.skrzynski / githubweb
  250. Fix atomic operations bug for Min and Max (#6435) — noreply / githubweb
  251. Fix example/build_cmake_installed_different_compiler — Daniel Arndt / githubweb
  252. fix impl — fnrizzi / githubweb
  253. Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp — noreply / githubweb
  254. Split Kokkos_Threads_Parallel files — Bruno Turcksin / githubweb
  255. HPX: Implement TeamThread and ThreadVector parallel_scan with return value — Daniel Arndt / githubweb
  256. Serial: Allow for distinct execution space instances (#6441) — noreply / githubweb
  257. add overload for TeamThreadRange — fnrizzi / githubweb
  258. address review comment — fnrizzi / githubweb
  259. Update to HIP TeamPolicy Block number heuristic (#6284) — noreply / githubweb
  260. SIMD: Split math functions from SIMD_Common.hpp (#6487) — noreply / githubweb
  261. Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON — Daniel Arndt / githubweb
  262. ad threadvector — fnrizzi / githubweb
  263. fix order — fnrizzi / githubweb
  264. remove guards — fnrizzi / githubweb
  265. UnorderedMap(space instance): proposal for #6067 — romin.tomasetti / githubweb
  266. Rename Kokkos_ThreadsExec to align with the other backends — Bruno Turcksin / githubweb
  267. Promote Kokkos_Printf.hpp to public include — Daniel Arndt / githubweb
  268. simd: add floor, ceil, round, trunc operations (#6393) — noreply / githubweb
  269. Update CI in OpenMPTarget to use llvm-17  (#6472) — noreply / githubweb
  270. Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp — Bruno Turcksin / githubweb
  271. Kokkos_SIMD_Scalar.hpp: remove extra ';' — ndellin / githubweb
  272. add guards — fnrizzi / githubweb
  273. avoid auto — fnrizzi / githubweb
  274. [ci skip] Update Kokkos version to 4.2.99 — Daniel Arndt / githubweb
  275. [deprecated code 3] remove all default device init tests — Damien L-G / githubweb
  276. [deprecated code 3] remove InitArguments — Damien L-G / githubweb
  277. [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros — Damien L-G / githubweb
  278. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax — Damien L-G / githubweb
  279. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  280. [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants — Damien L-G / githubweb
  281. [deprecated code 3] remove {OpenMP,HPX}::partition_master — Damien L-G / githubweb
  282. OpenMP backend cleanup following removal of deprecated code 3 — Damien L-G / githubweb
  283. Get rid of FIXME_OPENMP — Damien L-G / githubweb
  284. [deprecated code 3] remove MasterLock — Damien L-G / githubweb
  285. [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option — Damien L-G / githubweb
  286. fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions — Damien L-G / githubweb
  287. Do not append " - blocks" to the bitset label — Damien L-G / githubweb
  288. with_updated_label -> append_to_label — Daniel Arndt / githubweb
  289. SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables — Daniel Arndt / githubweb
  290. Fixup in README (github -> GitHub) — Damien L-G / githubweb
  291. Check that device associated with stream matches requested device — Daniel Arndt / githubweb
  292. Threads: replace enum with constexpr int and enum class (#6514) — noreply / githubweb
  293. Added unit tests for reduction ops and few intel svml intrinsics — donlee / githubweb
  294. Remove sleep and wake functions — Bruno Turcksin / githubweb
  295. Remove extra constructor — Daniel Arndt / githubweb
  296. Prefer defaulted default constructor for Bitset (#6524) — noreply / githubweb
  297. SYCL: Use host-pinned memory to copy reduction/scan result (#6500) — noreply / githubweb
  298. Remove unused variables — Bruno Turcksin / githubweb
  299. Remove Sentinel struct from Threads — Bruno Turcksin / githubweb
  300. Small cleanup of ThreadsInternal::initialize — Bruno Turcksin / githubweb
  301. Remove useless forward declaration — Bruno Turcksin / githubweb
  302. Remove spawn function — Bruno Turcksin / githubweb
  303. Add comments — Bruno Turcksin / githubweb
  304. Fix indentation — Bruno Turcksin / githubweb
  305. Fix typo in macro guard — Bruno Turcksin / githubweb
  306. Reduce number of View constructor instantiations — Damien L-G / githubweb
  307. Bump HPX version used in CI to 1.9.0 — mikael.simberg / githubweb
  308. Split files in HIP backend — Bruno Turcksin / githubweb
  309. Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544) — noreply / githubweb
  310. Only define STDALGO_TEAM_SOURCES_* once — Daniel Arndt / githubweb
  311. Remove logical memory spaces — Damien L-G / githubweb
  312. Rollback changes to view constructors to reduce the number of instantiations (#6564) — noreply / githubweb
  313. Address reviewer comments — Daniel Arndt / githubweb
  314. Threads remove unused variables and functions (#6566) — noreply / githubweb
  315. Remove unused Sandia testing files (#6568) — noreply / githubweb
  316. fallback implementation cleanup — donlee / githubweb
  317. Remove empty quotation marks for static_assert — Daniel Arndt / githubweb
  318. [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header — Damien L-G / githubweb
  319. Drop Clang+CUDA workaround — Damien L-G / githubweb
  320. OpenMPTarget: CI compiler upgrade. (#6545) — noreply / githubweb
  321. m_cudaDev isn't static anymore — Daniel Arndt / githubweb
  322. Add crtrott's launch_latency benchmark (#6379) — noreply / githubweb
  323. Simplify fence functions in the Threads backend (#6571) — noreply / githubweb
  324. simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574) — noreply / githubweb
  325. Set the device id explicitly for CUDA API calls in impl_initialize — Daniel Arndt / githubweb
  326. OpenMPTarget: Guard scratch memory usage in ParallelReduce — Daniel Arndt / githubweb
  327. SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534) — noreply / githubweb
  328. Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header — Damien L-G / githubweb
  329. Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556) — noreply / githubweb
  330. OpenMP: No memset in viewfill (#6573) — noreply / githubweb
  331. Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header" — noreply / githubweb
  332. OpenACC: add atomics support (#6446) — noreply / githubweb
  333. Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543) — noreply / githubweb
  334. kokkos(unique): fix allocation of temporary view to enfore using the provided space instance — romin.tomasetti / githubweb
  335. Use binary wrapper for consistency in definition of half types numeric traits (#6590) — noreply / githubweb
  336. try fix — fnrizzi / githubweb
  337. Fix TestNumericTriats.hpp for SYCL with bfloat16 support — Daniel Arndt / githubweb
  338. Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3 — crtrott / githubweb
  339. Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601) — noreply / githubweb
  340. Added missing operator* to NEON simd — crtrott / githubweb
  341. [ci skip] Update changelog on develop for 4.2.00 (#6592) — noreply / githubweb
  342. Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF  (#6593) — noreply / githubweb
  343. try fix — fnrizzi / githubweb
  344. avoid pyt package — fnrizzi / githubweb
  345. try — fnrizzi / githubweb
  346. fix for macos — fnrizzi / githubweb
  347. remove comments — fnrizzi / githubweb
  348. use reference — crtrott / githubweb
  349. add branching — fnrizzi / githubweb
  350. [ci skip] fix formatting — cezary.skrzynski / githubweb
  351. GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing — Daniel Arndt / githubweb
  352. nvcc(wrapper): adding missing `--generate-line-info` arg — romin.tomasetti / githubweb
  353. Add clang-format check to GitHub workflows (#6612) — noreply / githubweb
  354. graph(HIP): adding inline keyword to fix #6623 — romin.tomasetti / githubweb
  355. Add jenkins multibranch pipeline options — Bruno Turcksin / githubweb
  356. Add warp sync for Cuda parallel reduce — tccleve / githubweb
  357. kokkos(profiling): do not finalize in any backend — romin.tomasetti / githubweb
  358. Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format — Bruno Turcksin / githubweb
  359. update comment to include final() mention — tccleve / githubweb
  360. Disabling OpenACC in the CI because it emits too many warnings — Bruno Turcksin / githubweb
  361. unorderedmap: modernize traits — romin.tomasetti / githubweb
  362. nvcc wrapper: remove troubling flag to fix 6628 (#6629) — noreply / githubweb
  363. tools(profiling): type (related to kokkos/kokkos-tools/pull/221) — romin.tomasetti / githubweb
  364. This PR fixes the too-much-OpenACC-warning issue, mentioned in PR #6639. — lees2 / githubweb
  365. add missing header fix #6644 — fnrizzi / githubweb
  366. SYCL: Restrict workaround for is_device_copyable to oneAPI versions before 2024.0.0 (#6532) — noreply / githubweb
  367. Fixup test math functions ulp should double -> int — Damien L-G / githubweb
  368. Drop DualView converting copy assignment operator — Damien L-G / githubweb
  369. Don't use rocm-docker for clang-format — Daniel Arndt / githubweb
  370. Diable HIP CI — Daniel Arndt / githubweb
  371. Remove deprecation warning for AllocationMechanism for gcc <11.0 — Daniel Arndt / githubweb
  372. OpenMPTarget: clang extensions for dynamic shared memory. (#6380) — noreply / githubweb
  373. Fix builtin_unreachable use for MSVC/CUDA — crtrott / githubweb
  374. Fix missing include on msvc/cuda — crtrott / githubweb
  375. Avoid lambdas in constexpr branch for msvc/cuda — crtrott / githubweb
  376. Sidestep lacking CTAD support msvc/cuda — crtrott / githubweb
  377. Fix formatting — crtrott / githubweb
  378. Move header for Damien because he is right — crtrott / githubweb
  379. Unit test for issue 3371 (negative vector length should not yield a negative max_team_size) (#6076) — noreply / githubweb
  380. Add CMakeLists.txt for stream benchmark — cwpears / githubweb
  381. Do not negate the dependent true traits helper — Damien L-G / githubweb
  382. Drop guards to accommodate external code defining KOKKOS_ASSERT — Damien L-G / githubweb
  383. Use omp_get_max_active_levels() when supported — Daniel Arndt / githubweb
  384. Add missing gfx940 — rberger / githubweb
  385. Add Impl::always_false type-dendent false trait — Damien L-G / githubweb
  386. Per review prefer always_false<Arg>::value to is_void_v<Arg> — Damien L-G / githubweb
  387. Improve "no copy mechanism" exception message — bmkelle / githubweb
  388. Add a unit test for new deep_copy exception msg — bmkelle / githubweb
  389. Add missing include sstream — bmkelle / githubweb
  390. src->source, dst->destination — bmkelle / githubweb
  391. Workaround for ROCm 6.0 failing to compile with AVX2 SIMD support — Bruno Turcksin / githubweb
  392. SYCL: Force inlining of Kokkos::printf (#6650) — noreply / githubweb
  393. Improve handling of printf in OMPT on Intel GPUs — Daniel Arndt / githubweb
  394. OpenMP: Use `omp_get_nested` for older gcc versions (#6685) — noreply / githubweb
  395. Disable more Bessel tests for SYCL on INtel GPUs — Daniel Arndt / githubweb
  396. fill_random without exceution space instance should fence — Daniel Arndt / githubweb
  397. Drop unnecessary guarding for a tool library being loaded in ProfilingSection — Damien L-G / githubweb
  398. Drop unnecessary header include in Kokkos_Profiling_ProfileSection.hpp — Damien L-G / githubweb
  399. #5333: CUDA: Use scratch space appropriate to small reduction elements in Team reductions (#5334) — noreply / githubweb
  400. Cuda: Allocate using the correct device (#6392) — noreply / githubweb
  401. Let `Profiling::ProfilingSection(std::string)` constructor be explicit and nodiscard (#6690) — noreply / githubweb
  402. Cosmetic changes to ProfilingSection — Damien L-G / githubweb
  403. GitHub CI: Test with AddressSanitizer (#6676) — noreply / githubweb
  404. Kokkos::Array deduction guide (#6373) — noreply / githubweb
  405. Add CI for MSVC+Cuda (#6661) — noreply / githubweb
  406. SYCL: Address deprecations after oneAPI 2023.2.0 (#6577) — noreply / githubweb
  407. Fixup cast tolerance to double before printing — Damien L-G / githubweb
  408. Try linking against CUDA libararies even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE — Daniel Arndt / githubweb
  409. test_array_ctad: disable test for intel versions < 2021 — ndellin / githubweb
  410. Enable `{transform_}exclusive_scan` in place (#6667) — noreply / githubweb
  411. Add `ATOMICS_BYPASS` configuration option to disable atomics (#6692) — noreply / githubweb
  412. Check matching static extents in View constructor (#5190) — noreply / githubweb
  413. Remove Kokkos::[b]half_t volatile overloads (#6579) — noreply / githubweb
  414. add tests — fnrizzi / githubweb
  415. Provide `kokkos_swap` as part of Core and deprecate `Experimental::swap` in Algorithms (#6697) — noreply / githubweb
  416. Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` (#6687) — noreply / githubweb
  417. Fix TeamThreadMDRange parallel_reduce (#6511) — noreply / githubweb
  418. add tests for in-place `inclusive_scan` (#6682) — noreply / githubweb
  419. Drop pointless Kokkos::Impl::CudaExec forward declaration — Damien L-G / githubweb
  420. Don't use the compiler launcher script if the compile language is CUDA. (#6704) — noreply / githubweb
  421. Deprecate `{Cuda,HIP}::detect_device_count()` and `Cuda::[detect_]device_arch()` (#6710) — noreply / githubweb
  422. Get rid of CudaInternal::cuda_get_error_{name,string}_wrapper — Damien L-G / githubweb
  423. No need to jump through so many hoops to print the error message — Damien L-G / githubweb
  424. HIP: Forgot to delete matching brace closing the namespace — Damien L-G / githubweb
  425. Make initialize and finalize of the Cuda/HIP singleton less special (#6714) — noreply / githubweb
  426. Kokkos_HIP.cpp: include Kokkos_Core.hpp to resolve errors — ndellin / githubweb
  427. Add bound checks in RangePolicy and MDRangePolicy (#6617) — noreply / githubweb
  428. Temporary fix to reenable HIP CI — Bruno Turcksin / githubweb
  429. Let the smart pointer manage the CUDA/HIP stream (#6721) — noreply / githubweb
  430. Fix Docker env variables — Bruno Turcksin / githubweb
  431. Ensure view_allocation_error does not silently ignore that no exception was thrown — Damien L-G / githubweb
  432. Add RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc enumerator — Damien L-G / githubweb
  433. [OpenACC] throw if acc_malloc returned nullptr — Damien L-G / githubweb
  434. Fixup using declaration — Damien L-G / githubweb
  435. Disable openacc.view_allocation_error test — Damien L-G / githubweb
  436. Guard `[MD]RangePolicy` precondition check for deprecated code 4 (#6726) — noreply / githubweb
  437. Add C++26 standard to CMake Setup — dev / githubweb
  438. Add support for C++26 in generated makefiles — Damien L-G / githubweb
  439. Add KOKKOS_ENABLE_CXX26 to the configuration metadata — Damien L-G / githubweb
  440. Reenable HIP testing — Bruno Turcksin / githubweb
  441. Disabling failing HIP test in the CI — Bruno Turcksin / githubweb
  442. Use team_size_max to fix "Team size too large" error in reducer test (#6725) — noreply / githubweb
  443. Don't touch my records! (refactor Cuda/HIP/SYCL/Threads to not directly mess with `SharedAllocationRecord`) (#6732) — noreply / githubweb
  444. Untangle SharedAllocationRecord spaghetti code — Damien L-G / githubweb
  445. Fix TestThreadVectorMDRangeParallelReduce (#6734) — noreply / githubweb
  446. Cuda multi-GPU support: Allow execution space instance constructor to run (#6706) — noreply / githubweb
  447. Drop support for deprecated command-line arguments and environment variables (#6744) — noreply / githubweb
  448. Avoid unnecessary zero-memset of the scratch flags in SYCL (#6739) — noreply / githubweb