Skip to content
Changes
- Add Kokkos::num_threads() and Kokkos::device_id()
—
Damien L-G /
githubweb
- Check Kokkos::num_threads and device_id in tests
—
Damien L-G /
githubweb
- Use std::sort for sorting within a bin when possible
—
Andrey Prokopenko /
githubweb
- RangePolicyRequire was not using require
—
tccleve /
githubweb
- Fix reducer result check for Serial+HPX ParallelReduce
—
Daniel Arndt /
githubweb
- Define KOKKOS_ARCH_NVIDIA_GPU macro when targeting an NVIDIA GPU architecture
—
Damien L-G /
githubweb
- Use KOKKOS_ARCH_NVIDIA_GPU macro in SYCL, OpenACC, and OpenMPTarget backends where appropriate
—
Damien L-G /
githubweb
- Add ParallelScanRangePolicy test
—
tccleve /
githubweb
- Address review comments
—
Andrey Prokopenko /
githubweb
- Fix partition_master test
—
Daniel Arndt /
githubweb
- Use CombinedReducer in HostIterateTile
—
Daniel Arndt /
githubweb
- Desul atomics: fix bug max uint64_t value
—
Damien L-G /
githubweb
- Make CUDA bhalf conversion code more forward compatible
—
Damien L-G /
githubweb
- Let KOKKOS_ARCH_NVIDIA_GPU provide the Compute Capability
—
Damien L-G /
githubweb
- Bring back git info to benchmarks output
—
cezary.skrzynski /
githubweb
- [ci skip] Add a comment
—
Andrey Prokopenko /
githubweb
- Rename KOKKOS_{ -> IMPL_}ARCH_NVIDIA_GPU
—
Damien L-G /
githubweb
- Improve macro name KOKKOS_IMPL_{ARCH_NVIDIA_GPU_AMPERE_PLUS -> NVIDIA_GPU_ARCH_SUPPORT_BHALF}
—
Damien L-G /
githubweb
- Allow that C++20 is passed to nvcc
—
noreply /
githubweb
- Allow c++20 in nvcc_wrapper for nvcc 12 and above
—
noreply /
githubweb
- Print Kokkos version at configuration time (#5979)
—
noreply /
githubweb
- #5641: Fix HIP & CUDA MDRange reduce for sizeof(value_type) < sizeof(int) (#5745)
—
noreply /
githubweb
- Fix sycl.large_team_scratch_size
—
Daniel Arndt /
githubweb
- Add to HIP tests in Makefile
—
tccleve /
githubweb
- Bit manipulation: implement `byteswap` (#5967)
—
noreply /
githubweb
- UnorderedMap: Ensure size() working in case of copies
—
Daniel Arndt /
githubweb
- Use (non-mutable) std::shared_ptr instead
—
Daniel Arndt /
githubweb
- Add another test case
—
Daniel Arndt /
githubweb
- graph(hip): enable test
—
romin.tomasetti /
githubweb
- Remove workaround for submit_barrier not being enqueued properly
—
Daniel Arndt /
githubweb
- Support --compiler-options in nvcc_wrapper
—
noreply /
githubweb
- Cleanup OpenMPTaget ParallelReduce
—
Daniel Arndt /
githubweb
- Fix compiling OpenMPTarget for AMD GPUs
—
Daniel Arndt /
githubweb
- Desul atomics: Fix NVCC warning integer conversion resulted in a change of sign
—
Damien L-G /
githubweb
- Fixup NVIDIA GPU arch must be defined potentially for other backends as well
—
Damien L-G /
githubweb
- Fix warning with NVC++
—
Damien L-G /
githubweb
- Fix team_scratch_1_queues for SYCL+Cuda
—
Daniel Arndt /
githubweb
- m_team_scratch_pool -> m_team_scratch_event
—
Daniel Arndt /
githubweb
- Explain acquire_team_scratch_space
—
Daniel Arndt /
githubweb
- Move Cuda/Kokkos_Cuda_NvidiaGpuArchitectures.hpp -> impl/Kokkos_NvidiaGpuArchitectures.hpp
—
Damien L-G /
githubweb
- Drive-by fix typos "fix {to -> too} many"
—
noreply /
githubweb
- simd: Fixed an incorrectly returning size for uint64_t in avx2 (#6004)
—
noreply /
githubweb
- Pass local_accessor directly instead
—
Daniel Arndt /
githubweb
- Fix warning in some user code when using std::memcpy
—
Bruno Turcksin /
githubweb
- CMake: update package compatibility mode when building within Trilinos
—
Andrey Prokopenko /
githubweb
- SYCL: Make is_device_copyable future-proof (#6009)
—
noreply /
githubweb
- desul: Move lock_array_copied from global scope (#5999)
—
noreply /
githubweb
- Fix namespace for is_device_copyable
—
Daniel Arndt /
githubweb
- Fixup 4.0 change log (#6015)
—
noreply /
githubweb
- Add half_t and bhalf_t limits (#5778)
—
noreply /
githubweb
- Add UnorderedMapInsertOps for coo2crs (#5877)
—
noreply /
githubweb
- #6027: replace remaining instances of ALL_t with Kokkos::ALL_t (#6028)
—
noreply /
githubweb
- Add -Wdeprecated-copy warning and fix OMPT scan bug related to assignment operators (#6026)
—
noreply /
githubweb
- Relax scratch space limits for HIP reductions (#6029)
—
noreply /
githubweb
- Fix -Wmissing-field-initializers warning
—
Daniel Arndt /
githubweb
- Fix enum warnings
—
Daniel Arndt /
githubweb
- Fix OMPT size compare warnings
—
Daniel Arndt /
githubweb
- Add -Wextra
—
Daniel Arndt /
githubweb
- Cherry-pick TriBITS update from Trilinos
—
Daniel Arndt /
githubweb
- KOKKOS_COMPILER_PGI -> KOKKOS_COMPILER_NVHPC
—
Daniel Arndt /
githubweb
- Try removing _kokkos_pgi_compiler_bug_workaround
—
Daniel Arndt /
githubweb
- Remove unused constructors for ThreadVectorRangeBoundairesStruct
—
donlee /
githubweb
- Define at most one KOKKOS_COMPILER* macro
—
Daniel Arndt /
githubweb
- Define KOKKOS_COMPILER_INTEL_LLVM
—
Daniel Arndt /
githubweb
- Desul atomics: let relocatable device code mode be part of the configuration (#5991)
—
noreply /
githubweb
- CHANGELOG.md: add threads sort
—
ajpowel /
githubweb
- Remove Kokkos Keyword Listing section from BUILD.md and refer to the wiki instead
—
Damien L-G /
githubweb
- Remove (outdated) license information [ci skip]
—
Damien L-G /
githubweb
- Remove libnuma (#6048)
—
noreply /
githubweb
- Don't create a shared state for size() in UnorderedMap's deep_copy
—
Daniel Arndt /
githubweb
- OpenMPTarget: Changes to Makefile.kokkos (#6053)
—
noreply /
githubweb
- Partially reverse #5504
—
Daniel Arndt /
githubweb
- Reduce size of ScatterView test when using OpenMP
—
Bruno Turcksin /
githubweb
- Fix bug in Makefile.kokkos
—
stamoor /
githubweb
- Fix join for ValueWrapperForNoNeutralElement
—
Daniel Arndt /
githubweb
- SYCL: Use in-order queue for SYCL+Cuda
—
Daniel Arndt /
githubweb
- perf_test is still not working
—
Daniel Arndt /
githubweb
- Fix PerfTests by limiting GramSchmidt
—
Daniel Arndt /
githubweb
- Only pass one wrapper object in SYCL reductions
—
Daniel Arndt /
githubweb
- Explicitly cast to CombinedFunctorReducerType
—
Daniel Arndt /
githubweb
- Fix Kokkos_SIMD with AVX2 on 64-bit architectures (#6075)
—
noreply /
githubweb
- Update changelog (#6058)
—
noreply /
githubweb
- Fix up SYCL execution space instance creation for Intel GPUs
—
Daniel Arndt /
githubweb
- Converted a shared_ptr to a host view in UnorderedMap (#6073)
—
noreply /
githubweb
- Update OpenACC FunctorAdapter (#6077)
—
noreply /
githubweb
- Improve SYCL parallel_scan
—
Daniel Arndt /
githubweb
- Compiling with auto deduction of workgroup sizes
—
Daniel Arndt /
githubweb
- Disable failed bit manipulation tests when compiled by NVHPC (#6088)
—
noreply /
githubweb
- Add support for Darwin 32-bit and PPC (#5916)
—
noreply /
githubweb
- Unconditionally enable CUDA extended lambda support
—
pbmille /
githubweb
- Tentative arguments switch for nvcc 12+
—
pbmille /
githubweb
- Change Makefile.kokkos too
—
pbmille /
githubweb
- Implement CMake messages per team decision
—
pbmille /
githubweb
- Fix definitions and docs to remove CUDA Lambda option
—
pbmille /
githubweb
- Don't fail to define broader 'lambdas are available' macro
—
pbmille /
githubweb
- Always expect KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA to be set
—
pbmille /
githubweb
- Remove various test exclusions based on KOKKOS_ENABLE_CUDA_LAMBDA
—
pbmille /
githubweb
- Update changelog
—
ndellin /
githubweb
- [ci skip] Fixup changelog
—
ndellin /
githubweb
- Work around nvcc issue for view_mapping and add FIXME_NVCC comment
—
pbmille /
githubweb
- OpenMPTarget: Update hierarchical parallelism. (#6043)
—
noreply /
githubweb
- Enable OpenMP in CUDA-11.0-NVCC-RDC to test DEPRECATED_CODE_3=ON (#5978)
—
noreply /
githubweb
- fix ternary op in subset of std algorithms not working with nvhpc (#6095)
—
noreply /
githubweb
- Add implementation of bit_cast in <Kokkos_BitManipulation.hpp>
—
Damien L-G /
githubweb
- Add compile time tests for the constraints on the bit_cast function template
—
Damien L-G /
githubweb
- Add the Experimental:: builtin variant (just defer to regular bit_cast)
—
Damien L-G /
githubweb
- Add runtime tests for bit_cast
—
Damien L-G /
githubweb
- Use Kokkos::bit_cast in SIMD instead of rolling its own
—
Damien L-G /
githubweb
- Clang-format glitch
—
Damien L-G /
githubweb
- view(uvm): fence if need in allocation (#6005)
—
romin.tomasetti /
githubweb
- Disable tests that fail at runtime with NVHPC (likely not liking the class declaration within the body of the functor)
—
Damien L-G /
githubweb
- change impl of `is_sorted_until` to use reduce (#6097)
—
noreply /
githubweb
- Fix typo and remove accidentally committed assertions
—
noreply /
githubweb
- Added multiple reducers support for team-level parallel reduce (#5727)
—
noreply /
githubweb
- Work around NVHPC issue with enum types
—
crtrott /
githubweb
- Work around NVHPC 23.x issues
—
crtrott /
githubweb
- Kokkos: Remove TriBITS Kokkos subpackages (trilinos/Trilinos#11545) (#6104)
—
noreply /
githubweb
- Drop pointless Kokkos{Algorithms,Containers}_config.h files
—
Damien L-G /
githubweb
- Revert "Merge pull request #5964 from PhilMiller/cuda-lambda-default"
—
Damien L-G /
githubweb
- Update the OpenACC parallel_reduce() constructs with Range/MDRange/Team (#6072)
—
noreply /
githubweb
- Always pass -extended-lambda option to NVCC and force Kokkos_ENABLE_CUDA_LAMBDA ON
—
Damien L-G /
githubweb
- Reorganize ZeroMemset (#6087)
—
noreply /
githubweb
- Drop CUDA_LAMBDA guards in Cuda headers
—
Damien L-G /
githubweb
- Work around NVHPC 23.x not dealing with __isGlobal
—
crtrott /
githubweb
- Drop unused cmake macros
—
Damien L-G /
githubweb
- Fixup cmake style
—
Damien L-G /
githubweb
- use ASSERT_EQ in all std algorithms tests
—
fnrizzi /
githubweb
- Reintroduce test skip for nvhpc < 23.3
—
crtrott /
githubweb
- hpcbind: check for correct Slurm variable
—
rberger /
githubweb
- Fix macro guards in test for NVC++ as the CUDA compiler
—
Damien L-G /
githubweb
- Allow templated functors in parallel_for, parallel_reduce and parallel_scan (#5976)
—
noreply /
githubweb
- Import sycl::bit_cast into the Kokkos namespace
—
Daniel Arndt /
githubweb
- Qualify calls possibly ambiguous calls to bit_cast
—
Daniel Arndt /
githubweb
- Fix nightlies -- workaround compiler bug in GCC 9.1 and 9.2 (#6118)
—
noreply /
githubweb
- Kokkos_BitManipulation: KOKKOS_COMPILER_GCC->KOKKOS_COMPILER_GNU (#6119)
—
noreply /
githubweb
- Cuda: Remove unused attach_texture_object
—
Daniel Arndt /
githubweb
- Move half traits to private header and add half/bhalf infinity trait (#6055)
—
noreply /
githubweb
- Increase minimum required HPX version to 1.8.0
—
mikael.simberg /
githubweb
- Conditionally use hpx::post instead of hpx::apply based on HPX version
—
mikael.simberg /
githubweb
- Don't restrict index type in builtin reducers
—
Daniel Arndt /
githubweb
- dual view: update template types (#6085)
—
romin.tomasetti /
githubweb
- sorting an empty view should exit early and not fail (#6130)
—
noreply /
githubweb
- core/src: Move floating_point_wrapper to private header
—
eharvey /
githubweb
- Disable tests failing with NVHPC
—
Daniel Arndt /
githubweb
- Fix bit_cast for SYCL again
—
Daniel Arndt /
githubweb
- Disable tests for OpenMPTarget
—
Daniel Arndt /
githubweb
- Improve indentation of comments
—
Daniel Arndt /
githubweb
- Allow deprecated declarations in SYCL+Cuda CI
—
Daniel Arndt /
githubweb
- Try running for other execution spaces
—
Daniel Arndt /
githubweb
- Add guards for Cuda
—
Daniel Arndt /
githubweb
- Expand list of kokkos options not to export with cmake
—
Damien L-G /
githubweb
- Do not append to Kokkos_OPTIONS variables those in the do not export list
—
Damien L-G /
githubweb
- Drop Kokkos_ENABLE_LAUNCH_COMPILER option
—
Damien L-G /
githubweb
- Export Kokkos_ENABLE_<OPTION> that are relevant
—
Damien L-G /
githubweb
- Drop Kokkos_ENABLE_PROFILING_LOAD_PRINT option
—
Damien L-G /
githubweb
- Suppress bogus warning about CUDA_LAMBDA being ON
—
Damien L-G /
githubweb
- [ci skip] Add nightly ci for spack (#6135)
—
noreply /
githubweb
- OpenMPTarget: Enable Cray compiler for the OpenMPTarget backend. (#5889)
—
noreply /
githubweb
- Revert to `DualView<class,class=void,class=void,class=void>` when deprecated code 4 is enabled
—
Damien L-G /
githubweb
- Fix Kokkos_ENABLE_CUDA_LAMBDA for Trilinos
—
Daniel Arndt /
githubweb
- Fix bogus warnings in nested CUDA parallel_reduce
—
Daniel Arndt /
githubweb
- `BinSort`, `BinOp1D`, `BinOp3D`: mark default constructor as deleted (#6131)
—
noreply /
githubweb
- KokkosTools: Don't call callbacks before backends are initialized (#6114)
—
noreply /
githubweb
- Add `Kokkos::all_libs` alias target for compatibility with TriBITS/Trilinos (#6157)
—
noreply /
githubweb
- sorting: add to binsort support for strided views and reorg tests (#6081)
—
noreply /
githubweb
- Allow linking against build tree (#6078)
—
noreply /
githubweb
- Implement `HPX::in_parallel` (#6143)
—
noreply /
githubweb
- OpenMPTarget: Changes for OpenMPTarget backend with nvhpc compiler.
—
rgayatri /
githubweb
- OpenMPTarget: Add a fixme.
—
rgayatri /
githubweb
- Update Makefile.kokkos
—
noreply /
githubweb
- Remove extended_namespace template paramter for SYCLMemoryOrder/Scope
—
Daniel Arndt /
githubweb
- OpenMPTarget: update fixme comment.
—
rgayatri /
githubweb
- OpenMPTarget: Replace kokkos macros in desul.
—
rgayatri /
githubweb
- OpenMPTarget: Restore desul changes.
—
rgayatri /
githubweb
- Cherry-pick v3.7.02 changelog into develop [ci skip]
—
Damien L-G /
githubweb
- Kokkos: Mark HWLOC as a TriBITS TPL as well (trilinos/Trilinos#11938)
—
rabartl /
githubweb
- Clean up FunctorAnalysis
—
Daniel Arndt /
githubweb
- SIMD: make binary op tests to test against all data types (#5913)
—
noreply /
githubweb
- Also create symlinks for CMake configuration files to cmake_packages/Kokkos for TriBITS (#6163)
—
noreply /
githubweb
- Allow passing a temporary std::vector to partition_space (#6167)
—
noreply /
githubweb
- .github/workflows: Remove push trigger
—
eharvey /
githubweb
- .github/workflows: Only trigger upon push to develop
—
eharvey /
githubweb
- Replace _mm512_loadu_epi64 and _mm512_storeu_epi64 with _mm512_loadu_si512 and _mm512_storeu_si512
—
donlee /
githubweb
- OpenMPTarget: include desul changes.
—
rgayatri /
githubweb
- Weed out verbose output from `dynamic_view` container unit test (#6173)
—
noreply /
githubweb
- shortcut value for is_dynamic_view
—
fnrizzi /
githubweb
- add trait and test
—
fnrizzi /
githubweb
- Fix global fence in Kokkos::resize(DynRankView) (#6184)
—
noreply /
githubweb
- Left align demangled stacktrace output. (#6191)
—
noreply /
githubweb
- [HIP] Improve heuristic deciding the number of blocks used in parallel_reduce (#6160)
—
noreply /
githubweb
- Improve OpenMP affinity warning to include MPI concerns (#6185)
—
noreply /
githubweb
- Update version number on develop after branching off for 4.1.00
—
Daniel Arndt /
githubweb
- Fix test_quad_precision_math_constants test
—
Daniel Arndt /
githubweb
- implementation and tests
—
fnrizzi /
githubweb
- implementation and tests
—
fnrizzi /
githubweb
- [ci skip] test_all_sandia: update compilers and queues
—
ndellin /
githubweb
- team-level std algos: common code needed (#6199)
—
noreply /
githubweb
- Fix compiling SYCL with KOKKOS_IMPL_DO_NOT_USE_PRINTF_USAGE
—
Daniel Arndt /
githubweb
- snapshot mdspan namespace changes (#6162)
—
noreply /
githubweb
- Disable AVX512 support for NVHPC
—
Daniel Arndt /
githubweb
- Fix host-annotations of AVX2, AVX512, and NEON constructors
—
Daniel Arndt /
githubweb
- Introduce impl_get_value/impl_get_mask
—
Daniel Arndt /
githubweb
- Fix a gcc-8.4.0 warning
—
Daniel Arndt /
githubweb
- Fix host-device annotation for where_expression/const_where_expression
—
Daniel Arndt /
githubweb
- Make in-order queues the default via macro
—
Daniel Arndt /
githubweb
- Avoid SFINAE in favor of overloads
—
Daniel Arndt /
githubweb
- Move scalar overloads to Scalar header
—
Daniel Arndt /
githubweb
- Disable KOKKOS_ARCH_AVX512XEON for NVHPC
—
Daniel Arndt /
githubweb
- Changelog for 4.1.00 (#6225)
—
noreply /
githubweb
- reorganize sort headers (#6230)
—
noreply /
githubweb
- SYCL: Support for bhalf_t (#6204)
—
noreply /
githubweb
- only compute with relavent entries
—
tccleve /
githubweb
- make constraints on `Kokkos::sort` more visible/clear (#6234)
—
noreply /
githubweb
- slim API and move code to impl
—
fnrizzi /
githubweb
- This PR contains minor code changes and bug fixes needed for LLVM-Clacc
—
lees2 /
githubweb
- improve all other corner cases as per review comment
—
fnrizzi /
githubweb
- refine for cuda uvm
—
fnrizzi /
githubweb
- use exespace to check rather than mem space
—
fnrizzi /
githubweb
- Fix AVX2 simd support for ZEN2 AMD CPU. (#6238)
—
noreply /
githubweb
- fix corner case
—
fnrizzi /
githubweb
- Fix windows symlink configure issue (#6241)
—
noreply /
githubweb
- fix corner cases
—
fnrizzi /
githubweb
- bug_report.md: new PR branching from `develop` (#5034)
—
noreply /
githubweb
- Fix whitespace in bug_report.md (#6244)
—
noreply /
githubweb
- Avoid undefined behavior in TestTaskScheduker.hpp
—
Daniel Arndt /
githubweb
- Remove calling tribits_exclude_autotools_files()
—
rabartl /
githubweb
- Ensure that complex is only instantiated for cv-unqualified floating-point type
—
Damien L-G /
githubweb
- Deprecated Kokkos::vector
—
Damien L-G /
githubweb
- Warn if <Kokkos_Vector.hpp> is included
—
Damien L-G /
githubweb
- Drop Vector test with makefiles and conditionally remove it with CMake
—
Damien L-G /
githubweb
- Ignore <Kokkos_Vector.hpp> in the header self-containment tests
—
Damien L-G /
githubweb
- SYCL: Use in-order queues in InterOp tests (#6246)
—
noreply /
githubweb
- std_algos: fix wrong corner case for `is_partitioned` (#6257)
—
noreply /
githubweb
- Make sure macros are defined
—
Daniel Arndt /
githubweb
- Error out when Kokkoks_Vector.hpp is included with deprecated code disabled
—
Daniel Arndt /
githubweb
- SIMD: Add abs() for all int types (#6069)
—
noreply /
githubweb
- Fix SIMD abs unit test accidentally using complex overload
—
Damien L-G /
githubweb
- Fix SIMD tests on NEON
—
Daniel Arndt /
githubweb
- Add default ParallelFor copy constructor for HIP
—
Bruno Turcksin /
githubweb
- Workaround gcc/8.2.0 compiler issue with _mm512_abs_pd
—
ndellin /
githubweb
- Implement Kokkos::printf (#6083)
—
noreply /
githubweb
- Improve SYCL TeamPolicy reduction
—
Daniel Arndt /
githubweb
- make Kokkos_CXX_COMPILER_VERSION available to CMake consumers
—
cwpears /
githubweb
- Fully qualify Experimental::SYCL in algorithms to avoid finding conflicting namespaces
—
Daniel Arndt /
githubweb
- subset of team level impl of std algorithms
—
fnrizzi /
githubweb
- fix copyright
—
fnrizzi /
githubweb
- guard for openmptarget
—
fnrizzi /
githubweb
- fix for openmptarget
—
fnrizzi /
githubweb
- address comments
—
fnrizzi /
githubweb
- Update CMakeLists for unit tests with OpenMPTarget, OpenACC with NVHPC (#6260)
—
noreply /
githubweb
- Update CI from CUDA 11.7.0 to 11.7.1
—
Bruno Turcksin /
githubweb
- Improve SYCL reduction performance: RangePolicy (#6264)
—
noreply /
githubweb
- Improve SYCL reduction performance: workgroup_reduction (#6270)
—
noreply /
githubweb
- SYCL TeamPolicy: Fix sign comparison warning
—
Daniel Arndt /
githubweb
- SIMD: suppress a uninitialized variable warning (#6294)
—
noreply /
githubweb
- OpenACC CMakechange Clacc (#6250)
—
noreply /
githubweb
- `Kokkos::sort` support custom comparator (#6253)
—
noreply /
githubweb
- Add nightly build using latest gcc and c++23
—
Bruno Turcksin /
githubweb
- Explicitly capture this in lambda function
—
Bruno Turcksin /
githubweb
- Fix typo in nightly jenkins configuration
—
Bruno Turcksin /
githubweb
- Fix a memory bug in the `free_state` function of random pools. (#6290)
—
noreply /
githubweb
- std_algos: for_each: try condense the impl
—
fnrizzi /
githubweb
- format
—
fnrizzi /
githubweb
- Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants
—
Daniel Arndt /
githubweb
- Use KOKKOS_IF_ON_HOST
—
Daniel Arndt /
githubweb
- Suppress warnings
—
Daniel Arndt /
githubweb
- Don't suppress warnings for NVHPC
—
Daniel Arndt /
githubweb
- Improve macro definitions
—
Daniel Arndt /
githubweb
- Enable Serial backend in HPX build
—
cezary.skrzynski /
githubweb
- Modify fences in View API test
—
cezary.skrzynski /
githubweb
- Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159)
—
noreply /
githubweb
- Improve SYCL reduction performance: MDRangePolicy (#6271)
—
noreply /
githubweb
- bring back previous code as discussed in meeting
—
fnrizzi /
githubweb
- create cudaAPI function wrappers
—
tccleve /
githubweb
- Reorganize #include <cuda_runtime_api.h>
—
tccleve /
githubweb
- Some api function require cuda11.2+
—
tccleve /
githubweb
- Cuda10 requires "stream=nullptr" as default arg
—
tccleve /
githubweb
- Rework stream inputs
—
tccleve /
githubweb
- Use "if constexpr" for setCudaDevice
—
tccleve /
githubweb
- Remove static in comment
—
tccleve /
githubweb
- Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314)
—
noreply /
githubweb
- add helper variable templates `are_*_iterators_v` (#6312)
—
noreply /
githubweb
- [HIP] Optimize parallel_reduce (#6229)
—
noreply /
githubweb
- Split Kokkos_SYCL_Parallel_Range.hpp
—
Daniel Arndt /
githubweb
- Split Kokkos_SYCL_Parallel_Reduce.hpp
—
Daniel Arndt /
githubweb
- Move Kokkos_SYCL_Scan.hpp
—
Daniel Arndt /
githubweb
- Split Kokkos_Parallel_Team.hpp
—
Daniel Arndt /
githubweb
- Add comment that we don't need to link against clang_rt with ROCm 5.5
—
Bruno Turcksin /
githubweb
- Error out when CXX standard is not set when using amdclang or cray clang
—
Bruno Turcksin /
githubweb
- Disable failing tests for ROCm 5.5 and 5.6
—
Bruno Turcksin /
githubweb
- #5635: Add parallel_scan overload with return value for ThreadVectorRange
—
arek.szczepkowicz /
githubweb
- #5635: Add test for parallel_scan with return value for ThreadVectorRange
—
arek.szczepkowicz /
githubweb
- #5635: Serial/OpenMP: Parallel_scan with return value for TeamThreadRange (#6090)
—
noreply /
githubweb
- Fix reviewer's comments
—
Bruno Turcksin /
githubweb
- Allow using the SYCL execution space on AMD GPUs (#6321)
—
noreply /
githubweb
- Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318)
—
noreply /
githubweb
- SIMD: add shift ops for all int types (#6109)
—
noreply /
githubweb
- SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on (#6300)
—
noreply /
githubweb
- remove spurious undefs
—
fnrizzi /
githubweb
- Decrease maximum memory available to ccache
—
Bruno Turcksin /
githubweb
- Fix gtest when using C++20
—
Bruno Turcksin /
githubweb
- Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces
—
Daniel Arndt /
githubweb
- address reviews [skip ci]
—
antoine.meyer54 /
githubweb
- formatting
—
fnrizzi /
githubweb
- SIMD: split simd unit tests into separate files (#6278)
—
noreply /
githubweb
- Run NVHPC only on V100
—
Bruno Turcksin /
githubweb
- Use checked arithmetic builtins for overflow detection (#6313)
—
noreply /
githubweb
- Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273)
—
noreply /
githubweb
- allow sorting via native oneDPL to support views with stride = 1 (#6322)
—
noreply /
githubweb
- Adopt new HIP cmake's way of finding clang-rt
—
nicurtis /
githubweb
- Fix #6334: intel compiler does not like returning inside if constexpr (#6335)
—
noreply /
githubweb
- Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337)
—
noreply /
githubweb
- fix lambda capture
—
fnrizzi /
githubweb
- remove unnecessary file, fix constraints
—
fnrizzi /
githubweb
- remove file
—
fnrizzi /
githubweb
- fix
—
fnrizzi /
githubweb
- fix lambda capture and constraints
—
fnrizzi /
githubweb
- keep only subset
—
fnrizzi /
githubweb
- revert files
—
fnrizzi /
githubweb
- remove file
—
fnrizzi /
githubweb
- fix syntax
—
fnrizzi /
githubweb
- format
—
fnrizzi /
githubweb
- Disable default oneDPL support in Trilinos
—
Daniel Arndt /
githubweb
- Rename AMD GPU architectures (#6266)
—
noreply /
githubweb
- Fix compiling SIMD libraray with NEON and gcc-13
—
Daniel Arndt /
githubweb
- SIMD: add generator constructors (#6347)
—
noreply /
githubweb
- Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability
—
mikael.simberg /
githubweb
- Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346)
—
noreply /
githubweb
- Use std::aligned_alloc for allocations (#6341)
—
noreply /
githubweb
- Initial implementation of gfx942 (#6358)
—
noreply /
githubweb
- Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled
—
Daniel Arndt /
githubweb
- Extend 'hip_driver_check_page_migration' (#6364)
—
noreply /
githubweb
- Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation
—
mikael.simberg /
githubweb
- Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers
—
mikael.simberg /
githubweb
- Add support for HIP Graph
—
Bruno Turcksin /
githubweb
- Replace one of the ROCM 5.2 configuration with ROCM 5.6
—
Bruno Turcksin /
githubweb
- Use aligned version of operator new instead of aligned_alloc
—
Daniel Arndt /
githubweb
- HPX: Don't interfere with exception handling
—
Daniel Arndt /
githubweb
- Fix -Wformat-truncation warnings in CI (#6354)
—
noreply /
githubweb
- Do not use HIP Graph with ROCm 5.2
—
Bruno Turcksin /
githubweb
- Use constexpr West in src
—
Bruno Turcksin /
githubweb
- Use constexpr West in test
—
Bruno Turcksin /
githubweb
- SIMD: convert binary operators to hidden friends (#6320)
—
noreply /
githubweb
- simd: make mask and condition unit test to check with all data types (#6360)
—
noreply /
githubweb
- team-level std algos: part 2 (#6205)
—
noreply /
githubweb
- Add deprecated attribute to HostSpace(AllocationMechanism) definition
—
Damien L-G /
githubweb
- Added gather_from and scatter_to for AVX2 and AVX512 simd
—
donlee /
githubweb
- Rebased and applied feedbacks
—
donlee /
githubweb
- Changed gen ctor usage to pass in KOKKOS_LAMBDA
—
donlee /
githubweb
- Suppress NVCC attribute ignored warning on SIMD operators as hidden friends
—
Damien L-G /
githubweb
- Remove deprecated code 3 support for volatile join
—
crtrott /
githubweb
- Disable a test not working with nvhpc-23.1
—
crtrott /
githubweb
- Reenabling tests for nvhpc 23.7
—
crtrott /
githubweb
- Update containers and algorithms for NVC++ 23.7
—
crtrott /
githubweb
- More NVC++ 23.7 updates
—
crtrott /
githubweb
- NVC++ clang-format fixes
—
crtrott /
githubweb
- Update nvhpc to version 23.7 in the CI
—
crtrott /
githubweb
- NVHPC 23.7 update: address reviewer comments
—
crtrott /
githubweb
- OpenACC: Guard tests relying on abort
—
crtrott /
githubweb
- Fix TestAtomic to use the test execspace
—
crtrott /
githubweb
- Use NVHPC 23.7 for testing of OpenACC
—
crtrott /
githubweb
- Update nvhpc gtest skip message
—
crtrott /
githubweb
- Work around OpenMPTarget failure
—
crtrott /
githubweb
- Update base docker file for nvhpc
—
crtrott /
githubweb
- Remove stray Cuda graph pattern specialization from tag
—
Damien L-G /
githubweb
- Fix reviewer's comments
—
Bruno Turcksin /
githubweb
- Fix uninitialize variable warning with gcc 13
—
Bruno Turcksin /
githubweb
- HIP: Update print_configuration (#6387)
—
noreply /
githubweb
- Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248)
—
noreply /
githubweb
- Don't use local headers or runtime in HPX backend due to deprecation
—
mikael.simberg /
githubweb
- Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377)
—
noreply /
githubweb
- team-level std algos: part 3 (#6207)
—
noreply /
githubweb
- Adding is_scoped_enum & to_underlying (#6356)
—
noreply /
githubweb
- SIMD: add float simd support (#6177)
—
noreply /
githubweb
- team-level std algos: part 4 (#6208)
—
noreply /
githubweb
- Added a gen ctor for float (#6397)
—
noreply /
githubweb
- team-level std algos: part 5 (#6209)
—
noreply /
githubweb
- Deprecate Cuda(cudaStream_t, bool)
—
Damien L-G /
githubweb
- Fixup checked interger operations death test
—
Damien L-G /
githubweb
- Deprecate HIP(hipStream_t, bool)
—
Damien L-G /
githubweb
- Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default
—
Damien L-G /
githubweb
- Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined
—
Damien L-G /
githubweb
- Introduce disable_malloc_async Cuda option with generated makefiles
—
Damien L-G /
githubweb
- Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC
—
Damien L-G /
githubweb
- Use archive extraction time for timestamps
—
cezary.skrzynski /
githubweb
- Disable performance benchmarks in AppVeyor CI
—
cezary.skrzynski /
githubweb
- team-level std algos: part 6 (#6210)
—
noreply /
githubweb
- address comments
—
fnrizzi /
githubweb
- OpenMP backend refactor files. (#6403)
—
noreply /
githubweb
- Drop (unused) `Cuda::cuda_internal_maximum_shared_words`
—
Damien L-G /
githubweb
- Drop check that the host backend is initialized before the Cuda/HIP/SYCL one
—
Damien L-G /
githubweb
- Drop unused HIPInternal::m_maxSharedWords data member
—
Damien L-G /
githubweb
- Drop unused HIPInternal::m_hipArch static data member
—
Damien L-G /
githubweb
- !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize
—
Damien L-G /
githubweb
- Drop pre-Kepler logic in Cuda::impl_initialize
—
Damien L-G /
githubweb
- use single
—
fnrizzi /
githubweb
- address comments
—
fnrizzi /
githubweb
- formatting
—
fnrizzi /
githubweb
- Team-level std algos: part 7 (#6211)
—
noreply /
githubweb
- formatting
—
fnrizzi /
githubweb
- Enable death tests for fedora rawhide
—
cezary.skrzynski /
githubweb
- core/src: Add half math functions to private header (#6124)
—
noreply /
githubweb
- Drop check whether device supports unified addressing
—
Damien L-G /
githubweb
- fix single as per Christian's suggestion
—
fnrizzi /
githubweb
- Only warn once (at initialization) when forcing allocation in unified memory
—
Damien L-G /
githubweb
- check-copyright improvements (#6399)
—
noreply /
githubweb
- Use execution space instance argument to get device properties in block size deduction
—
Damien L-G /
githubweb
- Address reviewer' comments
—
Bruno Turcksin /
githubweb
- Fix to avoid #186-D pointless comparison warning.
—
maarten.arnst /
githubweb
- add comment
—
fnrizzi /
githubweb
- improve tests to address review
—
fnrizzi /
githubweb
- Fix guard for isnan test for bhalf_t
—
Daniel Arndt /
githubweb
- avoid potential race condition HIP
—
tccleve /
githubweb
- [SYCL][Reduction] Group counter should use at least memory_order::acq_rel
—
andrei.elovikov /
githubweb
- Same for scan
—
andrei.elovikov /
githubweb
- Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy
—
Daniel Arndt /
githubweb
- improve tests with intra-team result check
—
fnrizzi /
githubweb
- Fixes for Kokkos::Array (#6372)
—
noreply /
githubweb
- try fix for unique, previous impl to remove later
—
fnrizzi /
githubweb
- #5635: Add parallel_scan changes for CUDA and TeamThreadRange
—
cezary.skrzynski /
githubweb
- remove old impl
—
fnrizzi /
githubweb
- #5635: Enable TeamThreadRange test for CUDA
—
cezary.skrzynski /
githubweb
- Clean up benchmarks/gups
—
cwpears /
githubweb
- benchmark/gups: use CMake
—
cwpears /
githubweb
- OpenMPTarget: Disable check for SIMD compiler macros
—
Daniel Arndt /
githubweb
- #5635: Add parallel_scan with value for CUDA and ThreadVectorRange
—
cezary.skrzynski /
githubweb
- add missing assert
—
fnrizzi /
githubweb
- #5635: Add test for parallel_scan with return value for ThreadVectorRange
—
cezary.skrzynski /
githubweb
- add intra team check for missing test
—
fnrizzi /
githubweb
- fix intel compile error
—
fnrizzi /
githubweb
- fix unreachable for intel
—
fnrizzi /
githubweb
- re-enable unit tests for sort and random via makefile (#6422)
—
noreply /
githubweb
- OpenMPTarget init-join fix (#6444)
—
noreply /
githubweb
- Fix Cuda parallel_scan ThreadVectorRange range
—
Daniel Arndt /
githubweb
- Assign final sum in Cuda parallel_scan ThreadVectorRange
—
Daniel Arndt /
githubweb
- Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443)
—
noreply /
githubweb
- std_algos: improving min, max, minmax (#6421)
—
noreply /
githubweb
- team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426)
—
noreply /
githubweb
- Skip bessel function tests known to fail on Intel GPUs (#6434)
—
noreply /
githubweb
- team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436)
—
noreply /
githubweb
- Fix race condition in functor_vec_scan_ret_val test
—
Daniel Arndt /
githubweb
- Fix parallel_scan_with_reducers test
—
Daniel Arndt /
githubweb
- team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425)
—
noreply /
githubweb
- improve tests (#6432)
—
noreply /
githubweb
- improve tests (#6437)
—
noreply /
githubweb
- Move final assignment to correct scope
—
cezary.skrzynski /
githubweb
- fix casting warning in Random test
—
fnrizzi /
githubweb
- Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449)
—
noreply /
githubweb
- HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7
—
Daniel Arndt /
githubweb
- fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459)
—
noreply /
githubweb
- improve tests to check intra-team result (#6431)
—
noreply /
githubweb
- SIMD: Math functions should be in namespace Kokkos
—
Daniel Arndt /
githubweb
- SYCL: Disable another bessel function test for Intel GPUs
—
Daniel Arndt /
githubweb
- team-level std algos: part 10 (#6256)
—
noreply /
githubweb
- team-level std algos: part 11 (#6258)
—
noreply /
githubweb
- #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302)
—
noreply /
githubweb
- #5635: Add parallel_scan overloads with value for Threads
—
cezary.skrzynski /
githubweb
- #5635: Move some tests for parallel_scan to TestTeamScan
—
cezary.skrzynski /
githubweb
- use shortcut
—
cezary.skrzynski /
githubweb
- Skip testing for non-power-of-two team sizes
—
cezary.skrzynski /
githubweb
- Fix formatting
—
cezary.skrzynski /
githubweb
- Add parallel_scan overloads with value for HIP backend
—
cezary.skrzynski /
githubweb
- Use std::is_same_v
—
cezary.skrzynski /
githubweb
- #5635: Move some tests for parallel_scan to TestTeamScan
—
cezary.skrzynski /
githubweb
- #5635: SYCL: Add parallel_scan overload with return value
—
cezary.skrzynski /
githubweb
- Use std::is_same_v
—
cezary.skrzynski /
githubweb
- OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1
—
Daniel Arndt /
githubweb
- Add compatible copy assignment operator to DualView
—
maarten.arnst /
githubweb
- Fix minimum version for Google benchmark
—
Daniel Arndt /
githubweb
- Add test of copy constructor/assignment operator for DualView.
—
maarten.arnst /
githubweb
- Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU
—
maarten.arnst /
githubweb
- Modify test so that source and destination view are of different type
—
maarten.arnst /
githubweb
- Use call operator instead of run_me function
—
maarten.arnst /
githubweb
- team-level std algos: part 12 (#6350)
—
noreply /
githubweb
- core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407)
—
noreply /
githubweb
- Moving abort and assert into their own public headers (#6445)
—
noreply /
githubweb
- Add test for parallel_scan with return value for ThreadVectorRange
—
cezary.skrzynski /
githubweb
- Add parallel_scan overloads with value for Threads
—
cezary.skrzynski /
githubweb
- Allow detecting SIMD types based on compiler macros (#6188)
—
noreply /
githubweb
- Add KOKKOS_ARCH_ARM_NEON
—
Daniel Arndt /
githubweb
- Fix implementation for cyl_bessel_i0
—
Daniel Arndt /
githubweb
- cleaning: remove iostream from headers where possible (IWYU)
—
romin.tomasetti /
githubweb
- Fix compiling SIMD unit tests on NVIDIA
—
Daniel Arndt /
githubweb
- guards to ensure DBL_EPSILON return for POWER8,9
—
ajpowel /
githubweb
- Fix NVCC warnings (#6483)
—
noreply /
githubweb
- team-level std algos: part 13 (#6351)
—
noreply /
githubweb
- Also fix annotations for generator constructor for AVX512 and NEON
—
Daniel Arndt /
githubweb
- #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange
—
cezary.skrzynski /
githubweb
- Fix atomic operations bug for Min and Max (#6435)
—
noreply /
githubweb
- Fix example/build_cmake_installed_different_compiler
—
Daniel Arndt /
githubweb
- fix impl
—
fnrizzi /
githubweb
- Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp
—
noreply /
githubweb
- Split Kokkos_Threads_Parallel files
—
Bruno Turcksin /
githubweb
- HPX: Implement TeamThread and ThreadVector parallel_scan with return value
—
Daniel Arndt /
githubweb
- Serial: Allow for distinct execution space instances (#6441)
—
noreply /
githubweb
- add overload for TeamThreadRange
—
fnrizzi /
githubweb
- address review comment
—
fnrizzi /
githubweb
- Update to HIP TeamPolicy Block number heuristic (#6284)
—
noreply /
githubweb
- SIMD: Split math functions from SIMD_Common.hpp (#6487)
—
noreply /
githubweb
- Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON
—
Daniel Arndt /
githubweb
- ad threadvector
—
fnrizzi /
githubweb
- fix order
—
fnrizzi /
githubweb
- remove guards
—
fnrizzi /
githubweb
- UnorderedMap(space instance): proposal for #6067
—
romin.tomasetti /
githubweb
- Rename Kokkos_ThreadsExec to align with the other backends
—
Bruno Turcksin /
githubweb
- Promote Kokkos_Printf.hpp to public include
—
Daniel Arndt /
githubweb
- simd: add floor, ceil, round, trunc operations (#6393)
—
noreply /
githubweb
- Update CI in OpenMPTarget to use llvm-17 (#6472)
—
noreply /
githubweb
- Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp
—
Bruno Turcksin /
githubweb
- Kokkos_SIMD_Scalar.hpp: remove extra ';'
—
ndellin /
githubweb
- add guards
—
fnrizzi /
githubweb
- avoid auto
—
fnrizzi /
githubweb
- [ci skip] Update Kokkos version to 4.2.99
—
Daniel Arndt /
githubweb
- [deprecated code 3] remove all default device init tests
—
Damien L-G /
githubweb
- [deprecated code 3] remove InitArguments
—
Damien L-G /
githubweb
- [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros
—
Damien L-G /
githubweb
- [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax
—
Damien L-G /
githubweb
- [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions
—
Damien L-G /
githubweb
- [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants
—
Damien L-G /
githubweb
- [deprecated code 3] remove {OpenMP,HPX}::partition_master
—
Damien L-G /
githubweb
- OpenMP backend cleanup following removal of deprecated code 3
—
Damien L-G /
githubweb
- Get rid of FIXME_OPENMP
—
Damien L-G /
githubweb
- [deprecated code 3] remove MasterLock
—
Damien L-G /
githubweb
- [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option
—
Damien L-G /
githubweb
- fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions
—
Damien L-G /
githubweb
- Do not append " - blocks" to the bitset label
—
Damien L-G /
githubweb
- with_updated_label -> append_to_label
—
Daniel Arndt /
githubweb
- SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables
—
Daniel Arndt /
githubweb
- Fixup in README (github -> GitHub)
—
Damien L-G /
githubweb
- Threads: replace enum with constexpr int and enum class (#6514)
—
noreply /
githubweb
- Added unit tests for reduction ops and few intel svml intrinsics
—
donlee /
githubweb
- Remove sleep and wake functions
—
Bruno Turcksin /
githubweb
- Prefer defaulted default constructor for Bitset (#6524)
—
noreply /
githubweb
- SYCL: Use host-pinned memory to copy reduction/scan result (#6500)
—
noreply /
githubweb
- Remove unused variables
—
Bruno Turcksin /
githubweb
- Remove Sentinel struct from Threads
—
Bruno Turcksin /
githubweb
- Small cleanup of ThreadsInternal::initialize
—
Bruno Turcksin /
githubweb
- Remove useless forward declaration
—
Bruno Turcksin /
githubweb
- Remove spawn function
—
Bruno Turcksin /
githubweb
- Add comments
—
Bruno Turcksin /
githubweb
- Fix indentation
—
Bruno Turcksin /
githubweb
- Fix typo in macro guard
—
Bruno Turcksin /
githubweb
- Reduce number of View constructor instantiations
—
Damien L-G /
githubweb
- Bump HPX version used in CI to 1.9.0
—
mikael.simberg /
githubweb
- Split files in HIP backend
—
Bruno Turcksin /
githubweb
- Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544)
—
noreply /
githubweb
- Only define STDALGO_TEAM_SOURCES_* once
—
Daniel Arndt /
githubweb
- Rollback changes to view constructors to reduce the number of instantiations (#6564)
—
noreply /
githubweb
- Threads remove unused variables and functions (#6566)
—
noreply /
githubweb
- Remove unused Sandia testing files (#6568)
—
noreply /
githubweb
- fallback implementation cleanup
—
donlee /
githubweb
- Remove empty quotation marks for static_assert
—
Daniel Arndt /
githubweb
- [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header
—
Damien L-G /
githubweb
- Drop Clang+CUDA workaround
—
Damien L-G /
githubweb
- OpenMPTarget: CI compiler upgrade. (#6545)
—
noreply /
githubweb
- Add crtrott's launch_latency benchmark (#6379)
—
noreply /
githubweb
- Simplify fence functions in the Threads backend (#6571)
—
noreply /
githubweb
- simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574)
—
noreply /
githubweb
- OpenMPTarget: Guard scratch memory usage in ParallelReduce
—
Daniel Arndt /
githubweb
- SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534)
—
noreply /
githubweb
- Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header
—
Damien L-G /
githubweb
- Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556)
—
noreply /
githubweb
- OpenMP: No memset in viewfill (#6573)
—
noreply /
githubweb
- Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header"
—
noreply /
githubweb
- OpenACC: add atomics support (#6446)
—
noreply /
githubweb
- Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543)
—
noreply /
githubweb
- kokkos(unique): fix allocation of temporary view to enfore using the provided space instance
—
romin.tomasetti /
githubweb
- Use binary wrapper for consistency in definition of half types numeric traits (#6590)
—
noreply /
githubweb
- try fix
—
fnrizzi /
githubweb
- Fix TestNumericTriats.hpp for SYCL with bfloat16 support
—
Daniel Arndt /
githubweb
- Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3
—
crtrott /
githubweb
- Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601)
—
noreply /
githubweb
- Added missing operator* to NEON simd
—
crtrott /
githubweb
- [ci skip] Update changelog on develop for 4.2.00 (#6592)
—
noreply /
githubweb
- Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF (#6593)
—
noreply /
githubweb
- try fix
—
fnrizzi /
githubweb
- avoid pyt package
—
fnrizzi /
githubweb
- try
—
fnrizzi /
githubweb
- fix for macos
—
fnrizzi /
githubweb
- remove comments
—
fnrizzi /
githubweb
- use reference
—
crtrott /
githubweb
- add branching
—
fnrizzi /
githubweb
- [ci skip] fix formatting
—
cezary.skrzynski /
githubweb
- GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing
—
Daniel Arndt /
githubweb
- nvcc(wrapper): adding missing `--generate-line-info` arg
—
romin.tomasetti /
githubweb
- Add clang-format check to GitHub workflows (#6612)
—
noreply /
githubweb
- graph(HIP): adding inline keyword to fix #6623
—
romin.tomasetti /
githubweb
- Add jenkins multibranch pipeline options
—
Bruno Turcksin /
githubweb
- kokkos(profiling): do not finalize in any backend
—
romin.tomasetti /
githubweb
- Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format
—
Bruno Turcksin /
githubweb
- Disabling OpenACC in the CI because it emits too many warnings
—
Bruno Turcksin /
githubweb
- unorderedmap: modernize traits
—
romin.tomasetti /
githubweb
- nvcc wrapper: remove troubling flag to fix 6628 (#6629)
—
noreply /
githubweb
- Add Kokkos::num_threads() and Kokkos::device_id()
—
Damien L-G /
githubweb
- Check Kokkos::num_threads and device_id in tests
—
Damien L-G /
githubweb
- Use std::sort for sorting within a bin when possible
—
Andrey Prokopenko /
githubweb
- RangePolicyRequire was not using require
—
tccleve /
githubweb
- Fix reducer result check for Serial+HPX ParallelReduce
—
Daniel Arndt /
githubweb
- Define KOKKOS_ARCH_NVIDIA_GPU macro when targeting an NVIDIA GPU architecture
—
Damien L-G /
githubweb
- Use KOKKOS_ARCH_NVIDIA_GPU macro in SYCL, OpenACC, and OpenMPTarget backends where appropriate
—
Damien L-G /
githubweb
- Add ParallelScanRangePolicy test
—
tccleve /
githubweb
- Address review comments
—
Andrey Prokopenko /
githubweb
- Fix partition_master test
—
Daniel Arndt /
githubweb
- Use CombinedReducer in HostIterateTile
—
Daniel Arndt /
githubweb
- Desul atomics: fix bug max uint64_t value
—
Damien L-G /
githubweb
- Make CUDA bhalf conversion code more forward compatible
—
Damien L-G /
githubweb
- Let KOKKOS_ARCH_NVIDIA_GPU provide the Compute Capability
—
Damien L-G /
githubweb
- Bring back git info to benchmarks output
—
cezary.skrzynski /
githubweb
- [ci skip] Add a comment
—
Andrey Prokopenko /
githubweb
- Rename KOKKOS_{ -> IMPL_}ARCH_NVIDIA_GPU
—
Damien L-G /
githubweb
- Improve macro name KOKKOS_IMPL_{ARCH_NVIDIA_GPU_AMPERE_PLUS -> NVIDIA_GPU_ARCH_SUPPORT_BHALF}
—
Damien L-G /
githubweb
- Allow that C++20 is passed to nvcc
—
noreply /
githubweb
- Allow c++20 in nvcc_wrapper for nvcc 12 and above
—
noreply /
githubweb
- Print Kokkos version at configuration time (#5979)
—
noreply /
githubweb
- #5641: Fix HIP & CUDA MDRange reduce for sizeof(value_type) < sizeof(int) (#5745)
—
noreply /
githubweb
- Fix sycl.large_team_scratch_size
—
Daniel Arndt /
githubweb
- Add to HIP tests in Makefile
—
tccleve /
githubweb
- Bit manipulation: implement `byteswap` (#5967)
—
noreply /
githubweb
- UnorderedMap: Ensure size() working in case of copies
—
Daniel Arndt /
githubweb
- Use (non-mutable) std::shared_ptr instead
—
Daniel Arndt /
githubweb
- Add another test case
—
Daniel Arndt /
githubweb
- graph(hip): enable test
—
romin.tomasetti /
githubweb
- Remove workaround for submit_barrier not being enqueued properly
—
Daniel Arndt /
githubweb
- Support --compiler-options in nvcc_wrapper
—
noreply /
githubweb
- Cleanup OpenMPTaget ParallelReduce
—
Daniel Arndt /
githubweb
- Fix compiling OpenMPTarget for AMD GPUs
—
Daniel Arndt /
githubweb
- Desul atomics: Fix NVCC warning integer conversion resulted in a change of sign
—
Damien L-G /
githubweb
- Fixup NVIDIA GPU arch must be defined potentially for other backends as well
—
Damien L-G /
githubweb
- Fix warning with NVC++
—
Damien L-G /
githubweb
- Fix team_scratch_1_queues for SYCL+Cuda
—
Daniel Arndt /
githubweb
- m_team_scratch_pool -> m_team_scratch_event
—
Daniel Arndt /
githubweb
- Explain acquire_team_scratch_space
—
Daniel Arndt /
githubweb
- Move Cuda/Kokkos_Cuda_NvidiaGpuArchitectures.hpp -> impl/Kokkos_NvidiaGpuArchitectures.hpp
—
Damien L-G /
githubweb
- Drive-by fix typos "fix {to -> too} many"
—
noreply /
githubweb
- simd: Fixed an incorrectly returning size for uint64_t in avx2 (#6004)
—
noreply /
githubweb
- Pass local_accessor directly instead
—
Daniel Arndt /
githubweb
- Fix warning in some user code when using std::memcpy
—
Bruno Turcksin /
githubweb
- CMake: update package compatibility mode when building within Trilinos
—
Andrey Prokopenko /
githubweb
- SYCL: Make is_device_copyable future-proof (#6009)
—
noreply /
githubweb
- desul: Move lock_array_copied from global scope (#5999)
—
noreply /
githubweb
- Fix namespace for is_device_copyable
—
Daniel Arndt /
githubweb
- Fixup 4.0 change log (#6015)
—
noreply /
githubweb
- Add half_t and bhalf_t limits (#5778)
—
noreply /
githubweb
- Add UnorderedMapInsertOps for coo2crs (#5877)
—
noreply /
githubweb
- #6027: replace remaining instances of ALL_t with Kokkos::ALL_t (#6028)
—
noreply /
githubweb
- Add -Wdeprecated-copy warning and fix OMPT scan bug related to assignment operators (#6026)
—
noreply /
githubweb
- Relax scratch space limits for HIP reductions (#6029)
—
noreply /
githubweb
- Fix -Wmissing-field-initializers warning
—
Daniel Arndt /
githubweb
- Fix enum warnings
—
Daniel Arndt /
githubweb
- Fix OMPT size compare warnings
—
Daniel Arndt /
githubweb
- Add -Wextra
—
Daniel Arndt /
githubweb
- Cherry-pick TriBITS update from Trilinos
—
Daniel Arndt /
githubweb
- KOKKOS_COMPILER_PGI -> KOKKOS_COMPILER_NVHPC
—
Daniel Arndt /
githubweb
- Try removing _kokkos_pgi_compiler_bug_workaround
—
Daniel Arndt /
githubweb
- Remove unused constructors for ThreadVectorRangeBoundairesStruct
—
donlee /
githubweb
- Define at most one KOKKOS_COMPILER* macro
—
Daniel Arndt /
githubweb
- Define KOKKOS_COMPILER_INTEL_LLVM
—
Daniel Arndt /
githubweb
- Desul atomics: let relocatable device code mode be part of the configuration (#5991)
—
noreply /
githubweb
- CHANGELOG.md: add threads sort
—
ajpowel /
githubweb
- Remove Kokkos Keyword Listing section from BUILD.md and refer to the wiki instead
—
Damien L-G /
githubweb
- Remove (outdated) license information [ci skip]
—
Damien L-G /
githubweb
- Remove libnuma (#6048)
—
noreply /
githubweb
- Don't create a shared state for size() in UnorderedMap's deep_copy
—
Daniel Arndt /
githubweb
- OpenMPTarget: Changes to Makefile.kokkos (#6053)
—
noreply /
githubweb
- Partially reverse #5504
—
Daniel Arndt /
githubweb
- Reduce size of ScatterView test when using OpenMP
—
Bruno Turcksin /
githubweb
- Fix bug in Makefile.kokkos
—
stamoor /
githubweb
- Fix join for ValueWrapperForNoNeutralElement
—
Daniel Arndt /
githubweb
- SYCL: Use in-order queue for SYCL+Cuda
—
Daniel Arndt /
githubweb
- perf_test is still not working
—
Daniel Arndt /
githubweb
- Fix PerfTests by limiting GramSchmidt
—
Daniel Arndt /
githubweb
- Only pass one wrapper object in SYCL reductions
—
Daniel Arndt /
githubweb
- Explicitly cast to CombinedFunctorReducerType
—
Daniel Arndt /
githubweb
- Fix Kokkos_SIMD with AVX2 on 64-bit architectures (#6075)
—
noreply /
githubweb
- Update changelog (#6058)
—
noreply /
githubweb
- Fix up SYCL execution space instance creation for Intel GPUs
—
Daniel Arndt /
githubweb
- Converted a shared_ptr to a host view in UnorderedMap (#6073)
—
noreply /
githubweb
- Update OpenACC FunctorAdapter (#6077)
—
noreply /
githubweb
- Improve SYCL parallel_scan
—
Daniel Arndt /
githubweb
- Compiling with auto deduction of workgroup sizes
—
Daniel Arndt /
githubweb
- Disable failed bit manipulation tests when compiled by NVHPC (#6088)
—
noreply /
githubweb
- Add support for Darwin 32-bit and PPC (#5916)
—
noreply /
githubweb
- Unconditionally enable CUDA extended lambda support
—
pbmille /
githubweb
- Tentative arguments switch for nvcc 12+
—
pbmille /
githubweb
- Change Makefile.kokkos too
—
pbmille /
githubweb
- Implement CMake messages per team decision
—
pbmille /
githubweb
- Fix definitions and docs to remove CUDA Lambda option
—
pbmille /
githubweb
- Don't fail to define broader 'lambdas are available' macro
—
pbmille /
githubweb
- Always expect KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA to be set
—
pbmille /
githubweb
- Remove various test exclusions based on KOKKOS_ENABLE_CUDA_LAMBDA
—
pbmille /
githubweb
- Update changelog
—
ndellin /
githubweb
- [ci skip] Fixup changelog
—
ndellin /
githubweb
- Work around nvcc issue for view_mapping and add FIXME_NVCC comment
—
pbmille /
githubweb
- OpenMPTarget: Update hierarchical parallelism. (#6043)
—
noreply /
githubweb
- Enable OpenMP in CUDA-11.0-NVCC-RDC to test DEPRECATED_CODE_3=ON (#5978)
—
noreply /
githubweb
- fix ternary op in subset of std algorithms not working with nvhpc (#6095)
—
noreply /
githubweb
- Add implementation of bit_cast in <Kokkos_BitManipulation.hpp>
—
Damien L-G /
githubweb
- Add compile time tests for the constraints on the bit_cast function template
—
Damien L-G /
githubweb
- Add the Experimental:: builtin variant (just defer to regular bit_cast)
—
Damien L-G /
githubweb
- Add runtime tests for bit_cast
—
Damien L-G /
githubweb
- Use Kokkos::bit_cast in SIMD instead of rolling its own
—
Damien L-G /
githubweb
- Clang-format glitch
—
Damien L-G /
githubweb
- view(uvm): fence if need in allocation (#6005)
—
romin.tomasetti /
githubweb
- Disable tests that fail at runtime with NVHPC (likely not liking the class declaration within the body of the functor)
—
Damien L-G /
githubweb
- change impl of `is_sorted_until` to use reduce (#6097)
—
noreply /
githubweb
- Fix typo and remove accidentally committed assertions
—
noreply /
githubweb
- Added multiple reducers support for team-level parallel reduce (#5727)
—
noreply /
githubweb
- Work around NVHPC issue with enum types
—
crtrott /
githubweb
- Work around NVHPC 23.x issues
—
crtrott /
githubweb
- Kokkos: Remove TriBITS Kokkos subpackages (trilinos/Trilinos#11545) (#6104)
—
noreply /
githubweb
- Drop pointless Kokkos{Algorithms,Containers}_config.h files
—
Damien L-G /
githubweb
- Revert "Merge pull request #5964 from PhilMiller/cuda-lambda-default"
—
Damien L-G /
githubweb
- Update the OpenACC parallel_reduce() constructs with Range/MDRange/Team (#6072)
—
noreply /
githubweb
- Always pass -extended-lambda option to NVCC and force Kokkos_ENABLE_CUDA_LAMBDA ON
—
Damien L-G /
githubweb
- Reorganize ZeroMemset (#6087)
—
noreply /
githubweb
- Drop CUDA_LAMBDA guards in Cuda headers
—
Damien L-G /
githubweb
- Work around NVHPC 23.x not dealing with __isGlobal
—
crtrott /
githubweb
- Drop unused cmake macros
—
Damien L-G /
githubweb
- Fixup cmake style
—
Damien L-G /
githubweb
- use ASSERT_EQ in all std algorithms tests
—
fnrizzi /
githubweb
- Reintroduce test skip for nvhpc < 23.3
—
crtrott /
githubweb
- hpcbind: check for correct Slurm variable
—
rberger /
githubweb
- Fix macro guards in test for NVC++ as the CUDA compiler
—
Damien L-G /
githubweb
- Allow templated functors in parallel_for, parallel_reduce and parallel_scan (#5976)
—
noreply /
githubweb
- Import sycl::bit_cast into the Kokkos namespace
—
Daniel Arndt /
githubweb
- Qualify calls possibly ambiguous calls to bit_cast
—
Daniel Arndt /
githubweb
- Fix nightlies -- workaround compiler bug in GCC 9.1 and 9.2 (#6118)
—
noreply /
githubweb
- Kokkos_BitManipulation: KOKKOS_COMPILER_GCC->KOKKOS_COMPILER_GNU (#6119)
—
noreply /
githubweb
- Cuda: Remove unused attach_texture_object
—
Daniel Arndt /
githubweb
- Move half traits to private header and add half/bhalf infinity trait (#6055)
—
noreply /
githubweb
- Increase minimum required HPX version to 1.8.0
—
mikael.simberg /
githubweb
- Conditionally use hpx::post instead of hpx::apply based on HPX version
—
mikael.simberg /
githubweb
- Don't restrict index type in builtin reducers
—
Daniel Arndt /
githubweb
- dual view: update template types (#6085)
—
romin.tomasetti /
githubweb
- sorting an empty view should exit early and not fail (#6130)
—
noreply /
githubweb
- core/src: Move floating_point_wrapper to private header
—
eharvey /
githubweb
- Disable tests failing with NVHPC
—
Daniel Arndt /
githubweb
- Fix bit_cast for SYCL again
—
Daniel Arndt /
githubweb
- Disable tests for OpenMPTarget
—
Daniel Arndt /
githubweb
- Improve indentation of comments
—
Daniel Arndt /
githubweb
- Allow deprecated declarations in SYCL+Cuda CI
—
Daniel Arndt /
githubweb
- Try running for other execution spaces
—
Daniel Arndt /
githubweb
- Add guards for Cuda
—
Daniel Arndt /
githubweb
- Expand list of kokkos options not to export with cmake
—
Damien L-G /
githubweb
- Do not append to Kokkos_OPTIONS variables those in the do not export list
—
Damien L-G /
githubweb
- Drop Kokkos_ENABLE_LAUNCH_COMPILER option
—
Damien L-G /
githubweb
- Export Kokkos_ENABLE_<OPTION> that are relevant
—
Damien L-G /
githubweb
- Drop Kokkos_ENABLE_PROFILING_LOAD_PRINT option
—
Damien L-G /
githubweb
- Suppress bogus warning about CUDA_LAMBDA being ON
—
Damien L-G /
githubweb
- [ci skip] Add nightly ci for spack (#6135)
—
noreply /
githubweb
- OpenMPTarget: Enable Cray compiler for the OpenMPTarget backend. (#5889)
—
noreply /
githubweb
- Revert to `DualView<class,class=void,class=void,class=void>` when deprecated code 4 is enabled
—
Damien L-G /
githubweb
- Fix Kokkos_ENABLE_CUDA_LAMBDA for Trilinos
—
Daniel Arndt /
githubweb
- Fix bogus warnings in nested CUDA parallel_reduce
—
Daniel Arndt /
githubweb
- `BinSort`, `BinOp1D`, `BinOp3D`: mark default constructor as deleted (#6131)
—
noreply /
githubweb
- KokkosTools: Don't call callbacks before backends are initialized (#6114)
—
noreply /
githubweb
- Add `Kokkos::all_libs` alias target for compatibility with TriBITS/Trilinos (#6157)
—
noreply /
githubweb
- sorting: add to binsort support for strided views and reorg tests (#6081)
—
noreply /
githubweb
- Allow linking against build tree (#6078)
—
noreply /
githubweb
- Implement `HPX::in_parallel` (#6143)
—
noreply /
githubweb
- OpenMPTarget: Changes for OpenMPTarget backend with nvhpc compiler.
—
rgayatri /
githubweb
- OpenMPTarget: Add a fixme.
—
rgayatri /
githubweb
- Update Makefile.kokkos
—
noreply /
githubweb
- Remove extended_namespace template paramter for SYCLMemoryOrder/Scope
—
Daniel Arndt /
githubweb
- OpenMPTarget: update fixme comment.
—
rgayatri /
githubweb
- OpenMPTarget: Replace kokkos macros in desul.
—
rgayatri /
githubweb
- OpenMPTarget: Restore desul changes.
—
rgayatri /
githubweb
- Cherry-pick v3.7.02 changelog into develop [ci skip]
—
Damien L-G /
githubweb
- Kokkos: Mark HWLOC as a TriBITS TPL as well (trilinos/Trilinos#11938)
—
rabartl /
githubweb
- Clean up FunctorAnalysis
—
Daniel Arndt /
githubweb
- SIMD: make binary op tests to test against all data types (#5913)
—
noreply /
githubweb
- Also create symlinks for CMake configuration files to cmake_packages/Kokkos for TriBITS (#6163)
—
noreply /
githubweb
- Allow passing a temporary std::vector to partition_space (#6167)
—
noreply /
githubweb
- .github/workflows: Remove push trigger
—
eharvey /
githubweb
- .github/workflows: Only trigger upon push to develop
—
eharvey /
githubweb
- Replace _mm512_loadu_epi64 and _mm512_storeu_epi64 with _mm512_loadu_si512 and _mm512_storeu_si512
—
donlee /
githubweb
- OpenMPTarget: include desul changes.
—
rgayatri /
githubweb
- Weed out verbose output from `dynamic_view` container unit test (#6173)
—
noreply /
githubweb
- shortcut value for is_dynamic_view
—
fnrizzi /
githubweb
- add trait and test
—
fnrizzi /
githubweb
- Fix global fence in Kokkos::resize(DynRankView) (#6184)
—
noreply /
githubweb
- Left align demangled stacktrace output. (#6191)
—
noreply /
githubweb
- [HIP] Improve heuristic deciding the number of blocks used in parallel_reduce (#6160)
—
noreply /
githubweb
- Improve OpenMP affinity warning to include MPI concerns (#6185)
—
noreply /
githubweb
- Update version number on develop after branching off for 4.1.00
—
Daniel Arndt /
githubweb
- Fix test_quad_precision_math_constants test
—
Daniel Arndt /
githubweb
- implementation and tests
—
fnrizzi /
githubweb
- implementation and tests
—
fnrizzi /
githubweb
- [ci skip] test_all_sandia: update compilers and queues
—
ndellin /
githubweb
- team-level std algos: common code needed (#6199)
—
noreply /
githubweb
- Fix compiling SYCL with KOKKOS_IMPL_DO_NOT_USE_PRINTF_USAGE
—
Daniel Arndt /
githubweb
- snapshot mdspan namespace changes (#6162)
—
noreply /
githubweb
- Disable AVX512 support for NVHPC
—
Daniel Arndt /
githubweb
- Fix host-annotations of AVX2, AVX512, and NEON constructors
—
Daniel Arndt /
githubweb
- Introduce impl_get_value/impl_get_mask
—
Daniel Arndt /
githubweb
- Fix a gcc-8.4.0 warning
—
Daniel Arndt /
githubweb
- Fix host-device annotation for where_expression/const_where_expression
—
Daniel Arndt /
githubweb
- Make in-order queues the default via macro
—
Daniel Arndt /
githubweb
- Avoid SFINAE in favor of overloads
—
Daniel Arndt /
githubweb
- Move scalar overloads to Scalar header
—
Daniel Arndt /
githubweb
- Disable KOKKOS_ARCH_AVX512XEON for NVHPC
—
Daniel Arndt /
githubweb
- Changelog for 4.1.00 (#6225)
—
noreply /
githubweb
- reorganize sort headers (#6230)
—
noreply /
githubweb
- SYCL: Support for bhalf_t (#6204)
—
noreply /
githubweb
- only compute with relavent entries
—
tccleve /
githubweb
- make constraints on `Kokkos::sort` more visible/clear (#6234)
—
noreply /
githubweb
- slim API and move code to impl
—
fnrizzi /
githubweb
- This PR contains minor code changes and bug fixes needed for LLVM-Clacc
—
lees2 /
githubweb
- improve all other corner cases as per review comment
—
fnrizzi /
githubweb
- refine for cuda uvm
—
fnrizzi /
githubweb
- use exespace to check rather than mem space
—
fnrizzi /
githubweb
- Fix AVX2 simd support for ZEN2 AMD CPU. (#6238)
—
noreply /
githubweb
- fix corner case
—
fnrizzi /
githubweb
- Fix windows symlink configure issue (#6241)
—
noreply /
githubweb
- fix corner cases
—
fnrizzi /
githubweb
- bug_report.md: new PR branching from `develop` (#5034)
—
noreply /
githubweb
- Fix whitespace in bug_report.md (#6244)
—
noreply /
githubweb
- Avoid undefined behavior in TestTaskScheduker.hpp
—
Daniel Arndt /
githubweb
- Remove calling tribits_exclude_autotools_files()
—
rabartl /
githubweb
- Ensure that complex is only instantiated for cv-unqualified floating-point type
—
Damien L-G /
githubweb
- Deprecated Kokkos::vector
—
Damien L-G /
githubweb
- Warn if <Kokkos_Vector.hpp> is included
—
Damien L-G /
githubweb
- Drop Vector test with makefiles and conditionally remove it with CMake
—
Damien L-G /
githubweb
- Ignore <Kokkos_Vector.hpp> in the header self-containment tests
—
Damien L-G /
githubweb
- SYCL: Use in-order queues in InterOp tests (#6246)
—
noreply /
githubweb
- std_algos: fix wrong corner case for `is_partitioned` (#6257)
—
noreply /
githubweb
- Make sure macros are defined
—
Daniel Arndt /
githubweb
- Error out when Kokkoks_Vector.hpp is included with deprecated code disabled
—
Daniel Arndt /
githubweb
- SIMD: Add abs() for all int types (#6069)
—
noreply /
githubweb
- Fix SIMD abs unit test accidentally using complex overload
—
Damien L-G /
githubweb
- Fix SIMD tests on NEON
—
Daniel Arndt /
githubweb
- Add default ParallelFor copy constructor for HIP
—
Bruno Turcksin /
githubweb
- Workaround gcc/8.2.0 compiler issue with _mm512_abs_pd
—
ndellin /
githubweb
- Implement Kokkos::printf (#6083)
—
noreply /
githubweb
- Improve SYCL TeamPolicy reduction
—
Daniel Arndt /
githubweb
- make Kokkos_CXX_COMPILER_VERSION available to CMake consumers
—
cwpears /
githubweb
- Fully qualify Experimental::SYCL in algorithms to avoid finding conflicting namespaces
—
Daniel Arndt /
githubweb
- subset of team level impl of std algorithms
—
fnrizzi /
githubweb
- fix copyright
—
fnrizzi /
githubweb
- guard for openmptarget
—
fnrizzi /
githubweb
- fix for openmptarget
—
fnrizzi /
githubweb
- address comments
—
fnrizzi /
githubweb
- Update CMakeLists for unit tests with OpenMPTarget, OpenACC with NVHPC (#6260)
—
noreply /
githubweb
- Update CI from CUDA 11.7.0 to 11.7.1
—
Bruno Turcksin /
githubweb
- Improve SYCL reduction performance: RangePolicy (#6264)
—
noreply /
githubweb
- Improve SYCL reduction performance: workgroup_reduction (#6270)
—
noreply /
githubweb
- SYCL TeamPolicy: Fix sign comparison warning
—
Daniel Arndt /
githubweb
- SIMD: suppress a uninitialized variable warning (#6294)
—
noreply /
githubweb
- OpenACC CMakechange Clacc (#6250)
—
noreply /
githubweb
- `Kokkos::sort` support custom comparator (#6253)
—
noreply /
githubweb
- Add nightly build using latest gcc and c++23
—
Bruno Turcksin /
githubweb
- Explicitly capture this in lambda function
—
Bruno Turcksin /
githubweb
- Fix typo in nightly jenkins configuration
—
Bruno Turcksin /
githubweb
- Fix a memory bug in the `free_state` function of random pools. (#6290)
—
noreply /
githubweb
- std_algos: for_each: try condense the impl
—
fnrizzi /
githubweb
- format
—
fnrizzi /
githubweb
- Enforce create_mirror restrictions w.r.t. ViewCtorArgs on all variants
—
Daniel Arndt /
githubweb
- Use KOKKOS_IF_ON_HOST
—
Daniel Arndt /
githubweb
- Suppress warnings
—
Daniel Arndt /
githubweb
- Don't suppress warnings for NVHPC
—
Daniel Arndt /
githubweb
- Improve macro definitions
—
Daniel Arndt /
githubweb
- Enable Serial backend in HPX build
—
cezary.skrzynski /
githubweb
- Modify fences in View API test
—
cezary.skrzynski /
githubweb
- Check for overflow during backend initialization (Cuda, HIP, SYCL) (#6159)
—
noreply /
githubweb
- Improve SYCL reduction performance: MDRangePolicy (#6271)
—
noreply /
githubweb
- bring back previous code as discussed in meeting
—
fnrizzi /
githubweb
- create cudaAPI function wrappers
—
tccleve /
githubweb
- Reorganize #include <cuda_runtime_api.h>
—
tccleve /
githubweb
- Some api function require cuda11.2+
—
tccleve /
githubweb
- Cuda10 requires "stream=nullptr" as default arg
—
tccleve /
githubweb
- Rework stream inputs
—
tccleve /
githubweb
- Use "if constexpr" for setCudaDevice
—
tccleve /
githubweb
- Remove static in comment
—
tccleve /
githubweb
- Update CI: use CUDA 11.6.2 instead of 11.6.0 (#6314)
—
noreply /
githubweb
- add helper variable templates `are_*_iterators_v` (#6312)
—
noreply /
githubweb
- [HIP] Optimize parallel_reduce (#6229)
—
noreply /
githubweb
- Split Kokkos_SYCL_Parallel_Range.hpp
—
Daniel Arndt /
githubweb
- Split Kokkos_SYCL_Parallel_Reduce.hpp
—
Daniel Arndt /
githubweb
- Move Kokkos_SYCL_Scan.hpp
—
Daniel Arndt /
githubweb
- Split Kokkos_Parallel_Team.hpp
—
Daniel Arndt /
githubweb
- Add comment that we don't need to link against clang_rt with ROCm 5.5
—
Bruno Turcksin /
githubweb
- Error out when CXX standard is not set when using amdclang or cray clang
—
Bruno Turcksin /
githubweb
- Disable failing tests for ROCm 5.5 and 5.6
—
Bruno Turcksin /
githubweb
- #5635: Add parallel_scan overload with return value for ThreadVectorRange
—
arek.szczepkowicz /
githubweb
- #5635: Add test for parallel_scan with return value for ThreadVectorRange
—
arek.szczepkowicz /
githubweb
- #5635: Serial/OpenMP: Parallel_scan with return value for TeamThreadRange (#6090)
—
noreply /
githubweb
- Fix reviewer's comments
—
Bruno Turcksin /
githubweb
- Allow using the SYCL execution space on AMD GPUs (#6321)
—
noreply /
githubweb
- Update Clang+Cuda CI from 10.1 to 11.0.3 (#6318)
—
noreply /
githubweb
- SIMD: add shift ops for all int types (#6109)
—
noreply /
githubweb
- SYCL: Use sycl::bit_cast from oneAPI 2024.0.0 on (#6300)
—
noreply /
githubweb
- remove spurious undefs
—
fnrizzi /
githubweb
- Decrease maximum memory available to ccache
—
Bruno Turcksin /
githubweb
- Fix gtest when using C++20
—
Bruno Turcksin /
githubweb
- Fully qualify Experimental::SYCL in algorithms to avoid conflicting namespaces
—
Daniel Arndt /
githubweb
- address reviews [skip ci]
—
antoine.meyer54 /
githubweb
- formatting
—
fnrizzi /
githubweb
- SIMD: split simd unit tests into separate files (#6278)
—
noreply /
githubweb
- Run NVHPC only on V100
—
Bruno Turcksin /
githubweb
- Use checked arithmetic builtins for overflow detection (#6313)
—
noreply /
githubweb
- Passing size==0 to DeepCopy/memcpy/omp_target_memcpy (#6273)
—
noreply /
githubweb
- allow sorting via native oneDPL to support views with stride = 1 (#6322)
—
noreply /
githubweb
- Adopt new HIP cmake's way of finding clang-rt
—
nicurtis /
githubweb
- Fix #6334: intel compiler does not like returning inside if constexpr (#6335)
—
noreply /
githubweb
- Fix #6336: capturing `this` implicitly is deprecated in C++20 (#6337)
—
noreply /
githubweb
- fix lambda capture
—
fnrizzi /
githubweb
- remove unnecessary file, fix constraints
—
fnrizzi /
githubweb
- remove file
—
fnrizzi /
githubweb
- fix
—
fnrizzi /
githubweb
- fix lambda capture and constraints
—
fnrizzi /
githubweb
- keep only subset
—
fnrizzi /
githubweb
- revert files
—
fnrizzi /
githubweb
- remove file
—
fnrizzi /
githubweb
- fix syntax
—
fnrizzi /
githubweb
- format
—
fnrizzi /
githubweb
- Disable default oneDPL support in Trilinos
—
Daniel Arndt /
githubweb
- Rename AMD GPU architectures (#6266)
—
noreply /
githubweb
- Fix compiling SIMD libraray with NEON and gcc-13
—
Daniel Arndt /
githubweb
- SIMD: add generator constructors (#6347)
—
noreply /
githubweb
- Use /usr/bin/env bash in nvcc_wrapper and kokkos_launch_compiler for portability
—
mikael.simberg /
githubweb
- Fix cudaAPI wrapper errors for CUDA_MALLOC_ASYNC (#6346)
—
noreply /
githubweb
- Use std::aligned_alloc for allocations (#6341)
—
noreply /
githubweb
- Initial implementation of gfx942 (#6358)
—
noreply /
githubweb
- Only set KOKKOS_ARCH_AMD_GPU if a AMD GPU architecture is enabled
—
Daniel Arndt /
githubweb
- Extend 'hip_driver_check_page_migration' (#6364)
—
noreply /
githubweb
- Replace volatile m_ready_count loads by desul::atomic_loads in task queue implementation
—
mikael.simberg /
githubweb
- Add Kokkos_Atomic.hpp headers to OpenMP and HPX task headers
—
mikael.simberg /
githubweb
- Add support for HIP Graph
—
Bruno Turcksin /
githubweb
- Replace one of the ROCM 5.2 configuration with ROCM 5.6
—
Bruno Turcksin /
githubweb
- Use aligned version of operator new instead of aligned_alloc
—
Daniel Arndt /
githubweb
- HPX: Don't interfere with exception handling
—
Daniel Arndt /
githubweb
- Fix -Wformat-truncation warnings in CI (#6354)
—
noreply /
githubweb
- Do not use HIP Graph with ROCm 5.2
—
Bruno Turcksin /
githubweb
- Use constexpr West in src
—
Bruno Turcksin /
githubweb
- Use constexpr West in test
—
Bruno Turcksin /
githubweb
- SIMD: convert binary operators to hidden friends (#6320)
—
noreply /
githubweb
- simd: make mask and condition unit test to check with all data types (#6360)
—
noreply /
githubweb
- team-level std algos: part 2 (#6205)
—
noreply /
githubweb
- Add deprecated attribute to HostSpace(AllocationMechanism) definition
—
Damien L-G /
githubweb
- Added gather_from and scatter_to for AVX2 and AVX512 simd
—
donlee /
githubweb
- Rebased and applied feedbacks
—
donlee /
githubweb
- Changed gen ctor usage to pass in KOKKOS_LAMBDA
—
donlee /
githubweb
- Suppress NVCC attribute ignored warning on SIMD operators as hidden friends
—
Damien L-G /
githubweb
- Remove deprecated code 3 support for volatile join
—
crtrott /
githubweb
- Disable a test not working with nvhpc-23.1
—
crtrott /
githubweb
- Reenabling tests for nvhpc 23.7
—
crtrott /
githubweb
- Update containers and algorithms for NVC++ 23.7
—
crtrott /
githubweb
- More NVC++ 23.7 updates
—
crtrott /
githubweb
- NVC++ clang-format fixes
—
crtrott /
githubweb
- Update nvhpc to version 23.7 in the CI
—
crtrott /
githubweb
- NVHPC 23.7 update: address reviewer comments
—
crtrott /
githubweb
- OpenACC: Guard tests relying on abort
—
crtrott /
githubweb
- Fix TestAtomic to use the test execspace
—
crtrott /
githubweb
- Use NVHPC 23.7 for testing of OpenACC
—
crtrott /
githubweb
- Update nvhpc gtest skip message
—
crtrott /
githubweb
- Work around OpenMPTarget failure
—
crtrott /
githubweb
- Update base docker file for nvhpc
—
crtrott /
githubweb
- Remove stray Cuda graph pattern specialization from tag
—
Damien L-G /
githubweb
- Fix reviewer's comments
—
Bruno Turcksin /
githubweb
- Fix uninitialize variable warning with gcc 13
—
Bruno Turcksin /
githubweb
- HIP: Update print_configuration (#6387)
—
noreply /
githubweb
- Check AVX, AVX2, AVX512XEON compiler macros on KokkosCore_config.h (#6248)
—
noreply /
githubweb
- Don't use local headers or runtime in HPX backend due to deprecation
—
mikael.simberg /
githubweb
- Add a more comprehensive `kokkos_{malloc, free}` perf_test (#6377)
—
noreply /
githubweb
- team-level std algos: part 3 (#6207)
—
noreply /
githubweb
- Adding is_scoped_enum & to_underlying (#6356)
—
noreply /
githubweb
- SIMD: add float simd support (#6177)
—
noreply /
githubweb
- team-level std algos: part 4 (#6208)
—
noreply /
githubweb
- Added a gen ctor for float (#6397)
—
noreply /
githubweb
- team-level std algos: part 5 (#6209)
—
noreply /
githubweb
- Deprecate Cuda(cudaStream_t, bool)
—
Damien L-G /
githubweb
- Fixup checked interger operations death test
—
Damien L-G /
githubweb
- Deprecate HIP(hipStream_t, bool)
—
Damien L-G /
githubweb
- Let Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC be ON by default
—
Damien L-G /
githubweb
- Print whether KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC is defined
—
Damien L-G /
githubweb
- Introduce disable_malloc_async Cuda option with generated makefiles
—
Damien L-G /
githubweb
- Preserve one build that disables Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC
—
Damien L-G /
githubweb
- Use archive extraction time for timestamps
—
cezary.skrzynski /
githubweb
- Disable performance benchmarks in AppVeyor CI
—
cezary.skrzynski /
githubweb
- team-level std algos: part 6 (#6210)
—
noreply /
githubweb
- address comments
—
fnrizzi /
githubweb
- OpenMP backend refactor files. (#6403)
—
noreply /
githubweb
- Drop (unused) `Cuda::cuda_internal_maximum_shared_words`
—
Damien L-G /
githubweb
- Drop check that the host backend is initialized before the Cuda/HIP/SYCL one
—
Damien L-G /
githubweb
- Drop unused HIPInternal::m_maxSharedWords data member
—
Damien L-G /
githubweb
- Drop unused HIPInternal::m_hipArch static data member
—
Damien L-G /
githubweb
- !initialized() should be a precondition for calling {Cuda,HIP,SYCL}Internal::initialize
—
Damien L-G /
githubweb
- Drop pre-Kepler logic in Cuda::impl_initialize
—
Damien L-G /
githubweb
- use single
—
fnrizzi /
githubweb
- address comments
—
fnrizzi /
githubweb
- formatting
—
fnrizzi /
githubweb
- Team-level std algos: part 7 (#6211)
—
noreply /
githubweb
- formatting
—
fnrizzi /
githubweb
- Enable death tests for fedora rawhide
—
cezary.skrzynski /
githubweb
- core/src: Add half math functions to private header (#6124)
—
noreply /
githubweb
- Drop check whether device supports unified addressing
—
Damien L-G /
githubweb
- fix single as per Christian's suggestion
—
fnrizzi /
githubweb
- Only warn once (at initialization) when forcing allocation in unified memory
—
Damien L-G /
githubweb
- check-copyright improvements (#6399)
—
noreply /
githubweb
- Use execution space instance argument to get device properties in block size deduction
—
Damien L-G /
githubweb
- Address reviewer' comments
—
Bruno Turcksin /
githubweb
- Fix to avoid #186-D pointless comparison warning.
—
maarten.arnst /
githubweb
- add comment
—
fnrizzi /
githubweb
- improve tests to address review
—
fnrizzi /
githubweb
- Fix guard for isnan test for bhalf_t
—
Daniel Arndt /
githubweb
- avoid potential race condition HIP
—
tccleve /
githubweb
- [SYCL][Reduction] Group counter should use at least memory_order::acq_rel
—
andrei.elovikov /
githubweb
- Same for scan
—
andrei.elovikov /
githubweb
- Initialize m_num_scratch_locks for Cuda parallel_for TeamPolicy
—
Daniel Arndt /
githubweb
- improve tests with intra-team result check
—
fnrizzi /
githubweb
- Fixes for Kokkos::Array (#6372)
—
noreply /
githubweb
- try fix for unique, previous impl to remove later
—
fnrizzi /
githubweb
- #5635: Add parallel_scan changes for CUDA and TeamThreadRange
—
cezary.skrzynski /
githubweb
- remove old impl
—
fnrizzi /
githubweb
- #5635: Enable TeamThreadRange test for CUDA
—
cezary.skrzynski /
githubweb
- Clean up benchmarks/gups
—
cwpears /
githubweb
- benchmark/gups: use CMake
—
cwpears /
githubweb
- OpenMPTarget: Disable check for SIMD compiler macros
—
Daniel Arndt /
githubweb
- #5635: Add parallel_scan with value for CUDA and ThreadVectorRange
—
cezary.skrzynski /
githubweb
- add missing assert
—
fnrizzi /
githubweb
- #5635: Add test for parallel_scan with return value for ThreadVectorRange
—
cezary.skrzynski /
githubweb
- add intra team check for missing test
—
fnrizzi /
githubweb
- fix intel compile error
—
fnrizzi /
githubweb
- fix unreachable for intel
—
fnrizzi /
githubweb
- re-enable unit tests for sort and random via makefile (#6422)
—
noreply /
githubweb
- OpenMPTarget init-join fix (#6444)
—
noreply /
githubweb
- Fix Cuda parallel_scan ThreadVectorRange range
—
Daniel Arndt /
githubweb
- Assign final sum in Cuda parallel_scan ThreadVectorRange
—
Daniel Arndt /
githubweb
- Fix compiling code using Kokkos::printf for OpenMPTarget on Intel GPUs (#6443)
—
noreply /
githubweb
- std_algos: improving min, max, minmax (#6421)
—
noreply /
githubweb
- team-level stdalgos: improve tests, check intra-team result matching (part 2/7) (#6426)
—
noreply /
githubweb
- Skip bessel function tests known to fail on Intel GPUs (#6434)
—
noreply /
githubweb
- team-level stdalgos: improve tests, check intra-team result matching (part 6/7) (#6436)
—
noreply /
githubweb
- Fix race condition in functor_vec_scan_ret_val test
—
Daniel Arndt /
githubweb
- Fix parallel_scan_with_reducers test
—
Daniel Arndt /
githubweb
- team-level stdalgos: improve tests, check intra-team result matching (part 3/7) (#6425)
—
noreply /
githubweb
- improve tests (#6432)
—
noreply /
githubweb
- improve tests (#6437)
—
noreply /
githubweb
- Move final assignment to correct scope
—
cezary.skrzynski /
githubweb
- fix casting warning in Random test
—
fnrizzi /
githubweb
- Workaround for ROCm 5.6+ failing to compile with AVX2 SIMD support (#6449)
—
noreply /
githubweb
- HIP: Restrict AVX2 workaround to ROCm 5.6 and 5.7
—
Daniel Arndt /
githubweb
- fixes build error for TeamReduce and TeamTranformReduced tests for specific GCC (#6459)
—
noreply /
githubweb
- improve tests to check intra-team result (#6431)
—
noreply /
githubweb
- SIMD: Math functions should be in namespace Kokkos
—
Daniel Arndt /
githubweb
- SYCL: Disable another bessel function test for Intel GPUs
—
Daniel Arndt /
githubweb
- team-level std algos: part 10 (#6256)
—
noreply /
githubweb
- team-level std algos: part 11 (#6258)
—
noreply /
githubweb
- #5635: HIP: Add Overloads for parallel_scan with return value for TeamThreadRange (#6302)
—
noreply /
githubweb
- #5635: Add parallel_scan overloads with value for Threads
—
cezary.skrzynski /
githubweb
- #5635: Move some tests for parallel_scan to TestTeamScan
—
cezary.skrzynski /
githubweb
- use shortcut
—
cezary.skrzynski /
githubweb
- Skip testing for non-power-of-two team sizes
—
cezary.skrzynski /
githubweb
- Fix formatting
—
cezary.skrzynski /
githubweb
- Add parallel_scan overloads with value for HIP backend
—
cezary.skrzynski /
githubweb
- Use std::is_same_v
—
cezary.skrzynski /
githubweb
- #5635: Move some tests for parallel_scan to TestTeamScan
—
cezary.skrzynski /
githubweb
- #5635: SYCL: Add parallel_scan overload with return value
—
cezary.skrzynski /
githubweb
- Use std::is_same_v
—
cezary.skrzynski /
githubweb
- OpenMP: Fix TeamThreadRange parallel_scan with return value for team_size > 1
—
Daniel Arndt /
githubweb
- Add compatible copy assignment operator to DualView
—
maarten.arnst /
githubweb
- Fix minimum version for Google benchmark
—
Daniel Arndt /
githubweb
- Add test of copy constructor/assignment operator for DualView.
—
maarten.arnst /
githubweb
- Compute concurrency on HIP using Kokkos hardcoded m_maxWavesPerCU
—
maarten.arnst /
githubweb
- Modify test so that source and destination view are of different type
—
maarten.arnst /
githubweb
- Use call operator instead of run_me function
—
maarten.arnst /
githubweb
- team-level std algos: part 12 (#6350)
—
noreply /
githubweb
- core/src: Add half single and double mixed compare (LT,GT,LE,GE) (#6407)
—
noreply /
githubweb
- Moving abort and assert into their own public headers (#6445)
—
noreply /
githubweb
- Add test for parallel_scan with return value for ThreadVectorRange
—
cezary.skrzynski /
githubweb
- Add parallel_scan overloads with value for Threads
—
cezary.skrzynski /
githubweb
- Allow detecting SIMD types based on compiler macros (#6188)
—
noreply /
githubweb
- Add KOKKOS_ARCH_ARM_NEON
—
Daniel Arndt /
githubweb
- Fix implementation for cyl_bessel_i0
—
Daniel Arndt /
githubweb
- cleaning: remove iostream from headers where possible (IWYU)
—
romin.tomasetti /
githubweb
- Fix compiling SIMD unit tests on NVIDIA
—
Daniel Arndt /
githubweb
- guards to ensure DBL_EPSILON return for POWER8,9
—
ajpowel /
githubweb
- Fix NVCC warnings (#6483)
—
noreply /
githubweb
- team-level std algos: part 13 (#6351)
—
noreply /
githubweb
- Also fix annotations for generator constructor for AVX512 and NEON
—
Daniel Arndt /
githubweb
- #5635: SYCL: Add parallel_scan overload with value for ThreadVectorRange
—
cezary.skrzynski /
githubweb
- Fix atomic operations bug for Min and Max (#6435)
—
noreply /
githubweb
- Fix example/build_cmake_installed_different_compiler
—
Daniel Arndt /
githubweb
- fix impl
—
fnrizzi /
githubweb
- Update core/src/HIP/Kokkos_HIP_KernelLaunch.hpp
—
noreply /
githubweb
- Split Kokkos_Threads_Parallel files
—
Bruno Turcksin /
githubweb
- HPX: Implement TeamThread and ThreadVector parallel_scan with return value
—
Daniel Arndt /
githubweb
- Serial: Allow for distinct execution space instances (#6441)
—
noreply /
githubweb
- add overload for TeamThreadRange
—
fnrizzi /
githubweb
- address review comment
—
fnrizzi /
githubweb
- Update to HIP TeamPolicy Block number heuristic (#6284)
—
noreply /
githubweb
- SIMD: Split math functions from SIMD_Common.hpp (#6487)
—
noreply /
githubweb
- Allow NVHPC as device compiler only with Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON
—
Daniel Arndt /
githubweb
- ad threadvector
—
fnrizzi /
githubweb
- fix order
—
fnrizzi /
githubweb
- remove guards
—
fnrizzi /
githubweb
- UnorderedMap(space instance): proposal for #6067
—
romin.tomasetti /
githubweb
- Rename Kokkos_ThreadsExec to align with the other backends
—
Bruno Turcksin /
githubweb
- Promote Kokkos_Printf.hpp to public include
—
Daniel Arndt /
githubweb
- simd: add floor, ceil, round, trunc operations (#6393)
—
noreply /
githubweb
- Update CI in OpenMPTarget to use llvm-17 (#6472)
—
noreply /
githubweb
- Rename Kokkos_ThreadsTeam.hpp to Kokkos_Threads_Team.hpp
—
Bruno Turcksin /
githubweb
- Kokkos_SIMD_Scalar.hpp: remove extra ';'
—
ndellin /
githubweb
- add guards
—
fnrizzi /
githubweb
- avoid auto
—
fnrizzi /
githubweb
- [ci skip] Update Kokkos version to 4.2.99
—
Daniel Arndt /
githubweb
- [deprecated code 3] remove all default device init tests
—
Damien L-G /
githubweb
- [deprecated code 3] remove InitArguments
—
Damien L-G /
githubweb
- [deprecated code 3] remove KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macros
—
Damien L-G /
githubweb
- [deprecated code 3] remove using declaration in Kokkos::Experimental:: for clamp, min, max, and minmax
—
Damien L-G /
githubweb
- [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions
—
Damien L-G /
githubweb
- [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math constants
—
Damien L-G /
githubweb
- [deprecated code 3] remove {OpenMP,HPX}::partition_master
—
Damien L-G /
githubweb
- OpenMP backend cleanup following removal of deprecated code 3
—
Damien L-G /
githubweb
- Get rid of FIXME_OPENMP
—
Damien L-G /
githubweb
- [deprecated code 3] remove MasterLock
—
Damien L-G /
githubweb
- [deprecated code 3] remove ENABLE_DEPRECATED_CODE_3 option
—
Damien L-G /
githubweb
- fixup! [deprecated code 3] remove using declaration in Kokkos::Experimental:: for all math functions
—
Damien L-G /
githubweb
- Do not append " - blocks" to the bitset label
—
Damien L-G /
githubweb
- with_updated_label -> append_to_label
—
Daniel Arndt /
githubweb
- SYCL: Use SYCL_EXT_ONEAPI_DEVICE_GLOBAL to detect support for device global variables
—
Daniel Arndt /
githubweb
- Fixup in README (github -> GitHub)
—
Damien L-G /
githubweb
- Threads: replace enum with constexpr int and enum class (#6514)
—
noreply /
githubweb
- Added unit tests for reduction ops and few intel svml intrinsics
—
donlee /
githubweb
- Remove sleep and wake functions
—
Bruno Turcksin /
githubweb
- Prefer defaulted default constructor for Bitset (#6524)
—
noreply /
githubweb
- SYCL: Use host-pinned memory to copy reduction/scan result (#6500)
—
noreply /
githubweb
- Remove unused variables
—
Bruno Turcksin /
githubweb
- Remove Sentinel struct from Threads
—
Bruno Turcksin /
githubweb
- Small cleanup of ThreadsInternal::initialize
—
Bruno Turcksin /
githubweb
- Remove useless forward declaration
—
Bruno Turcksin /
githubweb
- Remove spawn function
—
Bruno Turcksin /
githubweb
- Add comments
—
Bruno Turcksin /
githubweb
- Fix indentation
—
Bruno Turcksin /
githubweb
- Fix typo in macro guard
—
Bruno Turcksin /
githubweb
- Reduce number of View constructor instantiations
—
Damien L-G /
githubweb
- Bump HPX version used in CI to 1.9.0
—
mikael.simberg /
githubweb
- Split files in HIP backend
—
Bruno Turcksin /
githubweb
- Trim some fat in `CudaInternal` (towards multiple GPUs support) (#6544)
—
noreply /
githubweb
- Only define STDALGO_TEAM_SOURCES_* once
—
Daniel Arndt /
githubweb
- Rollback changes to view constructors to reduce the number of instantiations (#6564)
—
noreply /
githubweb
- Threads remove unused variables and functions (#6566)
—
noreply /
githubweb
- Remove unused Sandia testing files (#6568)
—
noreply /
githubweb
- fallback implementation cleanup
—
donlee /
githubweb
- Remove empty quotation marks for static_assert
—
Daniel Arndt /
githubweb
- [ci skip] Drop unused <impl/Kokkos_Memory_Fence.hpp> header
—
Damien L-G /
githubweb
- Drop Clang+CUDA workaround
—
Damien L-G /
githubweb
- OpenMPTarget: CI compiler upgrade. (#6545)
—
noreply /
githubweb
- Add crtrott's launch_latency benchmark (#6379)
—
noreply /
githubweb
- Simplify fence functions in the Threads backend (#6571)
—
noreply /
githubweb
- simd: temporarily skip device math ops unit test for OpenMPTarget build (#6574)
—
noreply /
githubweb
- OpenMPTarget: Guard scratch memory usage in ParallelReduce
—
Daniel Arndt /
githubweb
- SYCL: Implement DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION path (#6534)
—
noreply /
githubweb
- Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header
—
Damien L-G /
githubweb
- Replace Marsaglia polar method with Box-muller to generate a normally distributed random number (#6556)
—
noreply /
githubweb
- OpenMP: No memset in viewfill (#6573)
—
noreply /
githubweb
- Revert "Desul atomics: Trade SYCL-specific compile definition for a macro defintion in the configuration header"
—
noreply /
githubweb
- OpenACC: add atomics support (#6446)
—
noreply /
githubweb
- Fix infinity, quiet_NaN, signaling_Nan, isfinite, isnan, isinf for half_t and bhalf_t (#6543)
—
noreply /
githubweb
- kokkos(unique): fix allocation of temporary view to enfore using the provided space instance
—
romin.tomasetti /
githubweb
- Use binary wrapper for consistency in definition of half types numeric traits (#6590)
—
noreply /
githubweb
- try fix
—
fnrizzi /
githubweb
- Fix TestNumericTriats.hpp for SYCL with bfloat16 support
—
Daniel Arndt /
githubweb
- Fix generated Makefile when using gnu_generate_makefile.sh and make >= 4.3
—
crtrott /
githubweb
- Threads: add missing broadcast to TeamThreadRange parallel_scan (#6601)
—
noreply /
githubweb
- Added missing operator* to NEON simd
—
crtrott /
githubweb
- [ci skip] Update changelog on develop for 4.2.00 (#6592)
—
noreply /
githubweb
- Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF (#6593)
—
noreply /
githubweb
- try fix
—
fnrizzi /
githubweb
- avoid pyt package
—
fnrizzi /
githubweb
- try
—
fnrizzi /
githubweb
- fix for macos
—
fnrizzi /
githubweb
- remove comments
—
fnrizzi /
githubweb
- use reference
—
crtrott /
githubweb
- add branching
—
fnrizzi /
githubweb
- [ci skip] fix formatting
—
cezary.skrzynski /
githubweb
- GitHub Workflows: Use Ubuntu 22.04 instead of Fedora for Intel compiler testing
—
Daniel Arndt /
githubweb
- nvcc(wrapper): adding missing `--generate-line-info` arg
—
romin.tomasetti /
githubweb
- Add clang-format check to GitHub workflows (#6612)
—
noreply /
githubweb
- graph(HIP): adding inline keyword to fix #6623
—
romin.tomasetti /
githubweb
- Add jenkins multibranch pipeline options
—
Bruno Turcksin /
githubweb
- kokkos(profiling): do not finalize in any backend
—
romin.tomasetti /
githubweb
- Replace ubuntu:18.04 with ubuntu:20.04 as base image for clang-format
—
Bruno Turcksin /
githubweb
- Disabling OpenACC in the CI because it emits too many warnings
—
Bruno Turcksin /
githubweb
- unorderedmap: modernize traits
—
romin.tomasetti /
githubweb
- nvcc wrapper: remove troubling flag to fix 6628 (#6629)
—
noreply /
githubweb