Parallel Computing高引用文章

高引用文章

文章名称	引用次数
Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures	8
Optimizations of the eigensolvers in the ELPA library	7
Batched QR and SVD algorithms on GPUs with applications in hierarchical matrix compression	7
DVFS-aware application classification to improve GPGPUs energy efficiency	5
Accelerating the SVD two stage bidiagonal reduction and divide and conquer using GPUs	4
Comparing load-balancing algorithms for MapReduce under Zipfian data skews	4
Proteus: Exploiting precision variability in deep neural networks	3
SAGE: Percipient Storage for Exascale Data Centric Computing	3
Manila: Using a densely populated PMC-space for power modelling within large-scale systems	3
Performance optimization, modeling and analysis of sparse matrix-matrix products on multi-core and many-core processors	3
Benchmarking the GPU memory at the warp level	3
Performance of asynchronous optimized Schwarz with one-sided communication	3
IR plus : Removing parallel I/O interference of MPI programs via data replication over heterogeneous storage devices	3
Exponential integrators with parallel-in-time rational approximations for the shallow-water equations on the rotating sphere	3
A distributed-memory hierarchical solver for general sparse linear systems	3
Distributed ant colony optimization based on actor model	2
PSeIInv - A distributed memory parallel algorithm for selected inversion: The non-symmetric case	2
A hybrid CPU/GPU approach for optimizing sorting throughput	2
Characterizing the performance benefit of hybrid memory system for HPC applications	2
Overcoming the No Free Lunch Theorem in Cut-off Algorithms for Fork-Join programs	2
The time and energy efficiency of modern multicore systems	2
Evaluating the SW26010 many-core processor with a micro-benchmark suite for performance optimizations	2
Microwave tomographic imaging of cerebrovascular accidents by using high-performance computing	2
Incomplete Sparse Approximate Inverses for Parallel Preconditioning	2
PMIx: Process management for exascale environments	2
Machine Learning in Multi-Agent Systems using Associative Arrays	2
Optimized large-message broadcast for deep learning workloads: MPI, MPI plus NCCL, or NCCL2?	2
Integrating blocking and non-blocking MPI primitives with task-based programming models	2
Utility-based resource management in an oversubscribed energy-constrained heterogeneous environment executing parallel applications	2
A comparative evaluation of three volume rendering libraries for the visualization of sheared thermal convection	1
Targeting GPUs with OpenMP directives on Summit: A simple and effective Fortran experience	1
Searching for common patterns on protein sequences by means of a parallel hybrid honey-bee mating optimization algorithm	1
Client-side straggler-aware I/O scheduler for object-based parallel file systems	1
Concurrency of three-dimensional refined isogeometric analysis	1
Parallel eigenvalue computation for banded generalized eigenvalue problems	1
A time-stamping system to detect memory consistency errors in MPI one-sided applications	1
Characterizing MPI matching via trace-based simulation	1
Hybrid parallelization of a multi-tree path search algorithm: Application to highly-flexible biomolecules	1
Petascale scramjet combustion simulation on the Tianhe-2 heterogeneous supercomputer	1
Comparing the performance of rigid, moldable and grid-shaped applications on failure-prone HPC platforms	1
Accelerating the task/data-parallel version of ILUPACK's BiCG in multi-CPU/GPU configurations	1
GeneaLog: Fine-grained data streaming provenance in cyber-physical systems	1
Computation of the 100 quadrillionth hexadecimal digit of pi on a cluster of Intel Xeon Phi processors	1
Introducing the explicitly many-processor approach	1
Parallel accelerated vector similarity calculations for genomics applications	1
Practical, distributed, low overhead algorithms for irregular gather and scatter collectives	1
Superlinear speedup phenomenon in parallel 3D Discrete Element Method (DEM) simulations of complex-shaped particles	1
Data staging for efficient high throughput stream processing	1
Exploring stream parallel patterns in distributed MPI environments	1
The OpenACC data model: Preliminary study on its major challenges and implementations	1

Parallel Computing

并行计算

高引用文章