Lista de obras - Jesus Labarta - Dominio Público Uruguay

2. Performance Analysis: From Art to Science

A Job Scheduling Approach for Multi-core Clusters Based on Virtual Malleability

article published in 2012

A Proposal for Error Handling in OpenMP

article by Alejandro Duran et al published 28 June 2007 in International Journal of Parallel Programming

A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures

article

A Proposal to Extend the OpenMP Tasking Model with Dependent Tasks

article published in 2009

A Simulation Framework to Automatically Analyze the Communication-Computation Overlap in Scientific Applications

A Simulation of Seismic Wave Propagation at High Resolution in the Inner Core of the Earth on 2166 Processors of MareNostrum

A Study of Speculative Distributed Scheduling on the Cell/B.E

A dependency-aware task-based programming environment for multi-core architectures

A high-productivity task-based programming model for clusters

ALOJA: A systematic study of Hadoop deployment variables to enable automated characterization of cost-effectiveness

AMA: Asynchronous Management of Accelerators for Task-based Programming Models

Align and distribute-based linear loop transformations

An Evaluation of Marenostrum Performance

article

An Expert Assistant for Computer Aided Parallelization

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

Analyzing reference patterns in automatic data distribution tools

Analyzing scheduling policies using Dimemas

Another approach to backfilled jobs

Artificial Intelligence to Identify Retinal Fundus Images, Quality Validation, Laterality Evaluation, Macular Degeneration, and Suspected Glaucoma

artículo científico publicado en 2020

Asynchronous and Exact Forward Recovery for Detected Errors in Iterative Solvers

scholarly article by Luc Jaulmes et al published 1 September 2018 in IEEE Transactions on Parallel and Distributed Systems

Automatic Evaluation of the Computation Structure of Parallel Applications

Automatic Exploration of Potential Parallelism in Sequential Applications

Automatic Grid workflow based on imperative programming languages

Automatic Phase Detection and Structure Extraction of MPI Applications

Automatic Refinement of Parallel Applications Structure Detection

Automatic analysis of speedup of MPI applications

article published in 2008

Automatic detection of parallel applications computation phases

BSC Vision Towards Exascale

BSLD threshold driven power management policy for HPC centers

Balancing HPC applications through smart allocation of resources in MT processors

Bio-Inspired Call-Stack Reconstruction for Performance Analysis

article

Boosting irregular array Reductions through In-lined Block-ordering on fast processors

CATA: Criticality Aware Task Acceleration for Multicore Processors

CRC-Based Memory Reliability for Task-Parallel HPC Applications

CellSs: Scheduling Techniques to Better Exploit Memory Hierarchy

CellSs: a Programming Model for the Cell BE Architecture

article

ClusterSs

Contention-aware node allocation policy for high-performance capacity systems

Criticality-Aware Dynamic Task Scheduling for Heterogeneous Architectures

Data Distribution Strategies for Domain Decomposition Applications in Grid Environments

Data distribution and loop parallelization for shared-memory multiprocessors

Detailed Load Balance Analysis of Large Scale Parallel Applications

Detailed Performance Analysis Using Coarse Grain Sampling

article

Detailed and simultaneous power and performance analysis

article

Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling

Dynamic task scheduling in distributed real time systems using fuzzy rules

Effective Quality-of-Service Policy for Capacity High-Performance Computing Systems

article

Effective communication and computation overlap with hybrid MPI/SMPSs

scholarly article published 2010

Effective communication and computation overlap with hybrid MPI/SMPSs

Evaluating the Impact of OpenMP 4.0 Extensions on Relevant Parallel Workloads

Exploiting Locality on the Cell/B.E. through Bypassing

Exploiting asynchrony from exact forward recovery for DUE in iterative solvers

Exploiting parallelism through directives on the nano-threads programming model

Exploring dynamic parallelism in OpenMP

article

Exploring pattern-aware routing in generalized fat tree networks

article

Extending OpenMP to Survive the Heterogeneous Multi-Core Era

Extracting the optimal sampling frequency of applications using spectral analysis

Fault-Tolerant Protocol for Hybrid Task-Parallel Message-Passing Applications

Folding: Detailed Analysis with Coarse Sampling

article

Framework for a productive performance optimization

article

Graph-Based Task Replication for Workflow Applications

Guided Performance Analysis Combining Profile and Trace Tools

Handling task dependencies under strided and aliased references

scholarly article published 2010

Hierarchical Task-Based Programming With StarSs

Hints to improve automatic load balancing with LeWI for hybrid applications

Identifying Code Phases Using Piece-Wise Linear Regressions

article

Impact of Inter-application Contention in Current and Future HPC Systems

article

Impact of the Memory Hierarchy on Shared Memory Architectures in Multicore Programming Models

Implementing OmpSs support for regions of data in architectures with multiple address spaces

Improving the Integration of Task Nesting and Dependencies in OpenMP

Improving the Interoperability between MPI and Task-Based Programming Models

Including SMP in Grids as Execution Platform and Other Extensions in GRID Superscalar

Integration of the Enanos Execution Framework with GRMS

Just-in-Time Renaming and Lazy Write-Back on the Cell/B.E

LeWI: A Runtime Balancing Algorithm for Nested Parallelism

Linear programming based parallel job scheduling for power constrained systems

scholarly article published July 2011

Low-Overhead Detection of Memory Access Patterns and Their Time Evolution

article

MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines

Making the Best of Temporal Locality: Just-in-Time Renaming and Lazy Write-Back on the Cell/B.E

Marriage Between Coordinated and Uncoordinated Checkpointing for the Exascale Era

Memory---CellSs

Monitoring and Analysis Framework for Grid Middleware

Monitoring and analysing a Grid Middleware Node

Multiple Target Task Sharing Support for the OpenMP Accelerator Model

NanoCheckpoints: A Task-Based Asynchronous Dataflow Framework for Efficient and Scalable Checkpoint/Restart

Noise Inspector Tool

Oblivious routing schemes in extended generalized Fat Tree networks

article

On the Instrumentation of OpenMP and OmpSs Tasking Constructs

article

On the trade-off of mixing scientific applications on capacity high-performance computing systems

article

On the usefulness of object tracking techniques in performance analysis

article

On-line detection of large-scale parallel application's structure

On-the-Fly Adaptive Routing in High-Radix Hierarchical Networks

scholarly article published September 2012

OpenMP Extensions for Thread Groups and Their Run-Time Support

Optimizing job performance under a given power constraint in HPC centers

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL

article published in 2011

Overlapping communication and computation by using a hybrid MPI/SMPSs approach

scholarly article published 2010

PARSECSs

POSTER

Page Migration with Dynamic Space-Sharing Scheduling Policies: The Case of the SGI O2000

ParaView + Alya + D8tree: Integrating High Performance Computing and High Performance Data Analytics

Parallel Implementation of the Integral Histogram

Parallelizing dense and banded linear algebra libraries using SMPSs

scholarly article by Rosa M. Badia et al published 22 July 2009 in Concurrency and Computation: Practice and Experience

Performance Analysis and Parallelization Strategies in Neuron Simulation Codes

Performance Analysis of Domain Decomposition Applications Using Unbalanced Strategies in Grid Environments

Performance Data Extrapolation in Parallel Codes

Performance Visualization Of Grid Applications Based On OCM-G And Paraver

scholarly article published 2008

Poster

Power-aware load balancing of large scale MPI applications

Prediction of behavior of MPI applications

Productive Cluster Programming with OmpSs

Productive Programming of GPU Clusters with OmpSs

Programmability Issues

Programmability and portability for exascale: Top down programming methodology and tools with StarSs

Programmable and Scalable Reductions on Clusters

Programmer-directed partial redundancy for resilient HPC

Programming Grid Applications with GRID Superscalar

article

PyCOMPSs: Parallel computational workflows in Python

Quantifying the Potential Task-Based Dataflow Parallelism in MPI Applications

Quiet Neighborhoods: Key to Protect Job Performance Predictability

article

Reducing Cache Coherence Traffic with Hierarchical Directory Cache and NUMA-Aware Runtime Scheduling

Runtime Address Space Computation for SDSM Systems

Runtime Parallelization of the Finite Element Code Permas

Runtime-Aware Architectures

Runtime-Guided Management of Scratchpad Memories in Multicore Architectures

Runtime-Guided Mitigation of Manufacturing Variability in Power-Constrained Multi-Socket NUMA Nodes

scholarly article published 2016

SSMART

Scheduler-Activated Dynamic Page Migration for Multiprogrammed DSM Multiprocessors

Scheduling parallel jobs on multicore clusters using CPU oversubscription

Self-Adaptive OmpSs Tasks in Heterogeneous Environments

Short Reasons for Long Vectors in HPC CPUs: A Study Based on RISC-V

artículo científico publicado en 2023

Simulating Whole Supercomputer Applications

Simulation environment for studying overlap of communication and computation

Spark deployment and performance evaluation on the MareNostrum supercomputer

Sparse Matrix Structure for Dynamic Parallelisation Efficiency

Spatial Support Vector Regression to Detect Silent Errors in the Exascale Era

Supercomputing for the Future, Supercomputing from the Past (Keynote)

Supporting Adaptive Privatization Techniques for Irregular Array Reductions in Task-Parallel Programming Models

Tareador

article

Task Superscalar: An Out-of-Order Task Pipeline

Task-based programming in COMPSs to converge from HPC to big data

The Impact of Application's Micro-Imbalance on the Communication-Computation Overlap

The International Exascale Software Project roadmap

The Mont-Blanc Prototype: An Alternative Approach for HPC Systems

The Network Adapter: The Missing Link between MPI Applications and Network Performance

article

Tools for Power-Energy Modelling and Analysis of Parallel Scientific Applications

Topic 1: Support Tools and Environments

Towards Task-Parallel Reductions in OpenMP

article published in 2015

Trace Spectral Analysis toward Dynamic Levels of Detail

Transactional Memory and OpenMP

Unveiling Internal Evolution of Parallel Application Computation Phases

Utilization driven power-aware parallel job scheduling

Variable Batched DGEMM

cuHinesBatch: Solving Multiple Hines systems on GPUs Human Brain Project * *This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 720270 (HBP SGA1), from the Spanish Minist

article

Lista de obras de Jesus Labarta

2. Performance Analysis: From Art to Science

A Job Scheduling Approach for Multi-core Clusters Based on Virtual Malleability

A Proposal for Error Handling in OpenMP

A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures

A Proposal to Extend the OpenMP Tasking Model with Dependent Tasks

A Simulation Framework to Automatically Analyze the Communication-Computation Overlap in Scientific Applications

A Simulation of Seismic Wave Propagation at High Resolution in the Inner Core of the Earth on 2166 Processors of MareNostrum

A Study of Speculative Distributed Scheduling on the Cell/B.E

A dependency-aware task-based programming environment for multi-core architectures

A high-productivity task-based programming model for clusters

ALOJA: A systematic study of Hadoop deployment variables to enable automated characterization of cost-effectiveness

AMA: Asynchronous Management of Accelerators for Task-based Programming Models

Align and distribute-based linear loop transformations

An Evaluation of Marenostrum Performance

An Expert Assistant for Computer Aided Parallelization

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

Analyzing reference patterns in automatic data distribution tools

Analyzing scheduling policies using Dimemas

Another approach to backfilled jobs

Artificial Intelligence to Identify Retinal Fundus Images, Quality Validation, Laterality Evaluation, Macular Degeneration, and Suspected Glaucoma

Asynchronous and Exact Forward Recovery for Detected Errors in Iterative Solvers

Automatic Evaluation of the Computation Structure of Parallel Applications

Automatic Exploration of Potential Parallelism in Sequential Applications

Automatic Grid workflow based on imperative programming languages

Automatic Phase Detection and Structure Extraction of MPI Applications

Automatic Refinement of Parallel Applications Structure Detection

Automatic analysis of speedup of MPI applications

Automatic detection of parallel applications computation phases

BSC Vision Towards Exascale

BSLD threshold driven power management policy for HPC centers

Balancing HPC applications through smart allocation of resources in MT processors

Bio-Inspired Call-Stack Reconstruction for Performance Analysis

Boosting irregular array Reductions through In-lined Block-ordering on fast processors

CATA: Criticality Aware Task Acceleration for Multicore Processors

CRC-Based Memory Reliability for Task-Parallel HPC Applications

CellSs: Scheduling Techniques to Better Exploit Memory Hierarchy

CellSs: a Programming Model for the Cell BE Architecture

ClusterSs

Contention-aware node allocation policy for high-performance capacity systems

Criticality-Aware Dynamic Task Scheduling for Heterogeneous Architectures

Data Distribution Strategies for Domain Decomposition Applications in Grid Environments

Data distribution and loop parallelization for shared-memory multiprocessors

Detailed Load Balance Analysis of Large Scale Parallel Applications

Detailed Performance Analysis Using Coarse Grain Sampling

Detailed and simultaneous power and performance analysis

Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling

Dynamic task scheduling in distributed real time systems using fuzzy rules

Effective Quality-of-Service Policy for Capacity High-Performance Computing Systems

Effective communication and computation overlap with hybrid MPI/SMPSs

Effective communication and computation overlap with hybrid MPI/SMPSs

Evaluating the Impact of OpenMP 4.0 Extensions on Relevant Parallel Workloads

Exploiting Locality on the Cell/B.E. through Bypassing

Exploiting asynchrony from exact forward recovery for DUE in iterative solvers

Exploiting parallelism through directives on the nano-threads programming model

Exploring dynamic parallelism in OpenMP

Exploring pattern-aware routing in generalized fat tree networks

Extending OpenMP to Survive the Heterogeneous Multi-Core Era

Extracting the optimal sampling frequency of applications using spectral analysis

Fault-Tolerant Protocol for Hybrid Task-Parallel Message-Passing Applications

Folding: Detailed Analysis with Coarse Sampling

Framework for a productive performance optimization

Graph-Based Task Replication for Workflow Applications

Guided Performance Analysis Combining Profile and Trace Tools

Handling task dependencies under strided and aliased references

Hierarchical Task-Based Programming With StarSs

Hints to improve automatic load balancing with LeWI for hybrid applications

Identifying Code Phases Using Piece-Wise Linear Regressions

Impact of Inter-application Contention in Current and Future HPC Systems

Impact of the Memory Hierarchy on Shared Memory Architectures in Multicore Programming Models

Implementing OmpSs support for regions of data in architectures with multiple address spaces

Improving the Integration of Task Nesting and Dependencies in OpenMP

Improving the Interoperability between MPI and Task-Based Programming Models

Including SMP in Grids as Execution Platform and Other Extensions in GRID Superscalar

Integration of the Enanos Execution Framework with GRMS

Just-in-Time Renaming and Lazy Write-Back on the Cell/B.E

LeWI: A Runtime Balancing Algorithm for Nested Parallelism

Linear programming based parallel job scheduling for power constrained systems

Low-Overhead Detection of Memory Access Patterns and Their Time Evolution

MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines