Apan Qasem and
Hartwig Anzt and Eduard Ayguade and Katharine Cahill and Ramon Canal and Jany Chan
and Eric Fosler-Lussier and Fritz Gobel and Arpan Jain and Marcel Koch and Mateusz Kuzak and Josep Llosa
and Raghu Machiraju and Xavier Martorell and Pratik Nayak and Shameema Oottikkal and Marcin Ostasz
and Dhabaleswar K. Panda and Dirk Pleiter and Rajiv Ramnath and Maria-Ribera Sancho and Alessio Sclocco
and Aamir Shafi and Hanno Spreeuw and Hari Subramoni and Karen Tomko. Lightning Talks of EduHPC 2022. 10th {IEEE/ACM} Workshop on Education for High Performance Computing,EduHPC@SC 2022, Dallas, TX, USA, November 14, 2022, 2022
David Bunde and Kishwar Ahmed and Sridevi Ayloo and Tisha Brown-Gaines
and Joel Fuentes and Vishwesh Jatala and Ruth Kurniawat and Isil Oz
and Apan Qasem and Philip J. Schielke and Mary C. Tedeschi and Thomas Y. Yeh. Adopting Heterogeneous Computing Modules: Experiences from a ToUCH Summer Workshop. 10th {IEEE/ACM} Workshop on Education for High Performance Computing,EduHPC@SC 2022, Dallas, TX, USA, November 14, 2022, 2022
Rafi, Md Erfanul Haque and Qasem, Apan. Optimal Launch Bound Selection in CPU-GPU Hybrid Graph Applications with Deep Learning. 2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC), 2022
Haque Rafi, Md Erfanul and Williams, Kaylee and Qasem, Apan. Raptor: Mitigating CPU-GPU False Sharing Under Unified Memory Systems. 2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC), 2022
Apan Qasem. {YODA:} {A} Pedagogical Tool for Teaching Systems Concepts. {SIGCSE} 2022: The 53rd {ACM} Technical Symposium on Computer Science
Education, Providence, RI, USA, March 3-5, 2022, Volume 1, 2022
Apan Qasem and
David P. Bunde. Heterogeneous Computing for Undergraduates: Introducing the ToUCH
Module Repository. {SIGCSE} 2022: The 53rd {ACM} Technical Symposium on Computer Science
Education, Providence, RI, USA, March 3-5, 2022, Volume 2, 2022
PDF
Blake Ford and
Ziliang Zong and
Apan Qasem and
Jelena Tesic. Migrating Software from x86 to ARM Architecture: An Instruction Prediction Approach. 2021 {IEEE} International Conference on Networking, Architecture and
Storage, {NAS} 2021, Riverside, CA, USA, October 24-26, 2021, 2021
PDF
Jacob M. Hope and
Mikel Gjergji and
Johana Di Girolamo and
Marco A. Alvarez and
Apan Qasem. Characterizing Input-sensitivity in Tightly-Coupled Collaborative
Graph Algorithms. 21st {IEEE/ACM} International Symposium on Cluster, Cloud and Internet
Computing, CCGrid 2021, Melbourne, Australia, May 10-13, 2021, 2021
PDF
Bunde, David P. and Qasem, Apan and Schielke, Philip. Teaching about Heterogeneous Computing. Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, 2021
Sultana, Tanzima and Allen, Blake and Qasem, Apan. Intelligent Data Placement on Discrete GPU Nodes with Unified Memory. Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, 2020
Apan Qasem. A Gentle Introduction to Heterogeneous Computing for CS1 Students. 2019 IEEE/ACM Workshop on Education for High-Performance Computing (EduHPC) co-located with SC19, 2019
Jacob M. Hope and
Trisha Nag and
Apan Qasem. Energy-Efficient {GPU} Graph Processing with On-Demand Page Migration. Tenth International Green and Sustainable Computing Conference, {IGSC}
2019, Alexandria, VA, USA, October 21-24, 2019, 2019
PDF
Md Syadus Sefat and
Semih Aslan and
Jeffrey W. Kellington and
Apan Qasem. Accelerating HotSpots in Deep Neural Networks on a CAPI-Based {FPGA}. 21st {IEEE} International Conference on High Performance Computing
and Communications (HPCC19), 2019
PDF
Apan Qasem and Clara Novoa and Chandra Kolla and Samantha Coyle. {High-Accuracy Scalable Solutions to the Dynamic Facility Layout Problem} [Extended Abstract]. 31st International Conference on High Performance Computing Networking, Storage
and Analysis - Companion Volume (SC18), 2018
PDF
Md Syadus Sefat and Semih Aslan and Apan Qasem. {Hardware Acceleration of CNNs with Coherent FPGAs} [Extended Abstract]. 31st International Conference on High Performance Computing Networking, Storage
and Analysis - Companion Volume (SC18), 2018
PDF
Apan Qasem and
Ashwin M. Aji and
Michael L. Chu. Investigating Data Layout Transformations in {Chapel}. 2018 {IEEE} International Parallel and Distributed Processing Symposium
Workshops, {IPDPS} Workshops 2018, Vancouver, BC, Canada, May 21-25,2018, 2018
PDF
Apan Qasem and
Samuel Teich. Evaluating the impact of data layout and placement on the energy efficiency
of heterogeneous applications. Eighth International Green and Sustainable Computing Conference, {IGSC}
2017, Orlando, FL, USA, October 23-25, 2017, 2017
PDF
Apan Qasem and
Samuel Teich. Mitigating register pressure in {GPU} kernels for improved energy
efficiency. Eighth International Green and Sustainable Computing Conference, {IGSC}
2017, Orlando, FL, USA, October 23-25, 2017, 2017
PDF
Tiffany A. Connors and Apan Qasem. Automatically Selecting Profitable Thread Block Sizes for Accelerated Kernels. 19th IEEE International Conference on High Performance Computing and Communications (HPCC17), 2017
Biplab Saha and Tiffany A. Connors and Saami Rahman and Apan Qasem. A Machine Learning Approach to Automatic Creation of Architecture-sensitive Performance Heuristics. 19th IEEE International Conference on High Performance Computing and Communications (HPCC17), 2017
Apan Qasem and Aswhin Aji and Gregory Rodgers. Characterizing Data Organization Effects on Heterogeneous Memory Architectures. Proceedings of the International Symposium on Code Generation and Optimization (CGO), 2017
Tiffany Connors and Apan Qasem. Power-performance Analysis of Metaheuristic Search Algorithms on the GPU. International Workshop on Green Programming, Computing and Data Processing (GPCDP15), 2015
Claudia Alavardo and Dan Tamir and Apan Qasem. Realizing Energy-efficient Thread Affinity Configurations with Supervised Learning. Sixth International Green and Sustainable Conference (IGSC15), 2015
Mario Gutierrez and Saami Rahman and Apan Qasem. Neural Network Methods for Fast and Portable Prediction of CPU Power Consumption. Sixth International Green and Sustainable Conference (IGSC15), 2015
Biplab Kumar Saha and Saami Rahman and Apan Qasem. MLTUNE: A Tool-chain for Automating the Workflow of Machine-Learning Based Performance Tuning (Extended Poster Abstract). 28th International Conference on High Performance Computing, Networking, Storage and Analysis (SC15), 2015
Saami Rahman and Apan Qasem. Investigating Prefetch Potential on the Xeon Phi with Autotuning (Extended Poster Abstract). 28th International Conference on High Performance Computing, Networking, Storage and Analysis (SC15), 2015
Claudia Alavardo and Dan Tamir and Apan Qasem. Energy-Efficient Thread Migration via Dynamic Characterization of Resource Utilization. International Multi-Conference on Computing in the Global Information Technology (ICCGI), 2015
Mario Gutierrez and Dan Tamir and Apan Qasem. Evaluating Neural Network Methods for PMC-based CPU Power Prediction. International Multi-Conference on Computing in the Global Information Technology (ICCGI), 2015
Saami Rahman and Martin Burtscher and Ziliang Zong and Apan Qasem. Maximizing Hardware Prefetch Effectiveness with Machine Learning. 17th IEEE International Conference on High Performance Computing
and Communications (HPCC15), 2015
PDF
Abhilash Chaparala and Clara Novoa and Apan Qasem. Autotuning GPU-accelerated QAP Solvers for Power and Performance. 17th IEEE International Conference on High Performance Computing
and Communications (HPCC15), 2015
PDF
Saeed Taheri and Apan Qasem and Martin Burtscher. A Tool for Automatically Suggesting Source-Code Optimizations for Complex GPU Kernels. Proceedings of the International Conference on Parallel
and Distributed Processing Techniques and Applications (PDPTA), 2015
PDF
Clara Novoa and Apan Qasem and Abhilash Chaparala. A {SIMD} tabu search implementation for solving the quadratic assignment problem with GPU acceleration. Fourth Annual XSEDE Conference (XSEDE15), 2015
Martin Burtscher and Wuxu Peng and Apan Qasem and Hongchi Shi and Dan Tamir and Heather Thiry. A Module-based Approach to Adopting the 2013 ACM Curricular Recommendations on Parallel Computing. Proceedings of the 36th SIGCSE Technical Symposium on Computer Science Education (SIGCSE15), 2015
Apan Qasem. Exposing Undergraduates to Parallel Performance Concepts with a Three-Module Sequence. Workshop on Education for High-Performance Computing (EDUHPC14, educator workshop at SC14), 2014
Abhilash Chaparala, Clara Novoa and Apan Qasem. A SIMD Solution for the Quadratic Assignment Problem with GPU Acceleration. Third Annual XSEDE Conference (XSEDE14), 2014
Hyatt, Christopher and LaKomski, Greg and Alvarado, Claudia and Hay, Richard and Qasem, Apan and Tamir, Dan. Power Aware Task Matching and Migration in Heterogeneous Processing Environments. International Conference on Computational Science and Computational Intelligence (CSCI14), 2014
Claudia Alvarado and Dan Tamir and Apan Qasem. Dynamic Feedback-Driven Thread Migration for Energy-Efficient Execution of Multithreaded Workloads. Proceedings of the 16th Annual TECHCON Conference, 2014
Shwetha Shankar and Greg Lakomski and Claudia Alvarado and Richard Hay and Christopher Hyatt and Dan Tamir and Apan Qasem. Power Aware Work Stealing in Homogeneous Multicore Systems. The Sixth International Conference on Future Computational Technologies and Applications, 2014
Martin Burtscher and Wuxu Peng and Apan Qasem and Hongchi Shi and Dan Tamir and Heather Thiry. Integrating Parallel Computing into the Undergraduate Curriculum
at Texas State University: Experiences from the First Year. Workshop on Parallel, Distributed, and High-Performance Computing in Undergraduate Curricula (EDUPDHPC13, educator workshop at SC13), 2013
PDF
Martin Burtscher and Wuxu Peng and Apan Qasem and Hongchi Shi and Dan Tamir. Preparing Computer Science Students for an Increasingly Parallel World: Teaching Parallel Computing Early and Often (Extended poster abstract). 26th International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), 2013
Saami Rahman and Richard Hay and Apan Qasem. Enhancing Learning-based Autotuning with Composite and Diagnostic Feature Vectors (Extended poster abstract). 26th International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), 2013
Hammad Rashid and Richard Hay and Clara Novoa and Apan Qasem. Algorithmic Choice in Optimization Problems: A Performance Study (Extended poster abstract). 26th International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), 2013
Jim Holt and George Bazzera and Apan Qasem and Jason Miller and Henry Hoffmann. A Pattern Language for Adaptive Parallel Software. Proceedings of the 20th International Conference on Pattern Languages of Programs, 2013
Christopher R Hyatt and Greg R. LaKomski and Dan Tamir and Apan Qasem. Power Aware Task Matching and Migration in Heterogeneous Processing Environments. Proceedings of the 15th Annual TECHCON Conference, 2013
Shwetha Shankar and Dan Tamir and Apan Qasem. Towards an Operating System Based Framework for Energy-Efficient Scheduling of Parallel Workloads. Proceedings of the International Conference on Parallel
and Distributed Processing Techniques and Applications (PDPTA), 2013
PDF
Apan Qasem and Michael Jason Cade and Dan Tamir. Improved Energy Efficiency for Multithreaded Kernels Through Model-based Autotuning. Proceedings of the 2012 IEEE Green Technology Conferenc (GTC12), 2012
PDF
Swapneela Unkule and
Christopher Shaltz and
Apan Qasem. Automatic Restructuring of {GPU} Kernels for Exploiting Inter-thread
Data Locality. Proc. Int'l. Conf. on Compiler Construction (CC12), 2012
Apan Qasem. Efficient execution of time-step computations with pipelined
parallelism and inter-thread data locality optimizaitions. Proceedings of the 2012 PPOPP International Workshop on Programming Models and Applications for Multicore
and Manycores (PMAM12), 2012
PDF
Apan Qasem and Dan Tamir. Memory Performance Diagnosis Through Feedback Synthesis. Proceeding of the Workshop on Feedback-Directed Compiler Optimization for Multi-Core
Architectures (COMA12 a HIPEAC workshop), 2012
Faizur Rahman and
Qing Yi and
Apan Qasem. Understanding stencil code performance on multicore architectures. Conf. Computing Frontiers (CF11), 2011
Swapneela Unkule and Apan Qasem. Register pressure aware code transformations on {GPU}. 24th International Conference on High Performance Computing Networking, Storage
and Analysis - Companion Volume (SC11), 2011
PDF
Clara Novoa and Apan Qasem and Hammad Rashid and Mark McKenney. Dynamic Programming Solutions for the Integral Knapsack Problem on Multicore
Architectures, (Extended Abstract). 11th INFORMS Computing Society Conference, (ICS11), 2011
Santosh Sarangkar and Apan Qasem. Intelligent Feedback For Fast and Effective Autotuning, (Extended Poster Abstract). 23rd International Conference on High Performance Computing, Networking, Storage and
Analysis - Companion Volume (SC10), 2010
PDF
Qing Yi and Jichi Guo and Apan Qasem. Evaluating the Role of Optimization-Specific Search Heuristics in Effective Autotuning
(short paper). 23rd International Workshop Languages and Compilers for Parallel Computing (LCPC10), 2010
Apan Qasem. Locality-Conscious Superpaging for Improved TLB Behavior of Stencil Computations. Proceedings of the 2010 International Conference on High Performance Computing Systems
(HPCS10), 2010
PDF
Qing Yi and Santosh Sarangkar and Apan Qasem. Improving Autotuning Effciency and Portability Through Feedback Diagnostics. Proceedings of the Fifth International Workshop on Automatic Performance Tuning
(iWAPT10), 2010
PDF
Hammad Rashid and
Clara Novoa and
Apan Qasem. An Evaluation of Parallel Knapsack Algorithms on Multicore
Architectures. Proceedings of the 2010 International Conference on Scientific
Computing (CSC10), 2010
PDF
Santosh Sarangkar and
Apan Qasem. Restructuring parallel loops to curb false sharing on multicore
architectures. 24th IEEE International Symposium on Parallel Distributed
Processing Workshops and Phd Forum (IPDPSW10), 2010
PDF
Apan Qasem and
Jichi Guo and
Faizur Rahman and
Qing Yi. Exposing Tunable Parameters in Multi-threaded Numerical
Code. Network and Parallel Computing, IFIP International Conference, (NPC10), 2010
Joshua Magee and
Apan Qasem. A case for compiler-driven superpage allocation. Proceedings of the 47th Annual Southeast Regional Conference, (ACMSE09), 2009
Michael Jason Cade and
Apan Qasem. Balancing Locality and Parallelism on Shared-cache Mulit-core
Systems. 11th IEEE International Conference on High Performance Computing
and Communications (HPCC09), 2009
PDF
Qing Yi and
Apan Qasem. Exploring the Optimization Space of Dense Linear Algebra
Kernels. 21st International Workshop Languages and Compilers for Parallel Computing (LCPC08), 2008
Apan Qasem. Evaluating an Early-stop Criterion and a Statistical Pruning
Strategy of the Optimization Search Space. Proceedings of the International Conference on Parallel
and Distributed Processing Techniques and Applications (PDPTA), 2008
PDF
Apan Qasem and Ken Kennedy. Pruning the Optimization Search Space Using Architectureaware Cost Models. Proceedings of the First Workshop on Statistical and Machine Learning Approaches
Applied to Architecture and Compilation (SMART07), 2007
PDF
Apan Qasem and
Ken Kennedy. Profitable loop fusion and tiling using model-driven empirical
search. Proceedings of the 20th Annual International Conference
on Supercomputing (ICS), 2006
PDF
Apan Qasem and Ken Kennedy. A Cache-Conscious Profitability Model for Empirical Tuning of Loop Fusion. 18th International Workshop on Languages and Compilers for Parallel Computing, (LCPC), 2005
Apan Qasem and Ken Kennedy and John Mellor-Crummey. Automatic Tuning of Whole Applications Using Direct Search and a Performance-based
Transformation System. Proceedings of the Los Alamos Computer Science Institute 5th Annual Symposium
(LACSI04), 2004
PDF
Robert Fowler and John Mellor-Crummey and Guohua Jin and Apan Qasem. A Source-to-source Loop Transformation Tool (Extended poster abstract). Proceedings of the Los Alamos Computer Science Institute 3rd Annual Symposium
(LACSI02), 2002
PDF
Apan Qasem and
David B. Whalley and
Xin Yuan and
Robert van Engelen. Using a Swap Instruction to Coalesce Loads and Stores. 7th International Euro-Par Conference Parallel Processing, (EuroPar01), 2001
PDF