Apan Qasem and Hartwig Anzt and Eduard Ayguade and Katharine Cahill and Ramon Canal and Jany Chan and Eric Fosler-Lussier and Fritz Gobel and Arpan Jain and Marcel Koch and Mateusz Kuzak and Josep Llosa and Raghu Machiraju and Xavier Martorell and Pratik Nayak and Shameema Oottikkal and Marcin Ostasz and Dhabaleswar K. Panda and Dirk Pleiter and Rajiv Ramnath and Maria-Ribera Sancho and Alessio Sclocco and Aamir Shafi and Hanno Spreeuw and Hari Subramoni and Karen Tomko. Lightning Talks of EduHPC 2022. 10th {IEEE/ACM} Workshop on Education for High Performance Computing,EduHPC@SC 2022, Dallas, TX, USA, November 14, 2022, 2022
    PDF

David Bunde and Kishwar Ahmed and Sridevi Ayloo and Tisha Brown-Gaines and Joel Fuentes and Vishwesh Jatala and Ruth Kurniawat and Isil Oz and Apan Qasem and Philip J. Schielke and Mary C. Tedeschi and Thomas Y. Yeh. Adopting Heterogeneous Computing Modules: Experiences from a ToUCH Summer Workshop. 10th {IEEE/ACM} Workshop on Education for High Performance Computing,EduHPC@SC 2022, Dallas, TX, USA, November 14, 2022, 2022
    PDF

Rafi, Md Erfanul Haque and Qasem, Apan. Optimal Launch Bound Selection in CPU-GPU Hybrid Graph Applications with Deep Learning. 2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC), 2022
    PDF

Haque Rafi, Md Erfanul and Williams, Kaylee and Qasem, Apan. Raptor: Mitigating CPU-GPU False Sharing Under Unified Memory Systems. 2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC), 2022
    PDF

Apan Qasem. {YODA:} {A} Pedagogical Tool for Teaching Systems Concepts. {SIGCSE} 2022: The 53rd {ACM} Technical Symposium on Computer Science Education, Providence, RI, USA, March 3-5, 2022, Volume 1, 2022
    PDF

Apan Qasem and David P. Bunde. Heterogeneous Computing for Undergraduates: Introducing the ToUCH Module Repository. {SIGCSE} 2022: The 53rd {ACM} Technical Symposium on Computer Science Education, Providence, RI, USA, March 3-5, 2022, Volume 2, 2022
    PDF

Blake Ford and Ziliang Zong and Apan Qasem and Jelena Tesic. Migrating Software from x86 to ARM Architecture: An Instruction Prediction Approach. 2021 {IEEE} International Conference on Networking, Architecture and Storage, {NAS} 2021, Riverside, CA, USA, October 24-26, 2021, 2021
    PDF

Jacob M. Hope and Mikel Gjergji and Johana Di Girolamo and Marco A. Alvarez and Apan Qasem. Characterizing Input-sensitivity in Tightly-Coupled Collaborative Graph Algorithms. 21st {IEEE/ACM} International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021, Melbourne, Australia, May 10-13, 2021, 2021
    PDF

Bunde, David P. and Qasem, Apan and Schielke, Philip. Teaching about Heterogeneous Computing. Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, 2021
    PDF

Sultana, Tanzima and Allen, Blake and Qasem, Apan. Intelligent Data Placement on Discrete GPU Nodes with Unified Memory. Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, 2020
    PDF

Apan Qasem. A Gentle Introduction to Heterogeneous Computing for CS1 Students. 2019 IEEE/ACM Workshop on Education for High-Performance Computing (EduHPC) co-located with SC19, 2019
    PDF

Jacob M. Hope and Trisha Nag and Apan Qasem. Energy-Efficient {GPU} Graph Processing with On-Demand Page Migration. Tenth International Green and Sustainable Computing Conference, {IGSC} 2019, Alexandria, VA, USA, October 21-24, 2019, 2019
    PDF

Md Syadus Sefat and Semih Aslan and Jeffrey W. Kellington and Apan Qasem. Accelerating HotSpots in Deep Neural Networks on a CAPI-Based {FPGA}. 21st {IEEE} International Conference on High Performance Computing and Communications (HPCC19), 2019
    PDF

Apan Qasem and Clara Novoa and Chandra Kolla and Samantha Coyle. {High-Accuracy Scalable Solutions to the Dynamic Facility Layout Problem} [Extended Abstract]. 31st International Conference on High Performance Computing Networking, Storage and Analysis - Companion Volume (SC18), 2018
    PDF

Md Syadus Sefat and Semih Aslan and Apan Qasem. {Hardware Acceleration of CNNs with Coherent FPGAs} [Extended Abstract]. 31st International Conference on High Performance Computing Networking, Storage and Analysis - Companion Volume (SC18), 2018
    PDF

Apan Qasem and Ashwin M. Aji and Michael L. Chu. Investigating Data Layout Transformations in {Chapel}. 2018 {IEEE} International Parallel and Distributed Processing Symposium Workshops, {IPDPS} Workshops 2018, Vancouver, BC, Canada, May 21-25,2018, 2018
    PDF

Apan Qasem and Samuel Teich. Evaluating the impact of data layout and placement on the energy efficiency of heterogeneous applications. Eighth International Green and Sustainable Computing Conference, {IGSC} 2017, Orlando, FL, USA, October 23-25, 2017, 2017
    PDF

Apan Qasem and Samuel Teich. Mitigating register pressure in {GPU} kernels for improved energy efficiency. Eighth International Green and Sustainable Computing Conference, {IGSC} 2017, Orlando, FL, USA, October 23-25, 2017, 2017
    PDF

Tiffany A. Connors and Apan Qasem. Automatically Selecting Profitable Thread Block Sizes for Accelerated Kernels. 19th IEEE International Conference on High Performance Computing and Communications (HPCC17), 2017
    PDF

Biplab Saha and Tiffany A. Connors and Saami Rahman and Apan Qasem. A Machine Learning Approach to Automatic Creation of Architecture-sensitive Performance Heuristics. 19th IEEE International Conference on High Performance Computing and Communications (HPCC17), 2017
    PDF

Apan Qasem and Aswhin Aji and Gregory Rodgers. Characterizing Data Organization Effects on Heterogeneous Memory Architectures. Proceedings of the International Symposium on Code Generation and Optimization (CGO), 2017
    PDF

Tiffany Connors and Apan Qasem. Power-performance Analysis of Metaheuristic Search Algorithms on the GPU. International Workshop on Green Programming, Computing and Data Processing (GPCDP15), 2015
    PDF

Claudia Alavardo and Dan Tamir and Apan Qasem. Realizing Energy-efficient Thread Affinity Configurations with Supervised Learning. Sixth International Green and Sustainable Conference (IGSC15), 2015
    PDF

Mario Gutierrez and Saami Rahman and Apan Qasem. Neural Network Methods for Fast and Portable Prediction of CPU Power Consumption. Sixth International Green and Sustainable Conference (IGSC15), 2015
    PDF

Biplab Kumar Saha and Saami Rahman and Apan Qasem. MLTUNE: A Tool-chain for Automating the Workflow of Machine-Learning Based Performance Tuning (Extended Poster Abstract). 28th International Conference on High Performance Computing, Networking, Storage and Analysis (SC15), 2015
    PDF

Saami Rahman and Apan Qasem. Investigating Prefetch Potential on the Xeon Phi with Autotuning (Extended Poster Abstract). 28th International Conference on High Performance Computing, Networking, Storage and Analysis (SC15), 2015
    PDF

Claudia Alavardo and Dan Tamir and Apan Qasem. Energy-Efficient Thread Migration via Dynamic Characterization of Resource Utilization. International Multi-Conference on Computing in the Global Information Technology (ICCGI), 2015
    PDF

Mario Gutierrez and Dan Tamir and Apan Qasem. Evaluating Neural Network Methods for PMC-based CPU Power Prediction. International Multi-Conference on Computing in the Global Information Technology (ICCGI), 2015
    PDF

Saami Rahman and Martin Burtscher and Ziliang Zong and Apan Qasem. Maximizing Hardware Prefetch Effectiveness with Machine Learning. 17th IEEE International Conference on High Performance Computing and Communications (HPCC15), 2015
    PDF

Abhilash Chaparala and Clara Novoa and Apan Qasem. Autotuning GPU-accelerated QAP Solvers for Power and Performance. 17th IEEE International Conference on High Performance Computing and Communications (HPCC15), 2015
    PDF

Saeed Taheri and Apan Qasem and Martin Burtscher. A Tool for Automatically Suggesting Source-Code Optimizations for Complex GPU Kernels. Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), 2015
    PDF

Clara Novoa and Apan Qasem and Abhilash Chaparala. A {SIMD} tabu search implementation for solving the quadratic assignment problem with GPU acceleration. Fourth Annual XSEDE Conference (XSEDE15), 2015
    PDF

Martin Burtscher and Wuxu Peng and Apan Qasem and Hongchi Shi and Dan Tamir and Heather Thiry. A Module-based Approach to Adopting the 2013 ACM Curricular Recommendations on Parallel Computing. Proceedings of the 36th SIGCSE Technical Symposium on Computer Science Education (SIGCSE15), 2015
    PDF

Apan Qasem. Exposing Undergraduates to Parallel Performance Concepts with a Three-Module Sequence. Workshop on Education for High-Performance Computing (EDUHPC14, educator workshop at SC14), 2014
    PDF

Abhilash Chaparala, Clara Novoa and Apan Qasem. A SIMD Solution for the Quadratic Assignment Problem with GPU Acceleration. Third Annual XSEDE Conference (XSEDE14), 2014
    PDF

Hyatt, Christopher and LaKomski, Greg and Alvarado, Claudia and Hay, Richard and Qasem, Apan and Tamir, Dan. Power Aware Task Matching and Migration in Heterogeneous Processing Environments. International Conference on Computational Science and Computational Intelligence (CSCI14), 2014
    PDF

Claudia Alvarado and Dan Tamir and Apan Qasem. Dynamic Feedback-Driven Thread Migration for Energy-Efficient Execution of Multithreaded Workloads. Proceedings of the 16th Annual TECHCON Conference, 2014
    PDF

Shwetha Shankar and Greg Lakomski and Claudia Alvarado and Richard Hay and Christopher Hyatt and Dan Tamir and Apan Qasem. Power Aware Work Stealing in Homogeneous Multicore Systems. The Sixth International Conference on Future Computational Technologies and Applications, 2014
    PDF

Martin Burtscher and Wuxu Peng and Apan Qasem and Hongchi Shi and Dan Tamir and Heather Thiry. Integrating Parallel Computing into the Undergraduate Curriculum at Texas State University: Experiences from the First Year. Workshop on Parallel, Distributed, and High-Performance Computing in Undergraduate Curricula (EDUPDHPC13, educator workshop at SC13), 2013
    PDF

Martin Burtscher and Wuxu Peng and Apan Qasem and Hongchi Shi and Dan Tamir. Preparing Computer Science Students for an Increasingly Parallel World: Teaching Parallel Computing Early and Often (Extended poster abstract). 26th International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), 2013
    PDF

Saami Rahman and Richard Hay and Apan Qasem. Enhancing Learning-based Autotuning with Composite and Diagnostic Feature Vectors (Extended poster abstract). 26th International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), 2013
    PDF

Hammad Rashid and Richard Hay and Clara Novoa and Apan Qasem. Algorithmic Choice in Optimization Problems: A Performance Study (Extended poster abstract). 26th International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), 2013
    PDF

Jim Holt and George Bazzera and Apan Qasem and Jason Miller and Henry Hoffmann. A Pattern Language for Adaptive Parallel Software. Proceedings of the 20th International Conference on Pattern Languages of Programs, 2013
    PDF

Christopher R Hyatt and Greg R. LaKomski and Dan Tamir and Apan Qasem. Power Aware Task Matching and Migration in Heterogeneous Processing Environments. Proceedings of the 15th Annual TECHCON Conference, 2013
    PDF

Shwetha Shankar and Dan Tamir and Apan Qasem. Towards an Operating System Based Framework for Energy-Efficient Scheduling of Parallel Workloads. Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), 2013
    PDF

Apan Qasem and Michael Jason Cade and Dan Tamir. Improved Energy Efficiency for Multithreaded Kernels Through Model-based Autotuning. Proceedings of the 2012 IEEE Green Technology Conferenc (GTC12), 2012
    PDF

Swapneela Unkule and Christopher Shaltz and Apan Qasem. Automatic Restructuring of {GPU} Kernels for Exploiting Inter-thread Data Locality. Proc. Int'l. Conf. on Compiler Construction (CC12), 2012
    PDF

Apan Qasem. Efficient execution of time-step computations with pipelined parallelism and inter-thread data locality optimizaitions. Proceedings of the 2012 PPOPP International Workshop on Programming Models and Applications for Multicore and Manycores (PMAM12), 2012
    PDF

Apan Qasem and Dan Tamir. Memory Performance Diagnosis Through Feedback Synthesis. Proceeding of the Workshop on Feedback-Directed Compiler Optimization for Multi-Core Architectures (COMA12 a HIPEAC workshop), 2012
    PDF

Faizur Rahman and Qing Yi and Apan Qasem. Understanding stencil code performance on multicore architectures. Conf. Computing Frontiers (CF11), 2011
    PDF

Swapneela Unkule and Apan Qasem. Register pressure aware code transformations on {GPU}. 24th International Conference on High Performance Computing Networking, Storage and Analysis - Companion Volume (SC11), 2011
    PDF

Clara Novoa and Apan Qasem and Hammad Rashid and Mark McKenney. Dynamic Programming Solutions for the Integral Knapsack Problem on Multicore Architectures, (Extended Abstract). 11th INFORMS Computing Society Conference, (ICS11), 2011
    PDF

Santosh Sarangkar and Apan Qasem. Intelligent Feedback For Fast and Effective Autotuning, (Extended Poster Abstract). 23rd International Conference on High Performance Computing, Networking, Storage and Analysis - Companion Volume (SC10), 2010
    PDF

Qing Yi and Jichi Guo and Apan Qasem. Evaluating the Role of Optimization-Specific Search Heuristics in Effective Autotuning (short paper). 23rd International Workshop Languages and Compilers for Parallel Computing (LCPC10), 2010
    PDF

Apan Qasem. Locality-Conscious Superpaging for Improved TLB Behavior of Stencil Computations. Proceedings of the 2010 International Conference on High Performance Computing Systems (HPCS10), 2010
    PDF

Qing Yi and Santosh Sarangkar and Apan Qasem. Improving Autotuning Effciency and Portability Through Feedback Diagnostics. Proceedings of the Fifth International Workshop on Automatic Performance Tuning (iWAPT10), 2010
    PDF

Hammad Rashid and Clara Novoa and Apan Qasem. An Evaluation of Parallel Knapsack Algorithms on Multicore Architectures. Proceedings of the 2010 International Conference on Scientific Computing (CSC10), 2010
    PDF

Santosh Sarangkar and Apan Qasem. Restructuring parallel loops to curb false sharing on multicore architectures. 24th IEEE International Symposium on Parallel Distributed Processing Workshops and Phd Forum (IPDPSW10), 2010
    PDF

Apan Qasem and Jichi Guo and Faizur Rahman and Qing Yi. Exposing Tunable Parameters in Multi-threaded Numerical Code. Network and Parallel Computing, IFIP International Conference, (NPC10), 2010
    PDF

Joshua Magee and Apan Qasem. A case for compiler-driven superpage allocation. Proceedings of the 47th Annual Southeast Regional Conference, (ACMSE09), 2009
    PDF

Michael Jason Cade and Apan Qasem. Balancing Locality and Parallelism on Shared-cache Mulit-core Systems. 11th IEEE International Conference on High Performance Computing and Communications (HPCC09), 2009
    PDF

Qing Yi and Apan Qasem. Exploring the Optimization Space of Dense Linear Algebra Kernels. 21st International Workshop Languages and Compilers for Parallel Computing (LCPC08), 2008
    PDF

Apan Qasem. Evaluating an Early-stop Criterion and a Statistical Pruning Strategy of the Optimization Search Space. Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), 2008
    PDF

Apan Qasem and Ken Kennedy. Pruning the Optimization Search Space Using Architectureaware Cost Models. Proceedings of the First Workshop on Statistical and Machine Learning Approaches Applied to Architecture and Compilation (SMART07), 2007
    PDF

Apan Qasem and Ken Kennedy. Profitable loop fusion and tiling using model-driven empirical search. Proceedings of the 20th Annual International Conference on Supercomputing (ICS), 2006
    PDF

Apan Qasem and Ken Kennedy. A Cache-Conscious Profitability Model for Empirical Tuning of Loop Fusion. 18th International Workshop on Languages and Compilers for Parallel Computing, (LCPC), 2005
    PDF

Apan Qasem and Ken Kennedy and John Mellor-Crummey. Automatic Tuning of Whole Applications Using Direct Search and a Performance-based Transformation System. Proceedings of the Los Alamos Computer Science Institute 5th Annual Symposium (LACSI04), 2004
    PDF

Robert Fowler and John Mellor-Crummey and Guohua Jin and Apan Qasem. A Source-to-source Loop Transformation Tool (Extended poster abstract). Proceedings of the Los Alamos Computer Science Institute 3rd Annual Symposium (LACSI02), 2002
    PDF

Apan Qasem and David B. Whalley and Xin Yuan and Robert van Engelen. Using a Swap Instruction to Coalesce Loads and Stores. 7th International Euro-Par Conference Parallel Processing, (EuroPar01), 2001
    PDF