To access this work you must either be on the Smith College campus OR have valid Smith login credentials.
On Campus users: To access this work if you are on campus please Select the Download button.
Off Campus users: To access this work from off campus, please select the Off-Campus button and enter your Smith username and password when prompted.
Non-Smith users: You may request this item through Interlibrary Loan at your own library.
Publication Date
2025-5
First Advisor
Michael Robson
Document Type
Honors Project
Degree Name
Bachelor of Arts
Department
Computer Science
Keywords
large language model, parallel program, high performance computing, artificial intelligence, performance testing, basic linear algebra subprogramas (BLAS)
Abstract
Large Language Models (LLMs) have demonstrated strong capabilities in codegeneration for general-purpose programming, but their potential in the domain ofHigh-Performance Computing (HPC) remains underexplored. This thesis investigateswhether LLMs can effectively generate parallel programs suitable for HPCtasks by evaluating their performance across multiple models and parallel architectures.Using three representative parallel programming tasks, vector summation,SAXPY, and MPI-based averaging, we benchmarked code generated by eight LLMs(GPT-2, GPT-Neo, HPC-Coder, PolyCoder, Meta-LLama, o1-mini, Claude Sonnet3.5, Gemini-Pro 1.5) against three baseline implementations, including ATLAS(Auto-Tuned Linear Algebra Software), OpenMP (a hand-written naive approach),and OpenBLAS (optimized implementation). Our evaluation covers generation time,code compilation readiness, and runtime performance across different computing environments,including machines with varying core counts and memory capacities.My results show that LLMs can produce syntactically correct and compilable codewith performance metrics comparable to traditional handwritten baseline implementations.Despite variability in output across models, the generated programs consistentlydemonstrate potential for practical use in parallel computing tasks. Thesefindings suggest that LLMs can be valuable tools in accelerating HPC developmentworkflows. In addition to speeding up code generation, LLMs also reduce the manualeffort required to port HPC codes across systems or establish reasonable baselineimplementations for optimization, making them useful entry points for furtherperformance tuning. This work lays the foundation for further research into modelfine-tuning, prompt engineering, and expanding LLM capabilities for domain-specificcode generation in scientific computing, while also providing a reproducible pipelinefor evaluating LLMs on parallel code generation and their scalability across diversecomputing environments.
Rights
©2025 Ramsha Rauf. Access limited to the Smith College community and other researchers while on campus. Smith College community members also may access from off-campus using a Smith College log-in. Other off-campus researchers may request a copy through Interlibrary Loan for personal use.
Language
English
Recommended Citation
Rauf, Ramsha, "Evaluation of Large Language Models for Parallel Program Generation" (2025). Honors Project, Smith College, Northampton, MA.
https://scholarworks.smith.edu/theses/2760
Smith Only:
Off Campus Download

Comments
[9], 69, [22] pages: color charts. Includes bibliographical references (pages [70-72]).