Katherine T. Halvorsen
Bachelor of Arts
Statistical and Data Sciences
Metabolomic analysis, Machine learning, Prediction modeling, Lasso
Metabolites are small biological molecules that are involved in the process of con- verting food to energy and in generating new cells. Metabolomics shows us unique features of cancer studies that genomics cannot provide. Current metabolomic re- search is limited by the number of metabolites that a study measures. Our goal is to predict unidentified metabolites. We used data from eight studies across six different cancer types: renal cell carcinoma, breast cancer, urthle cell carcinoma of the thyroid, diffuse large B-cell lymphoma, pancreatic cancer, and prostate cancer. We built prediction models using two methods, Principle Component Regression (PCR) and Least Absolute Shrinkage and Selection Operator (Lasso). We evaluated model performance on existing data and we achieved robust model performance. Prediction models for a portion of metabolites exhibit successful transfer learning on metabolites from an unseen cancer type or study.
©2020 Ziwei Zang. Access limited to the Smith College community and other researchers while on campus. Smith College community members also may access from off-campus using a Smith College log-in. Other off-campus researchers may request a copy through Interlibrary Loan for personal use.
Zang, Ziwei, "Metabolites prediction in multiple studies using machine learning" (2020). Honors Project, Smith College, Northampton, MA.
Off Campus Download