Projects
If not available on GitHub, code for all projects are available upon request.
AlphaZero Implementation
Implemented Google Deepmind’s AlphaZero algorithm in PyTorch to intelligently play Tic-Tac-Toe and Connect4.
Bit String Design with GFlowNets
Implemented a GFlowNet in PyTorch to construct desired bit strings.
Evaluating and Extending Biomedical Image Segmentation Models
COS 429: Computer Vision Final Project
The field of biomedical image segmentation has long been dominated by hand-crafted techniques or mathematical modeling-based approaches. With the rise of deep learning, there has been a growing interest to develop biomedical image segmentation models based on deep learning techniques, which has shown rapid improvements in segmentation accuracy and efficiency. In this project, I evaluate the cutting edge deep-learning based biomedical image segmentation models, providing plenty of training results that are lacking in many biomedical image segmentation papers. Additionally, I also train the biomedical image segmentation models on the famous COCO dataset and provide those training results as well.
Improving Downstream Task Performance on Beta Lactamase Activity Prediction by Fine-Tuning Protein Language Models
COS 445: Computational Biology Final Project with Keerthana Nallamotu, Hee-Yun Suh, and Nick Sudarsky
In this study, we use the learned protein representations from ProteinBERT and ProtBert to predict classes and activity of beta-lactamase, a threatening enzyme for antibiotics, such as penicillin and cephalosporins, for informed treatment decisions and drug development. In our approach, we append and fine-tune the last multilayer perceptron (MLP) layers of ProteinBERT and ProtBert for two tasks: classification of beta-lactamase enzymes and regression of beta-lactamase activity. For classification, initial results showed a 95% accuracy, highlighting the value of ProtBert embeddings in downstream beta-lactamase tasks. For the regression task, we implemented a simplified regression model that is competitive with the results in the PEER benchmark since our evaluation test metric (Spearman’s rank correlation coefficient) is within 0.02 of the reported score in PEER [8]. Additionally, for regression, we ran the PEER benchmark code on our Princeton Della computing cluster for an extensive analysis of PEER’s training. Our work on regression is a replication of the PEER benchmark, and our work on classification is an extension of fine-tuning ProtBert on a beta-lactamase classification task. This project underscores the importance of PLMs in solving challenges in computational biology and paves the way for continued research in refining PLMs for antimicrobial resistance research.
Vector Search for Linkage Mechanisms
COS 597A: Long-Term Memory in AI: Vector Search and Databases Final Project
I explore vector search use cases for mechanical linkage designs and a novel generative model for producing these linkage designs. In doing so, I lay the foundation for a vector database that linkage designers can use in conjunction with my generative model when designing optimal linkages.
ClubPrinceton
Joint Work with Aabid Ismail, Christopher Speed, and Roy Mazumder
Developed a social media application with the goal of connecting 3000+ Princeton students with 100+ on-campus organizations. Implemented full-stack services, such as user ratings, club account creation/approval, and club page administration.
Implementation of CAS Authentication in ExpressJS
Implemented an API for Princeton’s Central Authentication Service in ExpressJS and EJS. This software will be used by 80+ students in Princeton’s software engineering course.