Hi, I'm Smyan

I'm a 3rd year Computer Science student at Northeastern University, passionate about building software and solving problems.

Smyan Sengupta

Experience

Machine Learning Engineering Co-op

July - December 2025
PfizerAndover, MA
  • Engineered a spaCy-based Natural Language Processing (NLP) chat assistant with a custom Named Entity Recognition (NER) pipeline from scratch, providing automated insights on experimental data, process documentation, and model optimization strategies for workflows across 5 distinct manufacturing processes
  • Developed, automated, and deployed a data cleaning and predictive modeling pipeline using XGBoost, Random Forest, and SVR on 9+ time-series lab datasets (1,000+ datapoints each) for an $11 billion polysaccharide data analysis project
  • Implemented SHAP explainability analysis and EconML causal inference to identify highest-impact factors across datasets, ensuring model interpretability and consistency
  • Researched and implemented MLflow experiment tracking and Data Version Control (DVC) for reproducible ML workflows; presented proof-of-concept to modeling teams across 4 sites

Teaching Assistant – Foundations of Data Science

September 2024 - April 2025
Khoury College of Computer SciencesBoston, MA
  • Facilitated 6+ office hours per week, directed project meetings, and proctored labs/exams to assist students with assignments and aid them in understanding data analysis, linear algebra, statistics, and machine learning concepts
  • Graded 90+ assignments per week, providing useful feedback

Director of Operations

August 2025 - Present
Northeastern Claude Builder ClubBoston, MA
  • Secured Anthropic as a sponsor for two large-scale 24-hour hackathons, negotiated Claude API credit allocations for winners
  • Coordinated demos and workshops with Claude Student Ambassadors and student developers to drive adoption and showcase capabilities across the Northeastern developer community

Founding Member and Vice Chair

September 2024 - Present
Northeastern Association for Computing MachineryBoston, MA
  • Directed the planning and execution of two large-scale 24-hour hackathons with 40+ attendees, coordinating budgeting, sponsorships, judging, workshops, and cross-club collaborations across 6+ student organizations and 3 local startups
  • Facilitated communications between student software engineers and local startups to develop impactful real-world products

Co-Founder and Vice President

May 2025 - Present
MedCS LabBoston, MA
  • Spearheaded an initiative to establish an interdisciplinary undergraduate research group focused on the intersection of computer science, data science, and the life sciences, facilitating communication between members and advisors

Tools I Use

Projects

NewsFactChecker

Designed a probabilistic NLP fact-checking model on 10K+ statements from the LIAR dataset using a 6-class Bayesian softmax classifier, applying Hamiltonian Monte Carlo with JAX to sample 5,000 posterior weight estimates. Implemented semantic feature extraction using SentenceTransformer embeddings; benchmarked posterior-averaged credibility scores against logistic regression baselines using AUC and threshold-optimized F1.

PythonJAXNumPyScikit-learnSentenceTransformer

Guardrails: Atomic

1st Place - 2025 AI Agent Hackathon

Architected an AI-powered full-stack formal verification platform in Next.js/TypeScript that iteratively generates mathematically verified code from natural language inputs, winning 1st place at an AI Agent Hackathon. Engineered a YAML-to-Z3 conversion pipeline with automated counterexample generation for program verification.

TypeScriptNext.jsZ3 ProverMongoDBOpenRouter

OpenLegislation

1st Place - HackHarvard 2024 Open Data Track

Built a web application that simplifies congressional legislation for general audiences, enabling users to search, read, and understand active bills without legal expertise, winning 1st place at the HackHarvard 2024 on the Open Data Track. Implemented semantic search using OpenAI embeddings to surface relevant bills from natural language queries, and integrated GPT-based summarization to translate legal language into plain English.

ReactTailwindCSSOpenAI API

HealthSync

Developed a mobile application that analyzes user health data and health-related journal entries to provide users with AI-powered health analysis and recommendations.

FlutterPythonMongoDB AtlasGemini API

Second Sight

Developed an AI-powered journaling application where users can log their mood for the day, create journal entries, and view entries over time. Leveraged Gemini AI to analyze entries and mood trends to assist users in understanding themselves better.

FlutterKotlinGemini API

Hoo Wants A Degree

Developed a degree planner web application to aid University of Virginia School of Engineering students in creating AI-generated four-year degree plans.

ReactPerplexity AI API

Stocks Simulator

Developed a full-stack Java application using MVC architecture for users to create, update, and maintain stock portfolios. Incorporated algorithms to calculate stock gain/loss, moving averages, and portfolio rebalancing with real-time data for 1000+ stocks.

JavaRESTful APIsMVC ArchitectureSwingAlpha Vantage API

Contact

Feel free to reach out if you'd like to connect or collaborate!