Investment Data Analyst Intern
New York, NY
- Rose AI is a financial data vendor and data product. Rose helps financial professionals access and audit the data that they need, within their Excel or Google Sheets workspace, or within Python. Rose prides itself on the auditability of the data that is made available, specifically the source of the data and the transformations that were applied to it.
- Developed transformations for ingested data in Rose’s proprietary language. These transformations are visible to individuals who audit the data, and the auditability of data is one of the core premises of Rose’s product.
- Ingested over 10k+ data series across 2 major vendors, and numerous minor vendors. Identified data vendors according to the needs of financial professionals. Vendors were evaluated for currency, openness, and accuracy. Developed documentation to navigate data made available from each vendor.
- Developed data transformations to fetch the raw data from vendors. Collaborated with the data pipeline team to deploy these scripts.
Cloud Data Analyst
New York, NY
- Tringapps is a Software Consulting company specializing in the development of SaaS applications. The cumulative active users served across applications developed by Tringapps is 400 million.
- One of Tringapps’ clients was hosting an event and wanted an application developed that would provide the attendees information about the event, and allow them to communicate with other attendees.
- Developed the backend for the group chat feature that would allow the attendees to communicate with each other. The application supported message deletions, read receipts, among other basic functionality that would be expected in a chat application.
- Developed a WebSocket API on AWS to handle the API requests and used AWS DynamoDB to store the associated data. The solution needed to be powerful and flexible to incorporate the evolving demands of the client.
- The app was downloaded 1000+ times on the Google Play Store. It is rated 4.2/5 on the Apple App Store, and 3.7/5 on the Google Play Store. Since the chat functionality is one of two core features, this seems to suggest that the chat functionality did not deviate dramatically from the expectations of the end-user.
- One of Tringapps’ clients was a utility company, who wanted to explore the possibility of using machine learning algorithms to forecast customer usage, and estimate their bills. The experiment was a success and a model was deployed to serve these forecasts to the customer.
- Trained two models on the client’s time series data and documented the findings. Followed typical best practices including splitting the data into train, test and validation splits, while controlling for the training variables. One of the models was a custom model intended to capture both trend and seasonal variations. The other model was a forecasting model provided by AWS.
- Developed data pipelines to store the client’s data in a database dedicated to timeseries data, and developed data pipelines to transform the data from the timeseries database into the format that the machine learning model expects.
- Deployed the model so that it would be updated once a week with new data. The model accounted for parameters such as the location, the associated weather patterns and forecasts, and specific location details such as square footage and commercial/residential designation.
- Tringapps was in the midst of expanding their data solutions team, and wanted to develop an onboarding process that would quickly prepare interns to handle a variety of common data pipeline tasks that they are likely to encounter. I was selected to develop this process because I developed pipelines that were reliable, easy to maintain and modify, and I developed them quickly.
- Developed a 4-week training program which included 6 tasks that interns would complete over the course of 4-weeks alongside their other responsibilities. The tasks escalated in difficulty starting from basic API data ingestion, to arranging data from multiple data sources between dramatically different data schemas.
- Personally led 3 interns through this training program, and at least one other intern was trained through the same process. The tasks were all completed using AWS Glue and pyspark.
B.Sc Mathematics & B.A. Linguistics
Champaign, IL
- Graduated B.Sc Mathematics with distinction; 3.37 GPA.
- Completed coursework in Statistics, Differential Equations, Abstract Linear Algebra, Real Analysis, Euclidean & Non-Euclidean Geometry, and Number Theory, Syntax, Morphology, Classical Mechanics, Electromagnetism.