beaukramer.github.io

View My GitHub Profile

Beau Kramer

Resume

Projects

Finance

Machine Learning

NLP Deep Learning Feature Engineering Data Visualization

For this NLP project, my team selected an automated essay grading challenge. We were interested in automatically generating feedback for students and for contrasting linear models with deep learning ones. We performed extensive feature engineering and trained deep learning models on an essay and sentence level.

NLP Classification Data Processing

I trained classifiers using a bag-of-words model to identify the topic of a news article. After some intial attempts at the problem, I applied some preprocessing to the texts which improved the ability of the model to generalize.

Using PCA to reduce dimensionality, I clustered data about mushrooms to try to classify poisonous ones. I used KMeans and Gaussian Mixture Models.

For this group project, my team selected the forest cover prediction challenge. We had to predict the species of tree that lived in a 30x30m cell in several Colorado forests. We summarized our lessons learned in this presentation.

Statistics

Classic linear regression was the focus of this project. We were given a dataset about crime in North Carolina in the 1980s with the goal of providing policy recommendations to politicians.

This discrete response modeling project addressed whether temperature and/or pressure have a relationship with the failure of the primary o-rings in the space shuttle.

This was a simple exercise in forecasting an e-commerce time series from the Federal Reserve Bank in St. Louis. After exploring and verifying the data’s suitability for the model, we fitted a seasonal ARIMA model to the data.

This panel dataset on driving laws proved a challenge to explore visually, but we managed to create some visuals that helped us understand the data. We then used a fixed effects model to capture the effects of different policies on fatality rates.

Multinomial regression of a cereal dataset was the main task in this project. In addition to calculating odds ratios, we built up toward visuals that showed the shelf probability by nutritional content.

The focus of this project was on the importance of exploratory data analysis in any project. We explored a dataset about a portuguese national park that experienced severe wild fires to see if we could begin predicting conditions that make an area vulnerable to extreme fires.

Fun