AI for Healthcare

January 15, 2022

During the beginning of the pandemic I engaged in multiple Udacity Courses This is a summary of the AI for Healthcare course and what I learned from it. AI for Healthcare In this course, I learned how to utilize machine learning techniques for health care data. The first project was a CNN that used chest X-rays to diagnose pneumonia. For this project, I learned alot about how machine learning can be applied for health care purposes. This project covered the types of error, the steps involved in getting a model approved by the FDA, and how to properly handle sensitive medical data. The actual model generated was not particularly powerful, however with some continued work it could be made very robust. A visualization of the model’s performance The next project I undertook was using MRI scans for Hippocampus Segmentation. This was my first experience utilizing three dimensional data or MRI scans in general. The UNet-based model was coded using Pytorch, a library I had very little familiarity with. The whole project was very new to me and I learned a significant amount. In the future, I want to do some more experimentation with the UNet architecture and explore what problems I can apply it to. A visualization of the segmentation model’s training performance The next project was to build a regression model that predicted expected hospitalization time for a patient being given an experimental diabetes drug. This was then used to decide whether or not they should be included in the clinical trials. The data set was a synthetic and the real challenge was the preprocessing and analysis of the dataset rather than the model building. It was very challenging working with a medical dataset which required a considerable amount of preprocessing, feature engineering, and filtering. I also learned how to deal with model bias for demographic groups and how to mitigate these. This project was very informative on how to work with medical data and how to analyze a model based on that data. A visualization of the model’s biases This model shows significant bias towards Caucasians and African-Americans. There is also a bias towards women over men. The final project was utilizing data, in this case heart rate, from wearables. This project used data I was very inexperienced with handling. Preprocessing and analyzing the data was very difficult. I was able to complete the project and I learned a lot from the experience. Overall, this course introduced me to many new data types and introduced me to the possible implementations of machine learning technology in the health care industry. I also learned how to handle sensitive health care data, the impact of model bias based on demographics, the intricacies of the FDA approval process, and the types of errors and metrics to consider in a health care setting.

Cassava Leaf Detection Kaggle Competition

May 26, 2021

During my winter break, I took on the Cassava Leaf Detection competition on Kaggle, which I thought would be a quick project that would keep me occupied. However, I found out I was greatly mistaken. What I thought would be a quick few week project ended up taking me almost three months. I started the project at the start of my winter break, December 23, and my last submission was made on the day that the competition ended, February 11. Initially, I had hoped I could make a simple model based off the ResNet architecture because I have had success with that before. These hopes were crushed when I found I could get no more than 70% accuracy. I then decided I needed a more complex model and decided to use transfer learning. I decided to use the pre-trained ResNet50 model. To my surprise, This performed worse than my initial model. I then tried to fine tune my initial model and was able to hit 75% before overfitting. I then tried both the pre-trained Inception and MobileNet architectures. The Inception model was only able to hit 70% without any fine-tuning and I quickly abandoned it. The MobileNet architecture was able to hit 80% with a lot of fine-tuning but I decided to drop it because I did not think I could get any more out of it. So, late in to the competition, I was scrambling to try and find an architecture that could work. I finally stumbled onto the pre-trained EfficientNet model. My first test I was able to hit 83% before overfitting. The only problem was, for a reason I have yet to discern, the model would sometimes decide not to learn and a training session would be wasted, also wasting my precious, limited GPU hours. It was the final week of the competition and, furiously rushing, I was able to tune the model to get 86%. With my GPU hours gone and the competition coming to a close I was able to submit my final model with a score of 0.8611. I got in 2854th out of 3900 teams, the top team getting a score of 0.9132. A visualization of the model This was my first Kaggle competition and I was able to learn a lot. If I were able to redo this, I would not have spent as much time fine-tuning the MobileNet model as most of it didn’t lead anywhere. Overall, I am happy with this project and I think I learned quite a bit. I meant to get this post out during February or March but I was still revamping the website and I didn’t finish until March 19th which was the end of my Spring Break. With the end of the school year I wasn’t able to work on any personal projects and I was only able to work on this post at the start of summer. I also have a recap of the extended summer, due to quarantine, of 2020 that I need to post and I hope to post that soon.

Another Website Update

November 3, 2020

I have updated the website again to make it easier to navigate and look better. I returned to Jekyll after switching to Hugo and used the Type-on-Strap theme to revamp the website. In my opinion, I find that Jekyll has better looking themes than Hugo. However, Hugo is a lot simpler and it took longer to set up the Jekyll site. Both use markdown to edit posts so it was fairly simple to move the old posts to the new site. The most time consuming aspect was changing how the images were imported because of the different folder structures. I did have some trouble with the audio embed from the previous post and had to rework how I embedded that. I also took the opportunity to improved some of the previous posts and redesigned my logo. I also had the time to revamp the portfolio section and create some quick designs for each of my projects. In the future I want to expand on the descriptions provided, however, I don’t know how to do this without making the blog section redundant. With the full website redesign, I took the time to create a simple graphic for the front page using krita and establish a color palete for the rest of the website. This may be subject to change. I hope this is the last time I have to completely redesign the website, however, I do plan on refining some elements.

Music Generation Project

July 16, 2020

I took on a small project this year. I wanted to work more on neural networks so I attempted to create one that generated music. I used a dataset of midi versions of mozart’s music. In order to create the algorithm I utilized this tutorial. This tutorial helped me understand how to work with midi data, which I have never worked with before. I used a modified version of the WaveNet Architecture for my model. One problem I ran into, was the model tended to return a long run of the same note most of the time. So in the end, it would, only half the time, generate a good song. In the future, I want to make the model more complex in the future and train it with more data. Here is one of the better pieces of generated music: thumbnail and header photo by Scott Kelley from Unsplash

An Update

July 16, 2020

I have not been good about updating my blog. This year I want to try to be better at this. Throughout the year, I have been taking on small projects. During the school year, I attempted to complete two courses. These courses were by Brandon Rohrer, whose courses you can find here. The first one was a decision tree which predicted arrival times of the subway in Boston. An early version of the decision tree I achieved very good results in the end with a pretty accurate model. I ran into a problem because I live in the central time zone and the data was in eastern time the model didn’t learn properly that is the decision tree pictured here. The second course I took was a polynomial classifier that, when given the name of a dog breed, gives similarly sized dog breeds. A visualization of the classifier

Ravindu Wijeratne