Blog Post #12: The Outro

Hey everyone! Welcome to the final blog post of my Senior Project Experience! This week I’ll be giving a review of everything that has happened over the past 12

At the very least, I definitely would say that the senior project was a journey that had its highs and lows. Academically this project had an extremely dense amount of work. I had to cover a large amount of content in such a short amount of time (12 weeks), including learning some linear algebra and machine learning concepts, which are both very technical fields. Thankfully I got through it! However, I believe I overestimated my ability to understand these topics, because even after the experience I still don’t completely understand some of the concepts I had to deal with, primarily the principal component analysis. I feel that I may have been too ambitious with the scope of my project; I think I should have only worked with one type of classifier instead of two that have completely different architectures, as understanding how each one works took a lot of time that I would rather have had studying principal component analysis.

At first, the project was going without a hitch. I was going at a faster pace than the schedule on my syllabus. However, once I got to the very dense topics, primarily generating adversarial examples and implementing PCA, I saw that my progress slowed down as I had to study up on these topics, as they required a lot of mathematics and coding. However, the major obstacle I had was that implementing PCA in the convolutional neural network turned out to be extremely difficult, because I didn’t know how to reflect the processing of the algorithm in the classifier. I lost the most time trying to fix this issue, and I had to cut out a portion of the project involving implementing PCA in the adversarial image generator.

Overall, though, I did enjoy my Senior Project Experience, and provided me a great learning experience, both personally and academically. My final product is going to be an article that will be detailing my findings of this project. I will also be presenting about my Senior Project Experience on May 23rd at the DoubleTree Hotel near San Jose International Airport. Hope to see some of you there!

Special thanks to Ms. Jefferson, my BASIS Advisor, Ms. Belcher, my Senior Project Coordinator, Dr. Mitra, my external advisor, and my friends and familiy!

Until next time!

-Rishab

Advertisements

Blog Post #11: Wrapping it Up

Greetings! Welcome to the 11th blog post of my senior project! This post is going to be a short one, as this week was just testing the last piece of the project in terms of the convolutional neural network outfitted with PCA.

After the CNN was implemented with the PCA, the overall classification accuracy of the classifier was about 94.6%, which is great because it is very close to the original classification accuracy of the classifier, which was about 95.2%. However, when tested with the adversarial images, I found that the PCA implemented classifier’s accuracy was about 81.9%. When tested with the same images, the classifier without the implementation of PCA had a classification accuracy of  82.6%. Thus, in this regard, they both had similar accuracies with the adversarial images, which also means similar dropoff rates. Therefore, PCA seemed to have a minimal effect, either harmful or beneficial, on the accuracies of the classifier. This may be because of the fact that the input function of the convolutional neural network is very similar to the PCA algorithm, so that may mean that the PCA function is redundant information to the classifiers, which is why the classification accuracies were so similar.

In terms of the significance of the finding, I cannot conclude much in terms of the effectiveness of PCA in defending adversarial attacks. However, I can conclude that PCA is effective on a case-by-case basis. The algorithm was very effective on the logistic regression classifier, and as that classifier is more rudimentary, it can be an indication that PCA can be more effective on more basic classifiers, but different classifiers of the same ilk must be tested. However, PCA was not as effective on the more advanced convolutional classifier, which can be a foresight for other advanced classifiers. Overall, though, PCA cannot be considered as a universal defense.

Next week, I will be recapping and reviewing my senior project experience! I will be telling you about the highlights, the lowlights, any future work I will do, and if I want to pursue a similar topic in the future. Thank you guys for bearing with me on this wild ride!

Blog Post #10: The Home Stretch

Hey everyone! Welcome to the first-ever Post #10 for the Attacks on Self-Driving Cars blog!

The last blog post I detailed about how in terms of progress, I was about a couple weeks behind due to the large obstacle I faced with the convolutional neural network’s accuracy. Originally, the plan was to use the principal component analysis algorithm on the neural network that generated adversarial attacks and generate a new batch of images to be used on the classifiers implemented with PCA. Thus, early in the week I started to work on implementing the algorithm into the adversarial image generator. However, I hit into another roadblock this week, involving this image I displayed from Blog Post #3:

8837b-1vmby7agksx-nhc8x8_b-_w

With the supposed implementation of the PCA algorithm, the data will be reduced and reformatted. J, or the classification loss function, which minimizes the error in the neural network, however, must also change. There will be a smaller amount of dimensions to the data. Thus, with a change in dimensionality, there will be a change in how the error is calculated, so using the same loss function may result in an adversarial image that looks nothing like its original counterpart.

Thus, the question is, how must the loss function change? Unfortunately, I’m not sure I can answer this question properly. Due to the time constraints of the project, my external advisor suggested that I make the executive decision with the attack generator; rather than create entirely new adversarial images, he suggested to use the same adversarial images because my testing from Blog Post #6 showed that they were able to transfer effectively between one architecture to another, so it could be the case here.

Thus, I headed onto the next stage of the project-testing, leaving the question above for me to tackle in the future work I do in adversarial machine learning. I started with the logistic regression model, training with the original traffic sign images and testing it with the adversarial images. The classification accuracy for this model turned out to be about 71.2%, which is higher than the results from Blog Post #6. This is definitely a great sign!

Next week I will be testing the adversarial images on the convolutional neural network implemented with PCA. If the results are similar to the logistic regression model, it will show that PCA actually has a positive effect when acting as a defense on adversarial attack, which will be a great finding.

Strap in, everyone! We’re in the homestretch of the project!

-Rishab

Blog Post #9: An Update on Last Week’s Events

Hey everyone! I’m going to pick right back up from where I left off from last week, involving the problem I was having with the convolutional neural network and logistic regression fitted with the principal component analysis, where the classification accuracy of the neural networks were only about 19% and 16% respectively, which was definitely not where I wanted to be in terms of classification accuracy. I wanted to be more towards the range of 80% or above.

Well, that obstacle took me this entire week, most likely because I am not a great debugger and instead facing the problem head on and trying to find detours around it. I’ll go over in detail about what I did to fix the problem, and the various efforts I tried, many of which failed.

Originally, my mentor and I thought that there was a problem with the PCA function I used, as I coded the function myself in reference to the mathematics of the paper I linked in Blog Post #7. Therefore, I searched for any functions in the Tensorflow software library that would allow me to do the same matrix math in the PCA function I coded myself. Thankfully, I found just the right function in the Tensorflow library, called Singular Value Decomposition, or tf.svd, which also finds principal components of matrices. Therefore, with my advisor’s assistance, we replaced my PCA code with the built-in tf.svd function. However, the results were still the same, with classification accuracies of 19.6% and 17%.

The next attempt I made to fix the problem was to port the neural networks to another language by myself, which was admittedly not a very smart decision. Because I was programming in a low-level Tensorflow API, I did not have much experience in debugging at such a level. Thus, I turned to MATLAB’s neural network toolbox, a high-level software library where I have more experience debugging and programming in, and decided to port my convolutional neural network and logistic regression classifiers over to that software. Importing these classifiers in the MATLAB toolbox was surprisingly easy; there was already a function called importKerasNetwork, which called in my convolutional neural network and logistic regression classifier into the software. The toolbox also had a built-in PCA function, so I used that function to process the data and break it down into the principal components. Sadly, the classification accuracy was still the same. Looking back, going on this detour did not make much sense, as I made virtually no progress in debugging and instead changed the PCA function again, which as my advisor and I learned, was not the problem to begin with. Essentially,  I was just going in circles and wasting precious time.

With no other options, my advisor suggested that I find a fresh set of eyes to look at the convolutional neural network code and see if he, or she could fix the problem. Thus, I asked an acquaintance of mine, a graduate student at UC Berkeley who had experience with Tensorflow to look at my code. Thankfully, he found the problem, which was in the preprocessing stage of the data where the PCA was used. Unknowingly, I was using the test data to find the key principal components, which only captures a sliver of the variance in the entire dataset. Therefore, I needed more principal components for the data then was given. With that in mind, the classification accuracies are much more respectable, with the logistic regression classifier having an accuracy of 80.7%, and the convolutional neural network having an accuracy of 88.2%.

With all of these efforts to fix the problem, I now am 1 week behind in my project, as I should have been finished with this stage of the project last Friday. Luckily, I have all of Spring Break next week, so I will definitely make up for lost time.

Until next time!

-Rishab

Blog Post #8: Running Through PCA

Hey everyone, welcome to my 8th blog post! I can’t believe I’ve written 8 posts already, time has been flying by!

This week was primarily dedicated to processing the traffic sign images through the principal component analysis algorithm. I will spare you the details of how the mathematics of the algorithm works, as it involves knowledge of terminology such as eigenvalues and eigenvectors,  but I will link this paper that I’ve used as a reference for the mathematics for this entire week.

In terms of what the image data looks like when PCA is applied, I’ve taken pictures of scatter plots that represent the variance of the dataset through a program called the Tensorboard. Here are some of the images below:

With only one component, capturing about 9.8% of the data’s variance:

Screen Shot 2018-03-30 at 9.53.41 AM.png

With two components, capturing about 17.9% of the dataset’s variance:

Screen Shot 2018-03-30 at 9.54.05 AM.png

Finally ,with three components, capturing about 29.9% of the dataset’s variance (I tried to make this scatter plot into a .gif to capture the three dimensional sensibility, but unfortunately I’m not a great .gif creator):`Screen Shot 2018-03-30 at 9.54.53 AM.png

Sadly the tensorboard visualizer would only allow me to graph in three dimensions, although it is very understandable about why the software cannot visualize beyond the third dimension. Ultimately though I settled on using ten principal components, which captured about 81.2% of the dataset’s variance, which means that it passed the 80-20 rule I discussed from last week’s blog post with flying colors, as ten principal components is definitely less than 20% of the image’s 1024-dimensional vector.

I applied the principal component analysis algorithm to the  preprocessing stages of the logistic regression and the convolutional neural networks, where the data is processed to optimize the accuracy of the training algorithm. However, when applying the PCA algorithm to the convolutional neural network and logistic regression classifiers, with every iteration of the neural network I would get a consistent classification accuracy of 19% and 16% respectively. Those are clearly unacceptable classification accuracies, and when I take the PCA function out, I revert to the original classification accuracies. Therefore there must be something wrong with the PCA function, but I just can’t seem to figure it out. My advisor and I have been trying to debug the program for the past couple of days, and we haven’t made much progress in fixing the issue. Therefore I am spending some time over this weekend to fix the issue, because I have no idea how long it will take to fix it and if I spend the next week trying to figure it out, I will be behind on the project by a week. I will update this blog post if I get an acceptable classification accuracy for the convolutional neural network of about 80% and explain what caused the issue and how I fixed it. Keep your fingers crossed!

Next week, if all goes well, I will reuse the adversarial images I generated a while ago and test the classifiers on these images and see if there is any drop in accuracy. Keep your fingers crossed!

Until next time,

Rishab

 

Blog Post #7: A Primer on PCA

Hey everyone! Welcome to the first week of the second half of the project, where I will be working to create a defense against adversarial attacks on self-driving cars! The defense that I will be working with for this project is called principal component analysis, or PCA for short.

This week, my external advisor had to go on a business trip to Croatia, so I did not get much done in terms of implementing PCA into the neural networks I have been working with, as he usually helps me with understanding and coding the algorithms in Python. However, he did assign me some readings to work on for this week that detail how PCA works and how it is to be implemented into a function. Therefore, this week’s blog post is dedicated to my thoughts on the readings I perused all this week involving principal component analysis.

Essentially, the main purpose of principal component analysis is to reduce the dimensionality of a dataset consisting of many variables that may or may not be correlated with each other while retaining the essence of the dataset’s variance. In other words, it is a method of summarizing data while retaining the core qualities of the data, thus reducing the complexity of the data.

Since the pure definition of principal component analysis, is so abstract, I feel that it would be best to give a visual example of the PCA process, which was detailed very helpfully in this video here, from which I will take images from.

6god

The image above shows a dataset, that has two variables, cell 1 and cell 2. The way one identifies the first principal component is a line that has the highest amount of variance. Therefore, visually it must be a line that splits the dataset diagonally, which provides the most variance.

blocboy

Now, it is possible to reorient the data points along the black line, or the first principal component, by pushing all of the points onto the black line, thus turning the two-dimensional graph into a one-dimension graph, but that would lead to quite a large loss of data. Thus, we should also consider the second largest amount of variance, which must be orthogonal, or perpendicular to the first line. Therefore we find the line with the second most variance as the following:

jb

Essentially, the second axis, called PC2, shows that the data is reoriented along the PC1 and PC2 axes, so overall the dimensions did not change. However, creating the extra principal component in PC2 did ensure that even though there is certainly a loss of data, it is minor compared to changing the two dimensional data to another two dimensional form of the data, one that better explains the variation among the data points. Hopefully it leads to a more accurate and more robust neural network!

In terms of how to implement PCA in the self-driving car classifier, it ultimately will boil down to the size of the images. Since each image has 32 by 32 pixels, it will be a 32 X 32 matrix. If we flatten each matrix, it turns into a 1024 dimensional vector, with each component of the vector being a value describing the color intensity of the pixel. So the dataset in total will be a collection of 1024-dimensional vectors.

I am still not entirely sure about how many principal components I will need for the dataset, but I am going to abide by the Pareto rule, or 80/20 rule, so that the dataset, when applied with PCA, retains 80% of its variance and data, with only 20% of the dimensions, or principal components. Next week, since my mentor will be coming back from Croatia, I will take a stab at coding the PCA algorithm into the traffic sign classifiers and report my experiences in doing so!

Stay tuned!

-Rishab

 

 

 

 

Blog Post #6: Testing 1 2 3

Hey everyone! This week was light in terms of studying and coding, but very significant in terms of determining the direction of the project. This was the week where I tested the two traffic sign image classifiers with the adversarial images I generated last week. Generating 40,000 adversarial images was actually not that hard, albeit time-consuming, with generating 40,000 images taking about 90 minutes to generate, and I pickled the data, or in other words, packaged the data so that I can use it in another script, namely the two image classifiers I created in Week #2. Thus, I will discuss the changes in classification rate for both the logistic regression model and the neural network model.

In Week #2, the logistic regression model has a classification accuracy of 81.3%, which is definitely not a great accuracy, but a serviceable accuracy. I tested the model again with the adversarial images, and the result was that the logistic regression model had a classification of 65.6%. For the convolutional neural network, the classification accuracy was about 94.7%, which was a much better classification accuracy than the logistic regression model. I also tested this model with adversarial images, and the new classification accuracy for the model was about 82.6%. While I was certainly expecting a drop in classification accuracy for the model, I was not expecting such significant ones of about 16% and 13%. However, the drops in classification accuracy were what I was expecting, in terms of comparing them, as I expected the drop in the classification rate of the logistic regression model to be greater than the convolutional neural network, as it is a more basic, streamlined model.

Nevertheless, I am glad that I confirmed that the adversarial images are effective in their task of causing the neural networks to malfunction. This marks the end of the first half of the project. Even though there have been some obstacles in my way, I’m satisfied with my progress in this project. Now I will be proceeding with the second half – defense with principal component analysis. I imagine this will be a very dense topic, but I can’t wait to dive headfirst into it next week!

Until next time,

Rishab