Featured

81114 Since I last posted, I’ve been revising and rewriting my convolutional neural network incorporating transfer learning from one of the five algorithms mentioned below, to make dog breed predictions from random photos selected from the imagenet database as part of the Udacity Deep Learning Nanodegree ‘dog-breed-project’ in a jupyter notebook using the python programming language.

I’ve really enjoyed this project a great deal as it involves computer vision algorithms used to make image content predictions. Although I am not quite finished with the project at this moment, I suddenly realized this morning that there is a possibility that I might be able to join the five transfer learning pre-trained algorithms into a single classifier.  As I leave this post, I will experiment with this ensemble concept of creating an algorithm with higher predictive accuracy today and then move on to the next Udacity Deep Learning Nanodegree Project  called ‘Generate TV Scripts’ using a recurrent neural network. Wish me good luck as I continue to study this interesting technology.

81110 Continued working on the second project in the Udacity Deep Learning Nanodegree this afternoon. I’ve written and tested the following image processing models for sequential convolutional neural network classification of human face and dog images attempting to successfully classify 133 different dog breeds using one of the following pre-trained models; ResNet50, VGG16, VGG19, InceptionV3 and Xception. So fare have been making relatively good progress.

81109 Today I am approaching the end of this project and will be attempting to close it out and submit it to Udacity for grading by next Wednesday at the latest. Then on to the third project in the Udacity Deep Learning Nanodegree called “Generate TV scripts”. The goal of this project is to train a recurrent neural network on a set of existing TV scripts and then generate an original TV script using the recurrent neural network that was created for this project.

81106 Today more progress was made on this project in fact I was able to hit a predictions accuracy of about 76% when the minimum target goal was 60%. This was accomplished using a VGG19 Sequential model using a ‘relu’, ‘sigmoid, and finally followed by a ‘softmax’ activation function.

81105-2 Later today I was able to move beyond the sticking point that I bumped into yesterday and the day before so I should be able to move on to parts 5 & 6 shortly.

81105 For the past two days I’d been making great progress on the second project in this ND. Unfortunately, yesterday I came to a virtual standstill on the 4th and 5th sections of the jupyter notebook. One thing I found that is quite frustrating, is the linearity involved in the different sections of the project. Once you get into the fourth and fifth sections running these code blocks are dependent on information in the previous three sections and the third section involves a block that runs very slowly so your progress moving forward in code blocks 4 & 5 is constrained by running the third slow block of code repeatedly every time you want to try out something new in sections 4 and 5. Of course I could be doing something wrong here and haven’t realized how to deal with it properly yet.

81103 Just a quick update on my DLND progress. Yesterday I was working on the second project in the course. This project is about building and training a convolutional neural network on data in the form of human face images and dog images in order to get the neural network model to predict dog breeds. The good news was that on part three of the project I am building a CNN from scratch to detect dog breeds and yesterday I had made some significant progress along those lines. So the rest of today I’ll be picking up where I left off  yesterday and attempting to complete this part of the project.

81101 Today is a national holiday here in Hungary, it is ‘All Saints Day’. Its a day when most people visit a cemetery where one or more family member’s have been laid to rest for the purpose of placing candles at the graves sites to burn throughout the night and possible longer.  In my case, I’ll be working on the next project in the Deep Learning Nanodegree by Udacity called the Dog Breed Classifier project using a convolutional neural network model architecture. Holidays for me, are typically when I get a lot of work done on a project.

81031 Successfully completed the first project in the Udacity Deep Learning Nanodegree yesterday.   Starting on the second project using a convolutional neural network called dog breed classifier.

81030 Submitted the first project: Predicting Bike Ridership with the first neural network project in the Udacity Nanodegree today.

81028 This recurring post is on the topic of Deep Learning in Artificial Intelligence.

After some weeks away from this course in order that I might put more emphasis on another course, I’ve returned to it now so that I might continue on and possibly complete this Deep Learning Nanodegree.

The course begins with an introduction module followed by five additional modules or sections which are in order; neural networks, convolutional networks, recurrent networks, generative adversarial networks and finally deep reinforcement learning.

Each section has lecture materials in the form of videos with supporting slides and exercises that promote better understanding of the concepts as well as project fully covering the topics in each section so that you will not only have a good mental understanding but also gain experience in the practical application of the concepts.

My experience has been that it is really in the project phase of each section that the majority of true learning occurs. At the moment I am on the Neural Networks section. Yesterday I spent a good amount of time on the project in this module called first neural network and expect to be completing it shortly.

 

 

 

81101 This is a reflection on the time I spent from late 2015 to mid 2017, learning computer programming skills by by enrolling in and completing a data science specialization from John Hopkins University’s, Bloomberg School of Public Health’s, Department of Biostatistics that was given online by Coursera.org .

When I began this specialization, I didn’t really know how to code even though I had been trying to learn for several years. My previous attempts to learn, were through MIT Open Courseware courses online as well as through a couple of online code websites like Code Academy and Data Camp.

Actually when I began this specialization, I still didn’t know how to code even though I now knew some programming vocabulary and concepts. I was vaguely familiar with the concepts of : while loops, for loops, if statements or conditional statements, booleans, tuples, mutability, immutability, recursive code and a few other fundamental programming concepts from these free online courses.

As a result of completing the specialization, I gained an enormous amount of knowledge and new skills within the data science landscape regarding programming and statistics in the R programming language, a language designed specifically for statistical analysis.

I spent many days and sometimes weeks stuck on a single problem or issue. I was not making consistent progress daily or weekly but I was persistent and continued my struggle to learn what I could each day throughout the course. Throughout the entire course I wasn’t sure if I was going to be able to complete the entire specialization.

Once I had completed the specialization though, I thought to myself, how valuable these skills were that I had just learned and that if I didn’t use them every day going forward then all the months I had just spent (close to a year) learning how to program and do statistical analysis would have been wasted.

That’s when I decided to enroll in the Flying Car Nanodegree at Udacity. My belief was that I would get to learn something that I had been dreaming about all my life which was to be part of a flying car creation while maintaining programming skills I had just learned.

Without hesitation, I enrolled in the Flying Car, the Data Analysis and Deep Learning Nanodegrees. Currently, I’m working on completing these last two Nanodegrees.  By completing these Nanodegrees will  I’ll be able to maintain my current programming skills while extending much further than the R language and its programming environment.

One thing that I like about computer programming and that keeps it exciting for me is the massive number of new developments happening almost daily, which for the most part tend to simplify the programmer’s job or process in one way or another.

That was a short recap of the past three years of my efforts on learning how to write different types of computer programs.

81101 Reflecting on this Nanodegree I cannot deny that I was exposed to an enormous amount of new material in a whole new industry using a completely new set of tools that I had previously known nothing about.

To say it was quite challenging for me would be an understatement. I spent practically every minute of every day for months pouring over the material in this nanodegree.

The first significant challenge for me was not having prior familiarity with the tools (i.e, C++ language, python language, jupyter notebook, numpy, pandas, matplotlib). While trying to learn these new tools on top of revisiting laws of physics as applied to an object moving in three dimensional space made this even more difficult. Fortunately, I was able to put in the time to study and work through the course exercises and ended up completing the course successfully.

I regret that I started the course from such a lack of skills but glad that I did it nonetheless.  The reason for my regret is that that I wan’t able to achieve the level of expertise in each of the sub modules of the course that I would have like to have attained.

One of the highlights of the course for me was the ability to apply the code that was being written by me to the quadrotor simulator in the simulation environment. Seeing the results of the code from my laptop being applied to and controlling the simulated quadrotor was deeply gratifying.

I really do miss having full access to this course now.  If I still had that, I could go back and continue upgrading the skills that I had barely time to learn. In other words I was at at the beginning of multiple learning curves and had to learn multiple things simultaneously and therefore did not really have the same amount of time as many students who were already familiar with man of the tools used in the course to write these programs.

All in all it was a great experience and I will be continuing my learning journey with autonomous agent programming in some form or anther.

I highly recommend this course if you are interested in learning what it takes to program and autonomous unmanned aerial vehicle.

81015 A few days ago I started work on the last project in the Udacity Data Analyst Nanodegree which is about A/B Testing.

An AB test is commonly used to determine which of two possible options have a higher probability of obtaining a desired result. In this project a hypothetical company operates a website and collects data about the CR (conversion rate) of what is referred to as a landing page (a page designed to receive new viewers and hopefully lead them through the marketing funnel resulting in a sale or conversion of some type like signing up for a course for example or some other type of purchase).

We begin here by defining some terms used in this analytical process. First, the name of the process as was mentioned earlier is called the AB Test. This test is built upon a method of analysis that implements two competing theories one of which is called the Null Hypothesis and its competing theory is called the Alternative Hypothesis.

The null hypothesis often mathematically represented as H_0 is used to represent our belief about the current situation, in this case the probability that a new viewer of a given website landing page will be converted into an enrolled student in one or more courses. This null hypothesis is also usually associated with what is called the control group or the group that the results of the alternative is being compared with.

The alternative hypothesis on the other hand is mathematically represented by H_1 and represents another possible website landing page that we have created and collected data on as well. The test is run for a period of time in order to collect enough data to enable a statistical comparison of the CR (conversion rates) of the two different hypothesis.

The data analyst’s job requires preparing the data and getting it into the proper form rendering it useable to analyze. Then the analyst preforms statistical calculations on the data to determine which of the two hypothesis are more likely provide the best CR (conversion rate).

A simple AB test, generally involves the interaction of two explanatory variables where as a more complicated analysis such as one using logistics regression might involve multiple explanatory variables as well as possible interactions between two or more of those multiple explanatory variables.

Another important feature involved in the methodology of AB testing is the Alpha and Beta rates. An Alpha rate represents the percentage of acceptable Type I errors. An example of a type one error would be concluding that someone was guilty of a crime when in fact they are innocent and we ended up convicting an innocent person and potentially sending them to prison.

Conversely, a Type II error is the reverse and is mathematically represented by the Beta rate. An example of this kind of error is when we conclude that a person who is actually guilty of a crime is innocent and we incorrectly let a guilty individual go free. Although this is also a mistake, it is not considered to be as severe of a mistake as sending an innocent person to prison or in some cases to death.

So far I have made good progress on this project and should be done within a few more days.

81028 I’ve been making progress again on the Deep Learning Nanodegree and should be completing the second project in that course very soon.  On the 6th of November the second term of the Data Analyst Nanodegree opens back up and I will continue working on both of these Nanodegrees while working on some other business projects.

81022 Successfully completed the 4th Project of the Udacity Data Analyst Nanodegree Term 1 yesterday. This project was on the topic of A/B Testing using a 294,478 row, 5 column data set and eventually joined with a second data set. This project develops a better understanding of hypothesis testing using null and alternate hypothesis, confidence intervals, Type I and Type II error rates or Alpha and Beta rates. The project flow was from manual probability calculations to using z_tests and p_value functions from the ‘statsmodels.api’ library then to multi variable logistics regression analysis and determination of statistical significance from the results of all three. The project also involved bootstrapping simulated data using a numpy random binomial generator. Next step is to start Term 2 and continue the Deep Learning Nanodgree Term 1.

81021 Submitted my fourth and final project for the first term in the Udacity Data Analyst Nanodegree that I’ve been working on for several months now.  Hopefully it will pass the requirement.

81006 It’s been a frustrating day concerning the progress on my 3rd project in the Udacity Data Analyst Nanodegree. The problem started this afternoon when the jupyter notebook I have been working in for about a month crashed. In other words I am getting the error message that the kernel died and although it tries to restart that’s not happening.

I’ve submitted an issue with the jupyter help on GitHub after researching the issue online and not finding the correct answer. Hopefully somebody will be contacting me soon on how to resolve the issue.

This all happened right after I updated the Seaborn application but I am not sure if that had anything to do with it or not it might have also been caused by the Apple Xcode application that I’ve had conflicts with many times before. Who knows?

Update 3 hrs later. After removing the Anaconda application and seaborn application and reinstalling the Anaconda application the kernel is now working again. I’ve submitted the 3rd project for the second time and passed the requirements and moving on to the forth and last project in term one now.

Update 12 hours later: The project submission received a successful review since my last update and I’ve resumed working on the fourth and final project A/B testing which will complete this Nanodegree. Two out of three down and one to go.

 

It’s been several months now since beginning three programs with Udacity out of Mountain View California.  The first, was the Flying Car Nanodegree. The second is the Data Analyst Nanodegree. Finally, the third program is the Deep Learning Nanodegree.

The Flying Car Nanodegree, FCND for short was quite challenging, involving writing code for linear algebra relating to physics and  geometry.  I had doubts I’d complete this nanodegree up until the last project submission. Fortunately I completed it and my persistence paid off after many months.

Now I am in the last days before completing the Data Analyst Nanodegree and looking forward to trying to complete the Deep Learning one also.

I highly recommend to anyone, enrolling in an online course about whatever topic interests you. Completing an online course is highly rewarding just based on the new knowledge you will have gained. The only draw back is that for me at least, these have taken a huge amount of my time and I’ve had to give up a lot of activities to make the time that I needed to work on this nearly all of the online courses that I’ve taken in the last ten years.

Getting back to the Data Analyst Nanodegree also referred to as the DAND.  This program is divided into two terms and I am almost done with the first one. The first term is broken into three modules; introduction to python, introduction to data analysis, and practical statistics.  Each module of instruction includes a project which requires the application of the concepts and techniques covered in the module. Therefore completing a nanodegree will give you the confidence needed to be proficient in the course topic.

Since I began taking courses involving computer programming I’ve been amazed at the results that can be derived from developing a software program. That’s it for today’s thoughts on data analysis. Thanks for stopping by!

Beginning October 2108, the intent of this blog will be discussing programming of autonomous agents, artificial intelligence, data science and analysis and supporting tools to facilitate accomplishing the tasks in these domains.

 

 

The video clip above was my submission for Project 2 of the Udacity Flying Car Nanodegree.

This is a demonstration of autonomous 3D flight planning using the A-star search algorithm to find a path solution for a simulated Quadrotor through a simulated 3D section of downtown San Francisco California as the physical environment.

This solution needs some improvement to reduce the number of nodes along the path to eliminate the need for the Quadrotor to double back when overshooting a given node. The process of reducing the number of waypoints or  nodes is called path pruning and there are several techniques available for achieving this.