Skip to main content

Learning to recognize handwritten digits

The Digits data set of the Scikit-learn library provides numerous data-sets that are useful for testing many problems of data analysis and prediction of the results. Some Scientist claims that it predicts the digit accurately 95% of the times. Perform data Analysis to accept or reject this Hypothesis.


In this project, we are using the Handwritten Digits dataset which is already ready in the sklearn library. we can import the dataset 


               from sklearn import datasets
               digits = datasets.load_digits()

Info about Dataset:


                print(digits.DESCR)

OUTPUT:




                main_data = digits['data']
                targets = digits['target']
                len(main_data)






%matplotlib inline

plt.subplot(321)
plt.imshow(digits.images[1791], cmap=plt.cm.gray_r,
interpolation='nearest')

plt.subplot(322)
plt.imshow(digits.images[1792], cmap=plt.cm.gray_r,
interpolation='nearest')

plt.subplot(323)
plt.imshow(digits.images[1793], cmap=plt.cm.gray_r,
interpolation='nearest')

plt.subplot(324)
plt.imshow(digits.images[1794], cmap=plt.cm.gray_r,
interpolation='nearest')

plt.subplot(325)
plt.imshow(digits.images[1795], cmap=plt.cm.gray_r,
interpolation='nearest')

plt.subplot(326)
plt.imshow(digits.images[1796], cmap=plt.cm.gray_r,
interpolation='nearest')

OUTPUT:




Support Vector Classifier:


                         from sklearn import svm
                         svc = svm.SVC(gamma=0.001 , C = 100.)
                         svc.fit(main_data[:1790] , targets[:1790])
                         predictions = svc.predict(main_data[1791:])
                         predictions , targets[1791:]
OUTPUT:

          (array([4, 9, 0, 8, 9, 8]), array([4, 9, 0, 8, 9, 8]))

From SVC we get 100% accuracy
Training Data : 1790
Test Data : 6




Decision Tree Classifier:


                    from sklearn.tree import DecisionTreeClassifier
                    dt = DecisionTreeClassifier(criterion = 'gini')
                    dt.fit(main_data[:1600] , targets[:1600])

                    predictions2 = dt.predict(main_data[1601:])
                    from sklearn.metrics import accuracy_score
                    confusion_matrix(targets[1601:] , predictions2
OUTPUT:


       array([[17,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0, 17,  0,  0,  1,  0,  0,  0,  2,  0],
       [ 0,  0, 13,  1,  0,  1,  0,  1,  1,  0],
       [ 0,  2,  2,  9,  0,  3,  2,  4,  0,  0],
       [ 0,  0,  0,  0, 18,  0,  1,  2,  0,  1],
       [ 0,  0,  0,  1,  2, 15,  0,  0,  1,  0],
       [ 0,  0,  0,  1,  2,  0, 19,  0,  0,  0],
       [ 0,  0,  0,  2,  1,  0,  0, 17,  0,  0],
       [ 0,  2,  1,  0,  0,  0,  0,  1, 13,  0],
       [ 0,  1,  0,  0,  0,  0,  0,  2,  1, 16]], dtype=int64)

                    accuracy_score(targets[1601:] , predictions2)
OUTPUT:
           0.7857142857142857

From Decision Tree Classifier we get 78 % Accuracy
Training Data : 1600
Test_data : 197



Random Forest Classifier:



                      

from sklearn.ensemble import RandomForestClassifier


rc = RandomForestClassifier(n_estimators = 150)
rc.fit(main_data[:1500] , targets[:1500])
predictions3 = rc.predict(main_data[1501:])
accuracy_score(targets[1501:] , predictions3)
OUTPUT:

0.9222972972972973

From Random Forest Classifier we get high accuracy for n_estimators = 150
Training data : 1500
Test Data : 297




Conclusion:

Data maters the most we need a good amount of data for modal.if we have a less data then we can use some other machine learning classifier algorithms like random forest which is also give 92 % accuracy on 1500 trainset which is less data compare to Support vector classifier.



As per our hypothesis, we can say with hyperparameter tunning with different machine learning models or using more data we can achieve near 95% accuracy on the handwritten dataset. But make sure we also have a good amount of test data otherwise the model will get overfit.

Comments

Popular posts from this blog

Chemical reaction and equation

  Chemical reactions -   The transformation of chemical substance into a new chemical substance by making and breaking of bonds between different atoms is known as Chemical Reaction.  Signs of a chemical reaction These factors denote that a chemical reaction has taken place- change of state of substance, change of color of substance,evolution of heat, absorption of heat, evolution of gas and evolution of light. Chemical Equation:   The representation of chemical reaction by means of symbols of substances in the form of formulae is called chemical equation.  E.g. - H 2  + O 2  ⇒ H 2 O                        Balanced Chemical Equation:   A balanced chemical equation has number atoms of each element equal on both left and right sides of the reaction.                                    *No...

Meteorological Data Analysis

Is there any change due in weather to global warming in of Finland by using Data analytics                                       Effect of  global warming  “Has the Apparent temperature and humidity compared monthly across 10 years of the data indicate an increase due to Global warming” To find whether the average Apparent temperature for the month of a month say April starting from 2006 to 2016 and the average humidity for the same period have increased or not. step-1  Importing of libraries and Dataset.   import libraries step-2   over Look at the dataset. step-3 Cleaning the Dataset step-4 Plotting a graph of  the following Dataset >  Firstly  plot the  graph whole dataset for all months  Graph for all month >  Now    plot graph for a specific month(April) .   Graph for  month of April Conclusion: As we ca...

Mechanical and durability Behavior of fiber reinforced concrete incorporating deferent types of natural, pp and steel fibers

  1. Introduction As an important building material, concrete has been widely used in civil engineering applications such as bridges and roads engineering, and the related experimental study of the mechanical properties of concrete was also fruitful . With the vigorous development of engineering construction, high-performance concretes such as fiber-reinforced concrete was applied gradually in important engineering structures . Among these high-performance concretes, for the advantages of low cost, easy fabrication, and performance improvements, obviously, steel fiber-reinforced concrete was used widely in the current engineering field . However, the study showed that uneven incorporation of steel fiber would affect the fluidity and uniformity of concrete mixing and even result in fiber bonding, which eventually affects the reinforcement effect of mechanical properties. Up to now, most research paid attention on the improvement effect of different types of fiber or optimum fiber co...