{ "cells": [ { "cell_type": "markdown", "id": "ca2f8f92-9d01-49a1-b1b0-71dbceb5ae14", "metadata": {}, "source": [ "A classic example of machine learning is shown with scikit-learn." ] }, { "cell_type": "markdown", "id": "59dff0f9-1ace-43f2-9a7d-49f8783f37e5", "metadata": {}, "source": [ "# 0. Installing scikit-learn" ] }, { "cell_type": "markdown", "id": "a4b3c36d-9425-4e81-921d-5f3e4be5c97a", "metadata": {}, "source": [ "To install scikit-learrn with miniconda, do\n", "\n", "```\n", "conda install scikit-learn\n", "```\n", "\n", "in a terminal window. For plotting, `matplotlib` is useful as well." ] }, { "cell_type": "markdown", "id": "d9991074-9301-48f8-839d-66b12ac1a5ae", "metadata": {}, "source": [ "# 1. The Handwritten Digits Data" ] }, { "cell_type": "code", "execution_count": 1, "id": "9c524321-d3f6-45d3-8f53-54805da5be7d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn import datasets\n", "'load_digits' in dir(datasets)" ] }, { "cell_type": "code", "execution_count": 3, "id": "56729d2f-8fdc-4662-b14f-dea3e8d984a7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function load_digits in module sklearn.datasets._base:\n", "\n", "load_digits(*, n_class=10, return_X_y=False, as_frame=False)\n", " Load and return the digits dataset (classification).\n", " \n", " Each datapoint is a 8x8 image of a digit.\n", " \n", " ================= ==============\n", " Classes 10\n", " Samples per class ~180\n", " Samples total 1797\n", " Dimensionality 64\n", " Features integers 0-16\n", " ================= ==============\n", " \n", " This is a copy of the test set of the UCI ML hand-written digits datasets\n", " https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits\n", " \n", " Read more in the :ref:`User Guide `.\n", " \n", " Parameters\n", " ----------\n", " n_class : int, default=10\n", " The number of classes to return. Between 0 and 10.\n", " \n", " return_X_y : bool, default=False\n", " If True, returns ``(data, target)`` instead of a Bunch object.\n", " See below for more information about the `data` and `target` object.\n", " \n", " .. versionadded:: 0.18\n", " \n", " as_frame : bool, default=False\n", " If True, the data is a pandas DataFrame including columns with\n", " appropriate dtypes (numeric). The target is\n", " a pandas DataFrame or Series depending on the number of target columns.\n", " If `return_X_y` is True, then (`data`, `target`) will be pandas\n", " DataFrames or Series as described below.\n", " \n", " .. versionadded:: 0.23\n", " \n", " Returns\n", " -------\n", " data : :class:`~sklearn.utils.Bunch`\n", " Dictionary-like object, with the following attributes.\n", " \n", " data : {ndarray, dataframe} of shape (1797, 64)\n", " The flattened data matrix. If `as_frame=True`, `data` will be\n", " a pandas DataFrame.\n", " target: {ndarray, Series} of shape (1797,)\n", " The classification target. If `as_frame=True`, `target` will be\n", " a pandas Series.\n", " feature_names: list\n", " The names of the dataset columns.\n", " target_names: list\n", " The names of target classes.\n", " \n", " .. versionadded:: 0.20\n", " \n", " frame: DataFrame of shape (1797, 65)\n", " Only present when `as_frame=True`. DataFrame with `data` and\n", " `target`.\n", " \n", " .. versionadded:: 0.23\n", " images: {ndarray} of shape (1797, 8, 8)\n", " The raw image data.\n", " DESCR: str\n", " The full description of the dataset.\n", " \n", " (data, target) : tuple if ``return_X_y`` is True\n", " A tuple of two ndarrays by default. The first contains a 2D ndarray of\n", " shape (1797, 64) with each row representing one sample and each column\n", " representing the features. The second ndarray of shape (1797) contains\n", " the target samples. If `as_frame=True`, both arrays are pandas objects,\n", " i.e. `X` a dataframe and `y` a series.\n", " \n", " .. versionadded:: 0.18\n", " \n", " Examples\n", " --------\n", " To load the data and visualize the images::\n", " \n", " >>> from sklearn.datasets import load_digits\n", " >>> digits = load_digits()\n", " >>> print(digits.data.shape)\n", " (1797, 64)\n", " >>> import matplotlib.pyplot as plt\n", " >>> plt.gray()\n", " >>> plt.matshow(digits.images[0])\n", " <...>\n", " >>> plt.show()\n", "\n" ] } ], "source": [ "help(datasets.load_digits)" ] }, { "cell_type": "code", "execution_count": 4, "id": "85746ed9-8a59-4d33-8cc9-11683af2f39a", "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import load_digits" ] }, { "cell_type": "code", "execution_count": 5, "id": "27043372-d387-4c3b-9606-66302f040444", "metadata": {}, "outputs": [], "source": [ "digits = load_digits()" ] }, { "cell_type": "code", "execution_count": 6, "id": "c83f5bd2-a0a3-45ae-8492-ab176a2beb53", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1797, 64)\n" ] } ], "source": [ "print(digits.data.shape)" ] }, { "cell_type": "markdown", "id": "4f0fe2c7-ecbb-4154-9f89-fabf33133110", "metadata": {}, "source": [ "The size of the data consists of 1797 samples, each of 64 numbers. Each sample represents an 8-by-8 image. The first image is shown below." ] }, { "cell_type": "code", "execution_count": 7, "id": "98e5a630-22b7-4daa-b4eb-0c3183f00cdd", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 8, "id": "74ddd5ef-55f7-4b6e-a85b-fdf250a49aac", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.gray()" ] }, { "cell_type": "code", "execution_count": 9, "id": "43270bfc-1d91-4694-93b1-7d761d42158a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZoAAAGkCAYAAAAIduO+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAYkElEQVR4nO3df2yUhR3H8c9B4VBsz4IU23BARSI/CogtcwWcP8AmDRLJNtQFWR1zWWdBsDHR6h+yXxz+sUUXZrMy0kkIlpAJsmyAJZPiYrqVaiNDg7ASeyisgcFd6ZIjts/+8mKH/fEc/fL0ub5fyZN5t+e8T0zl7dO79gKO4zgCAMDICK8HAADSG6EBAJgiNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYSpvQvPbaa8rPz9eYMWNUWFiod9991+tJ/Tpy5IiWL1+uvLw8BQIB7d271+tJAxKJRLRgwQJlZmYqJydHK1as0IkTJ7yeNSDV1dWaO3eusrKylJWVpeLiYu3fv9/rWa5FIhEFAgFt2LDB6yn92rhxowKBQI/j1ltv9XrWgHz22Wd6/PHHNX78eN14442688471dzc7PWsfk2dOvWqf+aBQEAVFRWe7EmL0OzatUsbNmzQiy++qA8++ED33HOPSktL1dbW5vW0PnV2dmrevHnasmWL11NcaWhoUEVFhRobG1VfX68vvvhCJSUl6uzs9HpavyZNmqTNmzfr6NGjOnr0qB544AE9/PDDOn78uNfTBqypqUk1NTWaO3eu11MGbPbs2Tp79mzyOHbsmNeT+nXx4kUtWrRIo0aN0v79+/XRRx/pV7/6lW6++Wavp/Wrqampxz/v+vp6SdLKlSu9GeSkgW984xtOeXl5j/tmzJjhPP/88x4tck+Ss2fPHq9npKS9vd2R5DQ0NHg9JSXZ2dnO73//e69nDEhHR4czffp0p76+3rn33nud9evXez2pXy+99JIzb948r2e49txzzzmLFy/2esagWL9+vTNt2jSnu7vbk+f3/RXNlStX1NzcrJKSkh73l5SU6L333vNo1fASi8UkSePGjfN4iTtdXV2qq6tTZ2eniouLvZ4zIBUVFVq2bJmWLl3q9RRXTp48qby8POXn5+uxxx5Ta2ur15P6tW/fPhUVFWnlypXKycnR/PnztXXrVq9nuXblyhXt2LFDa9asUSAQ8GSD70Nz/vx5dXV1aeLEiT3unzhxos6dO+fRquHDcRxVVlZq8eLFKigo8HrOgBw7dkw33XSTgsGgysvLtWfPHs2aNcvrWf2qq6vT+++/r0gk4vUUV+6++25t375dBw8e1NatW3Xu3DktXLhQFy5c8Hpan1pbW1VdXa3p06fr4MGDKi8v19NPP63t27d7Pc2VvXv36tKlS3riiSc825Dh2TMPsv8vteM4ntV7OFm7dq0+/PBD/e1vf/N6yoDdcccdamlp0aVLl/THP/5RZWVlamhoGNKxiUajWr9+vd5++22NGTPG6zmulJaWJv96zpw5Ki4u1rRp0/T666+rsrLSw2V96+7uVlFRkTZt2iRJmj9/vo4fP67q6mp9//vf93jdwG3btk2lpaXKy8vzbIPvr2huueUWjRw58qqrl/b29quucjC41q1bp3379umdd97RpEmTvJ4zYKNHj9btt9+uoqIiRSIRzZs3T6+++qrXs/rU3Nys9vZ2FRYWKiMjQxkZGWpoaNBvfvMbZWRkqKury+uJAzZ27FjNmTNHJ0+e9HpKn3Jzc6/6j4+ZM2cO+TcZfdWnn36qQ4cO6cknn/R0h+9DM3r0aBUWFibfVfGl+vp6LVy40KNV6c1xHK1du1Zvvvmm/vrXvyo/P9/rSdfEcRwlEgmvZ/RpyZIlOnbsmFpaWpJHUVGRVq1apZaWFo0cOdLriQOWSCT08ccfKzc31+spfVq0aNFVb9v/5JNPNGXKFI8WuVdbW6ucnBwtW7bM0x1p8a2zyspKrV69WkVFRSouLlZNTY3a2tpUXl7u9bQ+Xb58WadOnUrePn36tFpaWjRu3DhNnjzZw2V9q6io0M6dO/XWW28pMzMzeTUZCoV0ww03eLyuby+88IJKS0sVDofV0dGhuro6HT58WAcOHPB6Wp8yMzOveg1s7NixGj9+/JB/bezZZ5/V8uXLNXnyZLW3t+sXv/iF4vG4ysrKvJ7Wp2eeeUYLFy7Upk2b9Mgjj+gf//iHampqVFNT4/W0Aenu7lZtba3KysqUkeHxH/WevNfNwG9/+1tnypQpzujRo5277rrLF2+1feeddxxJVx1lZWVeT+vT122W5NTW1no9rV9r1qxJfp1MmDDBWbJkifP22297PSslfnl786OPPurk5uY6o0aNcvLy8pxvf/vbzvHjx72eNSB/+tOfnIKCAicYDDozZsxwampqvJ40YAcPHnQkOSdOnPB6ihNwHMfxJnEAgOHA96/RAACGNkIDADBFaAAApggNAMAUoQEAmCI0AABTaRWaRCKhjRs3Dvmf8v5/ft0t+Xe7X3dL/t3u192Sf7cPld1p9XM08XhcoVBIsVhMWVlZXs8ZML/ulvy73a+7Jf9u9+tuyb/bh8rutLqiAQAMPYQGAGDquv+mte7ubn3++efKzMwc9M+LicfjPf7XL/y6W/Lvdr/ulvy73a+7Jf9ut97tOI46OjqUl5enESN6v2657q/RnDlzRuFw+Ho+JQDAUDQa7fMzqa77FU1mZub1fkpIWrFihdcTUrJx40avJ6Ts8OHDXk9IiZ//mV+6dMnrCcNSf3+uX/fQ8PHK3hg1apTXE1Li5/8wGeqfzdMb/h2FW/19zfBmAACAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKYIDQDAFKEBAJgiNAAAU4QGAGCK0AAATKUUmtdee035+fkaM2aMCgsL9e677w72LgBAmnAdml27dmnDhg168cUX9cEHH+iee+5RaWmp2traLPYBAHzOdWh+/etf64c//KGefPJJzZw5U6+88orC4bCqq6st9gEAfM5VaK5cuaLm5maVlJT0uL+kpETvvffe1z4mkUgoHo/3OAAAw4er0Jw/f15dXV2aOHFij/snTpyoc+fOfe1jIpGIQqFQ8giHw6mvBQD4TkpvBggEAj1uO45z1X1fqqqqUiwWSx7RaDSVpwQA+FSGm5NvueUWjRw58qqrl/b29quucr4UDAYVDAZTXwgA8DVXVzSjR49WYWGh6uvre9xfX1+vhQsXDuowAEB6cHVFI0mVlZVavXq1ioqKVFxcrJqaGrW1tam8vNxiHwDA51yH5tFHH9WFCxf0s5/9TGfPnlVBQYH+8pe/aMqUKRb7AAA+5zo0kvTUU0/pqaeeGuwtAIA0xO86AwCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKYIDQDAVEoffAb/2bx5s9cTUnLbbbd5PSFl2dnZXk9IyX/+8x+vJ6TskUce8XpCSnbv3u31BFNc0QAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAw5To0R44c0fLly5WXl6dAIKC9e/cazAIApAvXoens7NS8efO0ZcsWiz0AgDST4fYBpaWlKi0ttdgCAEhDrkPjViKRUCKRSN6Ox+PWTwkAGELM3wwQiUQUCoWSRzgctn5KAMAQYh6aqqoqxWKx5BGNRq2fEgAwhJh/6ywYDCoYDFo/DQBgiOLnaAAAplxf0Vy+fFmnTp1K3j59+rRaWlo0btw4TZ48eVDHAQD8z3Vojh49qvvvvz95u7KyUpJUVlamP/zhD4M2DACQHlyH5r777pPjOBZbAABpiNdoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAw5fqDz4azwsJCryek7LbbbvN6QkqmTZvm9YSUtba2ej0hJfX19V5PSJlf/x3dvXu31xNMcUUDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKYIDQDAFKEBAJgiNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmXIUmEolowYIFyszMVE5OjlasWKETJ05YbQMApAFXoWloaFBFRYUaGxtVX1+vL774QiUlJers7LTaBwDwuQw3Jx84cKDH7draWuXk5Ki5uVnf+ta3BnUYACA9uArN/4vFYpKkcePG9XpOIpFQIpFI3o7H49fylAAAn0n5zQCO46iyslKLFy9WQUFBr+dFIhGFQqHkEQ6HU31KAIAPpRyatWvX6sMPP9Qbb7zR53lVVVWKxWLJIxqNpvqUAAAfSulbZ+vWrdO+fft05MgRTZo0qc9zg8GggsFgSuMAAP7nKjSO42jdunXas2ePDh8+rPz8fKtdAIA04So0FRUV2rlzp9566y1lZmbq3LlzkqRQKKQbbrjBZCAAwN9cvUZTXV2tWCym++67T7m5uclj165dVvsAAD7n+ltnAAC4we86AwCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKYIDQDAlKsPPhvusrOzvZ6QsubmZq8npKS1tdXrCcOOX79WMHRxRQMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKYIDQDAlKvQVFdXa+7cucrKylJWVpaKi4u1f/9+q20AgDTgKjSTJk3S5s2bdfToUR09elQPPPCAHn74YR0/ftxqHwDA5zLcnLx8+fIet3/5y1+qurpajY2Nmj179qAOAwCkB1eh+aquri7t3r1bnZ2dKi4u7vW8RCKhRCKRvB2Px1N9SgCAD7l+M8CxY8d00003KRgMqry8XHv27NGsWbN6PT8SiSgUCiWPcDh8TYMBAP7iOjR33HGHWlpa1NjYqJ/85CcqKyvTRx991Ov5VVVVisViySMajV7TYACAv7j+1tno0aN1++23S5KKiorU1NSkV199Vb/73e++9vxgMKhgMHhtKwEAvnXNP0fjOE6P12AAAPgqV1c0L7zwgkpLSxUOh9XR0aG6ujodPnxYBw4csNoHAPA5V6H597//rdWrV+vs2bMKhUKaO3euDhw4oAcffNBqHwDA51yFZtu2bVY7AABpit91BgAwRWgAAKYIDQDAFKEBAJgiNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKVcffDbcZWdnez0hZYcOHfJ6AnzCz1/nFy9e9HoCvgZXNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYOqaQhOJRBQIBLRhw4ZBmgMASDcph6apqUk1NTWaO3fuYO4BAKSZlEJz+fJlrVq1Slu3blV2dvZgbwIApJGUQlNRUaFly5Zp6dKl/Z6bSCQUj8d7HACA4SPD7QPq6ur0/vvvq6mpaUDnRyIR/fSnP3U9DACQHlxd0USjUa1fv147duzQmDFjBvSYqqoqxWKx5BGNRlMaCgDwJ1dXNM3NzWpvb1dhYWHyvq6uLh05ckRbtmxRIpHQyJEjezwmGAwqGAwOzloAgO+4Cs2SJUt07NixHvf94Ac/0IwZM/Tcc89dFRkAAFyFJjMzUwUFBT3uGzt2rMaPH3/V/QAASPxmAACAMdfvOvt/hw8fHoQZAIB0xRUNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKYIDQDAFKEBAJgiNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmrvmDz4aTixcvej0hZYWFhV5PGHays7O9npASP3+t7N692+sJ+Bpc0QAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAw5So0GzduVCAQ6HHceuutVtsAAGkgw+0DZs+erUOHDiVvjxw5clAHAQDSi+vQZGRkcBUDABgw16/RnDx5Unl5ecrPz9djjz2m1tbWPs9PJBKKx+M9DgDA8OEqNHfffbe2b9+ugwcPauvWrTp37pwWLlyoCxcu9PqYSCSiUCiUPMLh8DWPBgD4h6vQlJaW6jvf+Y7mzJmjpUuX6s9//rMk6fXXX+/1MVVVVYrFYskjGo1e22IAgK+4fo3mq8aOHas5c+bo5MmTvZ4TDAYVDAav5WkAAD52TT9Hk0gk9PHHHys3N3ew9gAA0oyr0Dz77LNqaGjQ6dOn9fe//13f/e53FY/HVVZWZrUPAOBzrr51dubMGX3ve9/T+fPnNWHCBH3zm99UY2OjpkyZYrUPAOBzrkJTV1dntQMAkKb4XWcAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKYIDQDAFKEBAJhy9cFnw11ra6vXE1JWWFjo9YSUrFy50usJKfPzdr96+eWXvZ6Ar8EVDQDAFKEBAJgiNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmHIdms8++0yPP/64xo8frxtvvFF33nmnmpubLbYBANJAhpuTL168qEWLFun+++/X/v37lZOTo3/961+6+eabjeYBAPzOVWhefvllhcNh1dbWJu+bOnXqYG8CAKQRV98627dvn4qKirRy5Url5ORo/vz52rp1a5+PSSQSisfjPQ4AwPDhKjStra2qrq7W9OnTdfDgQZWXl+vpp5/W9u3be31MJBJRKBRKHuFw+JpHAwD8w1Vouru7ddddd2nTpk2aP3++fvzjH+tHP/qRqqure31MVVWVYrFY8ohGo9c8GgDgH65Ck5ubq1mzZvW4b+bMmWpra+v1McFgUFlZWT0OAMDw4So0ixYt0okTJ3rc98knn2jKlCmDOgoAkD5cheaZZ55RY2OjNm3apFOnTmnnzp2qqalRRUWF1T4AgM+5Cs2CBQu0Z88evfHGGyooKNDPf/5zvfLKK1q1apXVPgCAz7n6ORpJeuihh/TQQw9ZbAEApCF+1xkAwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADBFaAAApggNAMAUoQEAmCI0AABThAYAYIrQAABMERoAgClCAwAwRWgAAKZcf/DZcNba2ur1hJQ9//zzXk9IyebNm72ekLLm5mavJ6SkqKjI6wlIM1zRAABMERoAgClCAwAwRWgAAKYIDQDAFKEBAJgiNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAEwRGgCAKUIDADDlKjRTp05VIBC46qioqLDaBwDwuQw3Jzc1Namrqyt5+5///KcefPBBrVy5ctCHAQDSg6vQTJgwocftzZs3a9q0abr33nsHdRQAIH24Cs1XXblyRTt27FBlZaUCgUCv5yUSCSUSieTteDye6lMCAHwo5TcD7N27V5cuXdITTzzR53mRSEShUCh5hMPhVJ8SAOBDKYdm27ZtKi0tVV5eXp/nVVVVKRaLJY9oNJrqUwIAfCilb519+umnOnTokN58881+zw0GgwoGg6k8DQAgDaR0RVNbW6ucnBwtW7ZssPcAANKM69B0d3ertrZWZWVlyshI+b0EAIBhwnVoDh06pLa2Nq1Zs8ZiDwAgzbi+JCkpKZHjOBZbAABpiN91BgAwRWgAAKYIDQDAFKEBAJgiNAAAU4QGAGCK0AAATBEaAIApQgMAMEVoAACmCA0AwBShAQCYIjQAAFOEBgBgitAAAExd94/I5LNsvHHlyhWvJ6Sko6PD6wkp++9//+v1BOC66O/P9YBznf/kP3PmjMLh8PV8SgCAoWg0qkmTJvX6/1/30HR3d+vzzz9XZmamAoHAoP694/G4wuGwotGosrKyBvXvbcmvuyX/bvfrbsm/2/26W/LvduvdjuOoo6NDeXl5GjGi91dirvu3zkaMGNFn+QZDVlaWr74YvuTX3ZJ/t/t1t+Tf7X7dLfl3u+XuUCjU7zm8GQAAYIrQAABMpVVogsGgXnrpJQWDQa+nuOLX3ZJ/t/t1t+Tf7X7dLfl3+1DZfd3fDAAAGF7S6ooGADD0EBoAgClCAwAwRWgAAKYIDQDAFKEBAJgiNAAAU4QGAGDqf64lQwQHsEU+AAAAAElFTkSuQmCC", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.matshow(digits.images[0])" ] }, { "cell_type": "markdown", "id": "99904c29-3502-42d5-9cac-3f6c433f2aeb", "metadata": {}, "source": [ "# 2. Classification" ] }, { "cell_type": "markdown", "id": "dd20f832-74b6-4281-b8d9-1dc853f29a15", "metadata": {}, "source": [ "This section follows the documentation at\n", "\n", "```\n", "https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html\n", "```\n", "\n", "attributed to Gael Varoquaux. License: BSD 3 clause." ] }, { "cell_type": "code", "execution_count": 10, "id": "3ab910b8-7cb3-4083-a256-9dbf1809c76d", "metadata": {}, "outputs": [], "source": [ "# Import datasets, classifiers and performance metrics\n", "from sklearn import datasets, metrics, svm\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "code", "execution_count": 12, "id": "d2e91b32-d742-4128-b51c-02b180433ae0", "metadata": {}, "outputs": [], "source": [ "# flatten the images\n", "n_samples = len(digits.images)\n", "data = digits.images.reshape((n_samples, -1))\n", "\n", "# Create a classifier: a support vector classifier\n", "clf = svm.SVC(gamma=0.001)\n", "\n", "# Split data into 50% train and 50% test subsets\n", "X_train, X_test, y_train, y_test = train_test_split(\n", " data, digits.target, test_size=0.5, shuffle=False\n", ")\n", "\n", "# Learn the digits on the train subset\n", "clf.fit(X_train, y_train)\n", "\n", "# Predict the value of the digit on the test subset\n", "predicted = clf.predict(X_test)" ] }, { "cell_type": "code", "execution_count": 13, "id": "f71ac0e8-ceb7-4a8b-b72e-6d54d48c4f55", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAxsAAADQCAYAAABvGXwjAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAT5UlEQVR4nO3dfWxV9f0H8E+xPAhUijxVNLaDjafoqBszSBYoghqNjDqXZckW2zoNbOhWInOabaEiGcJ0YVl0j5EGh8uMLCXDTE1mqU5Q2WIJbEadoTiJqOCKEkEePL8/CN2v8ozn20vZ65X0j57e+z7fW+6n5757Ti9FWZZlAQAAkLMehV4AAABwZlI2AACAJJQNAAAgCWUDAABIQtkAAACSUDYAAIAklA0AACAJZQMAAEhC2QAAAJLoVmWjsbExioqKOj6Ki4vjggsuiLq6uti6dWuXrKGioiJqa2s7Pl+zZk0UFRXFmjVrTipn7dq10dDQEO3t7Yd9raqqKqqqqj7ROvP24osvRnV1dQwfPjz69u0bY8aMiQULFsQHH3xQ6KXxMeakcMxJ92FOTg+//e1vo6ioKPr371/opXAE5qRwXnjhhbjqqquipKQk+vfvH1OnTo1nn3220Ms6JcWFXsCpWLZsWYwZMyZ2794dTz/9dCxatChaWlpi48aN0a9fvy5dy+c+97lYt25djBs37qTut3bt2rjrrruitrY2SktLO33tgQceyHGFn9w///nPmDRpUowePTqWLl0agwcPjqeffjoWLFgQf//732PVqlWFXiJHYE66ljnpnsxJ4WzdujXmzZsXw4cPj507dxZ6ORyDOela69evj8mTJ8ell14aDz30UGRZFkuWLIlp06ZFc3NzXHbZZYVe4knplmXjoosuigkTJkRExNSpU+PAgQNx9913R1NTU3z9618/4n0++OCD6Nu3b+5rOeecc2LixIm5Zp7sAKX28MMPx549e2LlypUxcuTIiIi4/PLL480334xf//rX8Z///CcGDhxY4FXyceaka5mT7smcFM7s2bNj8uTJce6558ajjz5a6OVwDOaka/3oRz+K0tLSePzxxzu+h9OnT48RI0bEvHnzut0Zjm51GdXRHHrSbdmyJSIiamtro3///rFx48a48soro6SkJKZNmxYREXv37o2FCxfGmDFjonfv3jFkyJCoq6uLd955p1Pmvn374vbbb4+ysrLo27dvfPGLX4wXXnjhsH0f7XTe888/HzNmzIhBgwZFnz59YuTIkVFfXx8REQ0NDfG9730vIiI+9alPdZyePJRxpNN57777bnz729+O888/P3r16hUjRoyIH/zgB/Hhhx92ul1RUVHccsst8dBDD8XYsWOjb9++MX78+Fi9evVJf18P6dmzZ0REDBgwoNP20tLS6NGjR/Tq1euUs+k65uS/zAlHY07+K8WcHPK73/0uWlpaTrvfKHNizMl/pZiTZ599NqqqqjqVtZKSkpg8eXKsXbs23nzzzVPOLoRueWbj4/71r39FRMSQIUM6tu3duze+9KUvxaxZs+KOO+6I/fv3x0cffRQzZ86MZ555Jm6//faYNGlSbNmyJebPnx9VVVXxt7/9Lc4+++yIiLj55ptj+fLlMW/evLjiiiti06ZN8eUvfznef//9467niSeeiBkzZsTYsWPjpz/9aVx44YXR1tYWTz75ZERE3HTTTfHuu+/Gz3/+8/jjH/8Y5513XkQcvVnv2bMnpk6dGq+99lrcdddd8dnPfjaeeeaZWLRoUbS2tsZjjz3W6faPPfZYrF+/PhYsWBD9+/ePJUuWxHXXXRcvv/xyjBgxouN2RUVFMWXKlONe91hTUxNLly6Nb33rW7F48eIYMmRItLS0xK9+9auYM2dOl59C5dSYE3PC8ZmTtHMSEfH2229HfX193HPPPXHBBRcc9/acfsxJ2jnZu3dv9O7d+7Dth7Zt3Lix4zF0C1k3smzZsiwisueeey7bt29f9v7772erV6/OhgwZkpWUlGTbtm3LsizLampqsojIHnzwwU73//3vf59FRLZy5cpO29evX59FRPbAAw9kWZZlL730UhYR2dy5czvdbsWKFVlEZDU1NR3bmpubs4jImpubO7aNHDkyGzlyZLZ79+6jPpaf/OQnWURkmzdvPuxrU6ZMyaZMmdLx+S9/+cssIrJHHnmk0+0WL16cRUT25JNPdmyLiGzYsGHZe++917Ft27ZtWY8ePbJFixZ1uv9ZZ52VXX755Udd4//30ksvZWPGjMkiouPjO9/5TvbRRx+d0P3pOubEnHB85qRwc3L99ddnkyZN6piLmpqarF+/fid0X7qWOSnMnFRWVmajRo3KDhw40LFt37592YgRI7KIyB5++OHjZpxOuuVlVBMnToyePXtGSUlJXHvttVFWVhZ//vOfY9iwYZ1ud/3113f6fPXq1VFaWhozZsyI/fv3d3xUVlZGWVlZR9Nsbm6OiDjsOsSvfvWrUVx87JNBr7zySrz22mvxzW9+M/r06fMJH+lBTz31VPTr1y++8pWvdNp+6N0Z/vKXv3TaPnXq1CgpKen4fNiwYTF06NCO052H7N+//7D7HklbW1vHqclHH300WlpaYsmSJdHY2Bg33XTTKT4qUjMnB5kTjsWcHNRVc7Jy5cr405/+FL/5zW+iqKjoFB8FXc2cHNRVc3LrrbfGK6+8Erfcckts3bo1/v3vf8fs2bM78nr06F4v37vlZVTLly+PsWPHRnFxcQwbNuyIp5L69u0b55xzTqdtb731VrS3tx/12unt27dHRMSOHTsiIqKsrKzT14uLi2PQoEHHXNuhaxDzPDW8Y8eOKCsrO+wH89ChQ6O4uLhjvYccaY29e/eO3bt3n9L+77jjjnjvvfeitbW141KQyZMnx+DBg+PGG2+MG264IaZMmXJK2aRjTg4yJxyLOTmoK+Zk165dMWfOnLj11ltj+PDhHW9Bunfv3oiIaG9vj549e7rk8DRkTg7qquPJjTfeGO+8804sXLgwfvGLX0RExGWXXRbz5s2LxYsXx/nnn39KuYXSLcvG2LFjO94V4WiO9BuTwYMHx6BBg+Lxxx8/4n0OtdJDT5pt27Z1+gfdv3//YU+wjzt0/eIbb7xxzNudjEGDBsXzzz8fWZZ1elxvv/127N+/PwYPHpzbvo6ktbU1xo0bd9gB4Atf+EJERGzatMmLqNOQOTnInHAs5uSgrpiT7du3x1tvvRX33Xdf3HfffYd9feDAgTFz5sxoampKtgZOjTk5qKuOJxER3//+96O+vj5effXVKCkpifLy8pg1a1b069cvPv/5zyfff56613mYT+jaa6+NHTt2xIEDB2LChAmHfYwePToiouMdCVasWNHp/o888kjs37//mPsYNWpUjBw5Mh588MHD3rHg/zv0Rz4n0nqnTZsWu3btOuwH8PLlyzu+ntLw4cPjH//4R+zatavT9nXr1kVEvr9NoPDMyakxJ/9bzMnJKysri+bm5sM+rrrqqujTp080NzfHwoULk+2frmdOPpnevXvHRRddFOXl5fH666/HH/7wh7j55ps7/qi+u+iWZzZO1de+9rVYsWJFXHPNNfHd7343Lr300ujZs2e88cYb0dzcHDNnzozrrrsuxo4dG9/4xjdi6dKl0bNnz5g+fXps2rQp7r333sNOER7J/fffHzNmzIiJEyfG3Llz48ILL4zXX389nnjiiY5BuvjiiyMi4mc/+1nU1NREz549Y/To0Z2u+TvkhhtuiPvvvz9qamqira0tLr744vjrX/8aP/7xj+Oaa66J6dOnn9L3o7i4OKZMmXLc6wfr6+ujuro6rrjiipg7d24MHjw4nnvuuVi0aFGMGzcurr766lPaP6cnc9KZOeFIzElnJzInffr0OeL/0tzY2BhnnXXWafc/OPPJmZPOTvR4smnTpli5cmVMmDAhevfuHRs2bIh77rknPvOZz8Tdd999SvsuqML+ffrJOfSuCOvXrz/m7Y71zhb79u3L7r333mz8+PFZnz59sv79+2djxozJZs2alb366qsdt/vwww+z2267LRs6dGjWp0+fbOLEidm6deuy8vLy474rQpZl2bp167Krr746GzBgQNa7d+9s5MiRh73Lwp133pkNHz4869GjR6eMj78rQpZl2Y4dO7LZs2dn5513XlZcXJyVl5dnd955Z7Znz55Ot4uIbM6cOYc97o+v+9BtP76fo3nqqaeyK6+8MisrK8vOPvvsbNSoUdltt92Wbd++/YTuT9cxJ+aE4zMnhZuTj/NuVKcvc1KYOXn55ZezyZMnZ+eee27Wq1ev7NOf/nT2wx/+MNu1a9dx73s6KsqyLOvaegMAAPwv+J/6mw0AAKDrKBsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkMQZ9z+It7e3555ZW1ube2Zra2vumRFpHv+aNWtyz6ysrMw9kxPX2NiYe2ZDQ0PumVu2bMk9MyKiqakp98yZM2fmnsmZJ8XP0+rq6twzIyKWLl2ae2aK4ymFl+K1R4pjSopjX0REVVVV7pkpHn+hXns5swEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsWF3Hl7e3vumVVVVblnbtiwIffMKVOm5J4ZEdHS0pJ7ZlNTU+6ZlZWVuWeeqdra2nLPrKuryz2zO9m8eXOhl8D/qPr6+twzKyoqcs+MiKiurk6Sy5knxXMlxeuEFMfTiIja2trcM1tbW3PPLNRrL2c2AACAJJQNAAAgCWUDAABIQtkAAACSUDYAAIAklA0AACAJZQMAAEhC2QAAAJJQNgAAgCSUDQAAIAllAwAASELZAAAAklA2AACAJJQNAAAgCWUDAABIQtkAAACSUDYAAIAklA0AACAJZQMAAEhC2QAAAJIoLuTOly5dmnvmhg0bcs9sbm7OPbOtrS33zIiIlpaW3DMvueSS3DMprAEDBuSeuXPnztwzU6wzIqK6ujpJLmeW7nKM2rx5c+6ZERGlpaVJcjnztLe3555ZUVGRe2ZTU1PumRERq1atyj2zsrIy98xCcWYDAABIQtkAAACSUDYAAIAklA0AACAJZQMAAEhC2QAAAJJQNgAAgCSUDQAAIAllAwAASELZAAAAklA2AACAJJQNAAAgCWUDAABIQtkAAACSUDYAAIAklA0AACAJZQMAAEhC2QAAAJJQNgAAgCSUDQAAIIniQu78kksuyT1zwIABuWcuXbo098y2trbcMyMiysvLc8+cOXNm7pmcuIqKitwzUzyn6+rqcs9MpampKffM+vr63DM5cWvWrMk9s6GhIffM+fPn556Z4mdERMSqVatyz3Q8OTOlOKY0NjbmnpnqtVeK155VVVW5ZxaKMxsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASRRlWZYVehF5amtryz2ztrY298yWlpbcMyMixo8fn3tma2tr7pkUVkVFRe6ZVVVV3SIzIqKuri73zBdffDH3zMrKytwzz1TV1dW5Z6b42Zcis6mpKffMiDRzkmKtM2fOzD0TTkaKY1WK154pMk+EMxsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsWFXkDeKioqcs9sb2/PPTOVDRs25J7Z2NiYe2ZtbW3umWeqFM+/LVu25J5ZX1+fe2ZlZWXumRERdXV1uWeuWbMm98xUj7/QUjynV61alXtmeXl57pnV1dW5Z7a0tOSemUqK7yknp6GhIffM0tLS3DNTHFNSaW1tzT1z4MCBuWcWijMbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASygYAAJCEsgEAACShbAAAAEkUF3oB3cGGDRsKvYSCam9vL/QS/qeVlpbmnllTU5N7ZkNDQ+6ZqQwYMCD3zKqqqtwzz1Td5Tnd1taWe2ZFRUXumS0tLblnRqT5nlZWVuaeycmpr6/PPbO6ujr3zNbW1twza2trc8+MiNi5c2fumeXl5blnFoozGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJFGVZlhV6Eae76urq3DPb2tpyz4yIKC0tzT2zqakp98wU6+TEtba25p6ZYk62bNmSe2ZExLJly3LPrK2tzT2TM09jY2PumXV1dblnRkRs3rw598yKiorcMzkzVVZW5p65YcOG3DMjIubPn597ZkNDQ+6ZheLMBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASygYAAJCEsgEAACShbAAAAEkoGwAAQBLKBgAAkISyAQAAJKFsAAAASSgbAABAEsoGAACQhLIBAAAkoWwAAABJKBsAAEASRVmWZYVeBAAAcOZxZgMAAEhC2QAAAJJQNgAAgCSUDQAAIAllAwAASELZAAAAklA2AACAJJQNAAAgCWUDAABI4v8AJzg3oap6e70AAAAASUVORK5CYII=", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "_, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))\n", "for ax, image, prediction in zip(axes, X_test, predicted):\n", " ax.set_axis_off()\n", " image = image.reshape(8, 8)\n", " ax.imshow(image, cmap=plt.cm.gray_r, interpolation=\"nearest\")\n", " ax.set_title(f\"Prediction: {prediction}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "c8813b88-ea26-4afd-a871-8e200980f2ed", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.10" } }, "nbformat": 4, "nbformat_minor": 5 }