{"nbformat":4,"nbformat_minor":0,"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"},"colab":{"name":"1.2 Downloading MachineLearningCSV CICIDS2017 Dataset .ipynb","provenance":[],"collapsed_sections":[]}},"cells":[{"cell_type":"markdown","metadata":{"id":"ovbd1i5W0yqa","colab_type":"text"},"source":["# Description\n","\n","This notebook demonstrate how to download MachineLearningCSV from the CICIDS2017 dataset.\n","\n","*Author*: **Mahendra Data** mahendra.data@dbms.cs.kumamoto-u.ac.jp\n","\n","License: **BSD 3 clause**"]},{"cell_type":"markdown","metadata":{"id":"QHdWG9Ol00PE","colab_type":"text"},"source":["# Mounting Google Drive\n","\n","We will save the downloaded dataset to Google Drive."]},{"cell_type":"code","metadata":{"id":"8Q4fip2H02Pd","colab_type":"code","colab":{}},"source":["from google.colab import drive\n","drive.mount(\"/content/drive\")"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Dnr9koUm0yqd","colab_type":"text"},"source":["# Downloading the dataset\n","\n","The description of CICIDS2017 dataset is accessible at https://www.unb.ca/cic/datasets/ids-2017.html\n","\n","There are three versions available:\n","\n","1. Raw network captured data (PCAPs),\n","2. Generated Labelled Flows, and\n","3. Machine Learning CSV.\n","\n","In this notebook, we will download the `MachineLearningCSV.zip` version of this dataset.\n","\n","When downloading this dataset, we rename the `MachineLearningCSV.zip` file to `MachineLearningCVE.zip` because in the `MachineLearningCSV.md5` the target filename is `MachineLearningCVE.zip`."]},{"cell_type":"code","metadata":{"id":"KDQBKsnl0yqf","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":187},"executionInfo":{"status":"ok","timestamp":1596691896000,"user_tz":-540,"elapsed":140853,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"7b0683ac-f930-4803-f21e-ab4b58085ac1"},"source":["!wget -nc -O MachineLearningCVE.zip http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.zip"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2020-08-06 05:29:51-- http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.zip\n","Connecting to 205.174.165.80:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 235102953 (224M) [application/zip]\n","Saving to: ‘MachineLearningCVE.zip’\n","\n","MachineLearningCVE. 100%[===================>] 224.21M 6.73MB/s in 91s \n","\n","2020-08-06 05:31:35 (2.46 MB/s) - ‘MachineLearningCVE.zip’ saved [235102953/235102953]\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"QpaSsiJU0yqk","colab_type":"text"},"source":["# Integrity check\n","\n","Download `MachineLearningCSV.md5` file to check the integrity of the downloaded file."]},{"cell_type":"code","metadata":{"id":"F7DMcBUz0yqk","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":187},"executionInfo":{"status":"ok","timestamp":1596691897711,"user_tz":-540,"elapsed":142560,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"3b5bfb92-3f09-4822-897e-17df12cc495d"},"source":["!wget -nc http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.md5"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2020-08-06 05:31:36-- http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.md5\n","Connecting to 205.174.165.80:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 57\n","Saving to: ‘MachineLearningCSV.md5’\n","\n","\rMachineLearningCSV. 0%[ ] 0 --.-KB/s \rMachineLearningCSV. 100%[===================>] 57 --.-KB/s in 0s \n","\n","2020-08-06 05:31:36 (7.06 MB/s) - ‘MachineLearningCSV.md5’ saved [57/57]\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"akxfdrZ_0yqo","colab_type":"text"},"source":["Checking the file integrity."]},{"cell_type":"code","metadata":{"id":"8zEf1JtW0yqp","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":34},"executionInfo":{"status":"ok","timestamp":1596691899602,"user_tz":-540,"elapsed":144447,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"44b15cc6-9da5-4965-9593-5d598861ceeb"},"source":["!md5sum -c MachineLearningCSV.md5"],"execution_count":null,"outputs":[{"output_type":"stream","text":["MachineLearningCVE.zip: OK\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"J3NpbnnN0yqt","colab_type":"text"},"source":["If the downloaded dataset is correct, then the output should be like this\n","\n","`MachineLearningCVE.zip: OK`"]},{"cell_type":"markdown","metadata":{"id":"fueJb4vK3RYs","colab_type":"text"},"source":["# Saving the dataset\n","\n","Save the zip and extracted files to Google Drive."]},{"cell_type":"code","metadata":{"id":"-Qbe7PBF1rIP","colab_type":"code","colab":{}},"source":["!mkdir -p \"/content/drive/My Drive/CICIDS2017/\"\n","\n","!cp MachineLearningCVE.zip \"/content/drive/My Drive/CICIDS2017/\""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"vWkym95UfOYu","colab_type":"text"},"source":["Now the dataset is saved to your Google Drive at `CICIDS2017` folder."]}]}