{"nbformat":4,"nbformat_minor":0,"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"},"colab":{"name":"1.1 Downloading GeneratedLabelledFlows CICIDS2017 Dataset .ipynb","provenance":[],"collapsed_sections":[]}},"cells":[{"cell_type":"markdown","metadata":{"id":"ovbd1i5W0yqa","colab_type":"text"},"source":["# Description\n","\n","This notebook demonstrate how to download GeneratedLabelledFlows from the CICIDS2017 dataset.\n","\n","*Author*: **Mahendra Data** mahendra.data@dbms.cs.kumamoto-u.ac.jp\n","\n","License: **BSD 3 clause**"]},{"cell_type":"markdown","metadata":{"id":"QHdWG9Ol00PE","colab_type":"text"},"source":["# Mounting Google Drive\n","\n","We will save the downloaded dataset to Google Drive."]},{"cell_type":"code","metadata":{"id":"8Q4fip2H02Pd","colab_type":"code","colab":{}},"source":["from google.colab import drive\n","drive.mount(\"/content/drive\")"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Dnr9koUm0yqd","colab_type":"text"},"source":["# Downloading the dataset\n","\n","The description of CICIDS2017 dataset is accessible at https://www.unb.ca/cic/datasets/ids-2017.html\n","\n","There are three versions available:\n","\n","1. Raw network captured data (PCAPs),\n","2. Generated Labelled Flows, and\n","3. Machine Learning CSV.\n","\n","In this notebook, we will download the `GeneratedLabelledFlows.zip` version of this dataset."]},{"cell_type":"code","metadata":{"id":"KDQBKsnl0yqf","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":187},"executionInfo":{"status":"ok","timestamp":1596692030271,"user_tz":-540,"elapsed":126707,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"050ac7f2-df10-48a4-c24e-c5630c6562ad"},"source":["!wget -nc -O GeneratedLabelledFlows.zip http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/GeneratedLabelledFlows.zip"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2020-08-06 05:32:43-- http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/GeneratedLabelledFlows.zip\n","Connecting to 205.174.165.80:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 283876488 (271M) [application/zip]\n","Saving to: ‘GeneratedLabelledFlows.zip’\n","\n","GeneratedLabelledFl 100%[===================>] 270.73M 6.75MB/s in 65s \n","\n","2020-08-06 05:33:49 (4.16 MB/s) - ‘GeneratedLabelledFlows.zip’ saved [283876488/283876488]\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"QpaSsiJU0yqk","colab_type":"text"},"source":["# Integrity check\n","\n","Download `GeneratedLabelledFlows.md5` file to check the integrity of the downloaded file."]},{"cell_type":"code","metadata":{"id":"F7DMcBUz0yqk","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":187},"executionInfo":{"status":"ok","timestamp":1596692032061,"user_tz":-540,"elapsed":128488,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"b54391fe-ea88-4fe4-efd0-fc38ba296c57"},"source":["!wget -nc http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/GeneratedLabelledFlows.md5"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2020-08-06 05:33:50-- http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/GeneratedLabelledFlows.md5\n","Connecting to 205.174.165.80:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 61\n","Saving to: ‘GeneratedLabelledFlows.md5’\n","\n","\r Generated 0%[ ] 0 --.-KB/s \rGeneratedLabelledFl 100%[===================>] 61 --.-KB/s in 0s \n","\n","2020-08-06 05:33:50 (6.88 MB/s) - ‘GeneratedLabelledFlows.md5’ saved [61/61]\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"akxfdrZ_0yqo","colab_type":"text"},"source":["Checking the file integrity."]},{"cell_type":"code","metadata":{"id":"8zEf1JtW0yqp","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":34},"executionInfo":{"status":"ok","timestamp":1596692034074,"user_tz":-540,"elapsed":130490,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"71075251-0e2a-4509-8a52-f0d2c6960908"},"source":["!md5sum -c GeneratedLabelledFlows.md5"],"execution_count":null,"outputs":[{"output_type":"stream","text":["GeneratedLabelledFlows.zip: OK\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"J3NpbnnN0yqt","colab_type":"text"},"source":["If the downloaded dataset is correct, then the output should be like this\n","\n","`GeneratedLabelledFlows.zip: OK`"]},{"cell_type":"markdown","metadata":{"id":"fueJb4vK3RYs","colab_type":"text"},"source":["# Saving the dataset\n","\n","Save the zip and extracted files to Google Drive."]},{"cell_type":"code","metadata":{"id":"-Qbe7PBF1rIP","colab_type":"code","colab":{}},"source":["!mkdir -p \"/content/drive/My Drive/CICIDS2017/\"\n","\n","!cp GeneratedLabelledFlows.zip \"/content/drive/My Drive/CICIDS2017/\""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"vWkym95UfOYu","colab_type":"text"},"source":["Now the dataset is saved to your Google Drive at `CICIDS2017` folder."]}]}