You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1 line
5.7 KiB
Plaintext

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{"nbformat":4,"nbformat_minor":0,"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"},"colab":{"name":"1.2 Downloading MachineLearningCSV CICIDS2017 Dataset .ipynb","provenance":[],"collapsed_sections":[]}},"cells":[{"cell_type":"markdown","metadata":{"id":"ovbd1i5W0yqa","colab_type":"text"},"source":["# Description\n","\n","This notebook demonstrate how to download MachineLearningCSV from the CICIDS2017 dataset.\n","\n","*Author*: **Mahendra Data** mahendra.data@dbms.cs.kumamoto-u.ac.jp\n","\n","License: **BSD 3 clause**"]},{"cell_type":"markdown","metadata":{"id":"QHdWG9Ol00PE","colab_type":"text"},"source":["# Mounting Google Drive\n","\n","We will save the downloaded dataset to Google Drive."]},{"cell_type":"code","metadata":{"id":"8Q4fip2H02Pd","colab_type":"code","colab":{}},"source":["from google.colab import drive\n","drive.mount(\"/content/drive\")"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Dnr9koUm0yqd","colab_type":"text"},"source":["# Downloading the dataset\n","\n","The description of CICIDS2017 dataset is accessible at https://www.unb.ca/cic/datasets/ids-2017.html\n","\n","There are three versions available:\n","\n","1. Raw network captured data (PCAPs),\n","2. Generated Labelled Flows, and\n","3. Machine Learning CSV.\n","\n","In this notebook, we will download the `MachineLearningCSV.zip` version of this dataset.\n","\n","When downloading this dataset, we rename the `MachineLearningCSV.zip` file to `MachineLearningCVE.zip` because in the `MachineLearningCSV.md5` the target filename is `MachineLearningCVE.zip`."]},{"cell_type":"code","metadata":{"id":"KDQBKsnl0yqf","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":187},"executionInfo":{"status":"ok","timestamp":1596691896000,"user_tz":-540,"elapsed":140853,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"7b0683ac-f930-4803-f21e-ab4b58085ac1"},"source":["!wget -nc -O MachineLearningCVE.zip http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.zip"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2020-08-06 05:29:51-- http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.zip\n","Connecting to 205.174.165.80:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 235102953 (224M) [application/zip]\n","Saving to: MachineLearningCVE.zip\n","\n","MachineLearningCVE. 100%[===================>] 224.21M 6.73MB/s in 91s \n","\n","2020-08-06 05:31:35 (2.46 MB/s) - MachineLearningCVE.zip saved [235102953/235102953]\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"QpaSsiJU0yqk","colab_type":"text"},"source":["# Integrity check\n","\n","Download `MachineLearningCSV.md5` file to check the integrity of the downloaded file."]},{"cell_type":"code","metadata":{"id":"F7DMcBUz0yqk","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":187},"executionInfo":{"status":"ok","timestamp":1596691897711,"user_tz":-540,"elapsed":142560,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"3b5bfb92-3f09-4822-897e-17df12cc495d"},"source":["!wget -nc http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.md5"],"execution_count":null,"outputs":[{"output_type":"stream","text":["--2020-08-06 05:31:36-- http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.md5\n","Connecting to 205.174.165.80:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 57\n","Saving to: MachineLearningCSV.md5\n","\n","\rMachineLearningCSV. 0%[ ] 0 --.-KB/s \rMachineLearningCSV. 100%[===================>] 57 --.-KB/s in 0s \n","\n","2020-08-06 05:31:36 (7.06 MB/s) - MachineLearningCSV.md5 saved [57/57]\n","\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"akxfdrZ_0yqo","colab_type":"text"},"source":["Checking the file integrity."]},{"cell_type":"code","metadata":{"id":"8zEf1JtW0yqp","colab_type":"code","colab":{"base_uri":"https://localhost:8080/","height":34},"executionInfo":{"status":"ok","timestamp":1596691899602,"user_tz":-540,"elapsed":144447,"user":{"displayName":"Mahendra Data","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Ghn7DAlkRKEg-Y82BqktrBT0ABMFy8r5576xhbKDQ=s64","userId":"08049029618478467489"}},"outputId":"44b15cc6-9da5-4965-9593-5d598861ceeb"},"source":["!md5sum -c MachineLearningCSV.md5"],"execution_count":null,"outputs":[{"output_type":"stream","text":["MachineLearningCVE.zip: OK\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"J3NpbnnN0yqt","colab_type":"text"},"source":["If the downloaded dataset is correct, then the output should be like this\n","\n","`MachineLearningCVE.zip: OK`"]},{"cell_type":"markdown","metadata":{"id":"fueJb4vK3RYs","colab_type":"text"},"source":["# Saving the dataset\n","\n","Save the zip and extracted files to Google Drive."]},{"cell_type":"code","metadata":{"id":"-Qbe7PBF1rIP","colab_type":"code","colab":{}},"source":["!mkdir -p \"/content/drive/My Drive/CICIDS2017/\"\n","\n","!cp MachineLearningCVE.zip \"/content/drive/My Drive/CICIDS2017/\""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"vWkym95UfOYu","colab_type":"text"},"source":["Now the dataset is saved to your Google Drive at `CICIDS2017` folder."]}]}