# Description

This notebook demonstrate how to download MachineLearningCSV from the CICIDS2017 dataset.

*Author*: **Mahendra Data** mahendra.data@dbms.cs.kumamoto-u.ac.jp

License: **BSD 3 clause**

# Mounting Google Drive

We will save the downloaded dataset to Google Drive.

In [None]:
from google.colab import drive
drive.mount("/content/drive")

# Downloading the dataset

The description of CICIDS2017 dataset is accessible at https://www.unb.ca/cic/datasets/ids-2017.html

There are three versions available:

1. Raw network captured data (PCAPs),
2. Generated Labelled Flows, and
3. Machine Learning CSV.

In this notebook, we will download the `MachineLearningCSV.zip` version of this dataset.

When downloading this dataset, we rename the `MachineLearningCSV.zip` file to `MachineLearningCVE.zip` because in the `MachineLearningCSV.md5` the target filename is `MachineLearningCVE.zip`.

In [None]:
!wget -nc -O MachineLearningCVE.zip http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.zip

--2020-08-06 05:29:51--  http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.zip
Connecting to 205.174.165.80:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 235102953 (224M) [application/zip]
Saving to: ‘MachineLearningCVE.zip’


2020-08-06 05:31:35 (2.46 MB/s) - ‘MachineLearningCVE.zip’ saved [235102953/235102953]



# Integrity check

Download `MachineLearningCSV.md5` file to check the integrity of the downloaded file.

In [None]:
!wget -nc http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.md5

--2020-08-06 05:31:36--  http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/MachineLearningCSV.md5
Connecting to 205.174.165.80:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 57
Saving to: ‘MachineLearningCSV.md5’


2020-08-06 05:31:36 (7.06 MB/s) - ‘MachineLearningCSV.md5’ saved [57/57]



Checking the file integrity.

In [None]:
!md5sum -c MachineLearningCSV.md5

MachineLearningCVE.zip: OK


If the downloaded dataset is correct, then the output should be like this

`MachineLearningCVE.zip: OK`

# Saving the dataset

Save the zip and extracted files to Google Drive.

In [None]:
!mkdir -p "/content/drive/My Drive/CICIDS2017/"

!cp MachineLearningCVE.zip "/content/drive/My Drive/CICIDS2017/"

Now the dataset is saved to your Google Drive at `CICIDS2017` folder.