How to Import Data from Kaggle to Google Colab

Gal Hever
2 min readDec 2, 2019

--

An Easy Way to Download Kaggle Data into Google Colab Notebook

If you work with google colab on some Kaggle dataset, you will probably need this tutorial! Here I’ll present some easy and convenient way to import data from Kaggle directly to your Google Colab notebook.

First, let’s install the Kaggle package that will be used for importing the data.

!pip install kaggle

Next, we need to upload the credentials of our Kaggle account. To do so, you need to enter your profile and “Create New API Token”. If you already have one you can click on “Expire API Token” and create a new one.

Then, save the json file with your credentials on your computer and upload this file to Colab using the code below:

from google.colab import files
files.upload()

The Kaggle API client expects the json file to be in ~/.kaggle folder so let’s create a new folder and move it inside.

!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

Next, we will click on the three vertical points on the right side of the screen and “Copy the API Command” of the dataset that we want to import from Kaggle.

Than we will paste this line of code in our notebook as below:

!kaggle datasets download -d rounakbanik/the-movies-dataset

Let’s see the imported files:

!ls

Now, the last step is to open the extracted files and get the data:

import zipfile
zip_ref = zipfile.ZipFile(‘the-movies-dataset.zip’, ‘r’)
zip_ref.extractall(‘files’)
zip_ref.close()
import pandas as pd
movies=pd.read_csv(‘/content/files/movies_metadata.csv’)
ratings=pd.read_csv(‘/content/files/ratings.csv’)

That’s it! Now your data is ready and you can start working on it.

End Notes

Hope that this blog-post made your connection between Kaggle and Colab much easier and faster than before.

You can find the full code for this tutorial on Github.

--

--