January 1, 2021

#102 - Public time-series datasets

The following website http://timeseriesclassification.com/ has a great amount of public time-series datasets for several applications such as data from sensors, motion, image, audio, among others. In this post is shown how to use these datasets in a python framework.

Using the python code shown on github it is possible to convert these files from .arff and to use these datasets in python framework using libraries like pandas, scipy and sktime. In addition to the data formatting and manipulation, it is also run a KNN classifier in the example presented.

Notes:

  • arff is a data format created by the University of Waikato (New Zealand) dedicated to be used with their machine learning software called WEKA. 
  • some of the algorithms used in sktime require the data (e.g. X_train, X_test) to be a nested dataframe. In the github code is detailed how to format the data in this way.

No comments:

Post a Comment