ETSC is a Python Early Time-Series Classification library for public use, used in "A Framework to Evaluate Early Time-Series Classification
Algorithms", Authors: Charilaos Akasiadis, Evgenios Kladis, Petro-Foti Kamberi, Evangelos Michelioudakis, Elias Alevizos, Alexander Artikis.
Cite as:
Akasiadis, C., Kladis, E., Kamberi, P. F., Michelioudakis, E., Alevizos, E., & Artikis, A. (2024). A Framework to Evaluate Early Time-Series Classification Algorithms. EDBT 2024: 27th International Conference on Extending Database Technology, Proceedings (pp. 623–635). ISBN 978-3-89318-095-0 on OpenProceedings.org
The aim of this work is to study and collect algorithms that conduct early time-series classification, in a user-friendly format, for researchers to use as a benchmark.
Currently, six algorithms are included in this directory. A python cli, simplifies the execution of each algorithm The predictions are evaluated through metrics such as earliness, accuracy, f1-score, harmonic mean between accuracy and earliness, and computation time for both training and testing.
We would like to thank the creators of the UCR/UEA repository for making the datasets openly available. Special thanks to Evangelos Michelioudakis (vagmcs@iit.demokritos.gr) for the contribution to the development of this repository.
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; See the GNU General Public License v3 for more details.
Python3 is required to install the libraries stated in the requirements.txt.
JVM >= 1.8 is required to run the algorithms that are implemented using java.
- Create the environment
conda create -n py37 python=3.7- Activate it
conda activate py37- Install required packages:
pip3 install -r requirements.txt- Locally install
timeline:
pip install --editable .- Install the
virtualenvpackage:
pip3 install virtualenv- Create a new virtual environment:
virtualenv venv- Activate virtual environment:
. venv/bin/activate
- Install required packages:
pip3 install -r requirements.txt- Locally install
timeline:
pip install --editable .For downloading the data run the script download_data.sh found in the script folder. The downloaded data can be found inside folder data.
10 datasets are available, derived from the UCR_UEA library. Multivariate datasets from the Biological and Maritime field are also provided.
Note that only ECTS was implemented by us, using the paper of the algorithm as a guide. The rest of the algorithms derive from sources we provide in the following table. All credit goes to the original creators of the algorithms papers.
| Algorithm | Parameters |
|---|---|
| ECTS [paper] | support = 0 |
| EDSC [paper] [code] | CHE k=3, min_length=5, max_length=len(time_series)/2 |
| TEASER [paper] [code] | S=20 (for the UCR_UEA), S=10 (for the biological and maritime) |
| ECEC [paper] [code] | training_times=20, length = len(time_series)/20,a=0.8 |
| MLSTM [paper] [code] | LSTM cells = [8, 64, 128], tested_lengths = [0.4,0.5,0.6] % |
| ECONOMY-K [paper] [code] | k = [1, 2, 3], λ = 100, cost = 0.001 |
After running the Virtual Enviroment commands stated above, by running ets a menu with all programming options appears.
A running command is constructed as follows:
ets <program commands> <algorithm> <algorithm commands>
If you want to see the algorithm's menu run:
ets <program commands> <algorithm> --help
-i <file path> : Only one file is given for cross validation with a given number of folds.
-t <file-path> : The training file used. A -e command is also required.
-e <file-path> : The testing file used. A -t command is also required.
-o <file-path> : The desired output stream file. Default output steam is the console.
-s <char>: The seperator of each collumn in the file/s.
-d & -h: Commands that indicate the collumn of the classes in the input file/s. It can be either the <int> of the collumn for -d or the <name> for -h.
-v <int>: In case of multivariate input, describes the number of variables and should always be followed by -g. All Multivariate input files, each time-series, should take up -v consequent lines for each univariate time-series variable, bearing the same labels
-g <method>: The methods used to deal with multivariate time-series. We used vote which conducts the voting as explained in the paper and normal which passes the whole multivariate input in the algorithm, currently possible only by MLSTM. Also MLSTM requires -g normal for univariate time-series as well.
--java & --cplus: Command that is required for non-python implementations. --java for Teaser and ECEC,--cplus for EDSC.
-c <number>: The class for which the F1-score will be calculated. If -1 is passed then the F1-score of all classes is calculated (not supported for multivariate time-series yet).
--make-cv: Takes the training and testing file, merges them and conducts cross validation.
--folds : Used when there are premade folds available.
--trunc : Use STRUT approach to find the best time-point to perform ETSC.
--pyts-csv: Use pyts format for STRUT Weasel's input, when the dataset comes in csv format.
ects : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote ects -u 0.0
edsc : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --cplus -g vote edsccplus
ecec : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote ecec
teaser : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote teaser -s 20
mlstm : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g normal mlstm
eco-k : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote economy-k
strut - minirocket : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m minirocket -p 0 -s 2
strut - weasel : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m weasel -p 0 -s 2
strut - minirocket-fav : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m minirocket_fav -p 0 -s 2
strut - weasel-fav : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m weasel_fav -p 0 -s 2
ects : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 ects -u 0.0
edsc : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --cplus edsccplus
ecec : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --java ecec
teaser : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --java teaser -s 10
mlstm : ets -i "file location" -v (3 for Biological or 7 Maritime) -d 0 -c -1 -g normal mlstm
eco-k : ets -i "file location"" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 economy-k
strut - minirocket : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --trunc strut -m minirocket -p 0 -s 2
strut - weasel : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --pyts-csv --trunc strut -m weasel -p 0 -s 2
strut - minirocket-fav : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --trunc strut -m minirocket_fav -p 0 -s 2
strut - weasel-fav : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --pyts-csv --trunc strut -m weasel_fav -p 0 -s 2
Any false product and misuse of the used algorithms is on the authors of the original papers. Please, inform us if you detect any misconduct or misuse of the code/datasets used in this repository.