`model_selection.split_data_train_validation_test`: Test set starts one point too early

When calling `model_selection.split_data_train_validation_test()` in backtest mode, the test set starts one point too early. Assume you have a DataFrame of size 100, and `test_fraction` is 0.1.
Then `start_date_test = end_date - np.round(number_indices * test_fraction) * delta` (line 176) implies that the test set starts 10 time steps earlier than the last timestamp. So, because we have 100 timestamps in this simple example, the test set starts from timestamp 90. Due to this off-by-one-error, the actual test set size is 11, instead of the expected 10.

A simple fix would be to change line 176 in `model_selection/model_selection.py` to
```
start_date_test = end_date - (np.round(number_indices * test_fraction) - 1) * delta
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`model_selection.split_data_train_validation_test`: Test set starts one point too early #774

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

model_selection.split_data_train_validation_test: Test set starts one point too early #774

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`model_selection.split_data_train_validation_test`: Test set starts one point too early #774