Back to List
Research Staff Report Discussion Paper 148 (2018.12)

Boundary problem and data leakage: A caveat for wavelet-based forecasting

Ryo HASUMI
  Specially Appointed Fellow
Yuto KAJITA
   

2018/12/03

The application of machine learning to economics has drawn much attention in recent years. Forecasting economic data based on machine learning needs feature extraction to obtain better performance. In time series forecasting, researchers often use the wavelet transform to process time series data, and have reported that the combination of a neural network model with the wavelet transform improves the accuracy of the prediction. There are, however, many papers relating to wavelet-based forecasting that do not provide sufficient information on how the time-series data was processed. We show that inappropriate procedures for applying the wavelet decomposition to time series data easily lead to data leakage, which uses unobserved data and so its forecasting results would be of extremely high precision. We find that wavelet-based forecasting in which the time series data are processed appropriately cannot outperform even a naive prediction. Prediction performance based on wavelets is unreliable if the researcher does not specify the data processing method.