Fidel-TS: A High-Fidelity Multimodal Benchmark for Time Series Forecasting
arXiv:2509.24789v3 Announce Type: replace-cross Abstract: The evaluation of time series forecasting models is hindered by a critical lack of high-quality benchmarks, leading to a potential illusion of progress. Existing datasets suffer from issues ranging from pre-training data contamination in the age of LLMs to the temporal and description leakage prevalent in early multimodal designs. To address this, we formalize the core principles of high-fidelity benchmarking, focusing on data sourcing integrity, leak-free and causally sound design, and structural clarity. […]