Skip to content

Unable to generate example report (timeseries data) #1773

@dhofstetter

Description

@dhofstetter

Current Behaviour

Using the latter script executed by uv, I'm totally unable to produce the report

uv run <script.py>

The error I got is:

Upgrade to ydata-sdk
Improve your data and profiling with ydata-sdk, featuring data quality scoring, redundancy detection, outlier identification, text validation, and synthetic data generation.
Register at https://ydata.ai/register
 32%|████████████████████████████████████████████████████████████▍                                                                                                                               | 9/28 [00:05<00:11,  1.61it/s]
Summarize dataset:  42%|███████████████████████████████████████████████████████████▊                                                                                 | 14/33 [00:06<00:09,  2.08it/s, Describe variable: CO AQI]
Traceback (most recent call last):
  File "/home/dhofstetter/Development/cognify/sag/salzburg-ag-netze-ngop-netzlastpunktprognose/scripts/ydata.py", line 37, in <module>
    profile.to_file("report_timeseries.html")
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/profile_report.py", line 381, in to_file
    data = self.to_html()
           ^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/profile_report.py", line 498, in to_html
    return self.html
           ^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/profile_report.py", line 294, in html
    self._html = self._render_html()
                 ^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/profile_report.py", line 411, in _render_html
    report = self.report
             ^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/profile_report.py", line 288, in report
    self._report = get_report_structure(self.config, self.description_set)
                                                     ^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/profile_report.py", line 270, in description_set
    self._description_set = describe_df(
                            ^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/describe.py", line 89, in describe
    series_description = get_series_descriptions(
                         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/summary.py", line 62, in get_series_descriptions
    return pandas_get_series_descriptions(config, df, summarizer, typeset, pbar)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/pandas/summary_pandas.py", line 98, in pandas_get_series_descriptions
    name, description = future.result()
                        ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/pandas/summary_pandas.py", line 81, in describe_column
    description = pandas_describe_1d(config, series, summarizer, typeset)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/pandas/summary_pandas.py", line 63, in pandas_describe_1d
    summary = summarizer.summarize(config, series, dtype=vtype)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/summarizer.py", line 50, in summarize
    return self.handle(str(dtype), config, series, {"type": str(dtype)})
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/handler.py", line 59, in handle
    summary = op(*args)[-1]
              ^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/handler.py", line 21, in composed_function
    result = func(*result) if isinstance(result, tuple) else func(result)
             ^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/summary_algorithms.py", line 72, in inner
    return fn(config, series, summary)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/summary_algorithms.py", line 89, in inner
    return fn(config, series, summary)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/pandas/describe_timeseries_pandas.py", line 220, in pandas_describe_timeseries_1d
    stats["gap_stats"] = compute_gap_stats(series)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/pandas/describe_timeseries_pandas.py", line 182, in compute_gap_stats
    gap_stats, gaps = identify_gaps(gap, is_datetime)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/ydata_profiling/model/pandas/describe_timeseries_pandas.py", line 151, in identify_gaps
    diff = gap.diff()
           ^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/pandas/core/series.py", line 3131, in diff
    result = algorithms.diff(self._values, periods)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhofstetter/.cache/uv/environments-v2/ydata-d492d7a1d6b2acca/lib/python3.12/site-packages/pandas/core/algorithms.py", line 1435, in diff
    out_arr[res_indexer] = op(arr[res_indexer], arr[lag_indexer])
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: unsupported operand type(s) for -: 'str' and 'str'

Expected Behaviour

the report file is generated

Data Description

The dataset you provide in your docs for timeseries data

Code that reproduces the bug

# /// script
# requires-python = "==3.11"
# dependencies = [
#     "ydata-profiling>=4.16.1",
# ]
# ///
import pandas as pd

from ydata_profiling.utils.cache import cache_file
from ydata_profiling import ProfileReport

file_name = cache_file(
    "pollution_us_2000_2016.csv",
    "https://query.data.world/s/mz5ot3l4zrgvldncfgxu34nda45kvb",
)

df = pd.read_csv(file_name, index_col=[0])

# Filtering time-series to profile a single site
site = df[df["Site Num"] == 3003]

# Setting what variables are time series
type_schema = {
    "NO2 Mean": "timeseries",
    "NO2 1st Max Value": "timeseries",
    "NO2 1st Max Hour": "timeseries",
    "NO2 AQI": "timeseries",
}

profile = ProfileReport(
    site,
    tsmode=True,
    type_schema=type_schema,
    sortby="Date Local",
    title="Time-Series EDA for site 3003",
)
profile.to_file("report_timeseries.html")

pandas-profiling version

4.16.1

Dependencies

Using Python 3.12.3 environment at: test
Package            Version
------------------ -----------
annotated-types    0.7.0
attrs              25.3.0
certifi            2025.8.3
charset-normalizer 3.4.3
contourpy          1.3.3
cycler             0.12.1
dacite             1.9.2
fonttools          4.59.0
htmlmin            0.1.12
idna               3.10
imagehash          4.3.1
jinja2             3.1.6
joblib             1.5.1
kiwisolver         1.4.9
llvmlite           0.44.0
markupsafe         3.0.2
matplotlib         3.10.0
multimethod        1.12
networkx           3.5
numba              0.61.0
numpy              2.1.3
packaging          25.0
pandas             2.3.1
patsy              1.0.1
phik               0.12.5
pillow             11.3.0
puremagic          1.30
pydantic           2.11.7
pydantic-core      2.33.2
pyparsing          3.2.3
python-dateutil    2.9.0.post0
pytz               2025.2
pywavelets         1.9.0
pyyaml             6.0.2
requests           2.32.4
scipy              1.15.3
seaborn            0.13.2
six                1.17.0
statsmodels        0.14.5
tqdm               4.67.1
typeguard          4.4.4
typing-extensions  4.14.1
typing-inspection  0.4.1
tzdata             2025.2
urllib3            2.5.0
visions            0.8.1
wordcloud          1.9.4
ydata-profiling    4.16.1

OS

ubuntu

Checklist

  • There is not yet another bug report for this issue in the issue tracker
  • The problem is reproducible from this bug report. This guide can help to craft a minimal bug report.
  • The issue has not been resolved by the entries listed under Common Issues.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions