ray.data.aggregate.Sum#

class ray.data.aggregate.Sum(on: str | None = None, ignore_nulls: bool = True, alias_name: str | None = None)[source]#

Bases: AggregateFnV2

Defines sum aggregation.

Example

import ray
from ray.data.aggregate import Sum

ds = ray.data.range(100)
# Schema: {'id': int64}
ds = ds.add_column("group_key", lambda x: x % 3)
# Schema: {'id': int64, 'group_key': int64}

# Summing all rows per group:
result = ds.aggregate(Sum(on="id"))
# result: {'sum(id)': 4950}
Parameters:
  • on – The name of the numerical column to sum. Must be provided.

  • ignore_nulls – Whether to ignore null values during summation. If True (default), nulls are skipped. If False, the sum will be null if any value in the group is null.

  • alias_name – Optional name for the resulting column.

Methods

finalize

Transforms the final accumulated state into the desired output.