Data Processing
Split functions and generator functions for data transformation
Last updated: February 8, 2026
Data Processing
Data processing pipeline transforms CSV data into chart datasets and reports using split functions and generators.
Pipeline Overview
CSV Data → getData() → Split Function → Generator Function → Chart/Report
Split Functions
Split functions partition data into groups for aggregation.
Time-based Splits
splitDailyDates
Groups by day (YYYY-MM-DD).
// Input timestamp: 2025-01-15
// Output key: "2025-01-15"
splitByYearMonth
Groups by month (YYYY-MM).
// Input timestamp: 2025-01-15
// Output key: "2025-01"
splitByYear
Groups by year (YYYY).
// Input timestamp: 2025-01-15
// Output key: "2025"
splitByQuarter
Groups by fiscal quarter (YYYY-Q#).
// Input timestamp: 2025-01-15 (Q1)
// Output key: "2025-Q1"
// Quarter mapping:
// Q1: Jan-Mar (months 0-2)
// Q2: Apr-Jun (months 3-5)
// Q3: Jul-Sep (months 6-8)
// Q4: Oct-Dec (months 9-11)
splitByWeek
Groups by ISO week number (YYYY-W##).
// Input timestamp: 2025-01-15
// Output key: "2025-W03"
ISO week calculation:
- Week starts Monday
- Week 1 contains first Thursday of year
- Weeks numbered 01-53
Dimension-based Splits
splitBy
Generic split by any field (timestamp|category|subcategory|value|extra).
// dataSourceKey: "extra"
// Input extra: "RRSP"
// Output key: "RRSP"
splitByCategory
Groups by category field.
// Input category: "Income"
// Output key: "Income"
splitBySubcategory
Groups by subcategory field.
// Input subcategory: "Salary"
// Output key: "Salary"
splitByValueRange
Groups by value magnitude.
// Value ranges:
// "Small (<100)": value < 100
// "Medium (100-1000)": 100 ≤ value ≤ 1000
// "Large (>1000)": value > 1000
// Uses absolute value: |value|
Generator Functions
Generator functions transform grouped data into chart datasets or reports.
Chart Generators
generateDailyDataSet
Creates line chart with daily values (last value per day).
- Filters zero-sum days
- Handles missing data (NaN for gaps)
- Preserves actual daily values
// Use case: Daily portfolio snapshots
// Each point = last recorded value for that day
generateSumDataSet
Aggregates values within each group.
- Sums all entries per period
- One dataset per subcategory
- Useful for total expenses/income
// Use case: Monthly total expenses
// Each point = sum of all expenses in month
generateSumDataSetPerTypes
Creates separate dataset per category.
- One line per category
- Sums values per period
- Enables category comparison
// Use case: Income vs Expenses comparison
// Two lines: Income total and Expense total per month
generateCumulativeSumDataSet
Running total across periods.
- Cumulative sum per subcategory
- Tracks accumulation over time
// Use case: Total saved over time
// Each point = current period + all previous periods
generateCumulativeSumDataSetPerTypes
Running total per category.
- Separate cumulative line per category
- Useful for portfolio growth tracking
generateDifference
Calculates difference between two categories.
- Requires
valuesparameter: “Category1, Category2” - Result: Category1 - Category2
- Single dataset
// values: "Income, Expenses"
// Output: Net income per period
generateCumulativeDifference
Cumulative difference over time.
- Running net total
- Tracks cash flow
generateSum
Adds two or more categories.
- Requires
valuesparameter - Result: Category1 + Category2 + …
- Single dataset
// values: "Expenses, House Expenses"
// Output: Total expenses per period
Dividend Generators
generateDividendMonthlyBySymbol
Monthly dividends split by symbol (subcategory).
- One dataset per symbol
- Shows dividend payments per stock
generateCumulativeDividendBySymbol
Cumulative dividends per symbol.
- Tracks total dividends received per stock over time
Report Generators
getLastValuePerTypeForCurrentMonth
Latest value per category for specified month.
- Defaults to current month if no
dateparameter - Useful for portfolio snapshots
- Output:
IReportData
// Returns: { label: "Portfolio", data: 150000, date: "2025-01" }
reportDifference
Category difference in report format.
- Text or table view
- Requires
valuesparameter
reportSum
Category sum in report format.
- Aggregated totals per period
- Table/text output
reportDividendAnalysis
Comprehensive dividend metrics per symbol:
- Total dividends received
- Average monthly dividend
- Payment count
- First and last payment dates
- Output:
IReportMultiData
Data Flow Example
Input CSV
Category,Subcategory,Value,TimeStamp,Extra
Income,Salary,5000,2025-01-15,Regular
Expenses,Rent,1500,2025-01-01,Monthly
Expenses,Food,300,2025-01-20,Weekly
Processing Steps
- Parse CSV (
getData())
[
{ category: "Income", subcategory: "Salary", value: 5000, timestamp: Date(2025-01-15), extra: "Regular" },
{ category: "Expenses", subcategory: "Rent", value: 1500, timestamp: Date(2025-01-01), extra: "Monthly" },
{ category: "Expenses", subcategory: "Food", value: 300, timestamp: Date(2025-01-20), extra: "Weekly" }
]
- Split by Month (
splitByYearMonth)
{
"2025-01": [
{ category: "Income", subcategory: "Salary", value: 5000, ... },
{ category: "Expenses", subcategory: "Rent", value: 1500, ... },
{ category: "Expenses", subcategory: "Food", value: 300, ... }
]
}
- Generate Sum Dataset (
generateSumDataSet)
{
labels: ["2025-01"],
datasets: [
{ label: "Salary", data: [5000], ... },
{ label: "Rent", data: [1500], ... },
{ label: "Food", data: [300], ... }
]
}
Custom Processing
Creating Custom Split Function
Add to methods.ts:
splitByCustom: {
help: "Description of split logic",
exec: (input: Array<IInput>, key: IDataSourceKeys) => {
return input.reduce((acc, current) => {
const customKey = /* your logic */;
if (!acc[customKey]) acc[customKey] = [];
acc[customKey].push(current);
return acc;
}, {});
}
}
Creating Custom Generator
Add to methods.ts:
generateCustom: {
help: "Description of generator",
exec: ({ categoriesToSelect, input, labels, categories, colors, values }) => {
// Transform input into IDataset or IReportData
return {
labels,
datasets: [/* your datasets */]
};
}
}
Performance Considerations
Split Function Performance
- O(n) complexity for all split functions
- Single-pass reduce operations
- Efficient for large datasets
Generator Function Performance
- Chart generators: O(n × m) where n = periods, m = categories
- Report generators: O(n) for most operations
- Dividend analysis: O(n × log n) for date sorting
Error Handling
Invalid data handling:
- NaN timestamps sorted to end
- Missing values default to 0
- Empty groups filtered out
- Invalid model names throw errors with helpful messages
Date Utilities
Helper functions in utils.ts:
getMonth(): Returns zero-padded month (01-12)getDate(): Returns zero-padded day (01-31)getToday(): Returns current date in ISO formatskipped(): Handles Chart.js segment gaps