40 Terms Defined

Financial Data Glossary

Definitions for key terms in financial data operations — from data aggregation and ETL to data lineage, DataOps, and financial services terminology.

Jump to letter

A

Alternative Data

Non-traditional data sources used by investment managers to generate alpha — including satellite imagery, web traffic data, credit card transaction data, and social media sentiment. Alternative data requires specialized aggregation and normalization pipelines distinct from traditional financial data feeds.

API (Application Programming Interface)

A defined interface that allows two software systems to communicate and exchange data. In financial services, REST APIs are increasingly replacing FTP for data delivery — providing encrypted transport, structured authentication, real-time capability, and queryable access to specific data rather than full file delivery.

AUM (Assets Under Management)

The total market value of investments managed by a financial institution on behalf of clients. AUM is a primary size metric for investment managers, and data operations complexity generally scales with the number of accounts, custodians, and asset classes covered, which itself correlates with AUM.

B

Batch Processing

A data processing approach that accumulates records over a period — typically one business day — and processes them all at once in a scheduled run. Batch processing is the foundation of legacy financial data pipelines and is increasingly being supplemented or replaced by near-real-time or real-time processing for latency-sensitive use cases.

Broker-Dealer

A financial firm that executes securities transactions on behalf of clients (broker) or for its own account (dealer). Broker-dealers generate transaction data, trade confirmations, and position reports that must be aggregated and normalized for downstream reporting and compliance systems.

C

Cloud Data Platform

A data infrastructure platform hosted and operated in cloud environments (AWS, Azure, Google Cloud) rather than on-premises servers. Cloud data platforms provide financial institutions with scalability, geographic redundancy, enterprise-grade security, and managed infrastructure that would be impractical to replicate with on-premises deployments.

Custodian

A financial institution that holds and safeguards financial assets — securities, cash, and other instruments — on behalf of institutional investors. Custodians report holdings, transactions, income, and corporate actions to their clients through data feeds that must be aggregated and normalized for portfolio management and reporting. Major institutional custodians include BNY Mellon, State Street, Northern Trust, and J.P. Morgan.

D

Data Aggregation

The process of collecting data from multiple sources, consolidating it into a unified dataset, and making it available for downstream use. In financial services, data aggregation specifically refers to collecting financial data from custodians, fund administrators, market data vendors, and other sources into a single normalized view of assets, performance, and risk.

Data Audit Trail

A complete, tamper-evident chronological record of all data operations — including source identification, receipt timestamp, transformation steps applied, validation results, manual exceptions, delivery confirmation, and access log. Audit trails are required for regulatory examinations under ERISA and SEC rules, and are a key differentiator between modern data platforms and legacy FTP infrastructure.

Data Distribution

The process of delivering processed and normalized financial data to all downstream consumers — portfolio management systems, reporting platforms, analytics tools, client portals, regulatory submission systems, and data warehouses. Distribution is the final stage of the financial data pipeline and must support multiple delivery methods (API, file, direct database) and formats simultaneously.

Data Fabric

An architectural approach to data management that creates a unified layer of data services — access, integration, governance, and security — across heterogeneous data environments. Data fabric enables consistent data management practices across on-premises and cloud environments without requiring all data to be moved to a central repository.

Data Feed

A continuous or scheduled stream of financial data delivered from a source system — typically a custodian, market data vendor, or fund administrator — to a consuming institution. Data feeds can be delivered via API (continuous or on-request), SFTP (scheduled file delivery), or event-driven mechanisms (webhooks, message queues).

Data Governance

The framework of policies, standards, processes, and accountabilities that ensure data is accurate, secure, and compliant throughout its lifecycle. For financial institutions, data governance covers data quality standards, lineage documentation, access control, retention policies, and regulatory compliance obligations imposed by ERISA, SEC rules, GDPR, and other frameworks.

Data Labeling

The process of annotating or tagging data with metadata that classifies, categorizes, or enriches it for downstream use. In financial data operations, labeling includes classifying transactions by type, tagging positions by asset class or strategy, and annotating data with institutional-specific identifiers that differ from the identifiers used by the source system.

Data Lineage

The documented history of a data point's origin and transformation — where it came from, every system it passed through, every transformation applied, and its current form and location. Field-level lineage enables financial institutions to trace any number in any report back to the original source data, satisfying regulatory requirements for data provenance documentation.

Data Mesh

An organizational approach to data architecture in which data is treated as a product owned and managed by domain teams rather than centralized in a single data platform. In financial services, a data mesh structure might assign data product ownership to the investment team, risk team, and operations team respectively, with each team responsible for the quality and availability of their data domain.

Data Normalization

The transformation of data from multiple sources — each with different field names, formats, conventions, and standards — into a consistent master data model. Financial data normalization is complex because each custodian, administrator, and data vendor uses different representations for the same concepts, requiring field-by-field mapping specifications.

Data Pipeline

The end-to-end infrastructure — technology, configuration, processes, and monitoring — that moves data from sources through transformation and validation to downstream consumers. A financial data pipeline includes source connectivity, ingestion scheduling, format parsing, transformation, quality validation, and distribution to all required destinations.

Data Provenance

The documentation of data's origin, custody, and transformation history. Data provenance answers the question: where did this number come from, and how was it derived? In financial services, data provenance is required to satisfy regulatory examination questions about the accuracy and completeness of regulatory filings and client reports.

Data Quality

The degree to which data meets defined standards across five dimensions: accuracy (values correctly represent reality), completeness (all expected data is present), consistency (data is represented the same way across systems), timeliness (data is available when needed), and validity (values conform to defined formats and ranges). Data quality failures in financial services directly affect investment decisions, regulatory filings, and client reporting.

Data Reconciliation

The process of comparing data from two or more sources that should agree, identifying discrepancies, and resolving them. In financial data operations, reconciliation compares custodian-reported positions against internal records, fund administrator NAVs against manager-calculated values, and transaction data across multiple systems — with discrepancies investigated and resolved before data is used in reports.

Data Transformation

Any processing step that changes the structure, format, or content of data as it moves through a pipeline. Financial data transformation includes format conversion, field mapping, value normalization, unit conversion, derived field calculation, aggregation, and filtering — all steps required to convert raw source data into the format required by downstream systems.

Data Vendor

An organization that provides financial data to institutional clients as a commercial service. Categories include market data vendors (Bloomberg, Refinitiv/LSEG, FactSet), benchmark providers (MSCI, S&P Dow Jones), reference data vendors (FactSet, ICE Data Services), and alternative data providers. Managing data vendor relationships — including contracts, delivery standards, and quality SLAs — is a significant operational function for financial institutions.

DataOps

The application of DevOps principles — automation, continuous monitoring, rapid iteration, and cross-functional collaboration — to data operations. DataOps treats data pipelines as managed infrastructure rather than collections of ad hoc scripts, automates quality checks, enables operations teams to modify data workflows without IT involvement, and applies continuous improvement disciplines to data quality and reliability.

E

ELT (Extract, Load, Transform)

A data integration pattern that extracts data from source systems, loads it into the destination system in raw form, and applies transformations within the destination. ELT is increasingly preferred over ETL for cloud data warehouse environments (Snowflake, BigQuery, Redshift) where compute is cheap and transformation logic benefits from the destination system's processing capabilities.

ETL (Extract, Transform, Load)

The classic data integration pattern that extracts data from source systems, transforms it into the target format, and loads it into the destination. ETL has been the standard approach for financial data integration for decades, though modern platforms increasingly use ELT patterns or streaming architectures that do not fit the traditional ETL model.

F

Family Office

A private wealth management firm serving one (single-family office) or multiple (multi-family office) ultra-high-net-worth families. Family offices typically manage complex portfolios spanning traditional and alternative assets across multiple entities and custodians, requiring sophisticated data aggregation to produce consolidated views that commercial custodian platforms cannot provide.

Financial Data Aggregator

A platform or service that connects to multiple financial data sources and consolidates the data into a unified format for downstream consumption. Institutional financial data aggregators — like FyleHub — focus on B2B data flows between financial institutions, custodians, and fund administrators, as distinct from consumer-facing aggregators (Plaid, Yodlee) that aggregate retail bank account data with consumer permission.

FTP (File Transfer Protocol)

A legacy network protocol designed in 1971 for transferring files between computers. FTP transmits data — including credentials — in unencrypted plaintext, provides no native audit trail, and lacks the monitoring and access control capabilities required by modern financial data governance standards. FTP is increasingly being replaced by SFTP, FTPS, or API-based data delivery.

FTPS (FTP Secure)

An extension of FTP that adds TLS encryption to the FTP protocol. FTPS is more secure than plain FTP but less commonly supported than SFTP, can be blocked by certain firewalls due to its use of multiple ports, and still lacks the audit trail and operational management capabilities of modern API-based platforms. FTPS is a transitional technology rather than a long-term destination.

M

Master Data Management

The discipline of creating and maintaining a single, authoritative master record for key data entities — securities, counterparties, accounts, and instruments — across an organization. In financial data operations, master data management ensures that the same security is represented consistently across all systems and that custodian-specific identifiers are mapped to internal master identifiers.

Multi-Custodian Data

Financial data aggregated from multiple custodians holding assets for a single institution or client. Multi-custodian data aggregation is one of the core challenges in institutional financial data operations — each custodian delivers data in a different format, on a different schedule, with different field definitions — requiring normalization to produce a unified view of assets.

O

Open Banking

A regulatory and industry framework that requires financial institutions to provide third-party providers access to customer financial data through APIs, with customer consent. Open banking standards (like PSD2 in Europe, CDR in Australia) are driving API-based data sharing in retail banking and are beginning to influence institutional financial data standards.

P

Pension Fund Administrator

A specialized service provider that manages the recordkeeping, reporting, and compliance functions for defined benefit or defined contribution pension plans. Pension fund administrators aggregate data from custodians, investment managers, actuarial firms, and benefit systems to produce trustee reports, regulatory filings, and member communications.

R

Real-Time Data Processing

Data processing that operates on each record as it arrives, making results available within seconds or minutes. Real-time processing enables continuous risk monitoring, intraday portfolio visibility, and immediate response to market events. Architecturally, it requires event-driven infrastructure — message queues, stream processing engines, and event-triggered distribution — distinct from batch processing architectures.

REST API

Representational State Transfer Application Programming Interface — the dominant architectural style for web-based APIs. REST APIs use standard HTTP methods (GET, POST, PUT, DELETE) to enable data exchange between systems. In financial services, REST APIs are the primary technology replacing FTP for data delivery, offering TLS encryption by default, OAuth 2.0 authentication, structured error handling, and real-time or on-demand data access.

RIA (Registered Investment Advisor)

An investment advisor registered with the SEC or state regulators that provides investment advice for a fee. RIAs are subject to SEC books and records requirements that mandate retention and auditability of investment data. Many RIAs aggregate client account data from multiple custodians (Schwab, Fidelity, TD Ameritrade) to support portfolio management and client reporting.

S

SFTP (Secure File Transfer Protocol)

A network protocol that provides encrypted file transfer over SSH. Unlike plain FTP, SFTP encrypts the entire session including authentication. SFTP is the most common file delivery method for financial data between institutions and is a significantly more secure alternative to FTP — though it still lacks the audit trail capabilities, real-time delivery, and operational management features of modern API-based platforms.

W

Wealth Management Platform

A technology platform used by wealth managers and RIAs to manage client relationships, portfolios, and reporting. Wealth management platforms (Tamarac, Orion, Envestnet, Black Diamond) consume aggregated and normalized custodian data to power portfolio analytics, performance reporting, and client-facing portals. Data aggregation platforms like FyleHub are frequently used to feed normalized custodian data into these systems.

About This Glossary

QWho is this glossary written for?

This glossary is written for financial services professionals who need to understand financial data terminology — including operations leaders, compliance officers, technology executives, and advisors who work with financial data but may not have a data engineering background. Technical terms are explained in plain language with financial services context.

QHow are terms kept current?

Terms are reviewed and updated annually to reflect changes in regulatory standards, technology, and industry practice. If you notice a term that needs updating or would like to suggest an addition, please contact the FyleHub team.

See It in Action

See These Concepts in Practice

Book a demo to see how FyleHub handles financial data aggregation, normalization, and distribution for institutional investors.

Purpose-built for institutional finance. Enterprise-grade security.