Lesson 04 · 6 min

Harmonizing the Data

UTI, UPI, and the CDE: one global dictionary of fields, and the package spread that finally has a home.

Same fields, everywhere

The last lesson said the rewrites brought "globally harmonized critical data elements." That phrase is this lesson. The breaks you just saw had two roots, and only one was about interpretation. The other was cruder: regimes asked for different fields. If London wants the notional as a schedule and New York wants a single current number, the same trade reported to both can never line up, no matter how carefully each side reads its own rules. So before anyone could share code, the regulators had to share a vocabulary.

They built it through CPMI and IOSCO, the global standard-setters that sit above the national regulators for exactly this kind of plumbing. Three pieces, published between 2017 and 2018 and then written into the rewrites: the UTI, which trade, one identifier both sides cite so reports can be paired, the one you have used all along; the UPI, what product, a single code standing for "USD fixed-float interest rate swap" so a product is named the same way everywhere, live in reporting since 2024; and the CDE, the critical data elements, every other field that matters, each pinned to one definition, one format, and one set of allowable values.

one vocabulary the regimes agreed on, before sharing any codeUTIwhich tradeUPIwhat productCDEevery other fieldone dictionaryone definition eachCFTCEU / UK EMIRASIC, MAS, …set by CPMI-IOSCO, adopted by every rewriteharmonized data; the shared logic to derive it comes next

What harmonized buys: the package, settled

Take the earlier curve trade, the two swaps Pershing Square executed with Goldman at a single 12.5 basis point spread. The problem there was not that anyone misread the rule. It was that the 12.5 belonged to the package, every field on the report belonged to a single leg, and no agreed field held a package-level number. The harmonized dictionary closes that gap directly, and the way it does it is worth seeing in full.

{} the same package, harmonized
{  "packageIdentifier": "GS-PKG-2026-0415-07",  "leg1": {    "uti": "…K528CRV1",    "fixedRate": 0.0341,    "price": "not applicable"  },  "leg2": {    "uti": "…K528CRV2",    "fixedRate": 0.03535,    "price": "not applicable"  },  "packageTransactionSpread": 12.5,  "packageSpreadNotation": "basis points"}
Each swap keeps its own UTI and its own fixed rate. The package identifier links them, and the 12.5 sits in a field built to hold it. Nothing has to be folded, split, or dropped.

Two things changed. First, the package became addressable: a Package identifier links the two swaps' UTIs, so a reader can see they are one trade in two parts, and the 12.5 moved into Package transaction spread, a field that exists precisely to hold the level a package traded at. Second, the leg reports stopped pretending to carry a price. A vanilla interest rate swap has no single price; its economics are a rate, the fixed rate on each leg plus any floating spread. So the harmonized rules mark the generic Price element not applicable for the product and let the rate fields do the work. "Report the price" was never a question an interest rate swap could answer, and the dictionary makes that official.

Key ideas
  • Three standards, one vocabulary

    UTI for which trade, UPI for what product, CDE for every other critical field. Set above the national regulators, by CPMI-IOSCO, and adopted by every rewrite. It is why a 2024 EU report and a 2024 US report finally describe the same trade with the same words.

  • Each field pinned three ways

    A CDE element fixes what a field means, how it is written, its notation, and which values are legal. "Notional" stops being whatever each firm assumed, and a repository can validate every submission against one definition.

  • The package gets its own fields

    A Package identifier links the components; Package transaction price or Package transaction spread carries the package-level economics. That 12.5 finally has a home, and no leg has to absorb a number that was never its own.

  • Harmonized data, not yet harmonized logic

    A shared dictionary says what to report and what each field means, not how to derive it from a given trade. Two firms can read the same CDE definition and still compute it differently. Closing that last gap is the next lesson: the DRR.

Where price hides, and where it does not

The Price element is not useless; it is conditional. For a cash equity or a bond, price is the number that moves, and reporting it is the point. For an FX forward it is the agreed rate. For a vanilla interest rate swap there is no such single number, so the harmonized rules carry the meaning in the fixed-rate and spread fields and mark the generic Price not applicable. Part of the CDE's job is exactly this bookkeeping: saying, per product, which fields are the real ones, so that "the price" is never left to a local guess.

Try itOpen TradeIdentifierTypeEnum in the explorer. The CDM carries the UTI as a first-class identifier type, the same UTI the dictionary uses to pair the two sides of a trade. Harmonized identifiers in, joinable reports out, which is the half of the problem a shared dictionary can solve on its own.
◆ Checkpoint

01What do the UTI, UPI, and CDE standardize, respectively?

1 / 3