convex-testing-interface
Safe HaskellSafe-Inferred
LanguageHaskell2010

Convex.ThreatModel.DatumBloat

Description

Threat model for detecting Datum Bloat Attack vulnerabilities.

A Datum Bloat Attack exploits validators that don't limit the size of data fields within their datums. Unlike the Large Data Attack (which adds extra constructor fields), this attack inflates existing fields - specifically lists and byte strings within the datum structure.

Consequences ==

  1. Increased execution costs: Processing bloated datums wastes CPU/memory execution units, making transactions more expensive.
  2. Permanent fund locking: If a list or bytestring field is bloated sufficiently:
  • Deserializing the datum may exceed execution unit limits
  • The transaction required to spend the UTxO may exceed protocol size limits

In these cases, the UTxO becomes permanently unspendable and funds are locked forever with no possibility of recovery.

Vulnerable Patterns ==

Pattern 1: Unbounded list fields ===

type Datum {
  owner: VerificationKeyHash,
  messages: ListByteArray  -- No list length limit!
}

An attacker can append arbitrarily many items to the messages list, bloating the datum beyond transaction limits. Caught by datumListBloatAttack.

Pattern 2: Unbounded ByteString fields ===

type Datum {
  owner: VerificationKeyHash,
  messages: ListByteArray  -- No ByteArray SIZE limit!
}

An attacker can replace small ByteArrays with huge ones (e.g., Hello -> 100KB). Caught by datumByteBloatAttack.

Mitigation ==

A secure validator should either:

  • Enforce maximum field sizes in the validator logic
  • Check list lengths explicitly (e.g., length messages <= maxMessages)
  • Limit ByteArray sizes (e.g., lengthOfByteString msg <= maxMsgSize)
  • Hash large data instead of storing it inline

This threat model tests if a script output with an inline datum still validates when list fields are bloated with additional large items, or when byte string fields are replaced with much larger ones.

Synopsis

List bloating attacks

datumListBloatAttack :: ThreatModel () Source #

Check for Datum Bloat vulnerabilities with default parameters.

Appends 5 items of 100 bytes each to every list found in the datum. If the transaction still validates, the script doesn't limit datum field sizes.

datumListBloatAttackWith :: Int -> Int -> ThreatModel () Source #

Check for Datum Bloat vulnerabilities with configurable parameters.

For a transaction with script outputs containing inline datums:

  • Recursively find all ScriptDataList fields in the datum
  • Append numItems large ScriptDataBytes items to each list
  • Each appended item is itemSize bytes of 0x42 (B)
  • If the transaction still validates, the script doesn't enforce field size limits - it only checks the fields it expects.

This catches vulnerabilities where validators have unbounded list fields (like a list of messages or a list of signatures) that can be exploited to bloat the datum beyond spendable limits.

datumListBloatAttackWith 5 100  -- Add 5 items of 100 bytes each
datumListBloatAttackWith 10 500 -- More aggressive: 10 items of 500 bytes

bloatLists :: Int -> Int -> ScriptData -> ScriptData Source #

Recursively bloat all list fields in a ScriptData value.

For ScriptDataList items, appends numItems copies of ScriptDataBytes (BS.replicate itemSize 0x42) to the list.

Recursively processes ScriptDataConstructor fields and nested lists.

For other ScriptData variants (Map, Number, Bytes), returns the value unchanged.

ByteString inflation attacks

datumByteBloatAttack :: ThreatModel () Source #

Test if ByteString fields in the datum can be inflated.

This catches validators that don't limit the size of individual ByteString fields (e.g., messages, names, arbitrary data).

The attack replaces every ScriptDataBytes field found at any depth (except the first field of the top-level constructor, typically an owner hash) with a much larger ByteString.

For a tipjar datum Con0(owner_hash, ["Hello"]):

  • owner_hash is preserved (first field must match for validation)
  • "Hello" inside the list gets inflated to 10KB of 0x42
  • Result: Con0(owner_hash, [bytes])
  • The validator checks: list.push([], bytes) == [bytes] → True!

This enables a DoS attack where an attacker can:

  1. Create a valid transaction with a small message
  2. Intercept/frontrun and replace the message with a huge ByteArray
  3. The bloated datum may exceed transaction limits for future spending

Default inflation size is 10,000 bytes (10KB).

datumByteBloatAttackWith :: Int -> ThreatModel () Source #

Check for ByteString inflation vulnerabilities with configurable size.

This attack is specifically designed to catch validators like tipjar that: 1. Allow adding items to a list 2. Check that list.push(old_items, new_item) == new_items 3. But DON'T limit the SIZE of new_item

The attack inflates only the FIRST item in lists (typically the newly-added item), leaving existing items unchanged so the structural check passes.

datumByteBloatAttackWith 10000   -- Inflate first list item to 10KB
datumByteBloatAttackWith 50000   -- More aggressive: 50KB

inflateBytes :: Int -> ScriptData -> ScriptData Source #

Replace all ScriptDataBytes with inflated versions.

Preserves the first field of the top-level constructor (typically an owner/address hash that must match exactly for validation).

Inflates all other ScriptDataBytes found at any depth with a ByteString of the given size filled with 0x42 (B).

For the tipjar use case, this inflates EVERY message in the list, which changes the structure too much. For validators that do structural checks like list.push(old_msgs, new_msg) == new_msgs, this will fail.

Use inflateFirstListItem for a more targeted attack that only inflates the first (newest) message in a list.

inflateFirstListItem :: Int -> ScriptData -> ScriptData Source #

Inflate only the FIRST ScriptDataBytes found in lists.

This is a more targeted attack for validators like tipjar that check: list.push(input_messages, new_msg) == output_messages

The validator only cares that the NEW message (head of the list) was correctly prepended. It doesn't check the SIZE of that message.

For a tipjar datum Con0(owner_hash, ["New", "Old1", "Old2"]):

  • owner_hash is preserved
  • "New" (first/newest message) gets inflated to 10KB
  • "Old1", "Old2" are left unchanged (must match input)
  • Result: Con0(owner_hash, [10KB, "Old1", "Old2"])

The validator check: * Input: ["Old1", "Old2"] * list.push(["Old1", "Old2"], 10KB) = [10KB, "Old1", "Old2"] * This equals the output! Vulnerability exploited.