This article assumes that you are familiar with the EUTxO model and with Cardano transactions at an abstract level (e.g. you can understand the information shown in a blockchain explorer such as CardanoScan).

Introduction

Development of dapps in Cardano is still in an early stage since the start of the Alonzo era that introduced smart contracts in September 2021. Currently, there is no standard approach that cover the complete end-to-end flow of applications, and most relevant projects had to develop in-house solutions for several aspects of dapp development. At the core of any dapp is the construction of well-formed Alonzo transactions, this is, transactions that are able to deal with Plutus scripts and their execution. For doing this, there is a number of libraries and frameworks that can be used, each with their pros and cons. Among these we can find the PAB (Haskell), the CLI (shell), the CTL (Purescript) and the CSL (Rust, Typescript). The usage of these libraries is very helpful for simplifying the development process. However, it is very likely that when testing starts, problems also start and the submission of freshly built transactions to a testnet fail with pretty esoteric error messages. In Cardano, apparently well-formed transactions can be rejected because of very subtle reasons. This is particularly true starting from the Alonzo era, where transactions gets complicated with new things related to Plutus, such as scripts, redeemers, data, etc. To be able to debug these errors, it is important to understand that, at the end of the day, Cardano transactions are just a raw sequence of bytes, and that there is a very detailed and precise description of their format: the Cardano Ledger specification. Any transaction not meeting the specification will be rejected. In this article we do an introduction to Cardano transactions at a low-level, following the Cardano Specification and with a focus on Plutus aspects introduced in the Alonzo era. We also take a practical approach, providing an example and highlighting typical aspects that may be source of errors.

CBORs

The first thing to know about the specification, is that transactions are defined and serialized using the CBOR format, defined first in RCF 0749 and then updated in RFC 8949. CBOR stands for Concise Binary Object Representation, a data format that can be seen as a “binary JSON”. This binary representation allows for more compact messages at the cost of human readability. Fortunately, CBOR messages can be easily encoded and decoded using any existing implementation of the CBOR protocol in a variety of languages, but also online by using the CBOR playground. CBOR is all around Cardano, as transactions themselves are encoded using this format, also do the Plutus data inside them, and even complete blocks of transactions are.

The Cardano specification/s

As Cardano has passed through different eras, the specification is splitted into several documents, one for each era, describing incrementally the modifications and additions that each era introduced. For each era there is also as part of the specification a text file that precisely defines the CBOR schema used for blocks and transactions. For this, the CDDL (Concise Data Definition Language) format is used, a notational convention defined in RFC 8610 that is used to describe CBOR data structures.

In this post, we will concentrate on the Alonzo era specification, the one that introduced all the smart contract functionality through Plutus.

First, an example: A simple smart contract

To learn about Alonzo transactions we will take an example transaction related to a very simple smart contract designed for this purpose. The smart contract is just a single script UTxO that holds an integer in its datum. It admits two operations: increment, to add 1 to the integer, and decrement, tu substract 1 from it. The UTxO also holds in its value an NFT we only use to identify it.

The example transaction we provide is an increment operation. It increments the integer from 47 to 48 in the datum, while preserving the NFT in the value. It can be illustrated the following way:

Here, besides the script input and output, there are also wallet input and output, that are used to pay for the fees and get back the change.

The raw transaction

Here you can download the raw transaction:

example.tx

The transaction was made in the preprod testnet, and it can be found in the Cardano explorer with hash 8ae88d7ee59eda5a7a95dd66e9cf123a89758f2ec31e73a5c65b4d9cf312f71c. However, the raw transaction is not available here or in any other blockchain explorer we know. To obtain it, we log it in the browser console just before submission. In the case of Nami, the raw transaction is also available under the “Details” section of the signing window:

Example of where to find the raw transaction in Nami.

As the transaction is in CBOR format, we can decode it using the CBOR playground to find something that looks like this:

[
  {
    0: [[h'A1D13B016FD106784482D2B2E1C85330090B3C27464D2163F752AD42730A2867', 0], 
        [h'A51F7E6EC66F1DB366F2A7AD63B8041B51B269CEBD5D52A140F6C5B7E069DFB7', 1]], 
    1: [[h'706EB7C29D9362DD710902977BB645EB97BB76C6246768851E76489F1A', 
            [2000000, {h'725BA16E744ABF2074C951C320FCC92EA0158ED7BB325B092A58245D': {h'': 1}}], 
            h'B034C17CF9EEF7E2D38FFF1EC8956C3A3C9FECE616E1CE03DF5860FEE81ADB1E'], 
       [h'002D106290A8EAEB46BDC5A5AD92401306263E77FAA00D3E7E60055C0784064433AEAE5D686016D7CA93DB82FD72D83AA67E55511B0AE9C3F8', 
            1247328024]], 
    2: 473726, 
   11: h'E1D6B63A31FD6FBE2BF9A7403160BEF41A6E03B31776BB6B140BF0C1B19494D4', 
   13: [[h'834E3714ED521A76CCAD0DFECCAB998C25EA068BFE120C2DDCF32D4555F79A6B', 7]]
  }, 
  {
    0: [[h'41BF65F9DBEACE48BDC6ABA5CC9149C16077EE2338A97292414FD0C1BB7D797E', 
         h'F2F0A179D98C04B51BC1C7A891708ADFEF9EB28818054E28FCA6CF769518E4DD148ED024208CA33F5ED996BD1A0A6AF46E1A941BDD54061EC47CA10A1D8D2C00']], 
    3: [h
    4: [121([121([47])]), 121([121([48])])], 
    5: [[0, 0, 121([]), [1302238, 360901332]]]
  }, 
  true,
  null
]

Quite a mess, right? Don’t let it scare you. Each line here has a meaning, and it is worth to understand it, or at least know where we can find a good explanation for it. Let’s see.

Transactions in the Alonzo era

The two main references to understand Cardano transactions in the Alonzo era are the specification document and the CDDL specification. The CDDL file serves as a good starting point, and in particular, we can find in line 13 the definition of a transaction:


transaction =
  [ transaction_body
  , transaction_witness_set
  , bool
  , auxiliary_data / null
  ]

Here, we can see that transactions are comprised of four parts. The two main parts of a transaction are its body and its witness set. In this article we focus on these parts, and ignore the other two, just saying that the third one has to do with transaction validity, and the fourth one is where the metadata goes, among other things.

The transaction body

We can find the schema for the transaction body at line 50 of the CDDL specification:


transaction_body = { 0 : set    ; inputs , 1 : [* transaction_output] , 2 : coin                      ; fee , ? 3 : uint                    ; time to live , ? 4 : [* certificate] , ? 5 : withdrawals , ? 6 : update , ? 7 : auxiliary_data_hash , ? 8 : uint                    ; validity interval start , ? 9 : mint , ? 11 : script_data_hash       ; New , ? 13 : set ; Collateral ; new , ? 14 : required_signers       ; New , ? 15 : network_id             ; New }

So, the transaction body is a map from integer keys to values of different types. Some of the entries, marked as ‘?’, are optional. In the comments (starting from ‘;’), we can see some explanations and which parts were introduced for Plutus (marked as “New”).

The mandatory parts are the inputs (field 0), the outputs (field 1) and the fee (field 2). In the example, also the fields 11 and 13 are present, which are always required if the transaction involves script execution.

Transaction inputs

Transaction inputs are listed in field 0 of the transaction body. A transaction input, you probably know, is a reference to a UTxO (Unspent Transaction Output). In other words, it is the output of a previous transaction that was not spent yet (i.e. used as input) by any other transaction. According to the CDDL, a reference to a UTxO is defined by the following pair:


transaction_input = [ transaction_id : $hash32                    , index : uint                    ]

where transaction_id is the hash of the transaction that generated the UTxO, and index is the index in the list of its outputs (starting from 0, obviously).

In our example, two inputs are present:


 0: [[h'A1D13B016FD106784482D2B2E1C85330090B3C27464D2163F752AD42730A2867', 0], 
      [h'A51F7E6EC66F1DB366F2A7AD63B8041B51B269CEBD5D52A140F6C5B7E069DFB7', 1]], 

So, the first input is the UTxO corresponding to the first output of transaction A1D13B…, and the second one corresponds to the second output of transaction A51F7E…. As we will see next, one of them corresponds to the smart contract, and the other one is used to pay for the transaction fees.

Inputs Information

To be able to understand and debug our transaction, it is important to know the information about the inputs, not present in the transaction itself. The information is comprised of these three components:

  • The address encoding the owner of the input, that for a wallet UTxO it is a pubkey hash and for a script UTxO it is the hash of the validation script.
  • The value it holds, a number of ADA and maybe some other assets.
  • An optional datum hash, in the case the UTxO encodes some data.

In our example, we can find this information by navigating the Cardano explorer:

From this information, it is clear that the first input is used to pay for the transaction, and the second one is the “smart contract”. Usually, when the off-chain code of a dapp builds a transaction, the inputs used to pay for it are introduced in a last stage called “balancing”. As a wallet may have several UTxOs, selecting which one/s will be used to pay is a complex subject called “coin selection”, something that is extensively discussed in CIP 2.

The ordering of the inputs

In the CDDL specification it can be seen that the inputs are in a set, not a list. Why a set? Well, Cardano doesn’t allow us to choose how to order the inputs. The ordering we use in the serialized raw transaction is completely ignored. Instead, the specifications assumes that the inputs are ordered lexicographically in the pair (transaction_id, index) :

Alonzo specification, section 4.1, p. 15.

This is important, because in the redeemers we will use indexes to refer to positions in the list of inputs following this ordering criteria. We will talk about this later.

Transaction outputs

Transaction outputs (field 1) are a bit more complex than inputs, as they are new UTxOs that are being created by the transaction. Their CDDL specification is:


transaction_output =  [ address  , amount : value  , ? datum_hash : $hash32 ; New  ]

The components are: the raw value of the address where the UTxO is paid to, the value it will contain and an optional datum hash it can also carry. The datum hash can be used to encode data in the UTxO, and was introduced in Alonzo to store “state” information for script UTxOs. While not forbidden, datum hashes are rarely used in wallet UTxOs.

In the example we have two outputs:

  • Output 0:
  • Raw address: 706EB7...
  • Value: [2000000, {h'725BA1...': {h'': 1}}]
  • Datum hash: B034C1... (encodes the integer 48, see section “The Plutus data” below)
  • Output 1:
  • Raw address: 002D10...
  • Value: 1247328024

The first one corresponds to the script address where the smart contract lives. The value is a list because it does not contain only ADA but also another asset: An NFT with currency symbol  725BA1… (aka policy) and an empty token name. You can see how values are specified in line 379 of the CDDL. The datum hash is encoding the new integer value: 48.

The second output is the “change”, the remaining ADA value that goes back to the wallet that paid for the transaction. This output is usually introduced in the balancing stage of the transaction building process.

The script data hash

Field 11 of the transaction body is the script data hash, also called ScriptIntegrityHash in the specification document. This hash encodes information that determines the results of scripts execution otherwise not present in the body. The encoding includes the redeemers and the data, both from the witnesses (see below), but also the protocol parameters that determine the costs and limits for script execution.

Computing this field is a bit complex and all transaction libraries that support Plutus are able to do this for you. However, you must have present that any modification you do to a transaction that may alter the script data hash, requires recomputing and updating the value of this field.

Collaterals

Starting from Alonzo, transaction validation is divided into two phases, where phase 1 involves all basic checks for transaction correctness, and phase 2 is comprised of the execution of all the involved Plutus scripts. If the validation in phase 2 fails, the transaction is rejected but a penalty must be applied to cover the execution costs (and discourage failing transactions). This is the collateral, a set of inputs that is spent in this case.

Collaterals (field 13) must be wallet UTxOs, can only contain ADA and the included signatures must allow their spending. An interesting observation is that the same inputs can be used as regular inputs and as collateral, because only one of the two sets will be spent.

In our example, the collateral is a single input with 5 ADA, a standard amount for collaterals. Light wallets such as Nami and Eternl provide functionality to create a UTxO suitable to be used as collateral.

Required signers

The required signers (field 14) is a set of hashed keys that can be used to require additional signatures besides those required to spend wallet UTxOs. If a key is present but the corresponding signature is not in the witness set, the transaction will fail in phase 1.

The required signers set is also made available to the Plutus script executions through the “script context”. This way, validation scripts can do indirect checks on the presence of signatures, as the script context doesn’t explicitly include them.

Our example doesn’t have required signers, but it is an important component because checking for signatures is a frequent requirement in smart contracts.

Other relevant body fields

So far we only covered fields 0, 1, 11, 13 and 14 of the transaction body. Of course, there are other important fields. We briefly describe here the ones we find interesting:

  • Field 2: The fee paid by this transaction (in Lovelace). It must be enough to cover for the costs related to transaction size and script execution units.
  • Fields 3 and 8: The “time to live” (TTL) and the “validity interval start”. Together, they form the ValidityInterval defined in the specification document (Fig. 2). It is the slot range where we expect the transaction to be executed, and phase 1 will fail if it is not the case. If phase 1 succeeds, the interval information is then converted to a POSIX time range and passed to the scripts, allowing for phase 2 checks.
  • Field 9: The minted value, all assets that are being minted or burned in the transaction. Minting can be done using pre-Alonzo simple scripts (defined in field 1 of the witness set) or using Plutus minting policies (included in field 3 for the witness set). For the latter, redeemers must be specified, and for this a lexicographical ordering in the policy IDs is assumed (see below in section about redeemers).

The witness set

Here is the CDDL specification for the witness set:


transaction_witness_set =  { ? 0: [* vkeywitness ]  , ? 1: [* native_script ]  , ? 2: [* bootstrap_witness ]  , ? 3: [* plutus_script ] ;New  , ? 4: [* plutus_data ] ;New  , ? 5: [* redeemer ] ;New  }

You can see that all fields are optional. However, field 0 will be present as it contains the transaction signatures and at least one signature is always required. For Plutus, the most relevant fields are the last three, so we address them in the following subsections.

The Plutus scripts

Field 3 is the list of Plutus scripts, this is, the binaries of the Plutus Core code for all the Plutus scripts that must be executed to validate the transaction, both for consuming script UTxOs and for minting Plutus assets.

Plutus scripts are without doubt the biggest part of Alonzo transactions and an important source of headache for any engineer trying to develop a meaningful dapp without hitting transaction size limit of 16kB.

For instance, in our example the Plutus script takes up to 4353 bytes, more than 45% of the total transaction size (9666 bytes).

Fortunately, the Babbage era, recently started with the Vasil hardfork, introduced “reference scripts” (CIP 33), a feature that provides a way to use scripts without the need for explicitly including them in transactions. We will leave this discussion for a future article.

The Plutus data

Field 4 is the Plutus data, a list that has the unhashed datums of all datum hashes present in the transaction inputs and outputs. These datums are made available to the Plutus validators through the “script context”, so checks can be made on them.

In the example, the Plutus data is:


4: [121([121([47])]), 121([121([48])])],

The first one corresponds to the datum hash of the script input, and the second one to the datum hash of the script output. In these datums, the integers are wrapped into some other data constructors, but this is just the way we chose to encode them, and has to do with the Haskell data structures we defined for the contract state.

The redeemers

Field 5 is the list of redeemers. Each redeemer refers to the execution of a Plutus script.

The CDDL specifies that a redeemer is as follows:


redeemer = [ tag: redeemer_tag, index: uint, data: plutus_data, ex_units: ex_units ] ; Newredeemer_tag = ; New    0 ; inputTag "Spend"  / 1 ; mintTag  "Mint"  / 2 ; certTag  "Cert"  / 3 ; wdrlTag  "Reward"ex_units = [mem: uint, steps: uint] ; New

So, a redeemer is a 4-uple with the following components:

tag: It specifies the type of redeemer, and has four possible values. Value 0 is used for spending a script UTxO, value 1 for minting/burning with a Plutus minting policy. Don’t worry about values 2 and 3 as they has to do with staking.

index: It is an integer with a different meaning depending on the tag. For the spending tag (value 0), the index refers to the position in the inputs list after ordering it lexicographically according to the TxId and TxIdx. For the minting tag (value 1), the index refers to the position in the lexicographically ordered list of policy IDs present in the minting field.

data: This is arbitrary data that is passed as a parameter to the script. Most times this data is what is actually called the “redeemer”, instead of the complete 4-uple. This is the case of the Plutus Haskell code, in particular the Redeemer type, and in the Plutus documentation.

ex_units: The budget for the script execution in memory and CPU units. These numbers are used to compute the fee and must be higher or equal to the actual units used by the script execution. Execution units are computed according to the cost model, part of the protocol parameters of the Cardano blockchain. There is also a limit of the total memory and CPU units that all redeemers of a transaction can use. Also defined in the protocol parameters, Alonzo started with limits of 10.000 million units for CPU, and 10 million units for memory. However it was soon noticed that the memory limit was too low and was later raised to 16 million units. Execution units and their limits are a big issue in Cardano, as they impose important restrictions to smart contracts and developers must pay special attention to on-chain code optimization.

In our example, we have only one redeemer for spending the script UTxO that is the first input according to the lexicographic order:


5: [[0, 0, 121([]), [1302238, 360901332]]]

We use 121([]) as the redeemer data to indicate to the script that we are trying to perform an increment operation. If it was 122([]), it would be a decrement operation. The script will validate, among other things, that the datum is updated according to the operation. The execution budget is 1302238 memory units and 360901332 CPU units, and was obtained in the balancing stage by doing an off-chain run of the validation.

Trivia

To end, I propose you to try to answer the following questions. Thinking about them can be an interesting way to improve your understanding of the Cardano specification.

  1. Is it possible to successfully submit a transaction with no wallet inputs?
  2. Why it is not possible to successfully submit a transaction without signatures?
  3. Is it possible to successfully submit a transaction with an empty list of inputs?
  4. Is it possible to successfully submit a transaction with an empty list of outputs?
  5. Answer True or False to the following assertions:
  6. Every transaction needs a collateral.
  7. Every transaction with script inputs needs a collateral.
  8. No transaction with no script inputs needs a collateral.

Last considerations: The Cardano specification was for us a priceless source of knowledge in our process of building and debugging our dapps. Subtle but crucial information can be found in the specification documents, such as the implicit lexicographical ordering of the inputs. CIPs also play an important role in standarization and can be helpful for dapp development.What we covered here were the most important aspects of Alonzo transactions, but there is much more to be discovered in the specification. Not only for Alonzo, but also for previous Cardano features, and for recent updates introduced in the Babbage era. Interesting topics are, among others, protocol parameters, fee calculation, validity intervals, metadata, reference inputs and reference scripts.We expect to cover some of these aspects in future articles.

Authors

Franco Luque