Tezos operations: how to build, simulate and inject

This article explains the details how Tezos wallets originate (= deploy) smart contracts, without using any wallet, with a working OCaml code!

To read the code you have to be familar with OCaml.

The whole code is available at https://gitlab.com/dailambda/tezos-origination-demo.

RPC via JSON

JSON

We can access Tezos nodes via RPC and they talk in JSON over HTTP. Don’t confuse it with JSON-RPC.

Here is a small module Json to handle JSON data. It is based on Data_encoding. Tezos protocol provides lots of t Data_encoding.t to encode its data to+from JSON and binary and we use them in this example.

For simplicity, there is no error handling:

module Json : sig
  type t

  (** construction *)

  val obj : (string * t) list -> t
  val array : t list -> t
  val string : string -> t
  val z : Z.t -> t

  (** parsing *)

  val parse : string -> t

  (** accessors *)

  val get_field_exn : t -> string -> t

  (** alias of [get_field_exn] *)
  val (.%{}) : t -> string -> t

  val get_string_exn : t -> string
  val get_z_exn : t -> Z.t
  val get_array_exn : t -> t list

  (** printers *)

  val pp : Format.formatter -> t -> unit

  val to_string : t -> string

  (** construction using [Data_encoding] *)
  val construct : 'a Data_encoding.t -> 'a -> t
end

HTTP

For HTTP, we use Cohttp. It provides nice Lwt-based asynchronization, but here we are not interested in it and use them synchronous. Since we only use JSON in the communication with Tezo nodes, I made RPC module to get+post JSON data from+to the nodes:

module RPC : sig
  (** Tezos node to communicate *)
  val node : string

  val get : string -> Json.t
  val post : string -> data:Json.t -> Json.t
end

Network

Tezos has multiple networks. In this example, NEVER use Mainnet whose tokens have financial values. Use one of the testnets, such as ghostnet.

Tezos node

We need a Tezos node to inject the operation. Start your own testnet node or use one of public nodes, then set RPC.node by its URL string. There are lists of public nodes such as https://tezostaquito.io/docs/rpc_nodes/.

Protocol library

To construct the operation for the contract origination, we need a Tezos protocol library. Tezos regularly upgrades its protocol and there are 18 protocols(!) as of this writing, therefore there are 18 versions of protocol libraries.

The most appropriate one we should use is the one currently used in the node. For example, for protocol Mumbai, the best protocol library is Tezos_protocol_016_PtMumbai. For the basic operations like the contract origination, however, the latest development version of the protocol library Tezos_protocl_alpha is almost likely compatible. So, let’s use its protocol code:

open Tezos_protocol_alpha.Protocol

Keys, the account, and signatures

We are going to build an operation to originate a smart contract by an account. For this, we need:

  • The private key of the account
  • Some budget in the account
  • The public key revealed on the chain

Signature module

Elliptic curve cryptography signature algorithms are defined in Tezos_crypto.Signature and we use it a lot. Let’s define an alias:

module Signature = Tezos_crypto.Signature

Secret key

In this example we use the following secret key:

let secret_key_b58check =
  "edsk34AV748RcC45ZgnUpky8qgBkus19wZA8hAL8dDyLrihsrnVbdT"

Do NOT use your own secret key which carries some Tezzies in Mainnet. Feel free to use the above key: it’s Mainnet balance is 0, unless someone stupid sends her tokens to it.

The keys and hashes in Tezos are Base58Check encoded: it encodes data with a prefix text for easier identification.
For example, the Base58Check encoding of Tezos ED25519 secret keys always start with edsk. This prefix texts prevent you from revealing your secret key, thinking it is a public key.1 It also integrates a check-sum in it.

We get secret_key from its Base58Check encoding:

let secret_key : Signature.Secret_key.t =
  Signature.Secret_key.of_b58check_exn secret_key_b58check

Public key

The public key is derivable from the secret key:

let public_key : Signature.Public_key.t =
  Signature.Secret_key.to_public_key secret_key

Public key hash aka address

The public key hash is the hash of the public key. It is also known as the account address. Its Base58Check encoding has tz1 prefix (if it is an ED25519 key pair), which is familar with Tezos users:

(* The address is the hash of public key *)
let public_key_hash : Signature.Public_key_hash.t =
  Signature.Public_key.hash public_key

(* tz1WfwVfEDaVVnRu1TDGd4VqKRJiLsZA9nCS *)
let () =
  Format.eprintf "public_key_hash: %a@."
    Signature.Public_key_hash.pp public_key_hash

Budget for origination

Accounts must have some balance for the smart contract origination, since it is not free operation. In Tezos test nets there are several web services to distribute free tokens such as https://faucet.ghostnet.teztnets.xyz/.

Revealing the public key

To perform any operation, an account must reveal its public key (also known as the manager key in Tezos) to the blockchain beforehand. Otherwise the blockchain could not verify the operations’ signature using the public key. In this example, we assume the account’s public key is already revealed.

Of course, we could make a reveal operation and inject it to the node just like we do here for the origination.

Michelson smart contract

Tezos smart contracts are written in Michelson. To deploy a smart contract we need to parse Michelson expressions for the code and the initial storage and obtain Script_repr.lazy_exprs.

The syntax of Michelson and its syntax tree is called Micheline and it is defined in Tezos_micheline library. The opcodes of Michelson are protocol-specific and therefore defined in Tezos_proto_alpha.Protocol.Michelson_v1_primitives. By combining them we here define a parsing function of Michelson expressions:

module Michelson : sig
  val parse_lazy_expr : string -> Script_repr.lazy_expr
end

We use the following code and initial storage. For the origination operation itself, you do not need to understand them:

Code
parameter unit;
storage int;
code
  {
    CDR;
    PUSH int 1;
    ADD;
    NIL operation;
    PAIR;
  };
Initial storage
1

The both strings are parsed to Script_repr.lazy_expr:

let code = Michelson.parse_lazy_expr code

let storage = Michelson.parse_lazy_expr storage

We combine the code and storage to make script : Script_repr.t:

let script : Script_repr.t = { code; storage }

Next steps

Now we have all the required information for the smart contract origination:

  • Keys: the secret key, the revealed public key, and the public key hash
  • Some budget to perform an operation
  • Smart contract code and its initial storage value

The remaining things are:

  1. Build an operation data
  2. Ask the node to simulate it and retrieve the cost estimations and the address of the originated contract address
  3. Update the gas and storage limits from the estimated costs
  4. Finally, inject the operation
  5. Wait and see the operation included into a block

Build an operation

Here the goal is to make a value of type _ Operation_repr.contents_list.

We start from making the manager operation of the contract origination:

let operation : _ Operation_repr.manager_operation =
  Origination {
    delegate;
    script;
    credit;
  }
delegate
We are not interested to delegate the balance of the smart contract to a PoS delegater. Therefore, None:
let delegate : Signature.Public_key_hash.t option = None
script
This is the code and stroage of the smart contract. We have defined script above.
credit
The initial amount of the balance of the smart contract. This amount of tokens will be transferred from the account which deploys the contract. This time, 0 tez:
let credit : Tez_repr.t = from_Some @@ Tez_repr.of_mutez 0L

Then we make a value of Operation_repr.contents. We need to fill 5 more fields in addition to the operation defined above:

let contents : _ Operation_repr.contents =
  Manager_operation {
    source;
    fee;
    counter;
    operation;
    gas_limit;
    storage_limit;
  }
source
Source is the public key hash of the account deploying the contract. Here it is tz1WfwVfEDaVVnRu1TDGd4VqKRJiLsZA9nCS:
(* tz1WfwVfEDaVVnRu1TDGd4VqKRJiLsZA9nCS *)
let source : Signature.Public_key_hash.t = public_key_hash
fee
Fee is the tokens to be paid for the block validation. Operations with too low a fee can be ignored by validators. The optimal fee should be calculated from the gas and storage usage. At this stage, since we do not know the gas and storage usage, we use a fee arbitrarily:
(* 1 tez fee should be large enough for any origination *)
let fee = from_Some @@ Tez_repr.of_mutez 1_000_000L
counter
To prevent the replay attacks, each account has an integer counter. The operation must have the next number of the current counter value of the source account. We perform an RPC call to obtain the current counter of the source:
 let counter =
   Json.get_z_exn
   @@ RPC.get
     (Printf.sprintf
        "/chains/main/blocks/head/context/contracts/%s/counter"
        (Signature.Public_key_hash.to_b58check public_key_hash))

Then we increment it and convert the type to Manager_counter_repr.t:

let counter : Manager_counter_repr.t =
  from_Some @@ Manager_counter_repr.Internal_for_injection.of_string
  @@ Z.to_string @@ Z.succ counter
operation
We have already defined it above.
gas_limit and storage_limit
An operation fails either if its gas consumption exceeds its gas limit or its storage allocation exceeds its storage limit. At this stage we have no idea how much they should be. They will be decided from the result of the origination simulation. They have the upper limits per operation and we can get them by the following RPC calls:
% ./octez-client -E <NODE_URL> rpc get /chains/main/blocks/head/context/constants
{ ..,
  "hard_gas_limit_per_operation": "1040000",
  ..,
  "hard_storage_limit_per_operation": "60000",
  .. }

We can use these numbers for the simulation:

let gas_limit : Gas_limit_repr.Arith.integral =
  Gas_limit_repr.Arith.integral_of_int_exn 1_040_000
let storage_limit : Z.t = Z.of_int 60_000

Of course it is totally possible to get these values via RPC, instead of hard coding them.

Tezos operations can be combined into a group to make their execution atomic. Type Operation_repr.contents_list is for this grouping. Since we have only 1 manager operation here, we simply use Single constructor. If we need multiple operations executed at once, Cons must be used instead:

let contents_list : _ Operation_repr.contents_list =
  Single contents

Simulation

A simulation RPC call checks the validity of operations and estimate the gas and storage usage. Here, we build the data for the simulation RPC and post it to get the cost estimations.

Data preparation

First, we make an Operation_repr.protocol_data. This is a pair of the contents_list and a signature. A signature can be omitted here. (Later at the injection, we must sign a fixed contents_list with updated gas and storage limits):

(* We need no signature for simulation *)
let protocol_data : _ Operation_repr.protocol_data =
  { contents = contents_list;
    signature = None
  }

Then we combine this protocol_data with a shell header:

let packed_operation : Operation_repr.packed_operation =
  { shell= shell_header;
    protocol_data= Operation_data protocol_data
  }

Branch

When we send an operation to Tezos blockchain, we must declare the branch where the operation should be appended to. A blockchain is not a simple chain but have some conflicting branches when more than 1 child blocks are proposed at the same block level. But do not worry, the conflicts are quickly resolved by the branch selection agreement among the validators.

The branch for the operation should NOT be the latest head, but few blocks before it. It is because the latest head can be taken over by another block. Tezos now uses Tenderbake consensus algorithm with the deterministic finality where a block becomes final once it has 2 additional blocks on top of it.

We perform an RPC to get the block header of head~2, the grand parent of the current head:

(* Branch: we must specify the branch where the operation should be
   attached to. If we use the very [head], the operation is created
   but it is unlikely taken into a block.
*)
let header = RPC.get "/chains/main/blocks/head~2"

From this header we extract the following 3 information:

protocol
The version of the protocol used for the opration. Precisely speaking, this value can be different if a protocol change happens after head~2. But it is a rare event and the operation would be safely fail when it would happen.
(* See Json module for (.%{}) notation *)
let protocol =
  let open Json in
  get_string_exn header.%{"protocol"}
chain_id
It is a value to identify running chains. Currently we have only 1 chain, it is always the same:
let chain_id =
  let open Json in
  get_string_exn header.%{"chain_id"}
block_hash
The identifier of the branch:
let block_hash =
  let open Json in
  get_string_exn header.%{"hash"}

Now the branch block_hash is known and we can build a packed operation:

let shell_header : Tezos_base.Operation.shell_header =
  { branch = Tezos_crypto.Hashed.Block_hash.of_b58check_exn block_hash }

let packed_operation : Operation_repr.packed_operation =
  { shell= shell_header;
    protocol_data= Operation_data protocol_data
  }

Simulation RPC

There is an RPC entry /chains/main/blocks/head/helpers/scripts/run_operation to simulate an operation. Unfortunately it is not listed in the list of the RPCs of the current protocol. You can find the details in the protocol code. It takes a JSON object of 2 fields, one is operation and the other is chain_id. We make the object and post it to the RPC:

let operation_result =
  let data =
    let open Json in
    (* proto_016_PtMumbai/lib_plugin/RPC.Scripts.run_operation *)
    obj [ "operation", Json.construct Operation_repr.encoding packed_operation;
          "chain_id", string chain_id
        ]
  in
  let resp =
    RPC.post
      "/chains/main/blocks/head/helpers/scripts/run_operation"
      ~data
  in
  Format.eprintf "@[<2>simulation result:@ %a@]@." Json.pp resp;
  parse_result_contents resp

It returns a large JSON object of the simulation result. We extract the following information from it using parse_result_contents function:

originated_contract
The KT1 address of the originated smart contract
consumed_milligas
The amount of the consumed gas, in milligases (= 1/1000 gas unit)
storage_size
Newly allocated storage size in bytes

Updating the limits and fee

From the cost estimation, we update the gas and storage limits for the injection.

Gas limit

The simulation returns consumed_milligas in milligases. We use it for the gas limit changing the unit to gas:

(* Let's add the buffer of 100 gas unit to the gas_limit *)
let gas_limit =
  let open Z in
  Gas_limit_repr.Arith.integral_of_int_exn
    (Z.to_int (operation_result.consumed_milligas / ~$1000 + ~$100))

The actual gas used at the injection may vary from various reasons: the state of the source account may be different from one at the simulation because of transactions related with it. We add 100 more gas for the safety buffer.

Storage limit

It is pretty tricky but storage_size is not immediately usable for the storage limit for the origination. When a contract is originated, the protocol considers that 257 bytes more are used for the storage space of the new contract. This information can be obtained from the RPC get /chains/main/blocks/head/context/constants:

% ./octez-client -E <NODE_URL> rpc get /chains/main/blocks/head/context/constants
{ ..,
  "origination_size": 257,
  .. }

For the storage limit we must include this bytes:

let storage_limit : Z.t =
  let open Z in
  operation_result.storage_size + ~$257

Fee

In order to make the operation included in a block by a validator, we must set an appropriate fee for it. A larger fee is safer but we want to save our expenses.

There may be generous validators which take operations with 0 fees, but there is a standard fee calculation in the protocol code for the client:

(* in nanotez: 10^{-9} tez *)
minimal_fees_in_nanotez
+ minimal_nanotez_per_gas_unit * gas
+ minimal_nanotez_per_byte * storage

The default values of these parameters are defined here:

default_minimal_fees = 100 mutez = 100_000 nanotez
default_minimal_nanotez_per_gas_unit = 100 nanotez
default_minimal_nanotez_per_byte = 1000 nanotez

From these parameters and the gas and storage limits, we can guess the minimal fee:

let minimal_fee_in_mutez =
  let open Z in
  (default_minimal_fees_in_nanotez
   + default_minimal_nanotez_per_gas_unit * Gas_limit_repr.Arith.integral_to_z gas_limit
   + default_minimal_nanotez_per_byte * storage_limit
   + ~$999) / ~$1000

We add 10% for the safety:

let fee =
  Tez_repr.of_mutez_exn Z.(to_int64 (minimal_fee_in_mutez + minimal_fee_in_mutez / ~$10))

(Note: I am not still very confident about this fee setting.)

With the updated limits and fee, we rebuild the contents_list:

let contents : _ Operation_repr.contents =
  Manager_operation {
    source;
    fee; (* updated *)
    counter;
    operation;
    gas_limit; (* updated *)
    storage_limit; (* updated *)
  }

let contents_list : _ Operation_repr.contents_list =
  Single contents

Signing

Now we sign this operation with the source account’s private key. First, we combine contents_list with the shell_header:

(* Using GADT to hide the parameter type *)
let packed_contents_list : Operation_repr.packed_contents_list =
  Contents_list contents_list

(* This is the target of key signing *)
type unsigned_operation =
  (Tezos_base.Operation.shell_header * Operation_repr.packed_contents_list)

let unsigned_operation : unsigned_operation =
  (shell_header, packed_contents_list)

We encode this unsigned operation to a binary using Operation_repr.unsigned_operation_encoding:

(* sign target *)
let unsigned_operation_bin =
  Data_encoding.Binary.to_bytes_exn
    Operation_repr.unsigned_operation_encoding
    unsigned_operation

In Tezos, we have a signing convention to prefix a “watermark” byte of 0x03 to the operation binary. The watermarks are defined at Tezos_crypto.Signature_v1.watermark:

(* When singing an operation, we need a prefix called watermark 0x03 *)
let sign =
  Signature.sign
    ~watermark:Signature.Generic_operation (* 0x03 *)
    secret_key
    unsigned_operation_bin

Preapplication (optional)

Preapplication is optional but good to see what will happen without injecting the operation. Unlike the simulation, the preapplication requires a real signature:

let sign_b58check = Signature.to_b58check sign

let preapplication_arg =
  let open Json in
  array [
    obj [ "protocol"  , string protocol
        ; "branch"    , string block_hash
        ; "contents"  , Json.construct Operation_repr.contents_list_encoding packed_contents_list
        ; "signature" , string sign_b58check
        ]
  ]

let () =
  let resp =
    RPC.post
      "/chains/main/blocks/head/helpers/preapply/operations"
      ~data: preapplication_arg
  in
  Format.eprintf "@[<2>preapplication_result:@ %a@]@." Json.pp resp

Injection

The injection RPC does not take JSON but raw data either in hex or binary.

The signed operation in this format is just a concatenation of the unsigned operation in binary and the signature:

let signed_operation_bin =
  Bytes.cat unsigned_operation_bin (Signature.to_bytes sign)

We post this binary in hex to the RPC with ?chain=main argument:

let () =
  let resp =
    RPC.post
      "/injection/operation?chain=main"
      ~data: (Json.string (hex signed_operation_bin))
  in
  Format.eprintf "%a@." Json.pp resp

If the injection succeeds, we should see a hash of the injected operation, such as:

"onjQsLmtHBiPBsaPSCkLuykpczuHipraTefhHTNFfM4Vg62JVBV"

Wait and see the operation included in a block

We skip this final process in the example.

We monitor the incoming blocks following the branch block (head~2) and find one which contains the operation we have injected.

We call RPC get /chains/main/blocks/head/header. If it is a new block, then we scan it to its ancestors till we reach a block we have already seen. Thus we have the list of blocks after the branch to the current head. We repeat this process periodically (do not spam the node too much!), until we find the operation in one of the blocks.

If something goes wrong, if no validator takes the operation nto the block due to too low fee for example, we will never find the operation in the following blocks. We should need give up waiting after we do not find the operation in too many block levels. In that case, we can inject another operation with a higher fee. No worry of the double execution of the operations as far as they use the same counter values; only one of them can be taken into the blockchain.


  1. Some blockchains use the same hex representations for both secret and public keys. People sometimes misuse the keys. Accounts created using public keys of other accounts as secret keys were drained. ↩︎