Is there any stable serialization method for different languages?

In my project we have an API, and many clients may send transactions to this API. Transactions should be signed. Clients may be written in any language (C++, C#, python, go, whatever), with any CPU architecture and endianness.

The problem now is to serialize our Transaction model into bytes, in order to be able to sign and then send it.

Our team selected protobuf v3.3.0 (proto syntax = proto3) for this purpose.

We wanted to use envelope pattern, which looks like:

message SignedTransaction {
  message Transaction {/* any data that should be signed */}
  Transaction transaction = 1;
  Signature signature = 2;
}

To sign, we just serialize internal object Transaction:

Transaction tx = <...>;
std::string bytes = tx.SerializeAsString();
// and then sign bytes

The problem with protobuf now is that it seems it is not deterministic for different languages. Today we wrote simple proto file with few integers and string, filled with the same data, serialized it for different languages and observed results.

We tried Javascript, C++, Java, Swift and it turned out that everything except C++ produces the same output string:

JavaScript, Java, Swift produced: 08B90A10BA0A1A106C6F6C206B656B20636865627572656B

C++ produced: 8FFFFFFB9A10FFFFFFBAA1A106C6F6C206B656B20636865627572656B

C++ parseFromString(str) is able to deserialize string from other languages, but not vice versa.

Questions are:

Why C++ protobuf produces different string?
What libraries can we use for our use case?

Details:

// test.proto:
syntax = "proto3";
package api;

message Msg {
    uint32 a = 1;
    int32  b = 2;
    string c = 3;
    bytes  d = 4;
}

// test.cpp:
api::Msg msg;

msg.set_a(1337);
msg.set_b(1338);
msg.set_c("lol kek cheburek");

std::string str = msg.SerializeAsString();
// str = 8FFFFFFB9A10FFFFFFBAA1A106C6F6C206B656B20636865627572656B

Answer

It turned out that my code which prints hexstring had bug in it. details

Short answer: Protobuf is a stable serialization method and can be used for described use case.

Advertisement

Answer