Friday, April 29, 2022

A better C++ than C++?

tl;dr -- Is Rust the language that C++ promised but failed to deliver? It's still unclear.

Programmers are adults. We don't need our hands held, and we certainly don't need to be told "Don't do that!" The only kind of advice that is needed is "I see what you are *trying* to do. This might be a better way to do it..."

Case in point: reading data from external sensors. I am still working on the rollercoasterometer (now for 13 glorious years!) and am doing my usual fighting with C++ about getting formatted data out of a byte buffer. Back in the bad old days, we would cast the address of the buffer as a pointer to a struct, and read the data out directly. Then the compiler writers, with their fancy-smancy alignment and such, said to not do that, because there is no guarantee that the structure will line up with what you think. The compiler is free to put any amount of padding between fields, order the fields however it likes, put invisible fields like vptr, etc. 

That by itself isn't bad. The bad part is when I ask how to do what I want, and the answer comes back: "Don't". There is no portable way to guarantee that any field in a struct lands anywhere. This way we can write C++ targeting the Apollo Guidance Computer, with its 15-bit words, no such concept as "byte", ones-complement arithmetic with +0 and -0, etc. It doesn't matter that basically every machine in the last 40 years has been twos complement, 8-bit bytes, and word size of a power of 2 bytes. Basically the only disagreement is endian-ness. 

But, C++ won't let me take advantage of the fact that both machines in a transaction have the same native word format. No, I have to individually extract each byte, shift and OR it myself, etc, to get the data out. If I am lucky, the compiler will see that I am translating from English to English, and optimize it out.

Game designers learned long ago to learn from their users. If all the Minecraft users are building farms, then support the building of farms. If the farm depends on a glitch, consider formalizing the glitch and making it an official feature. Don't just shower them with whatever they are farming for "free", but don't take away their ability to farm either. For instance, Mojang has considered in several instances changing the mechanics of how villagers and iron golems work to discourage farming iron. They got quite a bit of pushback from the community, and have therefore backed off. They don't "support" iron farming, but they haven't removed it either.

The C++ committee on the other hand seems to be driven by two factors:

  1. Backward compatibility indefinitely into the past
  2. Ability of compiler writers to game benchmarks
C++ has a lot of good ideas (some might say too many) but it is a language which has evolved for decades while at the same time hasn't been able to shed old, bad ideas or old, bad implementations of good ideas. 

C++ for whatever reason also has a burning hatred for the preprocessor. The preprocessor and compiler are married, but they are both trapped in a loveless marriage, and the simmering hatred has boiled over to the point where the official C++ FAQ considers macros to be "evil". Now the preprocessor does have some minuses, mainly in type checking. So, we were given constants and templates. We were told that templates would basically eliminate the need for macros. However, when we tried to use them like that, we found that the promise has not quite been kept. Templates do some but not all of the compile-time processing that we want. Constexpr functions are helping, but aren't there yet. 

My use case is that I want to make a self-documenting logger. The rollercoasterometer reads a bunch of sensors, formats the data into packets, then writes the packets to a file on the SD card. Since the data from the sensors doesn't naturally come in packets, and doesn't naturally have a timestamp, the main firmware timestamps the data and formats it into packets. Since I write the firmware, I also have to write the code on the host which interprets the packets. I came up with a [[clever idea]] to have the program emit a series of packet-documentation packets as it starts up. One way to do it is to have the program create a documentation packet the first time it emits each packet, and I have done this. It looks something like this:

void start_packet(apid, packet_desc_str) {
  if apid is not yet documented:
     write a documentation packet for this packet, using the packet_desc_str
  start the packet, perhaps to a backup buffer if we are documenting the packet
}
void fill<type>(data_value, field_desc_str) {
  if apid is not yet documented:
    write a documentation packer for this field, using the field_desc_str
  write the field to the packet, perhaps to the backup buffer
}
void finish(apid) {
  if the apid is being documented:
    take note that we have documented it and don't do it next time
  finish the packet and write it, perhaps from the backup buffer
}

Note that we need to have two buffers. It would be far better if we did something like this:

void start_packet(apid,packet_desc_str) {
  compile_time_code {
    create a packet describing this packet in the packet description block. This block will be a read-only blob as far as the runtime code is concerned
  }
  start the packet, no need for backup buffer
}
void fill<type>(data_value,field_desc_str) {
  compile_time_code {
    Add a packet describing this field to the packet description block
  }
  add the field to the packet
}
void finish() {
  finish the packet and write it
}
void setup() {
   open sd card
   write packet description block
   set up sensors etc
}
void loop() {
   read sensor
   start_packet(0x1234,"sensor packet");
   fillu16(value_from_sensor,"value from sensor")
   ...
   finish_packet()
}

Preprocessor macros might be able to do it, but the preprocessor is Evil. Templates might be able to do it, but it might require template metaprogramming, which is actually evil. It seems like there isn't a way to do it in C++, certainly not a clean way. Therefore we are forced to either use the official preprocessor, write our own preprocessor (which has its own headaches), or do it at runtime.

The language which might be a better C++ than C++ is Rust. This is a statically typed language which is designed to be compiled into good machine code (the reference compiler uses LLVM as its back end) but with some features added and some taken away. I'm not sure if I like mutability and ownership yet, and haven't gotten used to the concept of borrowing yet, but it does look like there is support for forcing a struct to land on certain bytes, and it does look like (with procedural macros) there might be enough compiler support for compile-time computation.

To test this out, I am going to work on three projects:
  1. A conversion of kwantrace to Rust, to experiment with plain application-domain programming.
  2. A packet parser for reading rollercoasterometer logs
  3. Firmware for the rollercoasterometer
The last one is probably not going to be ready in time for my next expedition to a roller coaster.