Type sizes:
char      : 1
short     : 2
int       : 2
long      : 4
long long : 8
bool      : 1
void*     : 2

Operation timings:
Note: timings for some operations are very data dependent
nop           0.07 usec/call
cli/sei       0.14 usec/call
micros()      3.17 usec/call
millis()      1.47 usec/call
fadd          9.18 usec/call
fsub          9.25 usec/call
fmul          7.96 usec/call
fdiv          7.96 usec/call
dadd          9.18 usec/call
dsub          9.25 usec/call
dmul          6.14 usec/call
ddiv          6.07 usec/call
sin()       112.35 usec/call
cos()       113.11 usec/call
tan()       154.92 usec/call
acos()      167.69 usec/call
asin()       82.16 usec/call
atan2()     193.34 usec/call
sqrt()       31.67 usec/call
iadd8         0.48 usec/call
isub8         0.48 usec/call
imul8         0.67 usec/call
idiv8         5.57 usec/call
iadd16        0.92 usec/call
isub16        0.92 usec/call
imul16        1.42 usec/call
idiv16       13.31 usec/call
iadd32        1.80 usec/call
isub32        1.80 usec/call
imul32        4.69 usec/call
idiv32       37.89 usec/call
iadd64        7.96 usec/call
isub64        8.53 usec/call
imul64       45.82 usec/call
idiv64      230.95 usec/call
memcpy128    56.76 usec/call
memset128    49.84 usec/call
delay(1)   1007.65 usec/call



Additional notes:

 eeprom_read_byte: 2 usec/call
 eeprom_write_byte: first call costs 5 usec, each subsequent byte
                    costs 3480 usec as it waits for the EEPROM to be
                    ready from the previous byte

 pgm_read_byte: 0.5 usec per byte

dataflash write: 
  50 usec per byte, same on APM1 and APM2
  if we change API to do block writes, will be 8 usec per byte