Some assembly tricks:
* SMC instead of checking the XMS driver address
in the DOS DS stub,
* SMC so that the address goes right into a
`call far immediate` instruction,
* use `repe cmpsw` to compare multiple words (saves
space over the individual word compares),
* near calls to far functions use push cs to build
a far-call stack frame,
* segments 0 and FFFFh generated by segment arithmetic
instead of loading from memory,
* common case (A20 already enabled) made to be the case
where the conditional branch just falls through, which
may be slightly better.
Using DOSTEXT(x) accesses x in LGROUP and DOSDATA(x) in DGROUP.
This is necessary since ia16-elf-gcc does not understand data
in far segments.
For non-macro'ed symbols, FP_SEG needs to be replaced by explicit
segment references.