Abstraction layers
Assembly, however, seems to be more intimidating than it actually should be. With all the abstractions that we have today, people tend to forget the fundamentals of computing systems. Bytes, and such. Or never even learned these fundamentals in the first place. Everything which happens behind the scenes is considered stable magic. I think that is awfully sad. Structurally knowing how something works in each layer of the system is the most wonderful thing about this field of work.
I began with binary and building CPUs in Minecraft many years ago. Before I ever touched a line of C language, I programmed individual bits of machine language into some torch ROM and had the CPU do simple counting. It was not a comparable system to common microprocessors (because it was a NISC, had two registers, no other memory, and no ability to do loops or conditional execution), but for 8 year old me, it was super exciting to see something logical work like expected. And don't get me wrong, there are so many things that I wish I'd know, or know better, but currently don't. Modern video processing is one of those things.
Writing assembly requires you to visualise the machine state and imagine transfers of data from one register to the next. It involves knowing exactly what an instruction does, its side effects, and the correct order of operations to perform a useful task (for example, knowing what operations you can or cannot do that clobber the flags register before a branch, to save a few instructions; see line 17 in the example below). Any abstraction above assembly will make the flags register of the CPU invisible.
Do you trust your dependencies?
Recently, I've been writing a little bit of TypeScript for work to
control the firmware of another device. It made me think about the
levels of abstraction that software depends on. It goes way further
than the super large node_modules directory that you can
quickly get, although that's pretty bad, too. The runtime of the
application probably uses lots of software from different vendors, too,
which accounts for processes like memory allocation, deallocation,
communication with the operating system, and the general reliability of
the application. Do you trust your interpreter?
Most software depends on the system C library, or libc, to provide
basic functions and interfaces to the kernel. It would be catastrophic
if libc's malloc was not optimised like glibc or musl is.
In any case, all applications today depends on the operating system to
host an environment which is suitable for the level of abstraction that
is used. And don't forget, that consists of a lot of different parts,
too. It is just assumed that all basic principles are well tested when
other applications runs on it. I would be impressed if Slack made a
release for bare metal (if only it wasn't Electron).
Modern complexity
I'd say that writing assembly in abstracted environments, so constained to an underlying operating system, is far more scary. Everything about computing systems has grown, such as speed, execution, caching, and input/output control. For one, a simple read from memory goes through layers of caches and block operations. In contrast with 50 years ago, you can directly monitor the bus because you control components directly.
The scale of demand has also become enormous. Some time ago, pong was the most innovative videogame there was. You could change the video output by changing a bitmap in memory. Now, there are shaders and realistic physics. There is so much more processing time necessary.
A funny example of 'scary assembly' that I recently wrote was in this commit in ZOS. The ZOS project is an operating system for the Z80. It's just the microprocessor, some input/output, wiring, and myself.
In the snippet, a dictionary is traversed which is used to find an
instruction address given a command. Register hl is used
as the search string pointer, and ix for the dictionary
pointer. In order to make this work, the search string pointer is
repeatedly compared with the keys of the dictionary until an equal is
found. Either a zero or non-zero flag is used to indicate a match,
with register ix pointing to the value of the dictionary
when found/non-zero.
Firstly, you may find it odd that the entrypoint of the subroutine
map_find is in the middle of the assembly block. This
allows control flow to travel into it without any additional jump
instructions. The search pointer hl is stored onto the
stack because it needs to be reset on each search. As the Z80 doesn't
have a stack peek instruction, it is popped (line 22) from the stack
and pushed (line 23) back onto it.
Subsequently, a load from the dictionary pointer ix is
done to check if the dictionary has reached the end with a termination
byte. A load instruction doesn't affect any flags, so the instruction
or 0 is used to route the accumulator through the ALU to
register the zero flag. After streq is called (conveniently
using the same pointer registers), its result is checked.
streq modifies the registers as a side effect, and the
ix register needs to seek its next dictionary key if it
was not equal (from line 16). Otherwise, the next address of
ix points to the value after the string termination byte
(line 29). An or 1 instruction (line 30) resets the zero
flag which is used in the return contract of the subroutine.
