On Names
One of my favorite things in this world is the dishwasher because it has a great name (and because it does the dishes for you!). What does a dishwasher do you ask? Well it washes dishes of course! Its a great name because the meaning can be understood from the composition of its parts, assuming you know what a dish is and what a washer is. It also doesn’t refer to a Whirlpool Dish-O-Matic 9000 with dual action scrubbing heads, it just means “that appliance that shoots hot soapy water on your dishes because why would we do that with our hands anymore”. We take the relative ease at consensus on words like this for granted in everyday life and we battle with naming in places like mathematics, the sciences, and programming.
Names are pretty overloaded in their purpose. Sometimes we use a name to be able to uniquely (up to a reasonable degree) identify something, like a person’s name identifies a particular human. Other times we use a name, like “dishwasher” to identify a whole class of things with some shared trait(s). Often times an item is identifiable by more than one name, like a dishwasher is an appliance. If you wanted to tell your landlord your dishwasher was broken, it would be pretty useless to tell them that your appliance is broken because that is ambiguous. Going down the chain of these names (appliance, kitchen appliance, large kitchen appliance, etc.), we find that dishwasher uniquely identifies the thing you’re talking about given the context – you and your landlord share the assumption that you will only be talking about your own appliances. So we see that the number of things which you could be referring to changes with your shared context. An important thing to note here is that for the purposes of communication, context is always a shared thing.
Names are useful because they let us refer to things without writing that exact thing out all the time – they act as a sort of compression. For someone who doesn’t want to write much, they could simply say, “I want to tell you b8aa2a36-c9e8-4c3f-9d22-d323f7e730a1” every time they want to say something. They have simply given a new name (universally unique identifier) which somehow references whatever they want to say, and left it as an exercise to the reader to lookup this (presumably) new name. This is extremely compact for the speaker because that name could reference all of Wikipedia, so they can refer to an enormous body of text in just a few characters. Of course, what is efficient for the speaker comes at the cost of the reader, who must lookup (presumably in a very large dictionary) what that name refers to. The space savings are even more pertinent if the sentence was, “I was reading b8aa2a36-c9e8-4c3f-9d22-d323f7e730a1 the other day and wanted to tell you about b8aa2a36-c9e8-4c3f-9d22-d323f7e730a1”. Now, instead of repeating all of Wikipedia verbatim twice, we can do so with our awesome name. Of course, our typical language would conquer this problem with nifty things like “it”; so the sentence could also be “I was reading b8aa2a36-c9e8-4c3f-9d22-d323f7e730a1 the other day and wanted to tell you about it”. Through some hocus pocus, we all know which thing “it” references in this sentence and lets us reuse the same word to reference many things in different contexts. In this way, “it” is a name for the class of all other words, but is unique in the context of a sentence (modulo ambiguous phrases).
“It” is interesting because it gives us a temporary name to refer to something that doesn’t have – or perhaps deserve – a name in its own right. Consider: “You remember that recipe I made last week? I want to make it again”. If we had to constantly generate new names every time we wanted to refer to something, we would have enormous dictionaries and spend inordinate amounts of time reading them. Even without “it”, it can be more economical to just repeat what you need verbatim. Language is somehow optimizing a cost function on the information bandwidth in communication by varying the reader and writer’s compression level along with the size of the shared context necessary.
Computer science gives us a new perspective on names. Whereas b8aa2a36-c9e8-4c3f-9d22-d323f7e730a1 was a name that I just made up to refer to all of Wikipedia, we could also derive a name which refers to all of Wikipedia by using a hash function. Cryptographic hash functions give us confidence that no other thing will generate the same name (uniqueness) and also lets people check whether the thing they have refers to the same thing (verifiability) by re-deriving the name. It is interesting to think about the situations which benefit from using a derived name vs a given name.
Programming uses a lot names. Many of them fall on a spectrum from derived to given, mostly because reusing names from common parlance like “Matrix” as the name of your datatype is given by you, but also derived from a shared understanding of what a matrix is. In any case, the thing that has been bothering me lately about names in programming is all the libraries/packages that use given names only distantly, if at all, related to what they do. Often times they are supposed to be cute, clever, funny, or just somehow memorable, but why do they need a name to begin with?
To answer the above, first a detour.
Writing a program requires wearing two hats. The first hat is used for specifying how to do a certain operation, for example sorting a list of numbers. There are plenty of ways to sort numbers and we expect that each implementation arrives at the same answer, but if you are writing a new sorting function or specializing/improving an existing sorting function, then you are deeply concerned with exactly how things get done. The second hat is used for specifying what the result of a certain operation should be. Today, the second hat is almost always worn over top the first hat because (practically speaking) all of your code directly references an implementation of a function which specifies a how and not only a what. To approach a clear separation between the two, we must leave all function calls abstract (by eg. naming them as an enclosing function parameter, using functors, type-classes, traits, interfaces etc.). In many cases, the pursuit of having these pervasive interfaces is unachievable or at least inefficient because many things we want to program are made of some bizarro business/real-world rules, exceptions, but-then-also-add-this-back-in, and and-then-only-if-I-feel-like-it operations that define their own interface and that interface will only ever have one implementation.
And now back to answer our question.
It seems clear that people use clever package names as a form of namespacing, allowing them to directly call their implementation of matrix-multiply and not Alice’s. This is obvious to state but I think we all need to think about what this means for how we share code, what it means for code reuse, and how it affects software over time. I feel like the same mantras about coupling, interface not implementation etc. are repeated over and over and yet we still program like this. I’m not convinced that our needs are that different that we require our own implementation to specify some behavior. And once more than one person uses that thing, doesn’t that automatically rule out that their function was so special it got its own implementation? There is obviously something common going on and once that happens, I think it should be made an interface and consumers should only reference that functionality via the name of the interface, and not the implementation.
I dream of a language where things are named by their essence. I want “packages” (if there is an analogue) with names like Maths, Strings, Trees, Drawings, etc. Using an exact name should be the rare case and reused only inside your codebase. We should seek to unify our understanding of as many interfaces as possible, perhaps even more than we are comfortable with in our typical human languages because we have these machines that can remember far more words than we can.