The Cannery — Is a tomato a fruit or a vegetable?

At times, we distinguish between what something is and how something is used. The tomato is a classic example. Botanically speaking, it is a fruit (a “sweet and fleshy product of a tree or other plant that contains seed and can be eaten as food”). In a culinary context, we classify it as a vegetable because we use it in savory dishes.

In software, we often write code whose essence differs when we look at it as authors or as consumers.

For example, a Set is a collection. See

in JavaScript, the MDN Web Docs list Set under the heading “Keyed Collections”
in .NET, HashSet implements System.Collections.Generic
in Java, Set implements Collection and Iterable
and so on.

As a tomato is botanically a fruit, a Set is a collection.

But.

Suppose you are writing a Java class that checks if an ID is valid. You implement the validator like so:

@FunctionalInterface
interface IdRule {
    boolean isValid(String id);
}

class IdValidator {
    private final List<IdRule> rules;

    public IdValidator(List<IdRule> rules) {
        this.rules = rules;
    }
    public validate(String id) {
    }
}

You have several rules defined for your IDs:

class IdLengthRule implements IdRule {
    private static final MINIMUM_LENGTH = 20;

    public boolean isValid(String id) {
        // Ensure the length meets or exceeds the minimum
        return id.length() >= MINIMUM_LENGTH;
    }
}

class IdUniquenessRule implements IdRule {
    private final Set<String> existingIds;
    
    public IdUniquenessRule(Set<String> existingIds) {
        this.existingIds = existingIds;
    }

    public boolean isValid(String id) {
        // Ensure the ID does not already exist in the system
        return !this.existingIds.contains(id);
    }
}

Here we have a Rule that is implemented with a Collection: The IdUniquenessRule is implemented with a Set<String>. It is not too hard to separate the ideas because we have different objects for each: the field existingIds is a collection; the class IdUniquenessRule defines the rule.

It gets fuzzier if we skip the IdUniquenessRule and take advantage of the fact that Set<String>::contains has the same signature as IdRule::isValid.

Instead of creating a class for the IdUniquenessRule, we can refer directly to the contains method:

Set<String> existingIds = getExistingIds();
IdRule idUniquenessRule = existingIds::contains;

Now, the rule and the collection are harder to separate. Is this thing a rule or a collection? Is a tomato a fruit or a vegetable?

It depends who is asking.

Example: encapsulation respects both what the code is and what it’s for

We can embrace this tension and use method and class names to hide the identity from the utility. Inside the class, we are precise about the identity. Outside the class, we are clear about the utility.

Consider these three examples. The first two choose either identity or utility. The last one encapsulates the identity and expresses the utility.

In all three of these examples, a function returns an IdRule. The function takes no arguments, and returns something we can use to check if a given ID already exists.

Identity focus (“fruit”)

Function<String, Boolean> getExistingIdSetRule() {
    return id -> !existingIds.contains(id);
}

This function very precisely expresses the identity of the parts (“the tomato is a fruit!”). The return type is Function<String, Boolean>, emphasizing that we are returning a thing that takes in a string and returns a boolean. The name is getExistingIdSetRule which emphasizes that the rule this function returns is all about a set of existing IDs. In the implementation, the ID set is called existingIds which emphasizes it’s true nature as a collection.

Utility focus (“vegetable”)

IdRule getIdUniquenessRule() {
    return id -> !idUniquenessData.contains(id);
}

The signature of this method improves over the previous one: the return type (IdRule) and the method name (getIdUniquenessRule) clearly express how this thing is useful: here is a method that returns an ID Rule that determines uniqueness. From the outside, it’s very helpful to express the purpose of this method. This is like classifying a tomato as a vegetable: it helps folks who are learning to cook if they think about the tomato as a vegetable. “Use this the way you would use a vegetable”. So here, we see “use this to get an ID Rule govorning uniqueness”.

However, this option takes the “tomato is a vegetable” stance to an extreme: in the implementation, the set of existing IDs is called idUniquenessData which does a great job of expressing the purpose, but makes the identity of the thing unclear.

Encapsulation: express the identity inside and express the utility outside

IdRule getIdUniquenessRule() {
    return id -> !existingIds.contains(id);
}

This is the best of both worlds: inside the method, we are precise about the identity. Here, we are dealing with a collection of existing IDs. (“Technically, the tomato is a fruit.”) Outside the method, the signature expresses the intended use of the code. (“Practically, the tomato is a vegetable.”)

Conclusion

A tomato is both a fruit and a vegetable. Depending on our context, we should think about it as one or the other.

When we code, we can use abstractions like variables, methods, and classes, to separate implementation detail from intention. Inside the abstraction, it’s helpful to be precise about the technical details of how it works. In the same way, it’s helpful for a botanist to talk about a tomato as a fruit. Outside the abstraction, it’s helpful to be clear about the purpose or intended use of the code, even if it it is not technically correct. Likewise, it’s helpful for a cooking instructor to speak about a tomato as a vegetable.

Are you a “vegetable” or “fruit” person?

If you often say “that’s not accurate” or “that’s not really _”, you’re probably more of a “fruit” person – you care about the true identity. On the other hand, if you are quick to think of abstractions, you’re probably more of a “vegetable” person – you care about the practical application of the thing. If you can identify where you tend to focus, then you could practice the opposite.

If you’re a “vegetable” person

As someone who focuses on the practical use of things, keep an eye out for precise and clear variables names and private method names.

If you’re a “fruit” person

As someone who focuses the true identity of things, you may want to review your working code for clear abstractions. Edit so that public method names and class names express your intention (not just technical details about how you built them).

For everyone

There is a time and place to call a tomato a “fruit” or a “vegetable”. It’s helpful to learn to wear two hats – to have the role of a botanist or a chef. As we code, we switch between “technically speaking…” and “for all intents and purposes…”. It’s helpful as a communicator to know when to be precise, and when to be practical.