Generated by the author using Midjourney prompt: hieronymus bosch style woman overwhelmed by choices detailed technology — v 5.1 — style raw — s 250

The Minimum Viable Language Model

Hilary Hayes
3 min readOct 20, 2023

Consider the delta between how Apple’s Siri and Amazon’s Alexa features were communicated: Alexa’s value prop was setting timers, playing music, and giving weather reports, while Siri prompted the user to ask anything. Alexa surprised and delighted the user with Easter egg features, while Siri failed to meet it’s own astronomically high bar.

Setting functionality expectations through effective onboarding is key to user success, no matter the product. That goes for language models, too.

To better meet user needs, language models need to hone their training scope to ensure meaningful coverage of product, service, or experience contexts, not “ask me anything”.

The size of the model itself is a double edged sword: yes, the user could ask anything, but the capacity to answer any query may come at the cost of giving the best answers to provide high quality user journeys. A customer could walk into a car dealership and ask the salesperson about the plot of a Shakespearean play, or the best way to remove a red wine stain, or how to trim their own bangs, and the salesperson might have answers for the customer, but they will likely not be the best answers because those questions don’t leverage the training that the salesperson has received.

Model success is not about sizing, it’s not about being able to answer anything, it’s about experience context parameters, it’s about defining the edges, then going deep.

The key use issue with conversational interfaces, including LLM chat interfaces, continues to be:

People don’t know what to ask because they don’t know what they can ask.

The proliferation of ChatGPT prompt “cheat sheets” underscores the fact that even when users could ask anything at all, they still don’t know where to start. The blank page is daunting. The possibilities are limitless and that’s not a good thing when it comes to user experience.

Evaluating language models for product experience coverage may be challenging since it’s hard to know where the the model’s edges, especially because one of the selling points of the models themselves has been their massive knowledge bases, having been trained on expansive libraries of data. So far, products have been built to leverage language model data, but this data bloat and experience overshoot greatly increases the risk of hallucination or context mismatch.

Products shouldn’t leverage language models but instead the models must be built and trained for use cases and user contexts related to the product or service usage, which is established via user research. After going deep on key contexts, model training can expand outwards into more tangentially related topics to become more “T”-shaped. Product analytics are then reviewed as they become available, and further model training is done to close knowledge gaps.

When designing product experiences that may include language models, remember to set functional expectations early, and build in affordances that make it clear how they can or should be used to provide maximum value to the user. A well designed product doesn’t need a cheat sheet.

--

--

Hilary Hayes

Generative & multimodal experience design researcher ✨