lenses, Machine Learning, Tasting, Tools, Wine Words
Comments 3

Models vs. Reality, Part 1

Wine appreciation requires language. But the way you use language depends on what you consider to be a “good tasting note.” What is good? What’s the norm?

…writing is a learned activity, no different in that regard from hitting a golf ball or playing the piano. Yes, some people naturally do it better than others. But apart from a few atypical autodidacts (who exist in all disciplines), there’s no practical way to learn to write, hit a golf ball, or play the piano without guidance on many points, large and small. And everyone, even the autodidact, requires considerable effort and practice in learning the norms. The norms are important even to those who ultimately break them to good effect. Bryan A. Garner, Garner’s Modern American Usage (2009, p. 104)

Famous critics and formal tasting systems provide models/norms/reference points. But how good are those norms?

What does “green apple, citrus peel, medium+ acidity” mean, exactly?

Models are useful, but only if we don’t lose touch with what is actually going on. So let’s calibrate our models to reality. What is being written in practice? Do we really know what the norms are, or are we just imagining things?

To the best of our knowledge this is the first attempt to use NLP (natural language processing) algorithms to find structure in wine notes. Algorithms are tools we can exploit to explore what wine means to us. We are excited by this opportunity to shine new light on wine words (see our last couple of posts for background).

Models and Reality

We’ve built a tool that allows us to cluster wine words based on whether or not they occur in similar contexts. The idea is to understand  wine words by analyzing their usage in context: what company does a word keep? For example, are the words oyster and sea used in similar contexts to describe breezy whites?

In case you’re wondering, the tool was easy to build. There are  many pre-packaged NLP techniques that can be cobbled together to do what you want. This is the golden age of data exploration, after all! We’re a bit surprised no-one else has done what we’re doing with wine words before, but it’s fun to be first!

Tasting Model Review

We have intuitive notions about how wine descriptors relate. For example, WSET has a guide to wine-word usage based on a few broad categories called the “Systematic Approach to Tasting“. Then there’s the classic “Aroma Wheel”  originally due to UC Davis Professor Ann C. Noble. A beautiful new attempt to categorize and visualize wine descriptors is due to WineFolly, see “Wine Descriptors & what they mean”.

These tasting systems / word categorizations are models. If you adhere to one of these models, your wine notes will be somewhat predictable

If you like the aroma wheel, “blackcurrant” and “cassis” may frequently occur together in your notes to describe intense, fruity red wines.

If you’re following the WSET system, cassis isn’t a “standard word” so perhaps blackcurrant and black cherry would be more likely to occur together.

If WineFolly’s poster is on your wall, I expect a different style of note entirely: this Syrah is “fleshy”, “flamboyant” and “plummy”.

Tasting systems / models / norms — whatever you want to call them, we’re being loose with words here — are important. They’re especially important when you start learning about the conventional tasting wisdom. I want to think of a model as a lens that flattens something complicated and turns it into digestible conceptual chunks.

A map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness. Alfred Korzybski, Science and Sanity (1933, p. 58)

Rethinking the Models: Calibrating to real notes

So the whole point here is that rather than starting with a model, we want to turn things around and start with what the critics are actually writing.

There are many ways of building a tasting system from a collection of wine notes. The questions you have to answer are: how do I summarize the main features of the text? And what does “similar” mean? Is blackberry similar to raspberry if the two words consistently occur in the same sentence? Or are they similar if they consistently have the same neighbouring words?

It’s a bit like the process Bendor Grosvenor et. al. rely on in BBC’s Fake or Fortune: Bounce light off the painting’s surface, record what comes back, let algorithms tell you what elemental features make up the composition, then compare the features to your model of what real paintings by the artist should be like.

Down and Dirty with the Algos

OK, back to business of building wine-word models by looking at what is actually being written.

To start off with, let’s focus on a simple kind of lens, which may be familiar from a Statistics course — PCA. (If it’s not familiar, it really doesn’t matter, we’re just trying to pretend that we’re clever by mentioning fancy acronyms. Judge the acronym by the picture it produces is a good rule to live by.  We will meet more lenses and pictures over the next three posts.)

Our input is a high-demensional set of features which summarise our dataset of tasting notes…

… Think of it as follows: wine contains a vast amount of information, which you boil down to a tasting note. The neural network algorithm boils down a large number of tasting note sentences and represents them as 200-dimensional vectors. The purpose of the lens is to then focus this high-dimensional summary of tasting notes down to three dimensions.

200 dimensions are hard to keep in your head, but we can visualize 3-dimensions on a simple plot: two axes + a third dimension represented on a colour-scale. Different lenses highlight different facets. So depending on which looking glass we use, we’ll get a different perspective. Some may be more useful / intuitive than others.

  1. Wines -> tasted by critics
  2. Critics write tasting notes -> critics publish tasting notes
  3. Database of tasting notes -> algo summarizes main features of wine note collection
  4. Hundred-dimensional representation of  tasting vocab -> lens focuses it into 3D

Juicy Berries

Let’s start by looking at how different kinds of berries feature in our wine note collection as seen through the PCA lens. The size of the bubbles in the plot below indicate how often the descriptor shows up in the collection of notes.


Common berry descriptors clustered using classical neural net algorithms and viewed through the 3-D PCA Lens. The size indicates the frequency with which the descriptor occurs in the notes. X-Axis: PCA Component 1 , Y-Axis: PCA Component 2, Colour (RdPu): PCA Component 3 , Size: represents frequency of the word in the collection of notes.


Through the Looking Glass

The horizontal axis on the graph is the most telling. The further apart the words are, the more dissimilair through our lens. In particular tomato (yes a tomato is technically a berry) and gooseberry are out on their own. You’d expect this, right?

Note that even though tomato and gooseberry are close together in the picture, their colours are quite different. (This means that in a 3-D plot, they’d be far apart.)

“Berry” and “cherry” also stand out. Both are more generic than the other descriptors. In particular, cherry is usually divided up into “black” or “red”. So effectively we are combining two quite distinctive descriptors into one here, which explains why it stands out so vibrantly.

Cranberry, strawberry and redcurrant are all red and fresh and somewhat young Pinot-like, but why is raspberry not in this cluster as I would have expected? According to our lens, raspberry is closer to blackcurrant and cassis than to strawberry.

So what?

That’s it for today. Feel free to use this picture to inform your use of “berry”. For example, you could separate the graph above into 4-5 clusters:

When you’re writing a note and you’ve used the words “blueberry” and “blackcurrant” then you’ve covered most of the space on this picture already. So perhaps it’s time to move on to secondary flavour descriptors? On the other hand, perhaps you want to re-inforce the impression that this is where the flavour’s at? In that case, why not add “cherry” and “mulberry” to the mix?

See you next time for more wine words in context!

In the mean time you may enjoy (if you haven’t already) Making Peace in the Language Wars, and Tense Present: Democracy, English, and the Wars over Usage. That’s really what this is all about. 

Additional Material (added on 03.04)

In response to questions on PCA and variance explained (see comments).






  1. Very interesting stuff. Heuristically it looks like the PCA has done a nice job at clustering various words together– although the cassis/blackcurrant split is curious.

    How much of the variance is explained by each component? PC1 looks to me to be some kind of ripeness/body measure, with the greener wines at the start, moving through light Pinot-like wines, into the bruisers. Have you looked at how this component correlates with “traditional” wine measures (or is that to come in later posts)? I can’t immediately translate what PC2 might be, perhaps you have some insight into this.


    Liked by 1 person

    • Good questions! I think I’d now like to look into the cassis/blackcurrant split further. Do the critics who use “cassis” sometimes use “blackcurrant” in a different context to make a different kind of point? Or perhaps the split is explained by having 1-2 writers in the mix who don’t use “cassis” + their use of “blackcurrant” is different? The split isn’t huge, so I do think it’s curious rather than problematic… To answer your question: 22%, 15%, 12% (added a picture to the post). As to the interpretation of the components, I think that requires a lot more work. I like your interpretation of PC1. I think of these methods as providing tastes/flavours of large bodies of tasting notes. There’s a lot of information in the notes, so it’s interesting to look at different representations, through different lenses. Of course to the machine there’s no difference between vectorizations of wine notes and dance choreographies. But that’s the beauty of machine-assisted tasting note analysis: we may well be able to attach a meaning to “PC1” like you have done (we can also try to check/verify these hunches by looking at the underlying wines)… And I hope ultimately someone will consider these pictures as useful additions to standard tasting norms. It works best if you ask a targeted question like: ABC is my favourite critic, I’m tasting wine xyz give me a map of his/her descriptors related to “tannic” for Australian red wines. OK, now compare it to the same map for critic DEF… Let’s discuss further!

      Edit: Initial comment had a mistake (explained variance was not in proportion). Corrected.


      • Thanks! Agree with all your points. I think understanding this mapping from flavours to words is important, as it’s key in how people understanding wine, so exciting to see you are having a stab. I look forward to seeing the further analysis as it comes.

        Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s