Methodology, best practices and limitations

Photo: Landesarchiv Baden-Württemberg

Terminology

A placement is an item in a wine list. It usually corresponds to a single wine, although some restaurants, usually in the category of fast food or casual dining, list more than one wine within a placement, for example as “House wine red/white/rosé” or “Riesling dry or sweet”. In the portal’s database, placements can be by-the-bottle (BTB), by-the-glass (BTG), or both. If a wine is listed both by the glass and by the bottle, it is counted in statistics as one placement.

A wine list is a document, an image or a website that contains a number of placements. An account can have several wine lists (for example, a list of by-the-glass placements, a list of reserve wines, a list of dessert wines, and a list for a happy hour). Conversely, a wine list can be shared by multiple accounts, which is a common occurance for chain restaurants.

An account is an establishment: a restaurant or a bar with a distinctive address. This term may be confusing or sound unnatural for users in some countries, but it is conventional in the US wine trade industry for referring to any establishment that sells wine regardless of its size, corporate status and current relationship to a given distributor. The words account, establishment and restaurant are used on the portal interchangeably.

An explicit tag is a property of a placement directly read from its representation in a wine list. It is explicitly mentioned and thus is obvious for a customer. The words tag and property are used on the portal interchangeably in most cases.

An implicit tag is a property that was not directly mentioned in a wine list but inferred algorithmically. For a restaurant customer not very knowledgeable about wine, it may not be immediately obvious. For example, every red Burgundy with the exception of Beaujolais-related appellations is implicitly marked as a Pinot Noir, and any white Burgundy except for Saint-Bris, Bourgogne Aligote and Bouzeron is assigned a tag Chardonnay. On bar charts, implicit tags are shown with a semi-transparent fill while explicit tags are shown with a solid fill.

Methods and limitations

As of January 2026, wine lists presented on the portal were collected solely from accounts’ websites. Different algorithms and techniques were used both to retrieve and parse the data, in order to achieve the highest possible rate of extraction. When large language models were used, their results were verified by robust deterministic algorithms to rule out possible AI hallucinations. Premium users can access local copies of wine lists in PDF documents and pictures. Web pages, however, were not locally archived (although this may be done in future versions).

The first major limitation emerges from the data gathering process: only wine lists that were published online were saved into the database. The percentage of high-end restaurants that post their wine lists online is certainly higher than that of casual dining venues. It results in a certain skew in statistics to the side of the wines that one more is likely to find in upscale restaurants. It is possible, however, to evaluate this skew quantitatively, and we look forward to implementing this functionality in the future.

The share of restaurants that publish their wine lists online also depends on a country’s culture. In the US, the UK and Switzerland, restaurants publish wine lists on their websites significantly more frequently than in Belgium, France, Italy and Spain. In regions not very attractive to tourists, this share can be lower than in popular holiday destinations.

For tagging, no stochastic or other non-deterministic methods were used, as we wanted to maintain full control and transparency of the results. It is not uncommon for a wine list to have a section that encompasses various styles of wines without further breakdown, such as “rosé and orange” or “Jura and Savoie”. Although a number of heuristics were applied to the database to minimize these contradictions, in rare cases some placements may simultaneously have mutually exclusive tags.

Assigning implicit tags can also be problematic. When it comes to appellation-named wines from the regions where blends are traditional, a certain approximation is usually the only possible option. For example, all red Bordeaux wines were assigned a tag Merlot as virtually all of them contain it, although it is legally permitted to produce a Bordeaux AOC red wine without Merlot. For all Medoc appellations, a tag Cabernet Sauvignon is added to Merlot, although a small percentage of Medoc wines made without Cabernet Sauvignon certainly exists. On the other hand, tags for Cabernet Franc, Petit Verdot, Malbec and Carmenere were not automatically assigned to any of Bordeaux appellations, because if done, it would be certainly misleading.

Some appellations allow rather diverse styles of wines. A white wine from Jura can be Savagnin or Chardonnay or a blend of both, with varieties frequently omitted both on a label and in a wine list. A typical Provencal rosé likely contains Grenache among other varieties, but wouldn’t it be presumptuous to assign a tag Grenache to all Provencal rosés? As a result, a substantial number of wines in the database do not have any variety-related tag assigned.

The current structure of the database was designed with discrete division between by-the-glass (BTG) and by-the-bottle (BTB) placements for the sake of structural simplicity. From the business point of view, these are two conceptually and economically different categories. However, one should be careful with price comparison, as glasses can be substantially different in volume. In Germany, it is common to serve „open wines” by different volumes or by “carafes” along with “glasses”. Conversely, by-the-bottle placements can contain magnums, jeroboams and other formats.

Currently, the database contains more than 37,000 individual wine brands. While it covers the overwhelming majority of the wines of the world, there are brands that exist that are not yet covered by the database. If your wine is missing in the database, please contact us at support@winemarkets.co.

Best practices

Most common use cases are mentioned on the main page. For many purposes, you may want to limit the statistics to a certain scope of wines. In on-premise statistics interface, this can be done by limiting the subset to a number of tags or by excluding wines with selected tags from a subset (buttons New filter by tag, Add tag to filter and Exclude tag in the menu that opens after clicking a tag name on a chart). Probably the most frequent filter to be used is the exclusion of sparkling and fortified wines. As they constitute very specific categories that are often incommensurable with other wines, you might want to subtract them from many reports: for example, while analyzing the popularity of grape varieties in a given market, you might want to subtract the contribution of Champagne wines to Chardonnay, Pinot Noir and Pinot Meunier.

To evaluate a brand presence in the market, counting accounts that carry it may be more indicative than counting placements. It is not uncommon for some restaurants to have a list with tens or even hundreds of wines from the same producer, but from different appellations and vintages. A single account with one hundred placements of a certain producer, perhaps affiliated with it, may outweigh a hundred of accounts that don’t carry this winery in a report. This is why the option Counting accounts with at least one placement is important.

Hovering over bar charts and popularity maps will show a tooltip with absolute numbers of samples. While working with narrow subsets and with less popular tags, relying on graphical or percentage representations might be misleading. The smaller the sample size, the easier it is to confuse a signal with noise and draw wrong conclusions from data. Maintaining awareness of your sample size is always good practice.