Connect with us


The Times They Are A-Changin’



Chart: Numbers Behind The New York Times Digital Transition

The Times They Are A-Changin’

The Numbers Behind the New York Times’ digital transition

The Chart of the Week is a weekly Visual Capitalist feature on Fridays.

For the most part, legacy print media stalwarts are dying a death by a thousand cuts.

There are exceptions to this rule, and The New York Times is often touted as the best example of an old-school media company that is successfully navigating the challenging transition to digital. They’ve experimented with different types of content and tactics to get eyeballs, while also shifting their company-wide strategy and culture to take a digital-first approach.

While pundits give credit to the Times for their latest efforts, this doesn’t mean it’s been an easy transition for the iconic newspaper. The path forward has been littered with roadbumps, and the most recent one is hard to ignore for shareholders.

Earlier this week, The New York Times announced a 95.7% decrease in quarterly profit. We dug a little deeper in this week’s chart to provide some context behind the newspaper’s challenges in maintaining its relevance in the 21st century.

Goodbye, Ad Dollars

The primary challenge faced by the Times is pretty obvious.

In the early 2000s, the company easily made over $2 billion in advertising revenue per year. Today, they make about $600 million from ads.

Why has the transition to digital hurt ad revenues so much? There are a bunch of reasons, but here’s a few of them:

  • Physical circulation of The New York Times and other newspapers is dropping rapidly.
  • Traditional display ads aren’t particularly effective, and are part of the “old-school” of digital thought.
  • Programmatic bidding drives down prices for these ads, bringing in even less revenue.
  • Digital lends itself to long-term, results-driven campaigns. It takes time to set these up and measure them properly, especially at scale.
  • Ads need to match the editorial stream to be effective. Quality over quantity.
  • There’s more competition in the digital space, which is a stark contrast to the distribution oligopolies enjoyed by big newspapers in the legacy era.
  • Madison Avenue is also slow at switching to digital, which only adds to the lag time.

These are just some of the reasons why advertising was able to make up 65% of the Times’ revenues in 2004, but only 39% in 2016.

Hello, Digital Subscriptions

While I don’t personally agree that a paywall is a long-term answer to any of their problems, it is true that the New York Times has used this as a temporary crutch to at least counter lost ad dollars.

In Q3 2016, revenue from digital-only subscriptions increased 16.4%, and money coming in from subscriptions has increased year-on-year since 2011.

Sometime between 2011 and 2012, subscription revenue (powered by digital-only subscriptions) passed ad revenues as the most important source of incoming cash for the company. The ramp-up has been impressive, and The New York Times now has 1.6 million digital subscribers.

My personal take? Digital subscriptions will plateau in the next five years or maybe sooner. Further, I think that content that isn’t industry or niche-specific will generally drift towards being free for users over time. The New York Times will have to solve their ad problem, but the paywall will buy them a bit of time to do so.

Click for Comments


Charted: The Exponential Growth in AI Computation

In eight decades, artificial intelligence has moved from purview of science fiction to reality. Here’s a quick history of AI computation.



A cropped version of the time series chart showing the creation of machine learning systems on the x-axis and the amount of AI computation they used on the y-axis measured in FLOPs.

Charted: The Exponential Growth in AI Computation

Electronic computers had barely been around for a decade in the 1940s, before experiments with AI began. Now we have AI models that can write poetry and generate images from textual prompts. But what’s led to such exponential growth in such a short time?

This chart from Our World in Data tracks the history of AI through the amount of computation power used to train an AI model, using data from Epoch AI.

The Three Eras of AI Computation

In the 1950s, American mathematician Claude Shannon trained a robotic mouse called Theseus to navigate a maze and remember its course—the first apparent artificial learning of any kind.

Theseus was built on 40 floating point operations (FLOPs), a unit of measurement used to count the number of basic arithmetic operations (addition, subtraction, multiplication, or division) that a computer or processor can perform in one second.

ℹ️ FLOPs are often used as a metric to measure the computational performance of computer hardware. The higher the FLOP count, the higher computation, the more powerful the system.

Computation power, availability of training data, and algorithms are the three main ingredients to AI progress. And for the first few decades of AI advances, compute, which is the computational power needed to train an AI model, grew according to Moore’s Law.

PeriodEraCompute Doubling
1950–2010Pre-Deep Learning18–24 months
2010–2016Deep Learning5–7 months
2016–2022Large-scale models11 months

Source: “Compute Trends Across Three Eras of Machine Learning” by Sevilla et. al, 2022.

However, at the start of the Deep Learning Era, heralded by AlexNet (an image recognition AI) in 2012, that doubling timeframe shortened considerably to six months, as researchers invested more in computation and processors.

With the emergence of AlphaGo in 2015—a computer program that beat a human professional Go player—researchers have identified a third era: that of the large-scale AI models whose computation needs dwarf all previous AI systems.

Predicting AI Computation Progress

Looking back at the only the last decade itself, compute has grown so tremendously it’s difficult to comprehend.

For example, the compute used to train Minerva, an AI which can solve complex math problems, is nearly 6 million times that which was used to train AlexNet 10 years ago.

Here’s a list of important AI models through history and the amount of compute used to train them.

Perceptron Mark I1957–58695,000
Neocognitron1980228 million
NetTalk198781 billion
TD-Gammon199218 trillion
NPLM20031.1 petaFLOPs
AlexNet2012470 petaFLOPs
AlphaGo20161.9 million petaFLOPs
GPT-32020314 million petaFLOPs
Minerva20222.7 billion petaFLOPs

Note: One petaFLOP = one quadrillion FLOPs. Source: “Compute Trends Across Three Eras of Machine Learning” by Sevilla et. al, 2022.

The result of this growth in computation, along with the availability of massive data sets and better algorithms, has yielded a lot of AI progress in seemingly very little time. Now AI doesn’t just match, but also beats human performance in many areas.

It’s difficult to say if the same pace of computation growth will be maintained. Large-scale models require increasingly more compute power to train, and if computation doesn’t continue to ramp up it could slow down progress. Exhausting all the data currently available for training AI models could also impede the development and implementation of new models.

However with all the funding poured into AI recently, perhaps more breakthroughs are around the corner—like matching the computation power of the human brain.

Where Does This Data Come From?

Source: “Compute Trends Across Three Eras of Machine Learning” by Sevilla et. al, 2022.

Note: The time estimated to for computation to double can vary depending on different research attempts, including Amodei and Hernandez (2018) and Lyzhov (2021). This article is based on our source’s findings. Please see their full paper for further details. Furthermore, the authors are cognizant of the framing concerns with deeming an AI model “regular-sized” or “large-sized” and said further research is needed in the area.

Methodology: The authors of the paper used two methods to determine the amount of compute used to train AI Models: counting the number of operations and tracking GPU time. Both approaches have drawbacks, namely: a lack of transparency with training processes and severe complexity as ML models grow.

Continue Reading