Press enter to see results or esc to cancel.

The importance of a shared language in newsletter analytics

How the Shorenstein Center is building valuable resources and open-source tools to help organizations better understand their newsletter investments.

Enter the Newsletter Zone

A newsroom or marketing department decides they need a newsletter. They assign it to a junior employee with no clear direction or significant resources. Three months down the line they will have to give a performance report. There is no clear explanation as to why the project has commenced, and how it will be measured.

Fast-forward 90 days. They have amassed 800 subscribers. Is that a good thing? Well, theSkimm has millions, and other newsletters have tens or hundreds of thousands. The paper’s open rate reached a peak of 30%. Their boss read somewhere that Lenny Letter had a staggering open rate of 70%. The New York Times gets a 70% open rate, but wait, that’s a gross open rate. Is that what they should be looking at?

So they fervently turn to the Mailchimp benchmarking report for a frame of reference. Are they marketing/advertising or media/publishing or professional services? They’re not quite sure, so they pick one category where they’re a percentage point ahead of the average open rate. They’re not quite sure what this means, but it’s something.

Considering the limited resources and dedicated tools out there, it’s not exactly this theoretical (but all-too-real) marketing team’s fault. Email service providers (ESPs) are primarily built for conversion-oriented email marketing, not brand-building editorial newsletters. 

The language and tools around email analytics are squarely focused on the goals of ecommerce marketers. Editorially-oriented newsletters have long been relegated to the realm of auto-generated blog roundups or side projects, and there hasn’t been a clear base of knowledge on how to create, measure and understand them. 

Due to this, the questions that actually matter – is the newsletter email list full of high-value subscribers? Does it have readers who open their emails over and over again? Can the team calculate some semblance of financial ROI for the project? – are nervously swept under a rug.

These questions are the inspiration of this series. These questions are also what a team of researchers, including Hong Qu and Will Hakim*, at the Harvard Shorenstein Center of Media, Politics and Public Policy are trying to help answer for companies of all sizes. 

A Shorenstein team composed of Emily Roseman, Caroline Porter, Joseph Lichterman, Jacqueline Boltik, Charley Bodkin, Francisco Rivera, Abigail Hartstone, and Bobby Courtney all built the canonical, a resource that feels like an entire MBA program for newsletters. And just as laudable, they’re building open-source tools to democratize access to collecting and processing the data required, so it’s not limited to only operations with big IT departments and investor checks.

*(Will is currently at The Information)

The Newsletter Why

“Newsletters in general have not been very sophisticated or data-driven.” – Hong Qu

For years, newsletters have revolved around the Big Three: clicks, opens and list size – what Hong and Will, two Shorenstein researchers with a technical background, argue are the vanity metrics of email. 

Why vanity? It’s easy to point to a rise in open or click rates and say things are going well. It’s easy to point at a newsletter’s list size, grunt “bigger”, and get a pat on the back from management. It offsets the difficulty of explaining why this is a good thing, or how it retains users over the long run. 

There’s a reason ecommerce folk have such sophisticated SaaS products around email marketing. With clearly defined goals, funnel vision, a dead-simple primary KPI (sales), and easy justification of why something works and why it doesn’t (the former sells more stuff), ecommerce marketers have been able to build up a robust language around email marketing.

Publishers (newsrooms or editorial marketers) don’t have it so easy. The blessing of plugging an ESP into your CMS and spitting out an auto-generated blog roundup may have been our original sin. It minimizes the resources required to such a degree, anyone could launch a newsletter product without asking themselves why

ESPs themselves make it notoriously easy to stop asking question past a certain point. Every ESP gives easy access to the Big Three. Like Internet Explorer, the fact that these three are just there, are quickly accessible, and are ubiquitous have made them the default choice for most marketers. 

You might be selling subscriptions, selling an event, or selling sponsorships. You might be building a brand, trying to win over a specific audience, or spreading an idea. You might be trying to drive traffic to your blog (though, unless you’re a web-display-ad-only operation, that never really made sense either). In any case, the Big Three have never been enough for the editorially-minded to properly answer why.

It quickly becomes clear why it’s tough to come up with a shared understanding of how to measure things. But it’s at least clear we need something, given the influx of long, valuable, self-contained newsletter products from every major publisher. So what should we be looking at?

Let’s Start with the Open Rate

People always start with the open rate. It’s easy enough, but what is a good open rate? ESP-produced benchmark reports are a rough starting point, but remain incredibly vague. How big are the lists in question? How old are the email projects (do they account for natural decay)? What types of companies actually compose something like media/publishing? 

As part of their efforts, the Shorenstein Center has aggregated anonymized, validated data from small, mostly nonprofit newsrooms (they get very granular on the composition of companies and the data is obtained via direct connection to their ESPs, so the numbers are very real).

It’s right there, in all its scattered beauty. Individual, real-life data points that visualize open rates in the context of list size and newsletter age. You can at least start to get an understanding of where your real benchmark is.

Distributions, Not Averages

But an open rate is just one very specific, top-level metric. What should we be looking at next?

Let’s start with a basic assumption. A newsletter project should get the right people reading a lot. Unlike ecommerce where 1 open out of 100 emails might result in a $1,000 purchase, with newsletters, where the product is content, there is no real equivalent transaction. 

Selling sponsorships? Building influence among key buyers? Building a strong brand? All these hypothetical objectives require high-value engagement over time. Even for an editorial subscription conversion, which is ostensibly ecommerce, you have to deliver valuable content on an ongoing basis.

How do you know if you’re reaching the right readers? How do you know if they’re reading a lot?

Imagine a reader who opened 50% of your emails over a one year period. That is insane editorial brand loyalty. That’s far more valuable than someone that’s opened 50% of your emails in a one week period. Think about all the emails you’re inundated with daily. And yet someone has consciously chosen, over a very long period of time, to open the ones you send. 

Now let’s work backwards for a second. Can you get these metrics by looking at clicks, opens or list size? Nope. You have to get way more granular than that. You have to start thinking in distributions.

The Distribution Equation

The Shorenstein research tries to instill the habit of inspecting the behavior of very engaged users, somewhat engaged users, and those who don’t seem to be interacting with your emails at all. 

“One of the reasons we talk about vanity metrics like list size, open rate, click rate being “bad” or less-useful metrics, is that they don’t differentiate people [at different levels of engagement.]” – Will

Organizations should consider distributions when it comes to engagement, not single numbers. “From there, you can analyze different sections of these distributions, and try to figure out the behavior of these groups,” says Will.

That means putting less emphasis on averages. We’ve seen this before in Rameez’s Law. As Axios’ VP of Growth notes, you don’t get a 50% open rate by having a 100% opener and a 0% opener. You get two cohorts that you act upon in different ways.

However, these aren’t exactly easy things to measure on a deep, meaningful level. Some ESPs do allow for more advanced segmentation based around email engagement over time, but the task of analyzing these cohorts quickly bleeds over to data science once the ESP reveals its reporting limitations around these segments. 

A Language Around Newsletters

“Vanity metrics are very easy to understand for those who aren’t very technical. List size or click rate don’t require any explaining.” – Will 

This isn’t the fault of organizations – there’s simply a lack of education and dedicated tools. To get on the level of Axios or Industry Dive, and other organizations we’ve interviewed for this series, you’d need a team of developers working full-time. 

Jacque Boltik and Nicco Mele have been trying to democratize in-depth engagement analytics since the release of their email analysis research guide in 2017. Their dive was the first we’d seen to openly explore granular audience metrics specific to newsletters. In fact, it was one of the first true attempts to form of a common language around the practice. Let’s take a look.

Justification Science

At the heart of Shorenstein’s 2017 dive are two Jupyter Notebooks (here’s the first Notebook). 

While the guide directly addresses data scientists, and the Notebook looks like gibberish to non-coders, it’s relatively simple when broken down. The Notebooks pull MailChimp data into an easily-readable table, then explores this data visually. That’s it – it’s simple data manipulation. 

Let’s explore a few charts.

In the above line graph, each point represents the current percentage of the email list that’s subscribed, unsubscribed, or pending

For example, 90% of people who signed up around December 2015 are still subscribed, 80% of people who signed up in June 2017 are still subscribed. Shorenstein points towards the anomaly between April 2016 and August 2016. There could be numerous reasons for the large number of pending subscriptions. 

For example, you could have hosted an event where you registered attendees, automatically adding their email to a segment, but didn’t send a confirmation email. The point here: you could not have gotten this by just looking at the metrics in MailChimp’s UI – or most ESPs for that matter.

Another example. Take this distribution bar chart for unique opens per subscriber:

Most users (a third) have opened less than 10% of all emails sent to them, while there is a dedicated minority of readers who are very loyal, opening nearly all your emails. How do you shift this distribution? Where do the readers who have over 60% unique opens come from? When did they join? Some great questions blossom from visuals like these.

These charts can get really detailed. Take this stacked area graph showing last active users by the time they joined:

Such analyses are meant to provoke questions and start discussions among various teams in an organization. They allow the discussion to move past, “we got an open rate two points above the mailchimp benchmark” (and the following silence from a lack of clarity on what to do next). 

Who are your most engaged/valuable/loyal readers and where did they come from? From there you can go several levels deeper. What organizations are these VIP cohorts from? What newsletter topics do they find most engaging? How many of them register for your events, or what percentage subscribe to paywalled content? 

Newsletter analytics still isn’t an exact science and there is much experimentation in the nascent space. But it’s more about empowering oneself to feel confident in asking questions. “It’s more of a learning process than coming up with an equation,” says Hong. 

Tech Wizardry

Tech sophistication is one of the telltale signs an organization is ready for audience-level newsletter analytics. More sophisticated newsrooms have the resources to explore audience-level metrics, and the people to create an analytics pipeline.

Most of the folks we’ve spoken to as part of this series have similar setups: daily extraction of an ESP’s data to a database, followed by programmatic analysis via SQL queries, and sometimes downstream analysis via Python’s array of data science libraries. 

When you’re a local paper in a small town with a shoestring budget, this becomes the stuff of dreams. This is where the Shorenstein Center’s mission of building valuable tools like open-sourced Python notebooks come into play.


Let’s go back to our befuddled newsroom/marketing team from the beginning and see what they’ve gained working with the Shorenstein Center’s tools. 

Primarily, they now have something useful to tell their managers. They can speak in distributions of engagement, distinct cohorts, loyalty over time, anomalies in behavior, and so on. And while setting up proper analytics infrastructure can be difficult and expensive, with these open source tools they have access to what was previously only available to the newsletter elite – the Smartbriefs and Industry Dives of the space. 

As Hong noted, this is still an ongoing learning process, and it will likely continue for years. But with some general awareness of the problem and open source tools that help democratize audience-level analytics, at least a common language is forming to help us conduct more valuable conversations after those first few newsletter editions have gone out.

P.S. If you want a recent (2019) real-world example of newsletter analytics in action, we recommend you check out a piece by Sunnie Huang, a newsletter editor at The Economist, who shared how her team built a custom newsletter analytics dashboard to better inform the editorial team.