Unreliable News Sites: An Index
By IFCN and BG
Propagating misinformation is a thriving industry, supported by advertising dollars and political donations. Fighting falsehoods is the focus of scores of research reports. Their studies often require lists of untrustworthy “news” sources. But many of these lists have grown out-of-date and incomplete. Better data means better results, so we built a better dataset, the updated Unreliable News Site index below.
We merged five major lists, then purged those sites no longer active. We only incorporated lists that were public and curated by established journalists or academics, contained original data (rather than information derived from other lists), stated their criteria for inclusion, and defined their assigned labels (detailed in Methodology). Our initial release relies on their determinations and definitions of “unreliable.”
- FactCheck.org’s Misinformation Directory (FC), created by the Annenberg Public Policy Center at the University of Pennsylvania.
- Fake News Codex (FN), widely quoted by Snopes and others, maintained by web developer and data designer Chris Herbert.
- OpenSources (OS), run by Merrimack University media studies professor Melissa Zimdars.
- PolitiFact’s Fake News Almanac (PF), by PolitiFact, a joint project of the Tampa Bay Times and the Poynter Institute.
- Snopes’ Field Guide to Fake News Sites and Hoax Purveyors (SN), created by Snopes, the oldest and largest online fact-checking organization.
Our next release will introduce our own inclusion criteria and fold in more lists. We hope researchers, reporters, and readers find this project helpful.
Unreliable News Sites
The next phase is to automate this list by dynamically removing inactive sites, adding sites by following URL redirects (which often lead to new fake-news schemes) and harvesting related domains via their shared IDs and IPs. Notorious conspiracy site YourNewsWire.com, for instance, now redirects to NewsPunch.com. The SpyOnWeb research tool detects their relationship, along with other domains connected by IP address or Google Analytics and AdSense ID (see Bellingcat’s site-research guide).
Blacklisting bots, fraud and false-news sites
Fake news is a business. Much of that business is ad-supported. Advertisers don’t want to support publishers that might tar their brand with hate speech, falsehoods, or some kinds of political messaging, but too often, they have little choice in the matter.
Most ad-tech dashboards make it hard for businesses to prevent their ads from appearing on (and funding) disreputable sites. Marketers can create blacklists, but those lists, too, have been out-of-date and incomplete.
Our index compiles existing site lists, curated by academic and journalists. For now, we depend on their expertise for accuracy. (A protocol to review and add sites is in the works.)
The site tags above come from those assigned by the original list curators. We grouped their differing labels into our set of six tags.
|Our tag||Their tag and description|
|bias:||OpenSources: "Extreme Bias: Sources that come from a particular point of view and may rely on propaganda, decontextualized information, and opinions distorted as facts."|
|conspiracy:||OpenSources: "Conspiracy Theory: Sources that are well-known promoters of kooky conspiracy theories."|
|clickbait:||OpenSources: "Clickbait: Sources that provide generally credible content, but use exaggerated, misleading, or questionable headlines, social media descriptions, and/or images."|
|fake:||Fake News Codex: "Sites that are fake,… A site doesn't need to exclusively publish fake content to qualify. In fact, many publish a great deal of authentic material, though it’s typically presented in a biased and tawdry fashion. This 'real' content serves as cover for the fake."
OpenSources: "Fake News: Sources that entirely fabricate information, disseminate deceptive content, or grossly distort actual news reports."
Politifact: "Fake news sites: There's little consistency of content or style among fake news sites — the common thread appears to be that they distribute fabricated content, but the reasons aren’t always apparent."
Politifact: "News imposter sites: Adding to the fog of fake news online, several websites appear to try to confuse readers into thinking they are the online outlets of traditional or mainstream media sources."
Snopes: "Fake News Sites and hoax purveyors… spreading fake news and outlandish rumors" and "false, disruptive claims" that "regularly fabricate salacious and attention-grabbing tales."
|satire:||Fake News Codex: "Sites that are not necessarily intended to mislead (such as The Onion and its legion of imitators), but that can be misunderstood by naive readers."
OpenSources: "Satire: Sources that use humor, irony, exaggeration, ridicule, and false information to comment on current events."
Politifact: "Parody or joke sites: Many of the deliberately false or fake news stories we see in social media feeds begin on websites that attempt to parody real news — imagine the humor website The Onion, but without the name recognition (or often the comedic writing talent)."
|unreliable:||FactCheck.org: "Websites that have posted deceptive content."
Fake News Codex: Sites that are "extremely misleading… We do not include sites that merely have a clear political or ideological bias."
OpenSources: "Hate News: Sources that actively promote racism, misogyny, homophobia, and other forms of discrimination."
OpenSources: "Junk Science: Sources that promote pseudoscience, metaphysics, naturalistic fallacies, and other scientifically dubious claims."
OpenSources: "Rumor Mill: Sources that traffic in rumors, gossip, innuendo, and unverified claims."
OpenSources: "State News: Sources in repressive states operating under government sanction."
OpenSources: "Unreliable/Proceed With Caution: Sources that may be reliable but whose contents require further verification."
Politifact: "Sites that contain some fake news: Finally, some websites appear to get duped like the rest of us."
The combined lists had 1,043 unique domain names. Of these, as of November 2018, the 515 above were still active and another 528 inactive (51 percent), either no longer online or no longer posting stories. We detected inactive sites programatically by retrieving HTTP status codes (404s or 301s), using auto-generated screenshots, and, in some cases, by visual inspection.
We curated the resulting list, trimming it a bit, by removing several sites whose stories, though highly politicized, were mostly not fake: alternet.org, cato.org, heritage.org, nationalreview.com, thedailybeast.com, theintercept.com, thinkprogress.org, and weeklystandard.com. We determined this by checking their stories at PolitiFact and Snopes.
Several sites we reviewed had mostly false fact-check judgments These stayed on the list (links go to examples of their failed fact checks): addictinginfo.org, breitbart.com, dailycaller.com, dailykos.com, and judicialwatch.org
Our Google spreadsheet has additional data: the year of domain registration and the number of scripts each site uses for advertising and tracking (thanks to BuiltWith). There's also a sheet of correlations between factors and averages for individual factors.
If you have additions or corrections, please use this form to notify us. Remember, our list includes only sites whose stories are demonstrably false -- not merely biased or partisan. Send links to fact-checks demonstrating whether the site you’d like us to review publishes fake or fact-based news.
Written by IFCN, BG and…