Microsoft lays off journalists to replace them
with AI
Microsoft is laying off dozens of journalists and editorial workers at its Microsoft News and MSN organizations. The layoffs are part of a bigger push by Microsoft to rely on artificial intelligence to pick news and content that’s presented on MSN.com, inside Microsoft’s Edge browser, and in the company’s various Microsoft News apps. Many of the affected workers are part of Microsoft’s SANE (search, ads, News, Edge) division, and are contracted as human editors to help pick stories.
“Like all companies, we evaluate our business on a regular basis,” says a Microsoft spokesperson in a statement. “This can result in increased investment in some places and, from time to time, re-deployment in others. These decisions are not the result of the current pandemic.”
While Microsoft says the layoffs aren’t directly related to the ongoing coronavirus pandemic, media businesses across the world have been hit hard by advertising revenues plummeting across TV, newspapers, online, and more.
The layoffs are happening in the US and UK
Business Insider first reported the layoffs on Friday, and says that around 50 jobs are affected in the US. The Microsoft News job losses are also affecting international teams, and The Guardian reports that around 27 are being let go in the UK after Microsoft decided to stop employing humans to curate articles on its homepages.
Microsoft has been in the news business for more than 25 years, after launching MSN all the way back in 1995. At the launch of Microsoft News nearly two years ago, Microsoft revealed it had “more than 800 editors working from 50 locations around the world.”
Microsoft has gradually been moving towards AI for its Microsoft News work in recent months, and has been encouraging publishers and journalists to make use of AI, too. Microsoft has been using AI to scan for content and then process and filter it and even suggest photos for human editors to pair it with. Microsoft had been using human editors to curate top stories from a variety of sources to display on Microsoft News, MSN, and Microsoft Edge.
These Ex-Journalists Are Using AI to Catch Online Defamation
Like many stories about people trying to help fix the internet, this one begins in the aftermath of 2016. From his home in Ireland, Conor Brady had watched the Brexit vote and the election of Donald Trump with disbelief. In his view, the prominence of false stories during each election—whether about Muslim immigrants or Hillary Clinton’s health—was the direct consequence of a hollowed-out news industry without the resources to check the spread of disinformation.
At the time, Conor’s son, Neil—also a former journalist—was working as a digital policy analyst at the Institute of International and European Affairs, researching neural networks and machine learning. The two got to thinking. Wouldn’t it be great, they wondered, if a machine-learning tool could approximate the wisdom of editors and lawyers in order to help overstretched newsrooms? As they thought about it, one use case seemed especially ripe: automated defamation detection. Libel lawsuits are a major threat to news organizations. A system that could flag potentially risky stories before publication could save serious time and money.
“I said to him, ‘Do you think an editor, a journalist, would use that if we could build that kind of tool?’” Neil Brady recalls. “And he said, ‘I’ve no bloody doubt they would.’ And that’s when we said, OK, let’s do it.”
CaliberAI is the startup that eventually launched from that conversation, with a €300,000 pre-seed grant from Enterprise Ireland, a government fund, in November 2020. The basic idea is to provide an extra, automated set of eyes to reporters and editors—like a warning system for potential libel. (Defamation lawsuits tend to be much easier to bring against publishers in Europe than in the US, where the First Amendment gives journalists extra protection.) But the long-term play is more ambitious. The European Union and the United Kingdom are both in the process of crafting laws that could impose new legal liability on platforms for harmful and illegal content, including defamation. In the US, Congress keeps making noises about reforming Section 230 of the Communications Decency Act, the legal shield that protects American companies from liability over user posts. Social media platforms around the world may soon be confronting a version of the legal liability that newspapers have long had to deal with. And their ability to handle it could depend on the success of tools like the one the Bradys are building.
Defamation plays an important but overlooked role in the history of the internet. In the US, Section 230 was originally passed, in 1996, to deal with the fallout from a libel lawsuit. Traditional media organizations, like newspapers and TV news shows, face harsh liability rules for publishing a defamatory claim—a false statement that harms someone’s reputation—or even just passing along a defamatory statement made by someone else. In the 1990s, a trial court ruled that the same standard should apply to online platforms that took steps to moderate user-generated content. This created a perverse incentive: Companies might have avoided moderating anything for fear of falling under the ruling, thereby hosting a complete free-for-all, or they might have chosen to moderate with excessive caution, stifling too many innocent posts in the process. And so Congress passed Section 230, establishing that platforms generally can’t be held liable for user posts no matter what.
A key part of the thinking behind Section 230 was that while a newspaper might publish a few dozen or a hundred stories a day, an internet platform might host thousands or millions or, eventually, billions of pieces of content uploaded by users. At that scale, it’s impossible to vet everything in the same way an editor or legal department might. While the major platforms today enlist thousands of moderators, they rely even more on automation to flag violations. And the challenge appears especially daunting for defamation. Whether a statement is defamatory depends on whether it’s true or false—a particularly tough judgment to automate. Unlike a list of prohibited words, the universe of potential defamatory posts is infinite.
The insight driving CaliberAI is that this universe is a bounded infinity. While AI moderation is nowhere close to being able to decisively rule on truth and falsity, it should be able to identify the subset of statements that could even potentially be defamatory.
Carl Vogel, a professor of computational linguistics at Trinity College Dublin, has helped CaliberAI build its model. He has a working formula for statements highly likely to be defamatory: They must implicitly or explicitly name an individual or group; present a claim as fact; and use some sort of taboo language or idea—like suggestions of theft, drunkenness, or other kinds of impropriety. If you feed a machine-learning algorithm a large enough sample of text, it will detect patterns and associations among negative words based on the company they keep. That will allow it to make intelligent guesses about which terms, if used about a specific group or person, place a piece of content into the defamation danger zone.
Logically enough, there was no data set of defamatory material sitting out there for CaliberAI to use, because publishers work very hard to avoid putting that stuff into the world. So the company built its own. Conor Brady started by drawing on his long experience in journalism to generate a list of defamatory statements. “We thought about all the nasty things that could be said about any person and we chopped, diced, and mixed them until we’d kind of run the whole gamut of human frailty,” he says. Then a group of annotators, overseen by Alan Reid and Abby Reynolds, a computational linguist and data linguist on the team, used the original list to build up a larger one. They use this made-up data set to train the AI to assign probability scores to sentences, from 0 (definitely not defamatory) to 100 (call your lawyer).
The result, so far, is something like spell-check for defamation. You can play with a demo version on the company’s website, which cautions that “you may notice false positives/negatives as we refine our predictive models.” I typed in “I believe John is a liar,” and the program spit out a probability of 40, below the defamation threshold. Then I tried “Everyone knows John is a liar,” and the program spit out a probability of 80 percent, flagging “Everyone knows” (statement of fact), “John” (specific person), and “liar” (negative language). Of course, that doesn’t quite settle the matter. In real life, my legal risk would depend on whether I can prove that John really is a liar.
“We are classifying on a linguistic level and returning that advisory to our customers,” says Paul Watson, the company’s chief technology officer. “Then our customers have to use their many years of experience to say, ‘Do I agree with this advisory?’ I think that’s a very important fact of what we’re building and trying to do. We’re not trying to build a ground-truth engine for the universe.”
It’s fair to wonder whether professional journalists really need an algorithm to warn that they might be defaming someone. “Any good editor or producer, any experienced journalist, ought to know it when he or she sees it,” says Sam Terilli, a professor at the University of Miami’s School of Communication and the former general counsel of the Miami Herald. “They ought to be able to at least identify those statements or passages that are potentially risky and worthy of a deeper look.”
That ideal might not always be in reach, however, especially during a period of thin budgets and heavy pressure to publish as quickly as possible.
“I think there’s a really interesting use case with news organizations,” says Amy Kristin Sanders, a media lawyer and journalism professor at the University of Texas. She points out the particular risks involved with reporting on breaking news, when a story might not go through a thorough editorial process. “For small- to medium-size newsrooms—who don’t have a general counsel present with them on a daily basis, who may rely on lots of freelancers, and who may be short staffed, so content is getting less of an editorial review than it has in the past—I do think there could be value in these kinds of tools.”On
the other hand, Sanders says, adopting a tool like CaliberAI could increase a publication’s legal exposure if it turned out that a journalist ignored a warning sign before publishing something defamatory. “I would not want my client to be the first publication to try this out,” she says. “Kudos to them for making it; let’s see what the courts think about this.”
The first set of CaliberAI users will be media organizations. The company is currently negotiating its first contract, with a chain of Irish newspapers owned by the Belgian publishing group Mediahuis. The far more interesting potential market, however, is not traditional media but social media. Neil Brady says he has had some preliminary conversations with major social networks. But as things stand, they have little reason to invest in something like CaliberAI’s software because they generally can’t get sued over user posts. The question is how long that will remain the case.
In the EU, under the upcoming Digital Services Act, platforms will be liable for illegal content that they know about and fail to remove. And in the US, the congressional debate around repealing or modifying Section 230 remains lively. (“We need Joe Biden to bring in liability,” Neil Brady half-jokes, when I ask about his company’s biggest challenges. “If he could just get on with that, that would be nice.”) Whatever final form these new liability rules take, platforms will almost certainly need new probabilistic methods of identifying and screening content that they mostly haven’t had to worry about so far, including defamation. In the case of CaliberAI, Neil Brady says, that could involve moderators using its tool, but it also could involve applying its analysis to steer users away from inflammatory posts in the first place.
“The internet is so dysfunctional in so many ways, and yet at the same time, there’s this very difficult balance to be struck between censorship and freedom of expression,” he says. “In the longer term, one of the big ways I can see that problem being addressed is the insertion of intelligent layers of technology like this, that essentially try and nudge better decisionmaking. It’s a kind of nudge-tech.”
One of the most compelling arguments made by people who support Section 230 immunity is that changing it would disproportionately hurt smaller platforms. Facebook can afford to expand moderation, the thinking goes, but newer companies might not. Startups like CaliberAI represent the other side of the coin. If legal changes force platforms to have more robust content moderation from the get-go, they won’t all build their systems from scratch. Companies like CaliberAI will proliferate to satisfy the startup market’s demand for moderation tools, in the same way many startups outsource payroll or other business functions.
It would be fitting if a team led by journalists helped shape the next phase of social media content moderation. Conor Brady, who teaches journalism at an Irish university, notes that the journalism profession is guided not just by legal pressures but by a set of values—like accuracy, impartiality, and independence—that date back to the late 19th century. Thinking on that timescale, it’s little wonder that social media hasn’t developed its own set of analogous norms. Conor likes to give his students a thought experiment. “Think about how you can actually take what are essentially 19th- and 20th-century editorial values and re-embed them, recast them in 21st-century technology,” he says. “It’s an easy thing to say it; it’s a damn difficult thing to actually put into effect.”
Artificial intelligence and journalism: a race with machines
The term Artificial Intelligence (AI) is a somewhat catch-all term that refers to the different possibilities offered by recent technological developments. From machine learning to natural language processing, news organisations can use AI to automate a huge number of tasks that make up the chain of journalistic production, including detecting, extracting and verifying data, producing stories and graphics, publishing (with sorting, selection and prioritisation filters) and automatically tagging articles.
These systems offer numerous advantages: speed in executing complex procedures based on large volumes of data; support for journalistic routines through alerts on events and the provision of draft texts to be supplemented with contextual information; an expansion of media coverage to areas that were previously either not covered or not well covered (the results of matches between ‘small’ sports clubs, for example); optimisation of real-time news coverage; strengthening a media outlet’s ties with its audiences by providing them with personalised context according to their location or preferences; and more.
But there is a flipside to the coin: the efficiency of these systems depends on the availability and the quality of data fed into them. The principle of garbage in, garbage out (GIGO), tried and tested in the IT world, essentially states that without reliable, accurate and precise input, it is impossible to obtain reliable, accurate and precise output.
News automation is the most visible aspect of this phenomenon and has undoubtedly given rise to the most heated debates within the journalistic profession. The idea of ‘robot journalism’ as it is often called has contributed to visions both dystopian and utopian.
At its worst, automation could threaten jobs and journalistic identity by taking over work usually done by humans. At its best, it could lead to a renewal of journalism by taking over repetitive and time-consuming tasks, freeing up journalists to focus on producing content with high added value.
But the automation of journalistic production methods is not limited to the generation of texts. The BBC recently introduced a synthetic voice to read aloud the articles published on its website; last year, Reuters launched an automated video system to cover sports matches.
No AI without human and financial resources
In his 2019 survey of 71 news organisations in 32 countries in Europe, North America, South America and Asia, Charlie Beckett, director of the Journalism AI project, reported that nearly four out of ten organisations have already deployed artificial intelligence strategies. The main obstacles to the development of these technologies lie in cultural resistance linked to fears regarding job loss and changing work routines, and sometimes even general hostility to technology. But they are also linked to the high cost of development, which explains why larger companies have greater access to them.
In what could be seen as a charm offensive aimed at easing tensions with newspaper publishers who criticise Google for using their content without compensation, the Google Digital News Innovation Fund has made significant contributions to funding projects in Europe that explore the possibilities of new technologies. At the time of the fund’s launch in 2015, Carlo D’Asaro Biondo, president of strategic partnerships at Google Europe, said the following: “I firmly believe that Google has always wanted to be a friend and partner to the news industry, but I also accept we’ve made some mistakes along the way.” Google DNI has gone on to support no less than 662 projects to the tune of €150 million.
One such project is RADAR (Reporters and Data and Robots) in the UK, which received funding of €706,000. According to the project’s website: “We have built the world’s only automated local news agency. We provide data-driven articles to hundreds of news websites, papers and radio stations across the United Kingdom.” The service is not entirely automated: A team of journalists work closely with the algorithms to ensure editorial control.
In Italy, the SESAAB group received €400,000 to develop algorithms that organise content according to the behaviour of internet users. Its tailor-made recommendation system is intended to increase the volume of subscriptions and with it income so that the journalists at its regional newspapers are able to devote themselves to creating ‘high quality’ content.
There are ways to take advantage of AI tools that don’t require such substantial resources. In addition to technologies developed to meet the specific requirements of a given media outlet, natural language generation software packages are also available which are not particularly out of reach for a news organisation.
According to a report by consultancy firm Gartner, the cost of accessing these platforms ranges from US$250 to US$4,800 a year. Their main advantage lies in the control they offer their end users, who are able to determine the software’s parameters – from choosing data to the form that generated texts will take – without requiring specialised skills. Swiss media group Tamedia opted for this solution in order to automate reporting on popular vote results in Switzerland. The system is capable of generating around 40,000 articles within a few minutes. It took five political journalists two to three days of work to configure ‘Tobi’ as the textbot was called.
Embracing the phenomenon in order to shape its development
Considering that a computerised procedure is based on human choices, which are not neutral by definition, it is not absurd to think that the steps should also be taken in the opposite direction. The “new players in the world of journalism” are computer engineers, linguists and data scientists. Companies that provide the media with technological solutions do not consider themselves to be doing journalism even though they are actively involved in the journalistic production chain.
Professional organisations should reflect on how to conduct inclusive policies, insofar as the exercise of the news media’s social responsibility is as much individual as it is collective. The major challenges of integrating AI technologies into journalism are also in the field of ethics. As French economist Michel Volle writes: “The good and the evil lie in the intention, not in the tool.”
According to a 2017 study by the Tow Centre of Digital Journalism, AI technologies should integrate editorial values into their design. The report also stresses that “readers deserve to be given a transparent methodology of how AI tool were used to perform an analysis, identify a pattern, or report a finding. But that description must be translated into non-technical terms, and told in a concise manner…”.
At the end of 2020, the Council for Mass Media in Finland published a report recommending that the profession’s self-regulatory bodies should not delay in taking up the issues of data processing, choices in computerised procedures and transparency towards audiences. According to the report, if the media councils do not take the lead, others will: “And whoever it is – whether national legislators, the EU or platform companies – they might jeopardise the freedom of the press.”