Journalists used technology to identify bombs in Gaza and police violence in Chicago

*Por Andrew Deck

Fifteen works received the Pulitzer Prize for journalism on May 6 of this year. This was the first time that 2 of the winners disclosed the use of AI (artificial intelligence) in the production of their reports.

We have no immediate knowledge of precedents,” said Marjorie Miller, Pulitzer Prize administrator. “Previous data reveals that winners may have used low-level machine learning applications. This is the first time we have asked the question explicitly [sobre o uso de IA]”, he explained.

In March, Alex Perry reported to Nieman Lab that 5 of this year’s 45 finalists disclosed their use of AI while researching, reporting or writing the stories submitted to compete for the journalism award. While cycles of excitement and fear around generative AI play out in U.S. news newsrooms, it was actually machine learning, used for investigative reporting, that ended up being seen most among the finalists.

The winner of the award for the article “Missing in Chicago”, from City Bureau It’s from Invisible Institute, trained a custom machine learning tool to comb through thousands of police misconduct files. The visual investigations department of the New York Times trained a model to identify craters from 2,000 specific bombs in areas marked safe for civilians in Gaza. This story was one of several that won some of the newspaper’s international reporting awards.

Miller also confirmed the other 3 finalists who disclosed their use of AI. They included a series of local news stories about the government’s response to Hurricane Ian from The Villages Daily Sun –newspaper covering a large Florida retirement community as well as the investigation into Bloomberg about how the US government fuels the global spread of gun violence and reports on the water harvesting industry.

Nieman spoke to the journalists behind the two Pulitzer-winning stories that used AI. He asked how they brought machine learning to their investigations and what other journalists can learn from their work.

Community-based data journalism

This year’s Pulitzer winner in the local reporting category was “Missing in Chicago,” a series that exposed systemic failures in the Chicago Police Department’s handling of investigations into missing and murdered black women. Published by non-profit outlets City Bureau e Invisible Institute, based in Chicago, the series took years to produce. One of his reporting pillars was a machine learning tool called Judy.

We used machine learning to analyze text from police misconduct records, specifically documents that contained narratives,” said Trina Reynolds-Tyler, chief data officer at Invisible Institutewho shared the Pulitzer with the reporter from City BureauSarah Conway.

Reynolds-Tyler began building Judy in 2021 as part of a project by Invisible Institute to process thousands of Chicago Police Department misconduct files released by court order. The files covered from 2011 to 2015. The journalist counted on 200 members of the Chicago community to develop Judy. Volunteers manually read and labeled misconduct files. In short, these volunteers created the system’s learning data.

While not AI experts, Reynolds-Tyler believes people in the impacted community had an inherent understanding of local police data. Even if they do not have the knowledge to describe a machine learning algorithm, the volunteers had an experience that a third-party data labeler did not. In total, Judy highlighted 54 allegations of police misconduct relating to missing persons over a 4-year period.

For the Pulitzer-winning investigation, these 54 cases became a kind of roadmap for other reporting by Reynolds-Tyler and Conway. The themes of the 54 cases validated the pain and neglect of families who have had loved ones go missing in recent years. The report proved that the incidents were not isolated, but were part of a history of systemic failure by the Chicago police.

Reynolds-Tyler hopes other reporters who rely on machine learning tools understand the value of embedding themselves in the community they are reporting about and basing their data work on real people and places. “We must make it our business to bring people with us into the future,” Reynolds-Tyler said about adopting AI in investigative reporting. “They can help you look at what needs to be seen and understand the data.

Finding patterns

In the international reporting category, a December 2023 article from the visual investigations department at New York Times was one of several recognized stories about the war in Gaza. The Pulitzer-winning team trained on a tool capable of identifying craters left by 2,000-pound bombs, one of the largest in Israel’s weapons arsenal. O New York Times used the tool to analyze satellite images and confirm that hundreds of these bombs were dropped by the Israeli military in southern Gaza, especially in areas marked as safe for civilians.

There are many AI tools that are fundamentally just powerful pattern recognizers”, these Ishaan Jhaveri, reporter on the team specializing in computational reports. He explained that if a mountain of material needs to be sifted through for an investigative project, an AI algorithm can be trained to know what pattern it is looking for. It could be the sound of someone’s voice in hours of audio recordings, a specific scenario described in a stack of Osha (US Occupational Safety and Health Administration) violation reports, or, in the case of the winning report, the outline of craters in aerial photos.

Jhaveri said the team decided that an object detection algorithm was best suited for their investigation. They turned to a platform called Picterra to train this algorithm. Journalists manually selected craters in satellite images uploaded to the platform, training Picterra to do the same automatically, at scale.

One of the advantages of using Picterra was its computational power. Satellite images can easily exceed several hundred megabytes or even a few gigabytes, according to Jhaveri. “Any local development work on satellite imagery would naturally be clumsy and time-consuming,” he said, suggesting that many newsrooms simply don’t have the necessary infrastructure. A platform like Picterra is all about processing power.

After eliminating false positives (like shadows and lakes, for example), the visual investigations team discovered that, as of November 17, 2023, there were more than 200 craters corresponding to this type of bomb in southern Gaza – representing “a widespread threat to civilians seeking safety across southern Gaza”, said the New York Times in your investigation. “It is likely that more of these bombs were used than what was captured in our reports,” the newspaper noted.

We do not use AI to replace what would be done manually. We used AI precisely because it was the kind of task that would take so long to do manually that [desviaria a atenção de] other investigative work,” said Jhaveri. AI can help investigative reporters find the needle in the haystack, she explained.

*Andrew Deck is a writer on the Generative AI team at Nieman Lab.

Translated by Fernanda Bassi. Read the original in English.

O Poder360 has a partnership with two divisions of Harvard’s Nieman Foundation: the Nieman Journalism Lab and the Nieman Reports. The agreement consists of translating the texts of the Nieman Journalism Lab and Nieman Reports into Portuguese and publishing this material on Poder360. To access all translations already published, click here.


Leave a Reply

Your email address will not be published. Required fields are marked *