Talend Data Masters 2016: How the ICIJ Decoded the Panama Papers with Talend

Talend Data Masters 2016: How the ICIJ Decoded the Panama Papers with Talend

  • Martine Vesco
    Martine Vesco joined Talend in 2016 as a Senior Customer Marketing Manager. In this role, Martine develops and maintains a trusted advisor relationship with key customer contacts and creates Customer Reference programs as well as communities. Prior to Talend, Martine held a number of senior positions in customer marketing at leading software companies such as Dassault Systèmes, Business Objects and Workday.


The Panama Papers is the history’s biggest data leak and cross-border investigation in journalism history. For one year, around 400 reporters across almost 80 countries dived into this massive trove of information that exposed how the offshore economy works. Talend Big Data was instrumental in bringing that information into the public domain.

Founded in 1997, the International Consortium of Investigative Journalists (ICIJ)  is a global network of more than 190 independent journalists in more than 65 countries who collaborate on exposing big investigative stories of global social interest. On May 2015, ICIJ obtained from German newspaper Süddeutsche Zeitung an encrypted hard drive with leaked data from the Panamanian law firm Mossack Fonseca.

Massive Mountains of Information

The 2.6 terabytes and 11.5 million files of Panama Papers data were made up of more than 320,000 text documents, 1.1 million images, 2.15 million PDF files, 3 million database excerpts and 4.8 million emails. The entire set of printed documents would weigh 3,200 tons, take more than 41 years of nonstop operation to print it on an office laser printer, consuming a small forest of 80,000 trees as paper.


Open-Source Technology Inside 

The ICIJ used Talend Big Data to reconstruct Mossack Fonseca’s client database from the database excerpts and convert it into a Neo4j graph database. They visualized it with Linkurious, a graph visualization platform to organize and access the information.

They knew from the beginning that they ultimately wanted to make this database open to the public. The data quality requirements were raised, since millions of people would see the information and a mistake could be catastrophic for ICIJ in terms of brand reputation and lawsuits. Talend was key for the ICIJ’s data team to efficiently work remotely across two continents and have each step of the preparation process documented.

Data Democratization

On April 3, 2016 more than 100 media organizations published the results of the year-long investigation. Included in the list of over 210,000 companies across 21 jurisdictions, were activities from the ongoing Syrian war, the looting of resources in Africa, and individual offshore transactions from billionaires, sports players and other celebrities. The report also linked company relationships with 140 politicians in more than 50 different countries – including 12 current or former world leaders.

The political reaction came almost immediately. Iceland’s prime minister resigned two days after the revelations, France put Panama back on its tax haven list, and U.S. President Barack Obama called for international tax reform. Swiss police conducted two raids, including one on the headquarters of UEFA, the body that oversees professional soccer in Europe. A member of FIFA’s ethics committee was forced to resign.  

ICIJ’s Panama Paper investigation produced a daily drumbeat of regulatory moves, follow-up stories and calls for more action to combat offshore financial secrecy. At least 150 inquiries, audits or investigations into its revelations have been announced in 79 countries around the world as a result of being a pioneer in the use of data to help uncover illegal activity.


Join The Conversation


Leave a Reply

Your email address will not be published. Required fields are marked *