UPDATED 18:02 EST / SEPTEMBER 11 2024

AI

Meta confirms it scrapes Australian users’ posts for AI training without opt-out option

Meta Platforms Inc. is scraping Australian users’ Facebook and Instagram posts without providing an opt-out option, the company acknowledged today.

Melinda Claybaugh, Meta’s global privacy director, detailed the practice during a hearing before the Australian Senate. The executive stated that the company uses the public posts of adult users in Australia to train artificial intelligence models. There is no opt-out option and the only way that users can prevent scraping is by setting a post to private.

“We are using public data to train our foundation model and the services we build on that model,” Claybaugh told lawmakers. “We are using public data from our product and services.”

Meta also collects the public data of users in the European Union, but provides them with an opt-out button. Claybaugh said that the feature’s availability is a “direct result of the existing regulatory landscape in the EU,” which has implemented privacy rules such as GDPR. The Australian Broadcasting Corporation reported that a similar opt-out setting is not available in Australia.

Meta develops a family of open-source large language models called Llama. The newest and most advanced addition to the series, Llama 3.1 405B, was trained using 15 billion tokens of data. That corresponds to several hundred million books’ worth of information.

Meta trained the model, which it says can outperform GPT-4o across certain benchmarks, through a multistep process. Some of the steps didn’t involve information from the public web but rather synthetic data, or data generated by an AI model specifically for neural network training purposes. To support the development of Llama 3.1 405B, Meta created several technical workflows for ensuring that its synthetic data meets quality requirements.

The Llama series also includes a number of simpler LLMs with less advanced capabilities. Collectively, the models in the series had about 350 million downloads as of late August, ten times more than the same time a year earlier.

In July, Meta revealed that it’s working on a multimodal version of Llama. The company stated at the time that it won’t make the LLM accessible to developers in the EU because of the “unpredictable nature of the European regulatory environment.” The disclosure came a few weeks after the EU passed a landmark piece of legislation designed to regulate certain AI use cases.

This week’s Australian Senate hearing didn’t mark the first time that Meta has drawn scrutiny in the country. 

In 2021, the Facebook parent blocked Australian users from sharing news articles over a proposed law aimed at requiring social media companies to compensate publishers. Meta reversed the move after lawmakers made amendments to the law, which was passed later that year. This past June, the company raised the possibility that it may once again block news content over the licensing fees it may be required to pay publishers.

Photo: Wikimedia Commons

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU