sdecoret - stock.adobe.com
How to opt out of AI training across social media platforms
Whether they are aware of it or not, businesses are increasingly using people's social media posts -- written content and images -- for their training AI systems.
Have you always wanted to be a teacher? Your dream might be coming true -- but not necessarily how you hoped.
That latest photo you posted on Instagram might be used to train an AI model or used in an AI-generated image. Your resume data on LinkedIn might be fed to an AI model. Your face might even appear in an ad if you use a feature on Snapchat.
AI companies rely on the internet to train their models because of the massive amounts of data they need. Not only are there vast amounts of data on the internet -- including social media sites -- it's also free.
Whether you want to train AI with your data or not, you have options.
How does AI scrape data from social media?
AI training models consume data faster than humans can produce it. They scrape the internet for information to learn how to respond to questions. AI chatbots -- such as ChatGPT -- use the information they pull from the web to formulate answers to questions. Companies also use social media data to find language data to help large language models understand how people converse and the latest trends.
"AI models rely on unstructured data from social media, including text, images and videos. Through techniques such as natural language processing and computer vision, AI attempts to understand and categorize this data," said Matt Hasan, CEO of AIResults Inc., an AI-powered marketing and CRM company. "But social media is chaotic, spanning multiple languages and contexts, which makes it difficult for AI to learn accurately. AI can easily misinterpret what it sees."
Companies also use AI to capture people's posts on social media for targeted ads. They use AI to analyze their posts, likes and actions to learn more about you. They want to reach you, learn more about you, and use AI to figure out what appeals to you, said Rogers Jeffrey Leo John, co-founder and CTO of DataChat, a generative AI platform for instant analytics.
Why you should opt out of AI training on social media
Transparency and disclosure are important. If you don't explicitly understand it, you are better opting out, said Kamal Ahluwalia, president of Ikigai Labs, a generative AI data platform.
There are several reasons you should consider taking steps to prevent your information from being used to train AI models, including the following:
- No control over how your information -- including images or private information -- is used.
- Plagiarism issues on your thoughts and text posts.
- Spreading of false information -- including misinformation and disinformation.
- Lack of privacy.
"Once a model is trained on your data, there's no way to make it 'unlearn' or erase it, making it safer to exclude such data from AI training to protect privacy," John said.
Issues with training AI
Social media sites might not provide the highest quality data for AI model performance. For reliable and accurate data output, companies need high-quality and diverse data. Using social media data might result in biased information, human slang, jargon, harmful content and disinformation.
The quality of data also varies across platforms. LinkedIn tends to have higher-quality career posts, while Reddit might have more diverse perspectives. By training models on this information, there is a need to identify incorrect misinformation and disinformation that might be purposely trying to spread harmful information to the public. This could become a safety hazard.
John said companies need to filter data because it is often biased and misinformative. Social media also holds vast amounts of private data -- such as birthdate, relationship status, and contact and employment information -- which has been exploited by malicious actors.
When reviewing products or company data, people tend to share negative experiences more freely. There is a good chance there will be more negative than positive commentary, even though more people have positive experiences.
"Negativity seems to percolate these days a lot faster," Ahluwalia said. These negative experiences about products and services can give an inaccurate representation of a product launch when performing a sentiment analysis.
Ahluwalia also said there's a lot of noise in social media content between the data generated by people and data being generated by machines. It's neither good nor bad, models and developers don't know how to remove it.
"It's genuinely a lot of garbage in, and it's hard to take that garbage out," Ahluwalia said.
Is it ethical for AI to use social media information without permission?
Privacy is a major concern. Opting out of AI training on social media isn't straightforward, as most platforms include your data by default, Hasan said. Users are often unaware that their data is being used to build and train AI models, which is a fairness issue. And often, platforms profit from user data without compensating individuals.
Ahluwalia said he thinks anyone training their models or using people's data should get permission. He mentioned EU AI regulations make it very clear that companies obtain consent from users before using data for AI training, and the specific purpose of training is communicated.
How to opt out
Opt-out settings vary between platforms, and not all platforms offer the option to opt out. Most details on how platforms use your data are buried in privacy policies and terms of services.
Here are some social media platforms with more than 100 million users on top social media site lists such as SemRush and Sprout Social.
Discord
To stop Discord from training their AI on your data, go to "Privacy & Safety" under "User Settings" on the left side. Scroll to "How we use your data" section. Turn "use data to improve Discord" and "use data to customize my Discord experience" off.
LinkedIn also lets users opt out of training AI models with their data. They posted FAQs outlining how they used personal data for generative AI.
Click on the "Me" tab at the top of the screen. Select "Settings & Privacy," and then the "Data Privacy" tab on the left column to opt out. Then select "Data for Generative AI Improvement" under "How LinkedIn uses your data." Toggle the button to off.
When you select this option, LinkedIn and its affiliates will no longer use your data for AI models moving forward. However, this does not affect previous training before you selected this option.
Meta
Meta states in its privacy policy that it might use public Facebook and Instagram posts, comments, profile photos and audio to train AI systems -- including its AI chatbot.
If you do not want to share your information, make your account private. If you belong to any public groups, any posts you make and share might also be used to train.
To make your account private on Facebook, adjust your privacy settings in the "Audience and visibility" section of "Settings & privacy." Select "Friends" or "Only Me" instead of "Public" in "Followers and public content."
Click on the privacy settings instructions for the other Meta apps to learn how to make the following accounts private:
- Instagram privacy settings.
- Threads privacy settings.
- WhatsApp privacy settings.
Reddit is a little trickier because it is a public platform, and AI will crawl its forums similar to other websites. Reddit states in its privacy policy when users submit content it is a public part of services, and the content might also be available in search results and provided in AI chatbot answers. The privacy policy also states, "You should take the public nature of the Services into consideration before posting."
Users cannot opt out of not sharing public posts because of the nature of the platform, but private messages and posts in private communities are not shared with third parties. Reddit has deals to share platform data with Google and OpenAI to help train AI models.
Read more about Reddit's other privacy settings.
Snapchat
Snapchat has an AI chatbot and a "My Selfie" feature that lets users turn their selfies into AI-generated images. These images can be turned into advertisements and used to develop and train machine learning models as stated in Snapchat's terms of service.
To turn off the My Selfie feature, go to "Settings" then "My Account." Select the "My Selfie" feature and toggle "See My Selfie in Ads" to off. This will prevent your image from being used in AI-generated images for sponsored content.
To clear data from the My AI chatbot used for personalized ads and Snap products, go to "privacy controls" and select "clear data." There is a "Clear My AI Data" option.
TikTok
TikTok offers private and public accounts. However, there is not much engagement with private accounts. It does use generative AI features on its platform. Learn how to adjust privacy settings on TikTok by reading their account settings.
Tumblr
Tumblr states it discourages "crawlers" from getting information off of its site. However, Tumblr reportedly has agreements with OpenAI and Midjourney to have access to information for their AI models.
To opt out, go to blog settings, then click "visibility." Toggle the "prevent third-party sharing" to "off." For multiple blogs, you will have to complete this step for each.
X, formerly Twitter
X updated its terms and services that go into effect on Nov. 15, 2024. The new terms of service state, "You agree that this license includes the right for us to analyze text and other information you provide and to provide, promote, and improve the Services, including, for example, for use with and training of our machine learning and artificial intelligence models, whether generative or another type." This means that continuing to post on X allows them to use your data to train their AI models.
If you do not want your data used to train the AI chatbot -- Grok -- you must opt out. Go to "Settings." Select "Privacy and safety." Under "Data sharing and personalization" there is a tab for "Grok." Toggle the option to "off."
Amanda Hetler is a senior editor and writer for WhatIs where she writes technology explainer articles and works with freelancers.