Machine learning can help to flag risky messages on Instagram while preserving users' privacy

AI and instagram issues

A group of researchers from four top universities have proposed a method to use machine learning technology to flag risky conversations on Instagram without having to eavesdrop on them as regulators and providers struggle with the protection of younger social media users from offensive behavior like bullying, and safeguarding their privacy. The finding may present platforms to safeguard younger users while maintaining their privacy.

Researchers from Drexel University, Boston University, Georgia Institute of Technology, and Vanderbilt University led the team, which recently published its timely work in the Proceedings of the Association for Computing Machinery's Conference on Human-Computer Interaction. The study looked at what types of data input, such as metadata, text, and image features, could be most helpful for machine learning models to identify risky conversations. According to their research, metadata traits like discussion duration and participant engagement may be used to identify dangerous talks.

Their initiatives target a growing issue on the social media platform that American teenagers between the ages of 13 and 21 use the most. Recent studies have demonstrated that bullying on Instagram is contributing to a sharp spike in depression among the app's youngest users, specifically an increase in eating disorders and mental health issues among adolescent females.

Scholars say that since Instagram is a platform that makes younger audiences feel safe and encourages them to open up, it leads to abuse, harassment and bullying by malignant users. 

In the wake of the Cambridge Analytica incident and the European Union's precedent-setting privacy protection rules, platforms are under increased pressure to protect the privacy of their users. As a result, end-to-end encryption of all messages is being implemented across Meta's platforms, which include Facebook and Instagram. This indicates that the messages' content is technologically secure and only the participants in the discussion may view it.

However, this increased degree of security also makes it more challenging for platforms to use automated technology to identify and avoid online hazards. For this reason, the group's system may be crucial in safeguarding consumers.

Automated risk-detection tools are "one method to confront this spike in rogue actors, at a scale that can safeguard vulnerable customers," Razi added. The difficulty, however, lies in creating them in a morally sound manner that permits them to be accurate while remaining non-privacy intrusive. When introducing security features like end-to-end encryption in communication platforms, it's critical to prioritize the safety and privacy of the younger generation.

The method created by Razi and her coworkers employs machine learning algorithms in a tiered manner to generate a metadata profile of a problematic conversation—for instance, it's likely to be brief and one-sided—combined with context indicators, such as if pictures or links are communicated. With only these brief and anonymous facts, the algorithm was 87% accurate in their tests when spotting dangerous talks.

To test the system, the scientists went through about 17,000 private conversations by 172 users, ages 13-21. Over 4 million messages were reviewed The participants labeled their chats as “safe” or “unsafe”. About 3,000 chats were labeled as “unsafe” and subsequently were sent in one of the groups: sexual massage, nudity/porn, harassment, sale, illicit activity. 

The team used several machine learning models to extract a set of metadata features that were most closely associated with risky conversations using a random sample of conversations from each category. The researchers looked at the average conversation duration, number of users in it, number of messages submitted, reply time, how many images sent, if participants were linked (mutually) to other people on Instagram. 

The team was able to build a software using just metadata thanks to this information, some of which would be accessible if Instagram interactions were end-to-end encrypted.

Overall, the researchers concluded that their findings have "implications for the sector as a whole" and "promising potential for further study." "First, risk identification that is only based on metadata attributes enables lightweight detection techniques that do not necessitate the costly processing associated with evaluating text and pictures. Second, creating systems without content analysis helps to resolve some of the ethical and privacy concerns that emerge in this field, protecting users.

The team used the same dataset to perform a similar machine learning analysis of linguistic cues and image features in order to improve it and create a program that could be even more useful and able to identify the specific risk type if users or parents choose to share additional details of the conversations for security reasons.

In this case, sophisticated machine learning techniques searched through the content of the discussions and, after ascertaining which contact the users had labeled as "unsafe," found the phrases and word combinations that are frequently used enough in problematic talks to be able to raise a warning.

The researchers employed a combination of systems, one that can recognize and extract text on top of photos and videos, and another that can look at and produce a caption for each image, to analyze the images and videos, which are key to communication on Instagram. The machine learning systems then developed a profile of phrases suggestive of photographs and videos exchanged in a dangerous chat using a similar textual analysis.

The machine learning system was tested by examining a random selection of talks from the larger dataset that had not been utilized in the profile-generation or training process. The system had been trained with these dangerous discussion features. The software was able to recognize unsafe interactions with an accuracy of up to 85% using a mix of studies of both metadata qualities, as well as verbal signals and visual aspects.

They claim that metainfo provides insights about communication which can be potentially vicious for juveniles. Despite this, recognition and response to the exact kind of risk needs the use of language and imagesAs such contextual cues might be helpful for well-designed risk mitigation systems that employ AI, this study raises significant philosophical and ethical issues in light of Meta's current push for end-to-end encryption.

The method might be modified to evaluate conversations on other platforms that are subject to end-to-end encryption. The researchers recognize that their research has limits because it only looked at Instagram messages. They also point out that if the program's training were to continue with a bigger sample of communications, it may become even more accurate.

However, they point out that while maintaining privacy is a legitimate issue, there are methods to make progress, and these measures should be taken in order to protect the most vulnerable users of these well-known platforms. This study demonstrates the viability of successful automated risk identification, they say.

In their paper, they state that "our research provides a crucial first step to allow automated—machine learning-based—detection of online risk behavior moving ahead." Our method is based on reactive conversational features, but our study also opens the door for more proactive techniques to risk identification that are likely to be more applicable in the real world given their rich ecological validity.

Yasmin Anderson

AI Catalog's chief editor

Share on social networks:

Similar news

Stay up to date with the latest news and developments in AI tools at our AI Catalog. From breakthrough innovations to industry trends, our news section covers it all.


Fashion Brands use AI to create a variety of models. To complete the idea of the diff...


Country’s Spring Budget is directed towards supporting the AI industry. In the recent...


Facial recognition tool Clearview AI has revealed that it reached almost a million sea...