Using AI to deduce bias in social media and news

Hard Find Electronics Ltd 2018-10-31 00:00

The 2018 Dawn or Doom conference features talks by more than three dozen Purdue faculty members and national experts representing four areas — Machines: artificial intelligence, robotics, autonomous vehicles and drones; Mind: internet and social media effects; Body: bioengineering and human design; and Data: Internet of Things, privacy and cybersecurity. Credit: Purdue University

"I'm feeling sick." "This video game is SICK!" To a computer, the word "sick" may have the same meaning in these two sentences.

But a Purdue professor is combining machine learning with models of social relationships and behavior to read between the lines of text and capture the author's intent in a deeper way. The technology could help identify biases in social media posts and news articles, the better to judge the information's validity.

Traditional natural language processing involves homing in on keywords – for example, the word "good" would normally indicate a positive opinion. This works well for certain applications, but isn't helpful when the text is ambiguous, for example if the author intended a word or phrase to be sarcastic or tongue-in-cheek.

That's where Purdue professor Dan Goldwasser's approach comes in. He focuses particularly on current events and political issues, and analyzes news articles and politicians' tweets to try to determine how the author frames certain issues and what their ideology is.

Goldwasser, an assistant professor of computer science, will talk about this work at Dawn or Doom '18, Purdue's annual conference on the risks and rewards of emerging technologies. Dawn or Doom will be held on Purdue's West Lafayette campus Monday and Tuesday (Nov. 5-6). The conference, now in its fifth year, is free and open to the public.

Dawn or Doom is aligned with Purdue's Giant Leaps Sesquicentennial Campaign and is part of the Ideas Festival theme, Giant Leaps in Artificial Intelligence, Algorithms, and Automation: Balancing Humanity and Technology. The Ideas Festival is the centerpiece of the campaign and connects world-renowned speakers and Purdue expertise in a conversation on the most critical problems and opportunities facing the world.

In one project, Goldwasser is analyzing Twitter posts from political officials. Tweets can be a challenging form of text to interpret, because they're short and may be ambiguous. As an example, after a mass shooting, the phrase "thoughts and prayers" may be used sincerely to express sympathy for the victims' families, but it may also be used sarcastically as a criticism of the lack of government action on gun control.

Goldwasser and his team are trying to understand how politicians frame issues or events, and how that framework sheds light on their stance on the issue. To do this, he's combining linguistic analysis with modeling social relationships and behavior. Social networks can give insight into the meaning of text, because if two people are closely connected, they're likely to share similar ideologies. Behavior, such as when an individual posts on social media, can predict what issues they care about. Combining all three models gives a more complete picture of the author's intent than relying on any one of them alone.

In another project, funded by Google, Goldwasser is using social relationship models to try to identify bias in news sources. Keywords can be a good way of differentiating ideology for a small set of data. For example, an article about a mass shooting that focuses on the mental health of the shooter is more likely to have a conservative viewpoint, whereas an article that discusses how the gun was obtained is more likely to have a liberal outlook.

"The problem is that manually identifying the relevant indicators for each event is difficult to scale up," Goldwasser says.

Instead, his team is collecting multiple news articles about the same event and building a network of people who share the articles on social media. Based on the network's connection to individuals or organizations with a known political slant, the perspective of the article can be inferred without having to manually generate relevant keywords.

Explore further: Forecasting model could predict which bills get passed