The good, the bad and the algorithm

By Arianna Danganan and Natasha Grodzinski

Social media data is already being used for marketing and political surveys, with over 80 per cent of Canada’s population in 2013 having some sort of online presence. But now it may be used to screen for individuals at risk and detect early signs of mental illness.

Diana Inkpen, a computer sciences professor and researcher at the University of Ottawa, is leading a team of PhD students and doctors from Ottawa to collect mass amounts of social media data.

Diana Inkpen, lead researcher and computer sciences professor at the University of Ottawa.

Diana Inkpen, lead researcher and computer sciences professor at the University of Ottawa. [Photo © Arianna Danganan]

Inkpen and her team received more than $400,000 in federal funding for their research, from the Natural Sciences and Engineering Research Council of Canada. This comes as part of an initiative from the federal government to put $48 million into scientific research across Canada.

Their goal for the three-year project is to create algorithms that can sort through social media text depending on its emotional content: i.e., if the tone is happy, sad, surprised or angry. When the data is mined, it is annotated by doctors on Inkpen’s team. By annotating, they are able to cross-reference and link certain characteristics to certain mental illnesses.

“The thing about these text mining algorithms,” says Inkpen, “is that they can learn from data. Once the annotations are made, the algorithm can make the correlations between the characteristic and mental illness.”

From there, other indicators are taken into consideration, such as how often things bearing the same theme are posted and if the emotional behaviour and social media behaviour is unusual for the user ID.

Infographic - Canada's social media and mental illness landscape

Infographic – Canada’s social media and mental illness landscape

“If it is normal behaviour, then maybe there is no need to worry,” says Inkpen. “But these predictive models can be used to make predictions for new viewers.”

Inkpen’s team is working with partners from the University of Alberta and Université de Montpellier in France. While her team is primarily using public data from Twitter, the Université de Montpellier has already started looking at text from online forums. Some of these online forums are specifically used by cancer survivors who share their experiences and advice with others.

This isn’t the first time work has been done linking social media and mental health. Previous studies, however, have worked to correlate high social media use with poor mental health.

The Centre for Addiction and Mental Health conducted a survey where Canadian students were asked about their social media use and personal mental health. The study found that students who used social media for more than five hours a day were more likely to report high levels of stress and low self-esteem levels.

“This is just a snapshot in time,” lead researcher Dr. Robert Mann said on the survey. “We can’t say what came first, the poorer mental health or the high level of social media use.”

If successful, these algorithms can determine if the relationship between social media and mental health is causative or purely correlational.

Inkpen says her team is also working with data gathered from the most recent Bell Let’s Talk campaign. The volume of data from the campaign is nearly impossible to manually sort through, but their programming may be able to.

“Our program can classify by type of message,” says Inkpen. “Is it a story, is it fundraising, is it just awareness, or is it someone who might have problems who is starting to speak out?”

Big Brother watching

Since Inkpen’s team is gathering most of their data from Twitter, issues of privacy have arisen. With all this data monitoring, is this project a little too “Big Brother is watching?” But she says they are looking only at user IDs – not specific people.

“We are looking at public data, because we cannot and do not want to look at private data,” she says. “But it’s very interesting how people want to keep their privacy, but share on social media, which is an obvious breach of privacy, and they themselves post a lot of information that could be considered a privacy leak.”

With social media becoming more accessible, activity continues to rise, especially among youths.

With social media becoming more accessible, activity continues to rise, especially among youths. [Photo © Arianna Danganan]

Social media has changed the way society views privacy. Criticisms of over-sharing follow platforms like Twitter, Instagram and Snapchat – with people posting their meals, daily habits and personal feelings – where posts can be viewed around the world by virtually anyone. While privacy measures exist on social media sites, many people still choose to keep their accounts public.

A study conducted through the Pew Research Center in 2013 found that 64% of teens who use Twitter keep their tweets public.

Inkpen emphasizes her team is only mining the data. Any final decisions about what to do with the information is in the hands of doctors.

Practical Applications

Inkpen also notes the importance of working together with health professionals in formulating the algorithms and annotating the data.

“We are separating the work of the expert from the programmer,” she says. “Because the expert cannot program, and the programmer is not an expert on diagnosis.”

The data mining research is still in its early stages. Inkpen and her team have much more text to gather and sift through before they can make any conclusions about the success of the algorithms. Success here, will be if their algorithms are able to make a prediction about an individual’s mental health that matches the diagnosis from a psychologist on their team.

Despite being in an early stage of research, Inkpen already has some ideas about practical applications for the algorithms.

“A long term scenario I see is parents keeping an eye on their kids to protect them from cyberbullying.” – Diana Inkpen

She believes the algorithms would be useful for psychologists to make use of in cases where they want to monitor their patients’ online activity, or watch for suicide warnings.

“It has to be done with agreement of the patient,” says Inkpen, “and our doctors have to be willing to use a technology still developing.”

Inkpen stresses the efficiency the algorithm have in gathering data, but also notes they are not perfect. In language, context is everything. Humans, even through written text, can infer and understand context. An algorithm cannot always do that.

“There is so much information out there,” says Inkpen. “The point is to see if that information can help us. Otherwise, the information is just buried there.”

Tags: , , , , ,

Comments are closed.