Charting the stormy seas of social media
story by Lauren J. Bryant
illustrations by Neil Caudle, adapted from Winslow Homer
On September 6, Clemson launched a second Social Media Listening Center in Daniel Hall, effectively doubling the University’s commitment to social media listening, not only in terms of space but also in terms of technology. The Daniel Hall SMLC will feature a six-screen “hiperwall” that allows users to configure and display data in myriad ways best suited to their particular needs. Directing the new center is Joseph Mazer, assistant professor of communication studies (photo above by Craig Mahaffey). Mazer will be working with students and faculty across campus, as well as clients outside the University, to drill down into millions of social media conversations. “The hiperwall offers a huge advantage,” said Mazer. “It enables us to configure data in ways that make sense graphically—critical for presentation, for teaching, and even more fundamentally, for understanding what we’re observing.”
If you think tweets come from feathered things, posts are for fences, and pinterest is merely a typo, this story may not be for you.
But if you’re among the one billion users of Facebook or YouTube, or the hundreds of millions who use Twitter, Pinterest, LinkedIn, Instagram, and Tumblr, then meet Jason Thatcher and the faculty and students at Clemson’s Social Media Listening Center.
In fact, they may have heard from you already.
Since February 2012, when the SMLC opened under the auspices of the Clemson CyberInstitute, Thatcher (@jasonbthatcher) and dozens of students and faculty members have been “listening” to social media—meaning, they are monitoring information drawn from hundreds of millions of updates, posts, and tweets to track what people are saying about a particular person, place, or thing.
They accomplish this listening using systems from Radian6 (a division of Salesforce Marketing Cloud @marketingcloud) and PeekAnalytics (@peekanalytics), six high-powered computers provided by Dell (@Dell), and six forty-seven-inch monitors arrayed around the often-busy room that constitutes the center.
There’s a lot to hear. As an article on social media analysis in The Guardian put it, in its June 10 edition, “online social networks bleed information.” Every status update and Facebook “like,” every hashtagged tweet and repinned pin, offers potentially valuable data if you know how to look or listen. Conversations via the “social web” generate a dataset like no other—constantly updated, broadly diverse, largely unfiltered, and, in terms of quantity, utterly overwhelming.
“Here’s the problem,” says Thatcher, an associate professor in the Department of Management at Clemson and a faculty lead at the SMLC. “You have seven hundred million data sources, five thousand tweets a second. How do you find something useable in that? How do you identify, extract, and analyze relevant data?”
That’s where the Social Media Listening Center comes in.
Using straightforward keyword searches and sophisticated algorithms, students and faculty at the SMLC monitor words and phrases used in social media conversations, analyzing the sentiment and intentions they may reveal.
“We sort the signal from the noise,” Thatcher says. “Defining keyword sets and refining searches isn’t rocket science, but what you do with the data is. What we’re good at is figuring out what to do with it.”
Keeping the trust
There’s no denying that social media listening is surveillance, but Thatcher is quick to point out that the work of the Social Media Listening Center is anything but covert.
“We’re upfront and transparent about collecting data,” he says. “We collect data only where people do not have their privacy settings turned on. We’re not reaching into anyone’s pocket.”
In fact, Thatcher firmly believes that academics should collect and mine data from the social web (as does Clemson’s Chief Information Officer Jim Bottum, who had the original idea of partnering with Dell and Salesforce Radian6 to create the SMLC).
“When I make my pitch about why academics should be doing this work,” says Thatcher, “I tell people that universities are keepers of the public trust. We can ask questions that others can’t. If we don’t do this work, then the intelligence and business communities will know what’s going on, and we won’t.”
Good analysis depends on context, says Thatcher, who stresses this point with students working on projects at the SMLC. “You really have to be a content expert and have some mastery of an event or issue to run a good data analysis,” he says. “You have to apply your understanding of the issue to the data along with the technology.”
Katy Perry and Halo 4
Last fall, with funding from a National Science Foundation Early Concept Grant for Exploratory Research, Thatcher and an SMLC team took a look at the 2012 congressional and national elections, specifically the impact of social media on polarization.
“We were really interested in learning about what made folks ‘pick a team,’” Thatcher explained in an interview on the Sales-Force Marketing Cloud blog after the presidential election.
Monitoring conversations related to these issues, the team discovered that medical marijuana talk focused on policy issues, not partying, and that Katy Perry engaged young people to support Obama but her tweeting peaked too soon.
“Her numbers spiked before Election Day with her offer [to retweet photos of outfits people wore to vote] but didn’t jump back up on the day of the election itself,” Thatcher reported in his blog interview.
As far as celebrity chatter (Clint Eastwood, Kanye West, etc.), the team found that celebrities encouraged more positive feelings toward Obama and negative ones toward Romney, but overall played a modest role. Memes, too, were low-profile influencers on Election Day.
The search for conversation about Halo 4 (a popular video game series that launched its latest installment on November 6, 2012) was used to track whether people were talking about other aspects of their lives in relation to the election. The team found that Halo 4 fans may have been disgruntled by the timing of the game’s release but managed to attend to both the election and alien enemy attacks.
The most significant results from the NSF study are still to come, though. First, Thatcher and his team have collected a very large dataset reflecting how social media influences the opinions of U.S. citizens about elections, a dataset that can now be used by other researchers. Second, the SMLC team is wrapping up work on a prototype of a digital tool that will take all the snippets of social media data and create an online “dashboard” that displays the data in one place.
“What we’re hoping to have when we’re done with the project is a visualization tool—a piece of software—that researchers will be able to import social media data into for analysis,” Thatcher says. “You’ll be able to manipulate the data, map conversations to see if they are positive or negative. It will be applicable to a number of different areas where researchers are interested in studying conversations on the social web.”
“We’re the pioneers.”
One of those researchers is Vernon Burton (@VernonBurton1), professor of history and computer science and director of the Clemson CyberInstitute. Author of the award-winning The Age of Lincoln and a former president of the Southern Historical Association, Burton is using the SMLC to help him “measure what people think about the South,” he says
Tracking posts, tweets, and hashtags, Burton and his colleague Simon Appleford (@sjappleford), associate director for humanities, arts, and social sciences at the Clemson CyberInstitute, are looking for mentions of the word “south” and what terms are associated with the mentions. Burton notes that this is considerably more complicated than typing a few words into your Google search bar.
(Jason Thatcher concurs on this point: “Analyzing social media is a long, involved process,” he says. “It’s not necessarily hard to do, but it’s hard to do well.”)
For example, Burton and Appleford had to construct their social media searches to exclude “souths” such as South America, South Africa, South Chicago, South Pole, and South Park.
Their social media analysis is ongoing, but certain conclusions were quickly evident.
For one thing, Burton says, “Southern history is negative. It’s significantly less important to users of social media than Southern food.” Southern food and culture trend positive, while mentions of religion and gender in connection with the South are negative.
Burton also tracked racist terms and comments on social media but found that geographic differences complicate things. In social media data coming from Philadelphia, New York City, and Louisiana, for example, Burton notes that the “n-word” has been appropriated as an affirmation of identity. Likewise, when tweets from Texas showed up containing the racist term “mammy,” it turned out that it was “an African-American woman in Texas tweeting ‘mammy’ all over the place,” he says.
Nevertheless, Burton, an American history scholar for more than thirty years, is enthusiastic about the future of social media analysis in social science research. “You can learn a lot from it,” he says, “especially when you parse it out by region and put it into historical context. You can look to see if the image of the South is changing. Is there a South and a North anymore?
“We’ve never had the opportunity to gather this kind of data,” he continues. “It was unimaginable even ten years ago. Social media is a different kind of data. It’s unguarded, more out of the heart than the head. We’re the pioneers in figuring out what you can and can’t learn from it.”
The wisdom of the crowd
Clemson students are definitely among the university’s social media analysis pioneers. Kyle LePrevost (@KyleLePrevost), who graduated in May 2013 with a degree in management information systems, was part of a Creative Inquiry group who took advantage of the SMLC shortly after it opened. In early 2012, he and four other students (Scott Cole, James Kaplanges, Brett Smentek, and Paul Smith) decided to explore whether social media analysis could help them predict movements in the foreign currency exchange market (@FOREXcom).
“The basic idea,” says LePrevost, “is that we can scrape actual trade data from Twitter and use it to make real-time decisions about when to buy or sell currencies.”
The students opened a “demo” brokerage account, using pretend money, to conduct practice trades and test their algorithm. It was a wildly successful experiment. Out of 58 trades made using the algorithm in spring 2012, only 13 moved in the opposite direction—a 77 percent prediction rate. The students began with $5,000 in their “dummy” account and ended up with $44,000 in theoretical funds just seven weeks later, a 780 percent increase.
As Jason Thatcher put it on the SMLC blog, the students “beat the snot out of the market.”
Over the last year or so, the group—now going by the name #TeamForex—has “backtested” their algorithm using historical market data. The algorithm did not sustain its 780 percent increase level; it dropped to 600 percent. “It still beats out most hedge funds by a wide margin,” LePrevost says. “We have seen about a six hundred percent return per year over two years, which is very exciting.” It’s no doubt exciting too for the investment firms now interested in helping #TeamForex expand their project.
Studies of U.S. elections, Southern history, and foreign currency exchange are just the beginning of the research projects to be carried at the Social Media Listening Center, according to Jason Thatcher. “Our goal is to create a platform for research projects across the campus and with other universities,” he says. “The questions we can pursue are limitless.”
Jason Thatcher is the faculty leader of the Social Media Listening Center and an associate professor in the Department of Management, College of Business and Behavioral Science. Orville Vernon Burton is the director of the Clemson CyberInstitute and a professor of history in the College of Architecture, Arts, and Humanities. Jim Bottum is the chief information officer and vice provost for computing and information technology; he is also a research professor of electrical and computer engineering in the College of Engineering and Science.
In January, Clemson’s Board of Trustees approved a new Social Technologies and Analytics Research Institute (STARI), which will include the Social Media Listening Center, along with several other Clemson units and industry partners. STARI will conduct research into the role of social media in organizational performance.
Lauren J. Bryant is a science journalist in Bloomington, Indiana.