It’s been a while since my last post but I am looking forward to the upcoming Sentiment Analysis Symposium in May. One of the new things to think about with sentiment is how it can be expressed in different genres. Until recently linguists did not pay terribly much attention to genre in terms of how it might influence lexical and grammatical features for classification and other tasks. That is certainly starting to change but in particular I think the impact on sentiment analysis deserves deeper investigation going forward.

For one thing the microblog document (OK Twitter) has so many more ways to express sentiment than other document types. It is not just emoticons that are of interest but all kinds of textual manifestations of emotions including the representation of sound (ugh! Eeew!) that are fairly rare elsewhere even in email. I am also excited by the idea of how the concept of “sentiment” has become intertwined with “reputation”. Why is that exciting? Well because the traditional polarity expectations change when something as subjective as a “reputation” becomes the topic. For example, when sentiment analysis was applied mostly to product reviews or news snippets, what sorts of things happened? Well, your “bad” news events like earthquakes and people complaining about products were tagged negatively while product raves and good news were tagged as positive. Sure, once in a while the sarcastic review will stump the classifier as will things like “plummeting” inflation. But reputation is a different animal. Many people do not want certain things exposed simply because of their position relative to other things. For example, a Republican does not want certain types of quotes exposed – even if they are genuine and popular with the public – simply because they hurt his reputation *as a republican* in the media. Reputations may indeed have polarity – it just is not as invariant as the polarity inherent in events in other contexts.
I admit I don’t quite have my head around this yet but I am thinking it over ahead of the symposium and wonder if others are thinking about this too. I originally did not imagine that online communications would vary so much in terms of the way content is structured but I have been surprised….