Learn more about fake reviews, negative reviews, and the power of social content in our ongoing series on user-generated content.
The dark art of rigging reviews is widespread across the web and has even ensnared several content26 clients over the years. The temptation to post fake reviews is high for companies that sell products online. The going rate for a fake five-star review seems to be about $5. Think about it--a mere $500 could pay for 100 5-star reviews on Amazon or Yelp.
But the risks of engaging in this unethical practice are high as well. A recent New York Times investigation into paid reviews by VIP Deals led to the company being delisted from Amazon. When I read the article, I was fascinated by comments from researcher Bing Liu on the methods computer science sleuths are using to detect fake reviews. I recently caught up with Liu, a professor at the University of Illinois at Chicago, to find out more about the elaborate algorithms being used to track down fraudulent reviews. As Bing Liu notes, fake reviews represent an online arms race and review spammers are becoming increasingly clever at avoiding detection. In the end, the best advice anyone can give is "buyer beware."
Content Ping: What is opinion mining? And how is it being used in e-commerce?
Bing Liu: Opinion mining, also called sentiment analysis, is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language. It is one of the most active research areas in natural language processing and is also widely studied in text mining, data mining, and web mining. In fact, this research has spread outside of computer science to the management sciences and social sciences due to its importance to business and society as a whole.
In regards to its application and use, opinion mining is basically everywhere because opinions are central to almost all human activities and are key influencers of our behaviors. Whenever we need to make a decision, we want to know others' opinions. In the business world, companies always want to find consumer or public opinions about their products and services. Currently, opinion mining is most commonly used in marketing, branding, and product improvements.
Content Ping: How can opinion mining be used to detect fake product reviews?
Positive opinions often mean profits and fame for businesses and individuals. This is a very strong incentive for people to game the system by posting fake opinions or reviews to promote or discredit certain products.
Bing Liu: Detecting fake reviews is a sub-area of opinion mining. To detect fake reviews, researchers and companies have built detection models using linguistic features (or signals) from the review text content, and meta-data features such as the star rating, user ID of the reviewer, the time when the review was posted, the host IP address, MAC address of the reviewer's computer, the geo-location of the reviewer, etc. Product and sales information is also useful in many cases. For example, if a product is not selling well, but has a large number of positive reviews, these reviews are clearly suspicious.
Content Ping: You've noted that deceptive user reviews, or "opinion spamming" as it's known, are commonplace on e-commerce sites such as Amazon. Just how widespread is it?
Bing Liu: It is very widespread and is on every website, but it is hard to pinpoint a specific number or percentage, as deceptive or fake reviews are very hard to recognize. As opinions are increasingly used by individuals and businesses for making purchase decisions, choices for elections, and choices for marketing and product design, positive opinions often mean profits and fame for businesses and individuals. This is a very strong incentive for people to game the system by posting fake opinions or reviews to promote or discredit certain products, services, organizations, and individuals.
Content Ping: Has opinion spamming begun to erode consumer confidence in the web?
Bing Liu: To some extent; but a large number of consumers still have not realized that many reviews on the web are fake. The good news is that almost all review hosting sites are actively dealing with this problem. It gets harder and harder for imposters to post fake reviews.
A large number of consumers still have not realized that many reviews on the web are fake. The good news is that almost all review hosting sites are actively dealing with this problem.
Content Ping: So you think e-commerce sites are doing a good job of identifying and removing deceptive product reviews? And are they using opinion mining techniques to do it?
Bing Liu: I do not have hard evidence. From my interactions with a few companies, I believe some sites are quite successful in detecting spam reviews. They are using different opinion mining, data mining, and machine learning algorithms.
Content Ping: What is the most common type of opinion spamming? And how does it work?
Bing Liu: There are many types of spamming depending on the nature of the products and services. For example, an author who wrote a book may write a fake review himself/herself to promote the book and also ask friends and family to write. The publishers help too. Perhaps the most common—and probably the most damaging—type of opinion spamming is when manufacturers and retailers hire people (even professionals) to write fake reviews on different websites in order to promote their products. There are now companies in the business of writing fake reviews for their clients. Their writing strategies and techniques are also evolving.
Content Ping: Who hires these spammers?
Bing Liu: Retailers, manufacturers, and businesses in general.
Content Ping: Who are the spammers? College students? Midwestern housewives?
Bing Liu: I have not personally heard of housewives doing it, but yes, college students and also IT professionals mostly from developing countries. Of course, people in businesses are writing for themselves too.
Perhaps the most interesting research direction is the integration of behavioral patterns of reviewers, as well as the psycholinguistics, deception, and stylistic signals in the text for building detection models.
Content Ping: What is some of the most interesting ongoing research in the area of identifying deceptive reviews?
Bing Liu: Researchers are detecting different types of spam using different algorithms. All of them are very interesting. Perhaps the most interesting research direction is the integration of behavioral patterns of reviewers, as well as the psycholinguistics, deception and stylistic signals in the text for building detection models.
Content Ping: Do you think deceptive reviews will remain a problem five years from now? Or will they be so easy to detect that it no longer pays?
Bing Liu: I think it will be a problem for a long time because the incentive is very strong. It is a hidden form of advertising and marketing, and it is extremely cheap. One cannot imagine doing an advertising campaign with $500 in the traditional media, but in social media it is easily done. However, as the detection algorithms are becoming more mature and sophisticated, it will become harder to spam, so the number of fake reviews will decrease. This is an arms race.
Bing Liu is a professor of computer science at the University of Illinois at Chicago (UIC). He received his PhD in artificial intelligence from the University of Edinburgh. His research interests include opinion mining and sentiment analysis, opinion spam (e.g., fake reviews) detection, web mining, and data mining.
Liu has served as technical program chair of many data mining conferences, including KDD, ICDM, WSDM, SDM, CIKM, and PAKDD, and is on the editorial boards of several leading journals.