K

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 1 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
Following is the template approved for M. Tech. synopsis submission work:

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

SYNOPSIS OF M . TECH. DISSERTATION PROGRAM NAME

1. Name of the College : K.I.T.’S College of Engineering, Kolhapur
2. Name of the Course : M. Tech. (Program )
3. Name of the Student : Miss Vaishnavi Pravin Kshirsagar
4. Date of Admission : 18/08/201
5. PRN Number : 1718100043
6. Name of the Guide : Mrs. D.K.Jadhav
Professor, Department of Computer Science & Engineering
K.I.T.’s College of Engineering, Kolhapur .
7. Name of the Co -Guide: –

8. Proposed Title : NetSpam: a Network -based Spam Detection Framework for
Reviews in Online Social Media
9. Type of project : Non -sponsored
10. Name of industry/ : –
Organization

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 2 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
11.1 Relevance:
Social media plays an important role in human life. Specifically Online Social
Media portals plays influential role in information propagation which is an important
platform for sellers for advertising campaigns also it plays an important role for customer s
in selecting products and services. In the past years, people rely a lot on the written reviews
in their decision -making processes, and positive/negative reviews
encouraging/discouraging them in their selection of products and services. In addition,
written reviews also help service providers to enhance the quality of their products and
services. Thus these reviews have become an important factor in success of a business.
Positive reviews can bring benefits for a company, where negative reviews can po tentially
impact credibility and cause economic losses. Also these positive/negative reviews helps
users/buyers in decision making.
Reviews considered by company as a product feedback. There are two types of
reviews: 1) Text Reviews 2) Rating . The fact tha t anyone with any identity can write
comments as review provides a tempting opportunity for spammers. Spammer writes fake
reviews designed to mislead user’s opinion. People make wise choices for their purchases
on the bases of online store reviews. Due to this reason, the review system has become a
target of spammers who are usually hired or enticed by companies to write fake reviews to
promote their products and services, and/or to distract customers from their competitors.

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 3 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
11.2 Literature review:
Reviews are used increasingly by individuals and organizations to make purchase
and business decisions1.The reviews written to change user’s perception of how good a
product or a service are considered as spam5.One of them is a classifier that can calculate
feature weights that show each feature’s level of importance in determining spam reviews.
The general concept of our proposed framework is to model a given review dataset as a
Heterogeneous Information Network (HIN) and to map the problem of spam detection into
a HIN classification problem 4.
There are generally three types of spam reviews:
1) Untruthful opinion spam
2) Reviews on brands only
3) Non -Reviews.
Spam detection can be regarded as a classification problem with two classes, spa m and
non -spam 6.
There are three main types of information related to a review:
1) The content of review,
2) The reviewer who wrote the review,
3) The product being reviewed.
There are three types of features:
1) Review centric features,
2) Reviewer centric features,
3) Product centric features 9.
Further these types are classified as behavioral and linguistic based features
.Content spam tries to add irrelevant or remotely relevant words in target pages to fool
search engines to rank t he target pages high 7.There are large no of duplicate and near –
duplicate reviews. The detection of duplicate and near -duplicate reviews are done by using
machine learning algorithm 8.

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 4 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
11.3 Problem definition:
To develop a software system to organize user reviews on the basis of behavioral
and linguistic features also implement generic graph based algorithm to determine the weights of
features and to classify test reviews into spam and non -spam labeling categories and also test and
analyze th e performance against standard benchmarks.

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 5 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
11.4 Objectives:
Based on literature review following research gaps are identified:

1. It is hard to identify the singleton review as a spam or non -spam 4
2. The classification of the users is difficult as one user has more than one account
6.
3. The reviews given in the form of ratings (star) are difficult to recognize as fake10
4. The review given by the spammer which is true -positive is not classified as a spam
review 11.

Considering stated research gaps, following objectives are defined in proposed study:
1. To organize user reviews on the basis of behavioral and linguistic features.
2. To implement generic graph based algorithm to determine the weights of features.
3. To classify test reviews into spam and non -spam labeling categories.
4. To test and analyze the performance against standard benchmarks.

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 6 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
11.5 Methodology:
The proposed methodology has been described in 4 phases as follows.
i. Prior Knowledge
ii. Network Schema Definition
iii. Metapath Definition and Creation
iv. Classification

1) Prior Knowledge:
This phase computes the probability of review being spam. The proposed version
works in two versions:

• Unsupervised Learning
In unsupervised learning method, the initial probability of review being
spam according to feature which is from set of features is calculated.

2) Network Schema:
1) The list of spam features which determines the features engaged in spam
detection is used to design network schema.
2) The metapath is calculated at this phase.

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 7 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:

3) Met apath Definition and Creation:

1) A metapath is a sequence of relations in the network schema. The path is established using
the features used in the framework.
2) The levels of spam certainty (using feature) for metapath are calculated in this phase.
�� ������= |��× ��(������)
Where s=Level of spamisity
f(������)= probability of review u being spam according to feature l.

3) After computing levels of spam certainty for all reviews and metapaths, two reviews with
the same metapath values for some metapath with fea ture are connected and the link is
created for review network.
4) In next step, using the no of levels with higher value will increase the no of each feature’s
metapath. Reviews can be connected to each other through these features.
5) The spamicity of the revie w with maximum no of levels is calculated.

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 8 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
4) Classification:
It consists two steps:
1) Weight calculation which determines the importance of each spam feature in spotting spam
reviews.
2) Labeling which calculates the final probability of each review being s pam.

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 9 of 11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
11.6 Activity chart:
Month Activity Days
Jul -18 Project Kickoff 5
Aug -18 Data Gathering 15
Sep -18
Extraction Of Feature set and gathering review
data set 25
Oct -18 Design 27

Nov -18 Development 30
Dec -18 Compute prior knowledge 35
Jan -19 development 15
Feb -19 Compute network schema 25

Mar -19 Metapath creation 30
Apr -19 Classification 30
May -19 Development 25
Jun -19 Quality Assurance 25

Jul -19 Roll out an d Maintenance 10

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 10 of
11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
11.7 Cost estimation: 20000
11.8 Resources required: NetBeans, Xampp

Signature of student Signature of Guide

K.I.T.’s College of Engineering (Autonomous), Kolhapur. Page 11 of
11

Student Signature: Guide Signature:
Name of Student: Name of Guide :
Date: Date:
References:
1. J. Donfro, A whopping 20 % of yelp reviews are fake. http://www.businessinsider.com/20 -percent –
of-yelp -reviews -fake -2013 -9. Accessed: 2015 -07 -30.
2. M. Ott, C. Cardie, and J. T. Hancock. Estimating the prevalence of deception in online review
communities. In ACM WWW, 2012.
3. M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of
the imagination.In ACL, 2011.
4. Ch. Xu and J. Zhang. Combating product review spam campaigns via multiple heterogeneous
pairwise features. In SIAM International Conference on Data Mining, 2014.
5. N. Jindal and B. Liu. Opinion spam and analysis. In WSDM, 2008.
6. F. Li, M. Huang, Y. Yang, and X. Zhu. Learning to identify review spam. Proceedings of the 22nd
International Joint Conference on Artificial Intelligence; IJCAI, 2011.
7. G. Fei, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Exploiting burstiness in
reviews for review spammer detection. In ICWSM, 2013.
8. A. j. Minnich, N. Chavoshi, A. Mueen, S. Luan, and M. Faloutsos. Trueview: Harnessing the power
of multiple review sites. In ACM W WW, 2015.
9. B. Viswanath, M. Ahmad Bashir, M. Crovella, S. Guah, K. P. Gummadi, B. Krishnamurthy, and
A. Mislove. Towards detecting anomalous user behavior in online social networks. In USENIX,
2014.
10. H. Li, Z. Chen, B. Liu, X. Wei, and J. Shao. Spotting fa ke reviews via collective PU learning. In
ICDM, 2014.