Implementing a Web Browser with Phishing Detection Techniques
WWorld of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 1, No. 7, 289-291, 2011 Implementing a Web Browser with Phishing Detection Techniques
Aanchal Jain
Department of Computer Science And Engineering Lakshmi Naraian College of Technology, Bhopal(M.P.), India [email protected]
Prof. Vineet Richariya
Department of Computer Science And Engineering Lakshmi Naraian College of Technology, Bhopal(M.P.), India [email protected]
Abstract—Phishing is the combination of social engineering and technical exploits designed to convince a victim to provide personal information, usually for the monetary gain of the attacker. Phishing has become the most popular practice among the criminals of the Web. Phishing attacks are becoming more frequent and sophisticated. The impact of phishing is drastic and significant since it can involve the risk of identity theft and financial losses. Phishing scams have become a problem for online banking and e-commerce users. In this paper we propose a novel approach to detect phishing attacks. We implemented a prototype web browser which can be used as an agent and processes each arriving email for phishing attacks. Using email data collected over a period time we demonstrate data that our approach is able to detect more phishing attacks than existing schemes. Keywords- Phishing detection; Web browser. I. I NTRODUCTION
The Internet is playing an increasingly significant role in today’s commerce and business activities. Unfortunately, poor security on the Internet and large financial gains provide a strong motivation for attackers to perpetrate such seemingly low risk, yet high-return online scams. Email messages are not protected as they move across the Internet. Often information being transmitted is valuable and sensitive such that effective protection mechanisms are desirable in order to prevent information from being manipulated or to protect confidential information from being revealed by unauthorized parties. Phishing can be classified into different types of attacks depending on the various channels of proliferation. These include malware, phishing emails, bogus Web sites, and identity theft. Malware is an application with malicious code that is distributed to the public via email or malicious Web sites. When victims access phishing emails or phishing Web sites, there is a chance that malware will be installed on the host computer and will steal personal information related to the customer surreptitiously [2]. Phishing email is another common type of phishing attack where phishers send out fraudulent emails impersonating genuine electronic service providers and ask victims to give away personal information or lead them to bogus Web sites. The bogus Web sites look similar to the Web sites of genuine electronic service providers. Once the victims log onto these Web sites, their personal information are recorded by the adversaries. Identity theft is the attack launched by phishers where they use the credentials of the victims to gain control of their accounts and transfer funds out of them. Filtering phishing emails using classification techniques are able to control the problem in a variety of ways [14]. Detection and protection of phishing emails from the e-mail delivery system allows end-users to regain a useful means of communication. There are many different approaches for fighting phishing emails have been proposed. A promising approach is the use of content-based filters, capable of discerning phishing and legitimate email messages automatically. Many researches on content based email classification have been centered on the more sophisticated classifier-related issues. The success of machine learning techniques in text categorization has led researchers to explore learning algorithms in email classification [1]. Unlike most text categorization tasks, the cost of misclassification is heavily skewed. In order to address the growing problem, users and organizations analyze the tools with available to determine how best to counter phishing in its environment [5,6]. However, it is amazing that despite the increasing development of anti-phishing services and technologies, the number of phishing email messages continues to increase rapidly [5]. In this paper, we study the common practices involved in phishing attacks and review some anti-phishing solutions. We eventually focus on an approach which we have developed to detect and prevent phishing. II.
BACKGROUND Looking at the fact that phishing scammers are reaping enormous financial gains, it can easily be concluded that the motivation behind phishing is almost always financial. Although financial gain is the major motivating factor for
CSIT 1 (7), 289 -291, 2011
METHOD In our implementation, we experimented with link related feature: 1. visible_links: the total number of links in an emails. 2. invisible_links: total number of invisible links. This feature is calculated by an algorithm according to vision standard provided by W3C. In particular, if the color deference between the background and font of link in an email is less than 500, the link is considered as a invisible link. 3.
Unmatching_urls: A binary value to show whether the visible url is as the same as the hidden url. After defined the features that we want to look for in our algorithm, we developed a set of methods to extract all above mentioned three possible features from each email. The values of all features are numerical but in a different range. If we find the value for “invisible_links” and “unmatching_urls” to be nonzero then we consider the given email as a possible phishing attack. The proposed method can be illustrated in figure 1.
Figure 1: Flowchart of proposed method
IV.
IMPLEMENTATION Our prototype currently includes a C
CSIT 1 (7), 289 -291, 2011
User opens the web browser and opens the email on the web browser. 2.
The email before opening will be scanned by the backend phishing detection engine. 3.
The “visible_links” will be extracted from the email body. 4.
The “invisible_links” will be extracted from the email body. 5.
The “Unmatching_urls” will be extracted from the email body. 6.
If the count of “Unmatching_urls” or “invisible_links” is greater than 0. a.
Prompt the user that this could be a a phishing attack. b.
Advise him to delete the mail. 7.
Else if a.
Open the email normally. b.
The status bar shows that the mail is marked as safe by our proposed phishing detection engine. The prototype implementation of web browser displays the potential phishing attack as shown in the following screenshot:
Figure 2: Screenshot of web browser with our method V. CONCLUSION In this paper, we have presented an approach to detect phishing emails using link based features. The contribution of the work mainly consists of the usage of features visible links, invisible links and unmatched urls. The proposed algorithm used in conjunction with the proposed prototype of web browser will help the user to get notified of possible phishing attacks and will prevent them from opening the suspicious websites. A
CKNOWLEDGMENT
We would like to thank all the supportive environment and staff of LNCT, Bhopal to give us the opportunity and facilities to carry out this work. R
EFERENCES [1]
Zhang, J., et. al,. A. Modified logistic regression: An approximation to SVM and its applications in large-scale text categorization. In Proceedings of the 20th International Conference on Machine Learning. AAAI Press, pp.888–895,2003. [2]
I. Bose and A. C. M. Leung, "Unveiling the mask of phishing: Threats, preventive measures, and responsibilities," communications of the Association for Information Systems, vol. 19, pp. 544-566, 2007. [3]
E. Kirda and C. Kruegel, “Protecting users against phishing attacks,” The Computer Journal, 2005. [4]
E. Kirda and C. Kruegel, “Protecting users against phishing attacks,” The Computer Journal, 2005. [5]
B. Ross, C. Jackson, N. Miyake, D. Boneh, and J. Mitchell, “A browser plug-in solution to the unique password problem,”http://crypto.stanford.edu/PwdHash/, 2005. [7]
L. Wenyin, G. Huang, L. Xiaoyue, Z. Min, and X. Deng, “Detection of phishing webpages based on visual similarity.” [8]
J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, pp. 81–106, 1986. [9] “Improved use of continuous attributes in c4.5,” Artificial Intelligence Research, vol. 4, pp. 77–90, 1996. [10]
I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2005. [11]
T. M. Mitchell, “Machine learning,” 1997. [12]
M. Dash and H. Liu, “Feature selection for classification,” 1997. [13]
D. Boneh, “Spoofguard,” http://crypto.stanford.edu/SpoofGuard/, Tech. Rep. [14]
AANCHAL JAIN
Completed Bachelor of engineering in Information Technology from RKDF Institute of science and technology , Rajiv Gandhi University, Bhopal (M.P.), India in 2009. Final Year Student Master of technology in Software Engineering (Dec’11) from Lakshmi Naraian college of Technology, Rajiv Gandhi University, Bhopal(M.P.), India
Vineet Richariya