Yuta Takata | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuta Takata is active.

Explore More

Publication

Featured researches published by Yuta Takata.

computer software and applications conference | 2015

MineSpider: Extracting URLs from Environment-Dependent Drive-by Download Attacks

Yuta Takata; Mitsuaki Akiyama; Takeshi Yagi; Takeo Hariu; Shigeki Goto

Drive-by download attacks force users to automatically download and install malware by redirecting them to malicious URLs that exploit vulnerabilities of the users web browser. Attackers profile the information on the users environment such as the name and version of the browser and browser plugins and launch a drive-by download attack on only certain targets by changing the destination URL. When malicious content detection and collection techniques such as honey clients are used that do not match the specific environment of the attack target, they cannot detect the attack because they are not redirected. We propose here a method to exhaustively analyze Java Script code relevant to redirections and to extract the destination URLs in the code. Our method facilitates the detection of attacks by extracting a large number of URLs while controlling the analysis overhead by excluding code not relevant to redirections. We implemented our method in a browser emulator called Mine Spider that automatically extracts potential URLs from websites. We validated it by using communication data with malicious websites captured during a three-year period. The experimental results demonstrated that Mine Spider extracted 30,000 new URLs from websites in a few seconds that existing techniques missed.

Proceedings of the Asia-Pacific Advanced Network | 2011

Analysis of Redirection Caused by Web-based Malware

Yuta Takata; Shegeki Goto; Tatsuya Mori

Web-based malicious software (malware) has been increasing over the Internet. It poses threats to computer users through Web sites. Computers are infected with Web-based malware by drive-by-download attacks. Drive-by-download attacks force users to download and install the Web-based malware without being aware of it. These attacks evade detection by using automatic redirections to various Web sites. It is difficult to detect these attacks because each redirection uses the obfuscation technique. This paper analyzes the HTTP communication data of drive-by-download attacks. The results show significant features of the malicious redirections that are used effectively when we detect malware.

international conference on communications | 2017

Malicious URL sequence detection using event de-noising convolutional neural network

Toshiki Shibahara; Kohei Yamanishi; Yuta Takata; Daiki Chiba; Mitsuaki Akiyama; Takeshi Yagi; Yuichi Ohsita; Masayuki Murata

Attackers have increased the number of infected hosts by redirecting users of compromised popular websites toward websites that exploit vulnerabilities of a browser and its plugins. To prevent damage, detecting infected hosts based on proxy logs, which are generally recorded on enterprise networks, is gaining attention rather than blacklist-based filtering because creating blacklists has become difficult due to the short lifetime of malicious domains and concealment of exploit code. Since information extracted from one URL is limited, we focus on a sequence of URLs that includes artifacts of malicious redirections. We propose a system for detecting malicious URL sequences from proxy logs with a low false positive rate. To elucidate an effective approach of malicious URL sequence detection, we compared three approaches: individual-based approach, convolutional neural network (CNN), and our newly developed event de-noising CNN (EDCNN). Our EDCNN is a new CNN to reduce the negative effect of benign URLs redirected from compromised websites included in malicious URL sequences. Our evaluation shows that the EDCNN lowers the operation cost of malware infection by reducing 47% of false alerts compared with a CNN when users access compromised websites but do not obtain exploit code due to browser fingerprinting.

mining software repositories | 2017

Understanding the origins of mobile app vulnerabilities: a large-scale measurement study of free and paid apps

Takuya Watanabe; Mitsuaki Akiyama; Fumihiro Kanei; Eitaro Shioji; Yuta Takata; Bo Sun; Yuta Ishi; Toshiki Shibahara; Takeshi Yagi; Tatsuya Mori

This paper reports a large-scale study that aims to understand how mobile application (app) vulnerabilities are associated with software libraries. We analyze both free and paid apps. Studying paid apps was quite meaningful because it helped us understand how differences in app development/maintenance affect the vulnerabilities associated with libraries. We analyzed 30k free and paid apps collected from the official Android marketplace. Our extensive analyses revealed that approximately 70%/50% of vulnerabilities of free/paid apps stem from software libraries, particularly from third-party libraries. Somewhat paradoxically, we found that more expensive/popular paid apps tend to have more vulnerabilities. This comes from the fact that more expensive/popular paid apps tend to have more functionality, i.e., more code and libraries, which increases the probability of vulnerabilities. Based on our findings, we provide suggestions to stakeholders of mobile app distribution ecosystems.

international conference on security and privacy in communication systems | 2016

Website Forensic Investigation to Identify Evidence and Impact of Compromise

Yuta Takata; Mitsuaki Akiyama; Takeshi Yagi; Takeshi Yada; Shigeki Goto

Compromised websites that redirect users to malicious websites are often used by attackers to distribute malware. These attackers compromise popular websites and integrate them into a drive-by download attack scheme to lure unsuspecting users to malicious websites. An incident response organization such as a CSIRT contributes to preventing the spread of malware infection by analyzing compromised websites reported by users and sending abuse reports with detected URLs to webmasters. However, these abuse reports with only URLs are not sufficient to clean up the websites; therefore, webmasters cannot respond appropriately to such reports. In addition, it is difficult to analyze malicious websites across different client environments, i.e., a CSIRT and a webmaster, because these websites change behavior depending on the client environment. To expedite compromised website clean-up, it is important to provide fine-grained information such as the precise position of compromised web content, malicious URL relations, and the target range of client environments. In this paper, we propose a method of constructing a redirection graph with context, such as which web content redirects to which malicious websites. Our system with the proposed method analyzes a website in a multi-client environment to identify which client environment is exposed to threats. We evaluated our system using crawling datasets of approximately 2,000 compromised websites. As a result, our system successfully identified compromised web content and malicious URL relations, and the amount of web content and the number of URLs to be analyzed were sufficient for incident responders by 0.8% and 15.0%, respectively. Furthermore, it can also identify the target range of client environments in 30.4% of websites and a vulnerability that has been used in malicious websites by leveraging target information. This fine-grained information identified with our system would dramatically make the daily work of incident responders more efficient.

computer software and applications conference | 2017

Detecting Malicious Websites by Integrating Malicious, Benign, and Compromised Redirection Subgraph Similarities

Toshiki Shibahara; Yuta Takata; Mitsuaki Akiyama; Takeshi Yagi; Takeshi Yada

To expose more users to threats of drive-by download attacks, attackers compromise vulnerable websites discovered by search engines and redirect clients to malicious websites created with exploit kits. Security researchers and vendors have tried to prevent the attacks by detecting malicious data, i.e., malicious URLs, web content, and redirections. However, attackers conceal a part of malicious data with evasion techniques to circumvent detection systems. In this paper, we propose a system for detecting malicious websites without collecting all malicious data. Even if we cannot observe a part of malicious data, we can always observe compromised websites. Since vulnerable websites are discovered by search engines, compromised websites have similar traits. Therefore, we built a classifier by leveraging not only malicious websites but also compromised websites. More precisely, we convert all websites observed at the time of access into a redirection graph and classify it by integrating similarities between its subgraphs and redirection subgraphs shared across malicious, benign, and compromised websites. As a result of evaluating our system with crawling data of 455,860 websites, we found that the system achieved a 91.7% true positive rate for malicious websites containing exploit URLs at a low false positive rate of 0.1%. Moreover, it detected 143 more evasive malicious websites than conventional systems.

Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18 | 2018

POSTER: Predicting Website Abuse Using Update Histories.

Yuta Takata; Mitsuaki Akiyama; Takeshi Yagi; Kunio Hato; Shigeki Goto

Threats of abusing websites that webmasters have stopped updating have increased. In this poster, we propose a method of predicting potentially abusable websites by retrospectively analyzing updates of software that composes websites. The method captures webmaster behaviors from archived snapshots of a website and analyzes the changes of web servers and web applications used in the past as update histories. A classifier that predicts website abuses is finally built by using update histories from snapshots of known malicious websites before the detections. Evaluation results showed that the classifier could predict various website abuses, such as drive-by downloads, phishes, and defacements, with accuracy: a 76% true positive rate and a 26% false positive rate.

web information systems engineering | 2017

Understanding Evasion Techniques that Abuse Differences Among JavaScript Implementations

Yuta Takata; Mitsuaki Akiyama; Takeshi Yagi; Takeo Hariu; Shigeki Goto

There is a common approach to detecting drive-by downloads using a classifier based on the static and dynamic features of malicious websites collected using a honeyclient. However, attackers detect the honeyclient and evade analysis using sophisticated JavaScript code. The evasive code indirectly identifies clients by abusing the differences among JavaScript implementations. Attackers deliver malware only to targeted clients on the basis of the evasion results while avoiding honeyclient analysis. Therefore, we are faced with a problem in that honeyclients cannot extract features from malicious websites and the subsequent classifier does not work. Nevertheless, we can observe the evasion nature, i.e., the results in accessing malicious websites by using targeted clients are different from those by using honeyclients. In this paper, we propose a method of extracting evasive code by leveraging the above differences to investigate current evasion techniques and to use them for analyzing malicious websites. Our method analyzes HTTP transactions of the same website obtained using two types of clients, a real browser as a targeted client and a browser emulator as a honeyclient. As a result of evaluating our method with 8,467 JavaScript samples executed in 20,272 malicious websites, we discovered unknown evasion techniques that abuse the differences among JavaScript implementations. These findings will contribute to improving the analysis capabilities of conventional honeyclients.

Proceedings of the 2nd ACM SIGSOFT International Workshop on App Market Analytics | 2017

Understanding the security management of global third-party Android marketplaces

Yuta Ishii; Takuya Watanabe; Fumihiro Kanei; Yuta Takata; Eitaro Shioji; Mitsuaki Akiyama; Takeshi Yagi; Bo Sun; Tatsuya Mori

As an open platform, Android enables the introduction of a variety of third-party marketplaces in which developers can provide mo- bile apps that are not provided in the official marketplace. Since the initial release of Android OS in 2008, many third-party app marketplaces have been launched all over the world. e diversity of which leads us to the following research question: are these third- party marketplaces securely managed? is work aims to answer this question through a large-scale empirical study. We collected more than 4.7 million Android apps from 27 third-party market- places, including ones that had not previously been studied in the research community, and analyzed them to study their security measures. Based on the results, we also a empt to quantify the security index of these marketplaces.

computer and communications security | 2015

POSTER: Detecting Malicious Web Pages based on Structural Similarity of Redirection Chains

Toshiki Shibahara; Takeshi Yagi; Mitsuaki Akiyama; Yuta Takata; Takeshi Yada

Detecting malicious web pages used in attacks and building blacklists and signatures from them are done to protect users against drive-by download attacks. Gathering the content on web pages by crawling and evaluating it to check if it is malicious can help in detecting malicious web pages. Methods that apply supervised machine learning to this evaluation are proposed for detecting malicious web pages from a massive amount of web pages. However, these methods need manual inspections for preparing training data when classifiers are retrained in accordance with changes in the content on malicious web pages. In this paper, we propose a method that evaluates whether web pages are malicious and needs only the discrimination results of web pages identified by high-interaction honeyclients to prepare training data. This method evaluates maliciousness on the basis of the structural similarity of redirection chains arising from drive-by download attacks. The results of our experiments with two years of data showed that the accuracy of our method was about 20\% higher than that of the previous method.

Explore More