dc.contributor.author |
Rekha, A G |
|
dc.date.accessioned |
2022-12-16T07:47:11Z |
|
dc.date.available |
2022-12-16T07:47:11Z |
|
dc.date.issued |
2016 |
|
dc.identifier.uri |
http://dspace.iimk.ac.in:80/xmlui/handle/2259/1082 |
|
dc.description |
Research Advisory Committee: Prof. Mohammed Shahid Abdulla (Chair-person), Prof. Asharaf S (Member), Prof. Saji Gopinath (Member):: Hardcopy of the thesis is available in the library. Please contact the help desk for reference. |
en_US |
dc.description.abstract |
Big data analytics facilitates better informed business decisions through the analysis of
large data sets that remain unexploited by traditional business intelligence systems. ‘Big
Data’ as input enhances the inferential power of established algorithms, but it challenges even the state-of-the-art computation and analysis methods. Though machine learning is a solution to overcome these problems, its current techniques have to be improved to deal with the Big Data. Another drawback of big data analytics is the greater focus on aggregates over outliers. However, in many situations the insights gathered from outliers could be of more significance. In light of this, the focus of this work is on developing machine learning techniques to make outlier detection practical on large business
datasets. For over a decade, Support Vector Data Description (SVDD) technique has shown good predictive accuracy on a wide range of outlier detection tasks. It has
been adapted to numerous business problems also. Inspired by this trend, this thesis
explores the scalability problems associated with SVDD and tries to address it. Three
approaches, namely, LT-SVDD, ELT-SVDD, and PELT- SVDD have been proposed.
The feasibility of these methods was assessed using a set of experiments on synthetic
as well as benchmark data sets; many of these with an order-of- magnitude advantage
in terms of running time. The application of these methods to three real world business
problems is also demonstrated. This work contributes to the support vector literature by
establishing these methods as efficient for outlier detection on large data sets. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Indian Institute of Management Kozhikode |
en_US |
dc.subject |
Anomaly Detection |
en_US |
dc.subject |
Big Data |
en_US |
dc.subject |
Support Vector Data Description |
en_US |
dc.title |
SVDD Variants for Anomaly Detection with Implementations using Hadoop & Spark |
en_US |
dc.type |
Thesis |
en_US |