Monday, June 24, 2013

Fraud Detection and Risk Prediction in the Era of Bigdata

Fraud detection and Risk predictions is a multi-million dollar business and it is increasing proportionally every year. As mentioned on Wikipedia,  the PwC global economic crime survey of 2009 suggests that close to 30% of companies worldwide have reported being victims of fraud in the past year. 

Traditional methods of data analysis and mining have long been used to detect fraud. They require too complex architecture and time-consuming computations that deal with different domains like financial, economics and business practices, and still the results produces are not that much accurate  Fraud often consists of many instances or incidents involving repeated offences using the same method. Fraud instances can be similar in content wise and appearance wise but usually are not identical.

How exactly Bigdata helps to find out the Fraud or to predict most likely risk factors?
There are thousands of data sources with too large volumes and varieties, which are ignored by the traditional fraud analysis techniques and methods in short termed as Bigdata includes social media, transaction logs, application logs, weblogs,  geographical data etc.

For an example: A guy who has taken loan from bank say 1,00,000 with returning monthly installment of 10,000. He regularly paid installments of first four months as per policy after that he unable to pay remaining installments as unavailability of funds, But he is posting his new car, or new home or foreign trip pics on twitter. The guys who is already defaulter in banks record because of unavailability of funds and keeps posting a photos his new car on twitter or facebook. So bank officials can take immediate action on it without waiting for fraud to be happen.

Second example is like, A person whose is living in India, keeps/tries withdrawing money from Delhi, NewYark, Londan, Paris everyday, we can find out his geolocation history using google maps and  will compare with transaction location, resulting into immediate action.

There are many more use cases with bigdata to find out fraud and risk analysis, Advantage of using bigdata over traditional systems is most important is high accuracy towards results and most likely predictions, ultimately because of huge data, high accuracy and likely predictions are directly proportional to the size and sources of data.

Nowadays we have technology which can take over the bigdata analytics nearly real time, without wasting much time in computations and calculations, so action can be taken prior fraud to be happen. High performance analytics is just an technology fad, With new distributed computing options like Hadoop and in-memory processing on commodity hardware, insurers can have access to a flexible and scalable real-time big data analytics solution at a reasonable cost.