|
|
||||
|
ILLUSTRATION
Contents:
Customer Databases: the lengths to which organisations are prepared to go 'Selecting' as seen in terms of Data-Mining, is about knowing more about your customers, in order to sell the right products or services to the right customers. It is known in industry jargon as 'Customer Relationship Management.' Centrica is better known as the owner of British Gas. It also owns the AA roadside assistance service and launched Goldfish in 1996, hiring HFC Bank, as a credit specialist, to manage customer accounts. Centrica had already agreed to pay £650m to take back full control of the Goldfish operation, which HFC has helped run since 1996. But in a battle in June 2001 over information on Goldfish's one million customers, the High Court sided with HFC, whose five year contract to run Goldfish services for Centrica expires in September 2001. Knowing more about the interests of Goldfish customers, and what products they might be interested in, was Centrica's main aim. In August 2001, Centrica agreed to pay HFC an £80m 'divorce' settlement to ensure the handover of Goldfish customer details. Centrica hopes to use the client details as the first step towards developing Goldfish into a telephone and internet bank, in a joint project with retail bank Lloyds TSB. This is a practical application of businesses selecting data from the mass of material they collect. It also indicates the value that these organisations place on such data. The Goldfish internet bank would be based on infrastructure set up for Evolvebank.com, which Lloyds has hoped to take Europe-wide, but which as yet is operational only in Spain. HFC, which owns the Marbles brand, has voiced plans to launch its own credit card once its contract with Centrica expires. The agreement is likely to save Centrica's deal with Lloyds and end a court battle which threatened to delay the expansion of the Goldfish brand; raising concerns at partner Lloyds TSB. [Top] Data-mining: The thought of being watched constantly, with even the smallest detail of our lives being monitored and recorded, used to be the stuff of science fiction. Now, of course, with the advent of TV programmes like 'Big Brother', the fear of round-the-clock surveillance is now packaged and sold to us as entertainment. But the reality of the computer age is that we are all being studied far more closely than many of us think. Every time you go on-line, make a telephone call, withdraw cash from a bank, or even claim reward points from your supermarket shopping, your transactions are likely to be logged and stored by computers. The result is huge quantities of data amassing in the electronic warehouses of governments and businesses. But how do these institutions select from these hordes of data? What use do they make of them, and what are the ethical implications of this reality? Until the mid-1990s the data tended to languish in these electronic warehouses. But since then the tools have become available to help statisticians analyse and sort these data, often turning up valuable and perhaps surprising information. [Top] Data Mining II: What statisticians do with these huge stockpiles of data can be thought of as data mining. All this means is that large data sets can be explored and analysed, in order to learn more about the patterns of behaviour that are hidden within the data. For this to work at all requires the use of computers. Try studying any large data set by yourself and you'll give yourself a headache very quickly! Computer algorithms are used to try to detect unusual patterns within the data. These patterns are then interpreted by humans to produce knowledge. Training, testing and validating: Because there is such a vast quantity of data available, just a small sample can be selected at first. This is sometimes called 'training' data and may account for only 10% of the total data set. An algorithm is applied to the training data, to explore it and try to find patterns within it. These patterns are then tested on another section of the data set, called the 'test' data. Data miners will often keep aside another section of the whole data set, which they use to validate their findings. [Top] Banking on the results: Let's look at an example drawn from the business world to illustrate how data mining techniques can be useful. A bank wants to know more about its customers, in particular the types of people who might want a loan from the bank. This could be extremely useful information, because the bank could make big savings from using a tightly targeted marketing campaign, compared to one that tries to communicate with the general public. The bank will have a databank which contains the records of its customers over a number of years. The databank is likely to hold detailed information on each customer: age, occupation, marital status, number of children and so on. Using test data, algorithms are used to detect the characteristics of customers who took out a particular type of loan. These customers are differentiated from those who used other kinds of financial services, including other types of loan. The algorithm will be able to develop 'rules' by which it can identify customers who are likely to be good prospects for the loan. These rules are then applied to the remainder of the database. The final computing task may be to sort the whole databank into clusters; groups of customers who share similar characteristics. It is at this stage that interesting and perhaps surprising patterns might be detected. These patterns then have to be interpreted by the data miners and banking staff. [Top] Summary of data mining techniques: Data mining can be used to work on a variety of tasks. A selection of these is outlined below.
[Top] Non-commercial uses of data mining: Data mining can prove valuable for businesses, but has applications across a wide range of fields. The police and, in particular, forensic services use data mining techniques to detect patterns in statements and other materials obtained during investigations. Various state revenue bodies have also used these techniques to spot cases of tax fraud. In general, any activity that amasses vast quantities of data stands to benefit from data mining. Ethical issues The techniques of data mining can be challenging, but so too are the ethical questions posed by the practice. Data mining might be able to find clusters of customers who are likely to be profitable to companies. But at the same time, this means that less profitable groups are likely to be disadvantaged. This is a form of discrimination and could lead to legal or public relations problems for business organisations further down the road. Clearly there is a need for the social effects of widespread data mining use to be understood and dealt with, if society is not to come to regard data mining as the unacceptable face of Big Brother. [Top] |
||||
|
|
|||||