Tuesday, June 4, 2019
Counterculture Analysis: Blackbeard
Counterculture Analysis thatched roofZachariah ChilesMany groups lease been established as countercultures throughout the course of history. However, what makes those groups real be handleed countercultures? Author W. LaVerne Thomas attempts to answer such a question in his book, a group that rejects the major value, norms, and practices of the astronomicr society and replaces them with a raw(a) set of cultural patterns (Thomas). One group that significantly follows Thomass definition argon the thatched roof pirates. This group rejected the cultural patterns of the British monarchy to live their own fierce life of stealing, killing, and raping. To this day pirates are still a significant threat to those who tread international waters, and even those who live in third world countries. before thatched roof acquired his name, he was known as Edward T from each one or Edward Thatch. As far as origin goes, non such(prenominal) is known about Thatch. However, it is preserve that he joined the British navy as a privateer during the tycoon Annes War, and turned to piracy shortly later (Division of Archives and Historys Office of order Archaeology). Blackbeard began his pirating in 1713 under the Captain Benjamin Hornigold (Ossian). Once given a smaller ship by Hornigold and able to command his own bunch as a captain, Blackbeard found the French slaver ship La Concorde. This esteemed ship would be known to umpteen as the Queen Annes Revenge, La Concorde was big, speedy, and powerful. With such a vessel, Blackbeard knew his men could cause to a greater extent havoc (Woodard). In 1717, the two pirates were so deadly that the British monarchy offered both Hornigold and Blackbeard currency in exchange for putting down pirating. Hornigold accepted, whereas Blackbeard denied the offer, and continued ravaging the Caribbean on his esteemed Queen Annes Revenge. However, his time came to an end on November 22nd, 1718 when cladding a British Royal Navy Contingent sent by Governor Alexander Spottswood. Blackbeard and his crew mainly raided ships for one thing, and that was gold. Everything they did was based upon how a great deal loot they could take, and although he has died many years ago, his reputation and name still stands out in the history of pirating.Both the sociological perspective and the sociological conceit can be used to excuse the actions of Blackbeard and his crew. tally to author LaVerne Thomas, The sociological perspective suffices you see that all great deal are social beings. It tells you that your port is influenced by social factors and that you have learned your behavior from others (Thomas).Many heard and saw the stories of Blackbeard and his ferocious crew. Because of this, many saw his actions and take them, to continue pirating and adapting Blackbeards techniques for more economical plundering. His name alone put fear in the hearts of men, so many see that fear and want to become it inspiring many to take up piracy and life on the seas. C. Wright Mills believes the sociological imagination is, the capacity to range from the most impersonal and remote topics to the most advert features of the human self and to see the similitudeship between the two (Thomas). In other words, this describes the insight of how your social environment shapes you, and how you shape your social environment (Thomas). Blackbeard and his crews environment most alikely included a poor social background, and the loss of a loved one. Many who are greedy and kill, have a great deal grown up in these conditions. They surrounded themselves with murderers and thieves, and thus became murderers and thieves themselves. They shaped their social environment by surrounding others with the corresponding negative behavior, thus having new people join Blackbeards crew. The more people in his crew, means the more people that go out and tell the infamous story of Blackbeard, the cutthroat killer.Ethnocentrism is a large p art of any culture. It is described as, the tendency to view ones own culture and group as superior (Thomas). Countercultures are subcultures, in that respectfore Blackbeard and his crew is technically a subculture of the larger society the British monarchy. Blackbeard and his crew saw these norms as superior to the time outricting life in the monarchy, and therefore ethnocentrism formed. Also, the British already having ethnocentrism, saw the opposing moral standards set by Blackbeards new found subculture, and rejected their views, making Blackbeard and his crew a counterculture. Many examples can be made as to why he and his crew is a counterculture. One such case is that there was no law against killing on Blackbeards ship, whereas it was criminalize in the British monarchy. Another analogous case would be with stealing, where Blackbeard plundered and stole from other ships for loot, whereas such atrocities were against the law in the British monarchy. heathen relativism ca n be defined as, the belief that cultures should be judged by their own standards rather than by applying the standards of another culture (Thomas). Looting, pillaging, and killing is what pirates know. These simple standards cannot be judged immaterial cultural beliefs without noticing the large moral negativity that follows. Blackbeard and his crew had no moral compass, so their actions should not be reassert through the eyes of the British monarchy. From a logistical point of view, them being strong, picked on the weak in order to gain wealth and become stronger in the world. Although they may know what they do is morally unacceptable and goes against the laws of many larger societies, they followed their own standards and traditions and should not be judged outside of that.My counterculture Blackbeard and his crew, have many intriguing norms and standards that oppose that of many societies of that season as well as modern times. However, this does not excuse the actions of Bl ackbeard and his crew. Killing, stealing, and plundering all leave large marks on this world. Anywhere from devastating the economy of a British town to killing the last son of a lonely French mother, cultures that directly affect the larger societies in a negative manner should not exist. Cultures having opposing standards is completely fine, as long as the opposing standards does not actively contradict those of a larger society. Blackbeard and his crew have very free standards, however the deaths that have been caused forces me to disagree with the philosophy and norms of their counterculture.ReferencesDivision of Archives and Historys Office of State Archaeology. Queen Annes Revenge Project. n.d. 12 3 2017.Ossian, Rob. The Pirate King. n.d. 12 3 2017.Thomas, W. LaVerne. Holt Sociology The Study of Human Relationships. Holt, Rinehart and Winston, 2003.Woodard, Colin. The Republic of Pirates. New York Houghton Mifflin Harcourt Publishing Company, 2007.Customer Segments in Retail Super trade AnalysisCustomer Segments in Retail Supermarket AnalysisCHAPTER 1 INTRODUCTIONBACKGROUNDIn todays dynamic retail environment, guests are offered with a tremendous range of choices and their loyalty is increasingly meet transitory due to the severe impact of competitors actions on existing relationships (Reinartz and Kumar, 2000). This increased competition to satisfy the diverse needs of the client, forces the traditional cropion and selling tension of the retailers towards customer relationships.In the context of use of retail supermarket, this has resulted in large investments in retail in coiffeion systems to collect the shoppers entropy to understand the customer shopping demeanour (Brijs.T et al 2001). Several in like mannerls and technologies of information warehousing, entropy mine, and other customer relationship management (CRM) techniques are exploited to manage and analyse this entropy. Especially through information mine, simply means extract ing knowledge from large descends of information which helps the organisations to find the patterns and trends in their customers selective information, and then to drive rectifyd customer relationships (Rygielski, Wang and Yen, 2002).According to Witten Frank, (2005), or so information excavation techniques include decision trees (DT), artificial neural networks (ANN), genetic algorithmic retrieves (GA), association rules (AR), etc., are usually used to solve problems related with customers in various fields like engineering, science, finance and business. In retail supermarket domain, selective information mining can be applied to identify useful customer behaviour patterns from large amounts of customer and work information (Giudici Passerone, 2002). Consequently, the discovered information can be used to support recrudesce decision-making in retail marketing. information mining techniques have been mostly adopted to make predictions and describe behaviours.During the past decade, there has been an array of significant developments in data mining techniques. Some of these developments are implemented in customized service (subgenus Chen et al, 2005) which is merry in retail markets to develop customer relationship. Therefore, this look into focuses to give customised service to distinct customer segments in retail supermarkets, by implementing data mining techniques with the help of data mining tools.Related Work searchers proposed various bettermentes to mine sales movement data of a retail supermarket to improve customer relationships. Previously, the customer behavioural variables such as (RFM) Recency-Frequency-Monetary variables are associated with demographic variables to predict customer barter for behaviour (Chen et al, 2005). Current research improved significantly, as Business Intelligence tools and modernistic data mining algorithms are implemented to analyse the data in a much more reformed way.Liao et al, (2008), proposed a methodology based on Apriori and K-means algorithms to mine the customer knowledge from household customers for product and brand extension in retailing. Bottcher et al, (2009), presented an approach which aimed to mine the changing customer segments in dynamic market through deriving frequent itemsets as representations of customer segments at distinct points of time, which are then analysed for changes.Problem DefinitionEffective management of sales feat data is as classic as any other asset for a retail supermarket store. The sales transaction data usually contains great amount of information distributed through numerous transactions.This train focuses on applying data mining techniques to analyse the sales transaction data of a retail supermarket store and suggests recommendations to provide customised service to defined customer segments. This research specifically uses two data mining techniques namely clustering and association rule discovery. The research starts with identifying different customer segments based on their purchase frequencies, in order to find out the differences in their purchase behaviour. The definition of behaviour in retail supermarket domain covers different meanings. For example, retailers often distinguish between light, medium and heavy users or weekday or weekend customers etc (Brijs et al, 2001). In this research, the differences will be discovered by identifying frequently purchased items for each customer segment and comparing their combinations. The retailer may use this information to customize his offer towards those segments and also to further examine the underlying relationships between those items for purposes of pricing, product placement or promotions.AIM OBJECTIVESThe aim of this research is to provide customised service to defined customer segments in a retail supermarket, by implementing data mining techniques on sales transaction data with the help of data mining tools.OBJECTIVESTo conduct a critical look into of the literature and present the current research within the discipline.Obtain the customer sales transaction dataset, in order to apply the data mining algorithms.Based on the literature review, select the get data mining approach to pre- crop the dataset and to implement the algorithms on the pre-processed data.Analyse the results obtained from the data mining algorithms and propose recommendations to provide customised service.Draw conclusions, discuss the limitations of this research and suggest the areas of rising research.Research ApproachThis research follows the quantitative methodology by obtaining the dataset and analysing the data with data mining tools. The dataset for compend was obtained from ABC retail supermarket store, Canada, which was available online (http//www.statsci.org/datasets.html). The data involve for this project is selected and loaded onto data mining tools SPSS (Statistical Package for the Social Sciences) and Weka, the tools selected fo r this research to mine the data. The data mining algorithms that are selected for this study are k-means algorithm for gather and Apriori algorithm for association rule mining, the argue behind the choice of these algorithms is justified in the literature review. These algorithms are implemented on the dataset with SPSS and Weka. The results obtained from these algorithms needs to be justified with the help of charts, tables and graphs. Microsoft Excel is used to plot the charts, tables and graphs. Finally, the recommendations are made based on the analysis of results.Dissertation OutlineThis chapter presents the essence of this dissertation, highlighting the aim and objectives of this research. The rest of this dissertation is structured as followsChapter 2 provides a comprehensive literature review of different aspects relating to the research topic under study.Chapter 3 discuss in detail about the research methods and the data analysis techniques followed, in order to extend to the aim of this research.Chapter 4 presents the analysis of the results obtained from the application of data mining algorithms on the data and provides recommendations.Chapter 5 summarises the entire project and gives insights on limitations of this research and points out the areas of future research.CHAPTER 2 LITERATURE REVIEWIntroductionThis chapter provides a critical review of literature addressing the application of data mining in retail supermarkets. It begins with an introduction to data mining, followed by its evolution and applications in todays business world. Then explore the fictional character of data mining in retail supermarkets to improve customer relationships, followed by a discussion about the typical data mining approach. It also discusses the techniques and algorithms implied in this project and the reason for their choice. data Mining An IntroductionThe word mining means extracting something useful or valuable, such as mining gold from the primer coat ( Lappas, 2007).The importance of mining is growing continuously, especially in the business world. Data mining is a process of finding interesting patterns in databases for decision-making. It is one of the fast growing and most prominent fields, which can provide a significant advantage to an organization by exploiting the vast databases (Rygielski, Wang and Yen, 2002). Finding patterns in business data is not new traditionally business analysts use statistical approach. The calculating machine revolution and huge databases ranging from few Giga Bytes to Tera Bytes changed this scenario. For e.g. companies like Wal-Mart stores huge amount of sales transaction data, which can be used to analyze the customer clouding patterns and make predictions(Bose and Mahapatra, 2001). Data warehousing technology has enabled the companies to store huge amount of data from eightfold etymons under a unified schema.Data mining has been considered to be a tool of business intelligence for knowledg e discovery (Wang Wang, 2008). Many people consider data mining as Knowledge Discovery from Data (KDD), but it is demonstrablely a part of the larger process called knowledge discovery which describes the travel that must be taken to secure the desired results (Han and Jiawei, 2006). Typical data mining process implicates various iterative steps the premier(prenominal) step is the pick of appropriate data from a single database or multiple source systems followed by cleaning and preprocessing for consistency. The data is then analyzed to find patterns and correlations in the data. This approach compliments the other data analysis techniques like statistics, OLAP (On-line analytical processing) etc, (Bose and Mahapatra, 2001). Every organization follows a different data mining and modelling process to achieve their business imperatives.The Evolution of data miningIt all started with the need to store the data in computers and improve the access to it for decision-making. Today the technology enables the users to access and aviate the real time data.At the beginning of 1960s, the data was collected for the purpose of making simple calculations to answer the business questions like the total average tax for a specific period of time. In 1980s 1990s the usage of data warehouses to store data in a structured format emerged, policies regarding the format of data to be used in an organization were implemented (Therling.K, 1998). The data warehouses extended to be multi-dimensional that facilitates the stakeholder to drilldown and navigate through the data.Nowadays, online analytic tools assist to retrieve the data real-time. Now computers can query data from past to until the current. In recent years many technologies like statistics, AI (Artificial Intelligence) and machine learning have been evolving as core sectors in data mining field(Rygielski, Wang and Yen, 2002). So these technologies combined with relational database systems with data integration pro vide potential knowledge from the data.Data mining applicationsData mining can be implied in many fields depending on the aim of the company. Some of the main areas in todays business world where data mining is applied are as follows (Apte.C. et al, 2002)FinanceTelecomMarketingWeb analysisInsuranceRetailMedicineData mining for CRM in retail supermarketsSwift (2001) defined CRM as an Enterprise approach to appreciation and influencing customer behaviour through meaningful communications in order to improve customer acquisition, customer retention, customer loyalty, and customer profitability. According to research by the American management association It costs three to five times as much to acquire a new customer than to retain the existing one and is especially evident in services sector (Ennew Binks, 1996). Therefore it is very important to create a good relationship with the existing and new customer rather than expanding the customer base.A large human action of companies are adopting various tools and strategies to stir a more effective CRM, in order to gain an in-depth understanding about their customers. Data mining is a powerful new technique, which helps the companies to mine the patterns, trends and correlations in their large amounts of customer, product, or data, to drive improved customer relationships. It is one of the well-known tools given to customer relationship management (CRM) (Giudici Passerone, 2002). In the context of retail supermarket these patterns not only assists the retailers to offer high quality products and service to their customers, but also helps them to understand the changes in customer needs.Data mining applications for CRM in retail supermarketsData mining improves customer relationship in retail supermarket, which is a wide area of research interest. Depending on the retailers objective, there are various application areas in which data mining can be applied to enhance customer relationship management. Some of the m ajor data mining applications in retail supermarket, determine from literature are as followsCross-selling (Brijs et al 1999, Feng and Tsang, 1999)Product recommendation (Shih and Liu 2005, Li et al 2009)Customer behaviour modelling (Baydar.C 2003, Cadez, 2001)Shelf space allocation (Chen and Lin 2007, Chen et al 2006)Catalogue breakdown (Ester et al,2004, Lin and Hong, 2006)Direct marketing (Bhattacharyya, 1999, Prinzie and Poel, 2005)Prize optimization (Chen et al 2008, Kitts and Hetherington, 2005)THE DATA MINING PROCESSIvancsy Vajk, (2006), defined the three main stages tangled in the data mining process which are (i) preprocessing, (ii) pattern discovery, (iii) pattern analysis/interpretation.PreprocessingFamili .A, (1997), defined data preprocessing as all the actions taken before the actual data analysis process starts. It is essentially a transformation T that transforms the raw real world data vectors Xik, to a set of new data vectors Yij.Yij = T (Xik)Such thatYij prese rves the valuable information in Xik,Yij eliminates at least one of the problems in Xik andYij is more useful than Xik.In the to a higher place relationi=1 n where n = number of objects,j=1 m where m = number of features after preprocessing,k=1. . . l where l = number of attributes/features before preprocessing, and in prevalent, m ? l.The most common data used for mining the purchase behaviour in retail supermarket is customer and transaction data (Giudici and Passerone, 2002).With a huge collection of customers sales transaction data available in the databases, it is necessary to pre-process the data and extract the useful information from it. In the context of retail supermarkets Pinto et al, (2006), suggested four key tasks in data preprocessing, they are data selection, data cleaning, data transformation, and data understanding.The original preprocessing task is data selection. Here the subset of the data is identified on which pattern discovery is to be performed. This task is especially helpful in solving the problem of large amounts of data through barely evaluating and categorizing the data into much smaller datasets. Computational requirements necessary for data analysis and manipulation are also hugely reduced by preprocessing large datasets through data selection techniques like clustering or vector quantization (Famili .A, 1997).The second is data cleaning where basic operations include removing noise and discussion missing data (Fayyad et al, 1996). Other issues regarding the data quality like errors and insufficient attributes which may complicate data analysis are also addressed in data cleaning. In most cases missing attribute values are replaced by attribute mean but traditionally, if more than 20% of attribute values are missing, the entire record is eliminated (Famili .A, 1997). To handle the outliers and noise data, techniques like binning ( partition off the sorted attribute values into bins), clustering and regression are applied.Th e next preprocessing task is data transformation. The application of each data mining algorithm requires the presence of data in a mathematically feasible format (Crone et al, 2006). Inaccuracies in the measurements of input or in repair feeding of data to the data mining algorithm could cause various problems. Since, operations such as normalization, aggregation, generalization and attribute construction are performed. Normalization deals with scaling the attribute value into a specific range, whereas aggregation and generalization refers to the summary of data in terms of mathematical and nominal attributes. pass judgment construction handles the replacement or addition of new attributes based on the existing attributes (Markov.Z and Larose.T.D, 2007).Once issues regarding the data are solved and the data are prepared, understanding the nature of data would be useful in many ways. According to Famili .A, (1997), the majority of the data analysis tools have some limitations regar ding the data characteristics therefore, it is important to eff these characteristics for appropriate setup of data analysis process. He further pointed out that techniques like visualization and principal component analysis are useful for better understanding the data.Pattern discoveryFayyad et al, (1996), defined that core of the process is the application of specific data-mining methods for pattern discovery and extraction. Pattern discovery is the key stage of the process in this research, which is where the data is mined. Once the data is pre-processed, and the irrelevant information is eradicated, it is then used for mining, using data mining techniques to discover patterns. However, it is not the intent of this make-up to describe all the available algorithms and techniques derived from these fields.This research focuses on two main data mining methods that to helps to mine the data and find patterns. They are Clustering and Association. The reason behind choosing these rul es is justified below.ClusteringClustering can be defined as a technique to group together a set of items having similar characteristics (Kuo et.al, 2002). In retail domain, cluster analysis is a common tool to segment the customers on the basis of their similarity on a chosen partitioning base or set of bases (Stewart.D.W and Girish.P., 1983). The actual choice for one or a combination of these bases largely depends on the business question under study (Wind, Y., 1978).Segmentation can be done on the basis of various variables/bases, such as 1) general or product-specific, and 2) observable or non-observable as sort out by wedel M and Kamakura (2000).General bases for segmentation are independent of products, services or circumstances, whereas product-specific bases for segmentation are related to the product, the customer or the circumstances. Observable segmentation bases can be measured directly, whereas non-observable bases must be inferred. The combination of classification of segmentation bases is shown below.Twedt, D.W., (1967) as cited in Engel.J.F et.al, (1972), stated that the existence of huge amounts of transaction data in retail supermarket domain provides a great impetus for segmentation on the basis of purchase frequencies. Segmentation based on this divides customers into groups on their strong suit of buying a product(s), such as light, medium and heavy buyers. According to Brijs.T, (2002), if customers are classified by their purchase oftenness, these segments could then be treated other than in terms of marketing communication (pricing, promotion, product recommendation etc.) to achieve greater return of investment (ROI) and customer satisfaction. Therefore, in this research clustering is active to segment the customers into various clusters on the basis of their similarity in purchase frequency.Several algorithms have been proposed in the literature for clustering, such as ISODATA, CLARA, CLARANS, ScaleKM, P-CLUSTER, DBSCAN, Ejcluster , BIRCH and GRIDCLUS (Kanungo.T. et al, 2002). It is not the objective of this research to use all these algorithms for clustering. However, as discussed earlier, k-means clustering algorithm would be used to cluster and its justification is given below.k-Means Clustering AlgorithmThe K-means has been considered as one of the most effective algorithms in producing good clustering results for many practical applications (Alsabti et.al, 1998). The main reason behind this is, when clustering is done for the purpose of data reduction, the goal is not to find the best partitioning, but simply needs a reasonable consolidation of N data points into k clusters, and, if necessary, some efficient way to improve the quality of the initial partitioning (Faber, 1994). Therefore, k-means algorithm proves to be very effective in data reduction and bring ons a good clustering output.The k-means algorithm clusters the data that are similar into various clusters namely Cluster 0, Cluster 1 to Cluste r n (Kanungo et.al, 2002). Provided a set of n data points in real d dimensional space (Rd) and an integer k, the aim is to determine k points in Rd, called the centers, so as to minimize the mean shape distance from each data point to its nearest center. This measure is often called as squared-error distortion (Jain Dubes, 1988).The diagram below illustrates the standard k-means algorithm. It shows the results during two iterations in the partitioning of nine two-dimensional data points into two well separated clusters. Points in cluster 1 are shown in red, points in cluster 2 are shown in black data points are denoted by open circles and reference points by filled circles. Clusters are indicated by dashed lines. The iteration converges quickly to the correct clustering even there was a bad initial choice of reference points.Lloyds algorithm is another popular version for K-means clustering which requires about the same amount of computation for a single pass through all the data points, or a single iteration, like the standard K-means algorithm (Faber, 1994). Lloyds algorithm is similar to standard k-means algorithm, except when the cluster centroids are chosen as reference points in subsequent partition the centroids are adjusted both during and after each partition. However, the k-means algorithm constantly updates the clusters and requires comparatively less iterations than Lloyds algorithm, thus, k- means algorithm is considerably faster. This is the key reason that leads to the selection of k-means algorithm, since it can group the customers which have similar purchase frequency into different clusters in less iterations. However, Faber, (1994), pointed two major drawbacks to this algorithm. Firstly, it is computationally inefficient for large datasets. Secondly- although the algorithm will always produce the desired number of clusters, the centroids of these clusters may not be particularly representative of the data.Association RulesAssociation rule discovery was proposed to find all rules in a basket data to analyze how items purchased by customer in a shop are related (Gery Haddad, 2003). The rule refers to the discovery of attribute value associations that occur frequently together within a given data set (Han Kamber, 2001). It is typically used for market basket analysis to discover rules of the form x% of customers who buy item A and B, also buy item C (Zaiane, 2001) and is an implication of the form (A, B) C.Some of the key definitions drawn from literature that characterize association rule technique are provided below (Agarwal, Imielinski and Swami, 1993).Itemset (i) Set of items that contain in a single transaction (e.g. milk, sugar, curd)Support (s) The support expresses the percentage of transactions in the data that contain both the items in the antecedent and the consequent of the rule.Confidence (c) Confidence estimates the conditional chance of B given A, i.e. P (B A) and it can be cypher as Confidence ( c) =s (A B) / s (A).Association rule discovery typically involves a two phased sequential methodology (Brijs T., 2002).Finding frequent itemsetsThe first phase involves looking for so-called frequent itemsets, i.e. itemsets for which the support in the database equals or exceeds the tokenish support threshold set by the user. This is computationally the most decomposable phase because of the number of possible combinations of items that need to be tested for their support.Generating association rulesOnce all frequent itemsets are known, the discovery of association rules is comparatively straightforward. The general scheme is that, if ABCD and AB are frequent itemsets, then it can be calculated whether the rule AB CD holds with sufficient confidence by reason the ratio confidence = s (ABCD) / s (AB). If the confidence of the rule equals or exceeds the minconf threshold set by the user, then it is a valid rule. For an itemset of size k, there are potentially 2k-2 confident rules .Association rules can help to discover frequently purchased combinations of products within a customer segment and provide customised service by promoting certain products or product combinations to the defined segments (Brijs T. et al, 2001). Therefore, in this research, frequent itemsets for each customer cluster will be generated and their combinations are compared to identify the differences in purchase behaviour to provide customised service.Traditionally, support and confidence are used in association rule discovery, but Aggarwal Yu, (1998), criticized this support-confidence framework for association rule discovery for the succeeding(a) main reasons.First of all, setting good values for the support and confidence parameters in association rule mining is critical. For example, setting the support threshold too low will lead to the generation of more frequent itemsets. But even if they would be statistically significant, their support is usually too low to have a significant influence.On the other hand, setting the support threshold too high increases the probability of finding insignificant relations and of missing some important associations between items.Further Agarwal Yu, (1998) Brin et al., (1998), as cited in Brijs.T,(2003), introduced the lift (also called interest) measure to overcome the disadvantage of confidence in not taking the baseline frequency of the consequent into account.Lift/Interest (l) Lift is computed as the confidence of the rule divided by the support of the right-hand-side (RHS). In other words, lift is the ratio of the probability that A and B occur together to the multiple of the two individual probabilities for A and B.Lift (l) = s (A B) / s (A).s (B)In order to perform prophetic analysis, it is useful to discover interesting patterns in the given dataset that serve as the base for future trends. The best and most popular algorithm used for this analysis is called the Apriori algorithm (Varde et.al, 2004).Apriori Algor ithmThe Apriori algorithm was proposed by Agarwal et.al, (1994) (Varde et.al, 2004). The algorithm finds frequent items in a given data set using the anti-monotone constraint (Petrucelli et.al, 1999), as cited in Varde et.al, 2004).It works under the principle that all subsets of a frequent itemset must also be frequent. In other words, if at least one subset of an itemset is not frequent, the itemset can never be frequent anymore. This principle simplifies the discovery of frequent itemsets significantly because for some itemsets, it can be determined that they can never be frequent before checking their support against the data anymore. This is the key reason to select this algorithm, since the association rules for the items can be discovered more quickly and efficiently.Given a data set, the problem of association rule mining is to generate all rules that have support and confidence greater than a user-specified minimum support and minimum confidence respectively.Candidate sets having k items can be generated by joining large sets having k-1 items, and deleting those that contain a subset that is not large (where large refers to support above minimum support).Frequent sets of items with minimum support form the basis for deriving association rules with minimum confidence. For A B to hold with confidence C, C% of the transactions having A must also have B.Though the algorithm is very efficient in association rule mining, it has certain drawbacks, found by Margahny Shakour, (2006). afterward discovering the 4-frequent itemsets this algorithm needs extra data structure and methods to process, since the further itemsets can be obtained by different ways.This method is fast only while discussion small data.There are several tools available for clustering and association rule mining such as ARMiner, Clementine (SPSS), Enterprise Miner (SAS), smart Miner (IBM), Decision Series (NeoVista). To mine association rules, WEKA is used, which is a collection of machi ne learning algorithms for data mining tasks and SPSS statistics 17.0 for clustering. WEKA is an open source software available online and very efficient in mining large datasets, where as SPSS statistics 17.0 is a statistical analysis package available at Brunel university computer labs.Pattern AnalysisPattern analysis means understanding the results obtained by the algorithms and drawing conclusions. This is the last phase in data mining process, where the uninteresting rules or patterns from the set found in the pattern discovery phase are filtered out (Cooley et.al, 2000). The uninteresting patterns are filtered out by applying appropriate methodologies on the results and produce some interesting statistical patterns.SUMMARYThis chapter discussed the concept of data mining, its evolution and applications in todays business world. Then, it provided an overview regarding the role of data mining in retail supermarkets to improve customer relationships, followed by a discussion abou t the typical data mining approach. It also discussed the techniques and algorithms implied in this project and the reason for their choice. The following chapter will explain about the research approach followed in this dissertation.CHAPTER 3 RESEARCH APPROACHIntroductionThis chapter will discuss about the research approach employed in this project. It starts with a discussion about the research and literature review methods, followed by the data collection and justification of data mining approach on the data.Research MethodsThe research approach depends upon the objectives and aim of the study, as it assists the researcher to elicit appropriate responses. Boyatzis (1998) defines research methods as taxonomic procedure used for problem solving where, first data is collected based on the research question, hypotheses are stated, data analysis is carried out using appropriate techniques, results are interpreted and conclusions are derived. According to Hussey et al (1997), research methods can be distinguished in two types they are Qualitative and Quantitative approach. Oates (2006) says that, quantitative research method is the data or evidence on numbers whereas qualitative research method includes all non-numeric.In this research, quantitative research methodology is used. Quantitative study makes use of the numeric data that has been collected from a group of people interested in the subject area which is then analysed and interprete
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.