What is support vector? How do you evaluate the accuracy of a classifier? Describe.[5]
Cluster Analysis
1.
Using k-means++ algorithm and Euclidean distance, find the initial 3 cluster centroids from A1 = (3, 11), A2 = (3, 6), A3 = (9, 5), A4 = (6, 9), A6 = (7, 5), A7 = (2, 3), A8 = (5, 10). Choose (3, 11) as one of the initial centroids.[5]
2.
Differentiate between k-means and k-medoids clustering algorithm.[5]
Data Cube Technology
1.
Explain the general strategies for cube computation.[5]
2.
List any two OLAP operations with example. How do you compute rule coverage and rule accuracy?[5]
Data Preprocessing
1.
Describe any two methods of handling noisy data.[5]
Graph Mining and Social Network Analysis
1.
Define graph mining. Discuss the conflict between theory of balance and theory of status.[5]
2.
Define link mining. What are the roles of epsilon and MinPts in DBSCAN.[5]
Introduction to Data Mining
1.
Distinguish between data characterization and data discrimination. What are the challenges of multimedia mining?[5]
Introduction to Data Warehousing
1.
When do we prefer trim mean for statistical description of data? Justify with an example. Describe about multi-dimensional data model and conceptual modeling of data warehouse.[10]
Mining Frequent Patterns
1.
How do you generate strong association rules? From the following dataset find the frequent item set using FP growth algorithm using 3 as minimum support.