1. What are the di erences among the three:

(1) boxplot (2) scatter plot (3) Q-Q plot?

Save your time - order a paper!

Get your paper written from scratch within the tight deadline. Our service is a reliable solution to all your troubles. Place an order on any task and we will take care of it. You won’t have to worry about the quality and deadlines

Order Paper Now

2. Assume a base cuboid of 10 dimensions contains only two base cells:

(1) (a1; a2; a3; b4; :::; b19; b20), (2) (b1; b2; b3; :::; b19; b20),

where ai 6= bi for any i. The measure of the cube is count.

(a) How many nonempty aggregated cells a complete cube will con-

tain?

(b) How many nonempty aggregated cells an iceberg cube will con-

tain if the condition of the iceberg cube is count 2″?

(c) How many closed cells in the full cube?

3. Since items have di erent values and expected frequencies of sale, it

is desirable to use group-based minimum support thresholds set up by

users. For example, one may set up a small min support for the group

of diamonds but a rather large one for the group of shoes. Outline an

Apriori-like algorithm that derive the set of frequent items e ciently

in a transaction database.

4. For mining correlated patterns in a transaction database, all con dence

( ) has been used as an interestingness measure. A set of items fA1;A2; :::;Akg

is strongly correlated if

sup(A1;A2; :::;Ak)

max(sup(A1); :::; sup(Ak))

min

1

where min is the minimal all con dence threshold and max(sup(A1); :::sup(Ak))

is the maximal support among that of all the single items

Based on the equation above prove that if current k-itemset cannot

satisfy the constraint, its corresponding (k+1)-itemset cannot satisfy

it either.

5. What are the major di erences among the three:

(1) information gain (2) gain ratio (3) foil-gain

6. What are the major di erences between:

(1) bagging (2) boosting?

7. Given 50 GB data set with 40 attributes each containing 100 distinct

values , and 512 MB main memory in a laptop, outline an e cient

method that constructs decision trees e cientlym, and answer the fol-

lowing questions explicitly:

(a) How many scans of the database does your algorithm take if the

maximal depth of decision tree derived is 5?

(b) How do you use your memory space in your tree induction?

 
Do you need a similar assignment done for you from scratch? We have qualified writers to help you. We assure you an A+ quality paper that is free from plagiarism. Order now for an Amazing Discount!
Use Discount Code "Newclient" for a 15% Discount!

NB: We do not resell papers. Upon ordering, we do an original paper exclusively for you.