Talk:Association rule learning

Untitled
This page really needs good, simple definitions of frequent set, support, confidence, and lift. Right now it's all jargon. dfrankow 16:58, 1 November 2006 (UTC)

Also it would be a good idea to check the overall use of certain words: It's certainly not a good idea to call the support threshold a "confidence threshold". Torf (talk) 20:29, 6 December 2007 (UTC)

And add to it the measure "conviction": Conviction of a Rule, a measure of how strong is the implication from A to B in Association rule learning in the field of Data mining = P(A) * (1 - P(B)) / ( P(A) - P(A,B) )

I rewrote part of the page and provided a more formal definition. I hope that makes it less confusing so I removed the confusion tag. Mhahsler (talk) 16:21, 25 March 2008 (UTC)

Merger Proposal
The Association rule mining page seems very similar to this page. I think it should be merged. In fact, I'm going to do it now. —Preceding unsigned comment added by Jnnnnn (talk • contribs) 10:03, 24 September 2007 (UTC)


 * The merge was performed, this issue is resolved. WilliamKF (talk) 17:07, 22 May 2014 (UTC)

Merge One-attribute-rule to here
One-attribute-rule is an orphan and is very stubby. I propose that it should be merged to here.NHSavage (talk) 07:19, 17 May 2008 (UTC)
 * merger complete. Also added section on apriori algorithm.--NHSavage (talk) 08:53, 25 May 2008 (UTC)

I think One-attribute-rule is not a great fit for the association rule learning page. It looks more like a classification strategy than an association rule algorithm where exactly the fact that multiple items in the left-hand-side generate a large search space is the main focus. I propose undoing the merge. Mhahsler (talk) 19:39, 6 June 2008 (UTC)

I've removed it, it's 2011 and the rule was still sitting here, where it doesn't belong. — Preceding unsigned comment added by 114.134.0.90 (talk) 05:19, 16 August 2011 (UTC)

Implementations
Incomplete info on how to implement the method in practice, or which software can do it. Added link to statistica, but perhaps other software like SPSS can do this too? Mango bush (talk) 07:09, 31 March 2010 (UTC)

Review grammar edit
I made an small gramatical fix here to the article, but it could have been fixed in another sense, so could an expect please review this and ensure we have the correct meaning now?

Current:


 * Looking for techniques that can model what the user has known (and using these models as interestingness measures) is currently an active research trend under the name of "Subjective Interestingness."

Alternate:


 * Looking for techniques that can model what the user has known (and using this model as an interestingness measure) is currently an active research trend under the name of "Subjective Interestingness."

Thanks for checking. WilliamKF (talk) 17:05, 22 May 2014 (UTC)

Possible error in definitions of support and lift
I was reading this through and with a rapid glance it seems that the definition of lift may have an error. Shouldn't the lift have an intersection instead of the union of the events in the numerator. Or is it just another definition of lift? Anyone who has time and expertise, please review that formula and modify it if needed. — Preceding unsigned comment added by 188.238.110.230 (talk) 19:32, 24 April 2016 (UTC)

I agree. If the analogy is to conditional probability, then confidence and lift need to be expressed in terms of the intersection, not union of two supports (relative frequencies). But I notice this same error occurs in the documentation for the R arules package.

I think that it's wrong to use intersection in support notation, but it's fine in probabilities. The support of $$X \longrightarrow Y$$ is $$\text{Supp}(X \cup Y)$$ and corresponds to (an estimate of) $$\mathrm{Pr}(X \cap Y)$$. Assuming $$K_T(A) = \{t \in T \mid A \subseteq t \}$$, then $$K_T(A \cup B) = K_T(A) \cap K_T(B)$$ 37.65.4.87 (talk) 15:15, 4 November 2022 (UTC)

I agree that the usage of "intersection" is incorrect in the definitions of support and lift. Unfortunately, the text of which the definitions cited does use the "intersection" notation (page 250 for definition of support). My reasoning for it being incorrect is that $$A$$ and $$B$$ are defined as itemsets (e.g., $$A$$:={apple, orange, lemon}, $$B$$:={mango, orange}) and when we unionize the sets to create a superset, we use the word "and" to describe the elements of the superset (e.g., ($$A \cup B$$)={apple, orange, lemon, mango} = "apple, orange, lemon, and mango"); there we conflate plain English with mathematical set notations.