Talk:Fractal tree index

Notability and COI
I've added notability and COI tags to the article. I've been unable to find articles on this topic independent of the group that invented the data structure. Without independent, in-depth reliable sources per WP:RS, this article may fail to satisfy notability guidelines per WP:GNG. The COI tag comes from the article creator also being one of the inventors of the algorithm and also an author on two of the cited papers: "Cache-Oblivoius streaming B-trees" and "The TokuFS Streaming File System". Per WP:COI, this can lead to problems with the neutrality of the article. Editors knowledgable about the topic should review the content for neutrality--although without independent reliable sources to rely on, this may be difficult. Thanks, --Mark viking (talk) 18:24, 23 April 2014 (UTC)

I've removed the notability tag. Database indexing is an important topic, B-Trees and their variants are very widely used despite their limitations. A new technique that overcomes them is notable.

I too would like to see more independent sources. Paul Foxworthy (talk) 05:55, 1 April 2016 (UTC)
 * That's not how notability works, please see WP:GNG for the criteria. I am restoring the notability tag. -- intgr [talk] 08:29, 1 April 2016 (UTC)
 * I respectfully leave your judgement there. However, the GNG guidelines say "Notability is a property of a subject and not of a Wikipedia article". I agree with that sentiment.Paul Foxworthy (talk)
 * I don't see how that quote applies here. My point is, notability is established by the existence of independent reliable sources about the subject, not hand-wavy claims like "A new technique that overcomes them is notable". A common and useful way to demonstrate notability is adding citations to such sources to the article, but you could also simply list them on the talk page. -- intgr [talk] 07:20, 4 April 2016 (UTC)

Terse line maybe?
The article says "There are several choices for how the buffers are flushed, all leading to similar I/O complexity."

It seems you are hinting at whether all buffers in a node are flushed or only the fullest buffer is flushed, and comparing/contrasting the costs resulting due to one of the 2 decisions. Is that so? If that is the case, then it is totally not clear that it is what is meant. Some more elaboration would be super helpful! — Preceding unsigned comment added by Dhruvbird (talk • contribs) 02:06, 28 April 2014 (UTC)

There are a bunch of choices for how and when to flush: as you point out, you can flush only the fullest buffer or all. But you can also flush-on-full only or also flush-on-query. None of these change the asymptotic analysis. It's all an engineering choice, based on typical workloads. Does that help? Farach (talk) 20:30, 28 April 2014 (UTC)

Thanks! That makes sense. However, I would like to see some proof or a sketch (maybe hand-wavey) of why it doesn't affect the asymptotics and whether the constants involved are different for either case. Dhruvbird (talk) 21:03, 30 April 2014 (UTC)

This looks like a cache-oblivious data structure/(crystalized) algorithm
However, there are no links to https://en.wikipedia.org/wiki/Cache-oblivious_algorithm — Preceding unsigned comment added by Dhruvbird (talk • contribs) 15:42, 28 April 2014 (UTC)

Although the COLA is cache oblivious, the current version of the FTI is not. It is based on the B-epsilon tree, which explicitly codes for a block transfer size. Farach (talk) 20:28, 28 April 2014 (UTC)

Ah! Yes, I forget - thanks for the clarification. Dhruvbird (talk) 21:03, 30 April 2014 (UTC)

Refs. for why the current version was chosen over the COLA?
You mentioned in one of the previous comments that the FT was earlier implemented using a COLA, but subsequent implementations have since been based on the B-epsilon tree. Are there any references to why this decision was made? What were the tradeoffs and why the current choice is a better one? — Preceding unsigned comment added by Dhruvbird (talk • contribs) 04:10, 1 May 2014 (UTC)