Talk:Two-phase commit protocol

Wikipedia text on Tree and Dynamic 2PC (D2PC) copied "as is" to a COIT 2008 conference paper
The entire text was copied to the following paper:


 * An Efficient Fault Tolerance Protocols for Mobile Computing Systems
 * * Kumar Surender **,Kumar Parveen *** Chauhan R.K
 * * Lecturer Deptt. Of I.T., H.C.T.M. Kaithal
 * ** Professor Deptt. Of CSE, APIIT, Panipat.
 * ***Chairman Deptt. Comp. Sc. & Application KUK

(see paper here)

that appeared in COIT 2008.

More Wikipedia text is embedded there with reference to neither Wikipedia nor the original D2PC paper:

Yoav Raz, The Dynamic Two Phase Commitment (D2PC) protocol, ICDT 1995

This is a strange practice for a scientific paper, a clear case of Plagiarism which should be strongly discouraged.

Quote:

"Plagiarism

From Wikipedia, the free encyclopedia

Plagiarism, as defined in the 1995 Random House Compact Unabridged Dictionary, is the "use or close imitation of the language and thoughts of another author and the representation of them as one's own original work." Within academia, plagiarism by students, professors, or researchers is considered academic dishonesty or academic fraud and offenders are subject to academic censure, up to and including expulsion."

-- Comps (talk) 17:18, 8 September 2009 (UTC)


 * I haven't had much time to delve into this much (and am by no means an expert in the subject), but the details at Copyrights might assist you with any information regarding this.


 * I do not intend to go further into this or to take any measures. This is a negative phenomenon, disallowed by academic standards, and a problem to the paper's authors themselves. Exposure is the best way to fight it. I'm sure their respective institutions view this negatively if they know. I do not think it is any problem for Wikipedia: Since I have not seen any copyright note in the paper, I do not think they have violated any of Wikipedia's guidelines regarding text reproduction and modification: It looks as if their text is also left for free use. If conference proceedings containing this paper are published with a copyright note that prohibits free use, this might be a violation.


 * On the other hand, with all the concerns about Wikipedia quality and credibility, it is quite amusing to find Wikipedia text in a scientific article (but without admitting, of course)... -- Comps (talk) 14:18, 9 September 2009 (UTC)


 * Default copyright in most countries is that if there's no statement, it's not free for use. --Jerome Baum (talk) 11:39, 5 January 2012 (UTC)

See also Help desk -- Comps (talk) 16:00, 2 October 2009 (UTC)

Merged in Two-phase commit
I just merged in the content from Two-phase commit. There was almost 100% overlap in content, though there were differences in terminology and formatting. I hope you all like the result. Jamie 01:12, 14 November 2005 (UTC)

--- The following paragraph in "Disadvantages" is incorrect and needs to be removed:

"Another disadvantage is the protocol is conservative. It is biased to the abort case rather than the complete case. Also it cannot recover from cases where a node has failed in the commit stage (due to internal or network failures) after indicating that it is ready to commit. In this case, resources that committed prior to this failure cannot be rolled back."

First, it can be biased either to commit or abort (see literature). Also, all known commercial implementations recover correctly (to abort) if such faliure occures. No resource commits before completion (decision by coordinator) --

How can it be that "all known commercial implementations recover correctly (to abort) if such faliure occures [sic]?" At some point the coordinator must start calling the FINAL commits on all the participants, either all at once or one-at-a-time (same thing). If one of the partipants fails the final commit, won't the data be left in an inconsistent state? This is what the "resources that committed prior to this failure cannot be rolled back" is suggesting.

I believe this is a MAJOR question (really, "is 'two-phase commit' a magic bullet?") and even IF it is true that "all known commerical implementations recover correctly," I think that this magical point should be included in the article somehow. --Daniel Rosenstark 01:24, 12 June 2006 (UTC)

-- As per earlier comments, I wonder if more discussion could be given in the "Disadvantages" paragraph regarding what it actually means to be biased towards aborts rather than towards commits. Nels Beckman 19:11, 6 September 2006 (UTC)

- I just modified the section on disadvantages, since it contained some mistakes. 1. The usual 2PC uses a timeout to avoid that the coordinator blocks forever. This is the reason why the protocol is biased towards aborts. 2. I removed the passage "Also it cannot recover from cases where a node has failed in the commit stage (due to internal or network failures) after indicating that it is ready to commit. In this case, resources that committed prior to this failure cannot be rolled back." This will never occur if the protocol is implemented properly: A cohort will only vote for commit if it can guarantee successful completion, even in the case of a failure. So, prior to voting for commit the cohort will log the necessary information for transaction redo. In addition, each cohort will keep the undo log until it has received the final decision by the coordinator. So, A cohort will never reveive an abort message without beeing prepared to abort. - I don't see how this addresses the root issue. If a cohort has committed, and then later another cohort fails to commit, you can't undo the finalized commitment. You seem to be suggesting that the commitment isn't finalized until everyone has committed. That's a contradiction of terms. A commitment, by definition, is finalized. This isn't just a semantic argument. What you are suggesting is really just taking the problem out one level. If the commitment isn't yet finalized (whatever that means) then what happens if there's a failure in the 'finalization of the commitment'? You could have a finalization of the finalization of the commitment but then a failure could happen there so you'd need a finalization of the finalization of the finalization of the commitment. This is like repeatedly traveling halfway to your destination. You keep getting closer but you'll never get there.

And undoing a committed change is also unworkable. Once committed, the change is visible to other transactions. This violates the whole concept of committing transactions.

Unless someone can show some external documentation, I remain convinced that the two phase commit cannot recover reliably from a failure in the commit phase. Dubwai 21:11, 25 January 2007 (UTC)

- The "correct" 2PC is blocking. The coordinator resends on timeout, but NEVER gives up on waiting. This is how 2PC ensures that it always recovers reliably.

Once a cohort agrees to commit, it is assumed that it can eventually commit (the commit cannot fail). The machine can fail, but it will come back up, it will read the logs, and will wait for the coordinator to resend the message (to see if it is a commit or an abort message). The same goes for the coordinator - if the machine fails, it will read the log and find that it was supposed to commit/abort and will start resending the messages; cohorts which have already finalized will also respond to these messages. The article isn't clear about the waiting and resending - the coordinator waits for all cohorts to finalize, resending the message on timeout.

Two phase commit is actually resilient to however many failures. Once the coordinator decides what to do and writes it to the log file, no matter how many failures occur, at some point the transaction will finalize. The wiki article is incorrect - the referenced paper does not say that 2PC FAILS, it says that in some cases everything must block for potentially a long time (which they argue is not acceptable in most applications). But (the basic, blocking) 2PC IS resilient to multiple failures. The problem is that it is blocking, and yes, if you remove the blocking part, it is no longer resilient (duh). Hope this helps, I think the article should be corrected but the author might want to make the corrections himself. Radu

redo log
The redo log is mentioned just once. The article needs to describe how it is actually used after an entry is added to the redo log. Mre5765 (talk) 15:40, 9 August 2011 (UTC)

Yoav Raz
The neutrality of part of this page is disputed, as part of a wider discussion. See Talk:Commitment ordering and Wikipedia talk:WikiProject Computer science. —Ruud 14:36, 23 December 2011 (UTC)
 * I find this entire discussion improper. The "disputed neutrality" is not explained. What facts in the article(s) are disputed? The "discussion" has never been carried out in Wikipedia talk:WikiProject Computer science, except request for attention for a long list of related "suspected" articles that at least some of them (if not all) have been similarly tagged. The "discussion" went in March 2012 to archive with nothing factual said or concluded. The tag should be removed. I also see that it recently ignited another similar baseless "discussion" named "The Raz infection is spreading" in Talk:Commitment ordering. I cannot avoid thinking about "The Jews infection" in Nazi slur. 65.96.201.116 (talk) 14:03, 15 August 2012 (UTC)
 * The archived main discussion: Wikipedia_talk:WikiProject_Computer_science/Archive_10
 * 65.96.201.116 (talk) 15:04, 15 August 2012 (UTC)