Talk:Application checkpointing

Expanding

 * Since my work is heavily related to checkpointing techniques, I'm going to revamp and expand this article. I will remove the definition already here since it seems to small and "informal", and begin working from zero. Of course feel free to put it back where you see fit, and please tell me what you feel about how this article evolves. Charles Dexter Ward 16:08, 13 January 2006 (UTC)


 * I have begun the work by detailing the (very) general checkpointing techniques' properties. I have written just a basic stub, because I am not sure how to progress. I am the author of several articles about the topic, but the problem is I have transferred the copyright to the publishers. Amongst other things, topics in which this article can be extended are:


 * · Extending the properties section, covering each one in detail and explaining relations between them.


 * · Existing checkpointing solutions. Since I am the author of an existing solution, surely I wouldn't be a perfect candidate to write a NPOV contribution here.


 * · Specific properties and problems arising when checkpointing parallel systems, such as multiprocessors, multicomputers or the Grid.


 * · Detailing used design/implementation techniques. Maybe this is a bit too advanced to be covered here since I don't feel like transforming the wikipedia into a scientific publication.

I would very much thank your opinions and contributions here. Also, I don't feel I should remove the stub notice here, so if someone feels like this one is reaching "completion", feel free to do it. Charles Dexter Ward 17:07, 13 January 2006 (UTC)

There is a redirection from checkpointing to application checkpointing. I am nto sure whether it is correct. Also, I think the article should contain about two mayor checkpointing techniques in distributed environment: coordinated and communication-induced checkpoints. Checkpointing techniques are also quite different in message-passing systems and DSM systems. I will try to write something very soon now (tm) Szopen 12:44, 31 January 2006 (UTC)

I don't think there's a problem with the redirection. I mean, first time I arrived here was through that redirection :) And I guess someone interested in application checkpointing is probably going to search "checkpoint" in the first place? About checkpointing techniques in distributed environments, thanks in advance. I'm eager to see the results :) Charles Dexter Ward 13:59, 31 January 2006 (UTC)

Actually, DSM checkpointing is different from message-passing only in details. I mean, e.g. log-based checkpointing in message-passing systems are quite reasonable, while not so in DSM systems (logging messages holding whole page contents?). (Hmmmm I just saw I forgot that "quite different" has quite different meaning in English than "dosc rozne" in Polish :))) ) Please check my changes for accuracy and English. Especially the explanation of domino effect would need MAYOR rewording. I may do it in some time in the future, if I will have the time. Szopen 12:00, 2 February 2006 (UTC)

I will check it as soon as I have some time. I work with message-passing systems, so I am more or less familiar with the domino effect. Will try to upload some images, as it is usually better explained using a simple example. Charles Dexter Ward 20:49, 4 February 2006 (UTC)

Article quality
I am sorry, this needs a total rewrite, given that teh logic has many flaws, etc. I may get to it one day, but probably not in 2012. I tagged it anyway. History2007 (talk) 22:41, 8 February 2012 (UTC)

Does it really belongs to parallel computing?
The scope if this technique does not seem to be limited with parallel computing at all.