Talk:File verification

Removal of reference links
I noticed that links to formats and software used has been removed. Surely it's useful to have examples rather than just talk about it as "theory". What if a user is looking for a tool to perform a verification, or how exactly a verification is performed? -- Lee Carré 18:47, 20 November 2006 (UTC)


 * I replaced the two external links with another internal link. Are there other internal links available? JonHarder 21:05, 20 November 2006 (UTC)


 * For existing articles; not that I know of, but I've added links for future articles if we're insisting on internal-only links -- Lee Carré 03:41, 21 November 2006 (UTC)


 * Although I prefer internal links over external links, I'm not insisting. If the externally linked products are notable, then definitely articles should be created for them. Otherwise I have a slight preference for the external links over redlinks. JonHarder 14:36, 21 November 2006 (UTC)


 * I agree with the ideals, but it seems a bit excessive to make a software page for a relatively minor tool, i completely understand for something like Adobe Photoshop, where I imagine there would be lots of detailed information, and links to plugin sites etc.
 * The reason for adding a link to HashCalc especially is that it's an easy to use, free, GUI, multi-input (text string, hex string, file), multi-output (several popular hash functions), generally quite handy tool; and I imagine users comming here would appreciate an example of a verification application, but an internal link it is for now :) -- Lee Carré

15:47, 21 November 2006 (UTC) so this is correct information.

Inadequate scope.
There are clear situations where the checks listed here, will not work.

The situations are complex and simple. The data that gets checked can be data is buffers and caches etc, in drive electronics, chipset, memory, virtual paging and operating system. Apart from compensating errors producing false positives, the software/hardware may fetch the values from the data chain before the recording has hit the dusk surface. A practice is write and read short sections encourages this. So, all the chain has to be flushed to disk and values collected from disk surface (there is potential for corruption between read from disk getting to software, but most of the time that will be negligible). The other side issues, I can't remember, as I am not an engineer in these areas, and not are many software writers). However, I encourage an editor to find definite information on this, and an engineer in this area, to get correct information.

It is difficult to get software that guarantee perfect copy. Which brought me to this article. In situations of backup of vital data and situations where thousands of terabytes maybe involved, this can lead to an accumulation of errors.

For instance, of importance to educating readers, some users in video production mistakenly copy files to new drives regularly as an archival strategy. The data is compressed, and such errors can lead to major image problems in a frame or a group of frames in heavy compression schemes, which can be half a second of footage, and can be unrecoverable. In normal daily filming, such errors may not come up, but in long term archive replication vital footage that needs to be kept can be corrupt. Hard drives have a very limited life, requiring transfer eventually, cheap flash storage even down to a year. So, transfer in these usage situations is responsible, but selection of competent software is an issue.

In an everyday user's life data recording, the same situation happens. The user's data can be many terabytes, even thousands of terabytes going corrupt or being copied in a fashion which can go corrupt. The basis of much cloud storage, is privacy invasion, as it is rather expensive to offer at reduced prices, as it is in much of the internet economy commonly commented on. For each person there is privacy concerns, and once collected the data 24 hours a day can pass onto others. So, backup to local storage is part of an answer.


 * Here is a page that lightly discusses it, but not to it's interaction with checks on unwritten data. The software here focuses on byte to byte comparison instead, which is top level verification.  The page is no longer maintained, as the developer passed way in 2017.

http://www.xxcopy.com/xxtb_045.htm

— Preceding unsigned comment added by 49.197.126.248 (talk) 10:02, 15 February 2021 (UTC)

Merge from Simple file verification
Per comments at Deletion review/Log/2021 September 1, maybe we could merge something here? The problem is that the sfv article has next to no footnotes. Perhaps someone could rewrite it into a properly referenced section here? Ping User:S Marshall who suggested this as a merge target, if I understood him correctly, and User:Cryptic, who also mentioned Cyclic redundancy check as another target? Piotr Konieczny aka Prokonsul Piotrus&#124; reply here 02:46, 9 September 2021 (UTC)


 * Both articles should be cleaned up. •  Sbmeirow  •  Talk  • 10:48, 9 September 2021 (UTC)


 * I'm not opposed to a merge, as long as it genuinely is a merge. I'm getting a little tired of seeing IDONTLIKEIT articles being made to disappear by redirection.  As a minimum, information on what the file format actually is needs to be here.  If that is UNDUE for this level of article, then merging is inappropriate. SpinningSpark 10:26, 11 September 2021 (UTC)


 * IMO, the SFV article should focus on the .sfv format proper, rather than attempt to cover the fundamentals of file integrity checks. This includes compatibility differences between programs that draw additional metadata from specially-formatted comment lines, support for multiline comments (useful for banners) and other syntactic extensions. This should be plenty of material for the SFV page. OmenBreeze (talk) 22:39, 21 September 2022 (UTC)

False positive or false negative?
Re your change, "false negative" means that a given condition, when tested, is found to be false when it is not. False positive means the condition is found to be true when it is not. The "given condition" here, as stated in the paragraph in question is the "file has not been corrupted". A hash collision results in the file being corrupted but testing as uncorrupted. That is a false positive for uncorrupted. SpinningSpark 06:06, 23 July 2022 (UTC)


 * Ack. Thanks for the clarification. Luchostein (talk) 21:07, 22 November 2022 (UTC)