Talk:Web document

See and contribute
See and contribute to the "source section here": there are a lot of sources about "web document" concept! —Preceding unsigned comment added by 201.6.212.207 (talk) 15:29, 27 December 2007 (UTC)

Article motivations
There are NO STANDARD (or "more authoritative") definition.

HERE, on article, we can fix better (consensual) definition. It needed for support other articles.

-- User:Krauss

Examples of other definitions

"An HTML document that is browsable on the Web", umich.edu.

"A file or set of related files that can be transferred from a Web server to a Web client. The document can contain text, graphics, sound, video, or links to other documents", sympatico.ca

"(...) Many related Web document make up a Web presentation", scism.sbu.ac.uk.

Definition notes about old definition
Web Document is a extended (and more informal) concept for web page, to be protocol independent and format independent. Is to be required only that web document:
 * 1) Is being transferred  on any Internet communication protocol.
 * 2) Format: any valid MIME Content-Type and usual format.

A PDF document requested from SFTP or SMTP protocols, for example, is a web document, but not a web page. On the other hand, all web page is also a web document.

About the 2 condictions:
 * They are not axioms, are informal, and like a fuzzy set.
 * About cond. 1 (transferred): see also application layer. The primary objective (and difference from generic documents) is the web accessibility, since the benefits are many and obvious.
 * Cond. 2 (another mode to say "usual format"): there are a "standard method" to transform (exactly) the web document into a file (or a "hub file"), and this file can be viewed with a usual (and usually configured) web browser.

What is and what not?

 * What is a web page? a  is a web page?
 * XML document are, all of them, web documents?

Original research or need of references?
We need this article (and web document concept) on Wikipedia? It is a "original research concept"??

Please comment your position here (below).
 * Just try to keep adding citations (as I see you have done already) to help put this article in the context of pre-existing usage. If I come across good cites I will add as well. IMHO, the article is steadily improving, so this should be less and less an issue over time. Thanks! dr.ef.tymac 16:57, 24 November 2006 (UTC)
 * I added the "original research" and "Primarysources" templates. Note that the sources listed here in the talk page under "Some sources" are not the kind of third-part "reliable sources" required by wikipedia.  And they may just be using "web document" as a natural fuzzy term, without trying to define it as a useful term in and of itself.  Any sort of computer file at all could be called a "web document" according to the criteria listed here, since communications protocols can transport anything.   MIME is flexible enough that it can be used to tag any kind of file format at all, so it does not provide a useful basis for distinguishing "web documents" from anything else.  Any program can be associated with any mime type, so the term "MIME-compatible application" is not helpful.  If you can find third-party assertions to the contrary, please support the assertions in the article with proper citations.  If not, it may be appropriate to delete this article. ★NealMcB★ (talk) 22:17, 26 December 2010 (UTC)

Some sources
THERE ARE A LOT OF SOURCES ABOUT "WEB DOCUMENT" CONCEPT!


 * Articles:
 * "Query type classification for web document retrieval", see ACM SIGIR Conference.
 * "Web document clustering: a feasibility demonstration", O. Zamir and O. Etzioni. See ACM SIGIR Conference.
 * "Flexible Web Document Analysis for Delivery to Narrow-Bandwidth Devices", G. Penn, J. Hu, H. Luo, R. McDonald. See ICDAR'01.
 * "On reliable and scalable peer-to-peer Web document sharing", L. Xiao, X. Zhang, and  Z. Xu. See IEEE Symp.
 * Patents:
 * "Web document based graphical user interface", Arthur A. Van Hoff, Patent number: 5802530.
 * Books:
 * "Web Document Analysis: Challenges and Opportunities", Apostolos Antonacopoulos. World Scientific 2003. ISBN 9812385827.
 * Whole Congress:
 * WDA2001, the "First International Workshop on Web Document Analysis".
 * WDA2005, "Web Document Analysis 2005".
 * ... —Preceding unsigned comment added by 201.6.212.207 (talk) 15:35, 27 December 2007 (UTC)
 * WDA2001, the "First International Workshop on Web Document Analysis".
 * WDA2005, "Web Document Analysis 2005".
 * ... —Preceding unsigned comment added by 201.6.212.207 (talk) 15:35, 27 December 2007 (UTC)

Sugestions
New sugestions for review and/or redo parts of the article.

About Lead section
See WP Lead Sec. and WP intro style.


 * 1) put the comparation list into a table, like below.
 * 2) put back the definition basic concepts (see : the 2 requeriments/conditions/axioms and a explicit for of "all web page is also a web document" (container concept). Perhaps something about "... while XML and Fast Infoset is not wide wide used.. is necessary remember web docs".

Comparation table sugestion: WPage		Wdoc Main prot. HTTP		HTTP or etc. Main format	HTML		HTML or etc. Context		Page		Page or comp., attach., etc. Viewer 	Browser		Browser or Mime-def. app.

"The first sentence should give (...) relevant characterization of the subject. If the subject is amenable to definition, the first sentence should give a concise one that puts the article in context. Rather than being typically technical, it should be a concise, conceptually sound, characterization driven, encyclopedic definition." WP intro style. I think if table is concise (and not grow) it can stay there. -- Krauss 2 December 2006

"web template series" table
I think this article is not a relevant part of the "Web Template Systems outdoor" (table on lead section). The (wiki history of) creation/motivations stay on Talk page, not go to the article. If on webdoc stay links at "See also" section, and on articles series have concistense on links, terminology defs., etc. about webdoc, is sufficient. -- Krauss 2 December 2006
 * DONE: Remove lead table. I do think the lead table is related to this, especially since 'web document' is a central concept to the template system series, but 'web document' also applies to other areas (such as web services, email attachements etc.) so agree with taking it out. This presents another situation, however, where extra support may be needed from primary authority. If we say "web document" is an essential term to understand the template system series, it is easier to justify it as an independent term. If we say it is an independent term, and it stands on its own merit as a separate article, we may need additional support for that, so the article can withstand closer scrutiny. dr.ef.tymac 15:12, 2 December 2006 (UTC)

use class=wikitable?
Hello! Is there a reason why you do not want class=wikitable? Without it, the uneven alignment of the text makes it more difficult to read and understand. dr.ef.tymac 17:02, 22 December 2006 (UTC)

Appearance of terms
(section "Appearance of terms" moved from article to here)

The term "Web Document" appears in Google searches, "Internet document", appears less frequently.

The term "Dynamic Web Documents" is on the title of scientific articles, like on article1 or article2. —Preceding unsigned comment added by 201.52.194.78 (talk) 13:17, 6 February 2008 (UTC)

W3C definition?
That is a good reference? See http://www.w3.org/TR/cooluris/#oldweb

--187.39.190.83 (talk) 05:03, 27 December 2010 (UTC)
 * "Like everything on the traditional Web, each of the pages mentioned above are Web documents. Every Web document has its own URI. Note that a Web document is not the same as a file: a single Web document can be available in many different formats and languages, and a single file, for example a PHP script, may be responsible for generating a large number of Web documents with different URIs. A Web document is defined as something that has a URI and can return representations (responses in a format such as HTML or JPEG or RDF) of the identified resource in response to HTTP requests".
 * "In technical literature, such as Architecture of the World Wide Web, Volume One [AWWW], the term Information Resource is used instead of Web document".

Original Research
I have removed all the original research, just leaving the base reference (expanded) and some explanation. Justinc (talk) 16:21, 1 May 2011 (UTC)