Talk:Component (graph theory)/GA1

GA Review
The edit link for this section can be used to add comments to the review.''

Reviewer: Ovinus (talk · contribs) 23:19, 28 February 2022 (UTC)


 * GA review (see here for what the criteria are, and here for what they are not)


 * 1) It is reasonably well written.
 * a (prose, spelling, and grammar): b (MoS for lead, layout, word choice, fiction, and lists):
 * 1) It is factually accurate and verifiable.
 * a (reference section): b (citations to reliable sources):  c (OR):  d (copyvio and plagiarism):
 * 1) It is broad in its coverage.
 * a (major aspects): b (focused):
 * 1) It follows the neutral point of view policy.
 * Fair representation without bias:
 * 1) It is stable.
 * No edit wars, etc.:
 * 1) It is illustrated by images and other media, where possible and appropriate.
 * a (images are tagged and non-free content have non-free use rationales): b (appropriate use with suitable captions):
 * 1) Overall:
 * Pass/Fail:
 * 1) It is illustrated by images and other media, where possible and appropriate.
 * a (images are tagged and non-free content have non-free use rationales): b (appropriate use with suitable captions):
 * 1) Overall:
 * Pass/Fail:
 * 1) Overall:
 * Pass/Fail:

Hi, I'll take this one up. A cursory look through the article reveals it's in good shape. For reference, I know graph theory concepts that would be found in a typical undergrad computer science sequence. Any edits I make to the article will be minor (and not something I'll be a stickler on). Ovinus (talk) 23:19, 28 February 2022 (UTC)

Lead

 * Would it be fair to include something like "the analogous concept for directed graphs is called a strongly connected component" somewhere, perhaps at the end of the first paragraph?
 * Is it really "the" analogous concept? Turn it around: if you're seeking the analogous concept to strongly connected components in undirected graphs, would it be connected components, biconnected components, or something else? Algorithmically, biconnected and strongly connected are a lot closer to each other than they are to the components of this article. Knuth would argue (and has, recently) that the correct analogy from connected components of undirected graphs is to weak components of directed graphs. Another concept frequently used for directed graphs and also analogous in some ways are the components of a directed graph obtained by forgetting the directions and using undirected components. —David Eppstein (talk) 23:47, 28 February 2022 (UTC)
 * Fair point; maybe "Similar concepts" or "Similar concepts accounting for additional structure". My immediate reaction to the lead sentence was generalizing the idea to other types of graphs. Ovinus (talk) 00:03, 1 March 2022 (UTC)
 * So this is intended as a summary (as everything in the lead should be a summary) of the last two lines of the "Definitions and examples" section? It would have to be very brief to summarize those lines rather than repeating them. —David Eppstein (talk) 00:49, 1 March 2022 (UTC)
 * True; I'm convinced. Ovinus (talk) 01:26, 1 March 2022 (UTC)
 * Set off the definition of "giant component" with dashes or parens
 * It is correct grammar to set off parenthetical clauses such as this one with commas. They could alternatively be replaced by dashes, but according to, "dashes have a slightly more emphatic feel, making the reader focus on this information". I'm not convinced that adding additional emphasis to these glosses is desirable here. Also, whatever is done to the definition of giant component should certainly be done to the definition of percolation threshold (because those two parts of the sentence are parallel), and I'm not convinced that it's a good idea to have sentences with more than one parenthetical clause (it makes the structure look confused and cumbersome). —David Eppstein (talk) 00:55, 1 March 2022 (UTC)
 * This is true in isolation and is fine throughout the article, but in this instance it is part of a list. (Indeed the "incidence of" groups implies the commas are parenthetical and not in a list, but this is only confirmed in the second "of" (for a percolation threshold), which means that a reader might have to backtrack.) If dashes or parentheses are objectionable, a semicolon after "others" works too. Ovinus (talk) 01:10, 1 March 2022 (UTC)
 * Ok, changed to a semicolon. —David Eppstein (talk) 07:36, 1 March 2022 (UTC)
 * constructed in linear time I'd specify "linear time in the number of vertices" to ensure it's not confused as linear in component count. ("sublinear time algorithms ... " later is clear)
 * It's not actually linear in the number of vertices. Linear time always means, linear in the total size of the input, and here that means vertices+edges. —David Eppstein (talk) 00:55, 1 March 2022 (UTC)
 * Alright. Ovinus (talk) 01:13, 1 March 2022 (UTC)
 * low time per change Would "constant time", "roughly constant time" or "amortized constant time" be too technical here? Not actually true, my bad

Definitions and examples

 * For instance, the graph shown in the first illustration has three components. I'd be a bit more explicit that the components are disjoint, or we're just stating something obvious; also, "three components" is stated in the caption. Maybe something like "For instance, the three components of the graph in the first illustration are disjoint, and together constitute the whole graph."
 * Writing it that way would suggest, incorrectly, that it's somehow possible for other examples to have components that are not disjoint, or do not cover the graph. —David Eppstein (talk) 23:51, 28 February 2022 (UTC)
 * How does it suggest that? We're explicitly considering examples of this rule ("For instance"). In any case, the disjoint part should be mentioned, or else the statement feels vacuous. Ovinus (talk) 00:05, 1 March 2022 (UTC)
 * The graph has exactly three components, in exactly the sense of the word "component" used in this article. There is no other number of components that it could have. Qualifiers do not help make the statement more true. Adding qualifiers creates the sense that the qualifiers are added for a reason, rather than being completely redundant, and therefore suggest that without the qualifier there might be some other number of components, which is false. —David Eppstein (talk) 00:47, 1 March 2022 (UTC)
 * I'm confused. Is the "For instance" sentence an example of the definition of component instead of the immediately preceding sentence? If so, I'd remove "For instance" entirely and maybe connect it to the next sentence with a semicolon, to show the natural connection between the ideas. Ovinus (talk) 00:53, 1 March 2022 (UTC)
 * I moved the "for instance" sentence earlier in the paragraph, to make clearer that what it is an example of is the definition in the first sentence, and not so much the additional properties in the later sentences of the paragraph. —David Eppstein (talk) 07:44, 1 March 2022 (UTC)
 * as sets of vertices, the equivalence classes of vertices, The second part can be removed as the meaning is clear
 * Ok, but I changed "as sets" to "as the sets" to make it less ambiguous that it is the same sets already being discussed.
 * for more advanced forms of graph connectivity I'd just say "for other forms"; advancedness is subjective
 * Ok; added weak component with another reference. —David Eppstein (talk) 07:44, 1 March 2022 (UTC)

Number of components

 * topological connected components I'd say "topologically", but this is minor and perhaps slightly imprecise
 * I think it is a little imprecise. I meant "connected components as defined for topological spaces", with "topological" modifying the noun phrase "connected components", rather than "components that are topologically connected", with "topologically" modifying the adjective "connected". —David Eppstein (talk) 07:47, 1 March 2022 (UTC)
 * and representing its edges as line segments between those points I wouldn't call this a "representation" per se, since information about vertices of degree two are lost in the topological setting. Maybe "considering"? nvm
 * and in topological graph theory it can be interpreted as the zeroth Betti number of the graph This seems a bit redundant, since a construction was just essentially given (so it's just terminology)
 * Besides percolation theory–type questions which you discuss, are there any important applications to infinite graphs? It seems infinite graphs aren’t as frequently studied as I thought
 * Not anywhere near as much as finite graphs, definitely, to the point where a lot of references are sloppy and make certain claims as being true for all graphs when really they're true only for finite graphs and require more care for infinite graphs. That's true in many of our Wikipedia articles, too. I sprinkled a few more "finite"s through the article to be safe in cases where this was a little unclear. —David Eppstein (talk) 07:56, 1 March 2022 (UTC)
 * Makes sense. I have a conceptual question if you don't mind: is the notion of an infinite graph, treated as an object in its own right, ever useful in your experience/research? Most of the graph theory concepts I know of can be sensibly applied to countably infinite graphs, although connectedness would probably require some adjustment (maybe requiring finite paths or indexing the path with $$\mathbb{Z}$$?). Or is it usually easier to deal with the finite case and let the number of vertices tend to infinity? Ovinus (talk) 17:33, 1 March 2022 (UTC)
 * Do you mean, have I ever used the concept in my own research? Or do you mean, does my experience lead me to believe that it's a worthwhile concept to explore? It would be difficult to do without the infinite grid graphs and their relatives; they've come up repeatedly as an organizing tool in my papers despite those papers mostly being about finite things. So yes, infinite graphs are useful. I've only rarely studied infinite graphs directly in my own research; an exception is the paper "Combinatorics and geometry of finite and infinite squaregraphs". —David Eppstein (talk) 06:01, 2 March 2022 (UTC)
 * Interesting! Thanks. Ovinus (talk) 17:27, 2 March 2022 (UTC)
 * "The same number arises in other ways in graph theory as well." Specify number of components so it's not confused with the rank of the graph; can replace "Numbers of components plays" later with "This number plays" if wanted
 * Ok. —David Eppstein (talk) 07:56, 1 March 2022 (UTC)
 * Perhaps in the sentence on topology, a brief mention of topological separability as the analog of disjointness in the graph case?
 * Added something about this as a gloss for the meaning of topological connected components. —David Eppstein (talk) 06:06, 2 March 2022 (UTC)

Algorithms
Mostly prose nitpicks here:
 * To find all the components of a graph, loop through its vertices The imperative mood feels a bit off here; maybe just "Finding all the components of a graph is/can be done by looping ... "
 * Ok, reworded in declarative. —David Eppstein (talk) 07:46, 2 March 2022 (UTC)
 * breadth first or depth first hyphenate
 * Done. Yes, it should be consistent with the earlier use in the same paragraph. —David Eppstein (talk) 07:46, 2 March 2022 (UTC)
 * Hopcroft & Tarjan (1973) describe ... "well known" What is the utility of this vs. a plain citation? If the main thrust is "well known" then I think "essentially this algorithm" can be removed
 * It was intended primarily as a way of giving some history: that the algorithm was already well known in 1973, and therefore presumably invented significantly earlier. It was "essentially this algorithm" because the text of the reference is a little vague about how to find the next vertex that is not yet included in a component, and the more precise part of the reference is formatted as a flow chart at a much lower level of detail than would typically be used now, so I wanted to phrase it in a safe way that would allow for the possibility of some of those low-level details being different in an inessential way. —David Eppstein (talk) 07:52, 2 March 2022 (UTC)
 * Sounds good. Ovinus (talk) 17:27, 2 March 2022 (UTC)
 * or as likely to be part of objects depicted in the image rm "depicted in the image"
 * Well, I can see that this is somewhat redundant, but we somehow need to disambiguate "objects" meaning things that are visible in the image from "objects" meaning data in the computer code for processing the image. —David Eppstein (talk) 07:54, 2 March 2022 (UTC)
 * How about "part of depicted objects" as a compromise? Ovinus (talk) 17:29, 2 March 2022 (UTC)
 * Ok, done. —David Eppstein (talk) 17:46, 2 March 2022 (UTC)
 * Are there any appealing images of CCL available?
 * The ones I know about are in connected-component labeling and Commons:Category:Connected component labeling but I'm not convinced they are adequate. File:Two-pass connected component labeling.svg at least tries to show off something of how it works but fails to provide a non-trivial test case (one where a top-down scan has to merge what it thinks are multiple components). And I think the use of the 8-neighborhood instead of the 4-neighborhood in the other ones may be unnecessarily confusing. —David Eppstein (talk) 08:03, 2 March 2022 (UTC
 * Researchers in this area rm "in this area"
 * Ok. —David Eppstein (talk) 17:45, 2 March 2022 (UTC)
 * and labeling them as objects This seems quite clear; removing it would make the sentence more concise
 * Ok, removed. —David Eppstein (talk) 17:47, 2 March 2022 (UTC)
 * allowing it to be processed in pixel order Provide why this is important (I assume it's to do with cache locality?)
 * The source had a line or two about this, so I added a line here based on that. That's one reason but not the only reason. Certain kinds of hierarchical image representations, for instance, might allow fast sequential access but not fast random access to the pixels. —David Eppstein (talk) 17:45, 2 March 2022 (UTC)
 * Great. Ovinus (talk) 18:35, 2 March 2022 (UTC)
 * The second $$O(\alpha(n))$$ should be $$\alpha(n)$$ or just $$\alpha$$
 * Ok. —David Eppstein (talk) 17:45, 2 March 2022 (UTC)

In random graphs

 * "but should not depend on the choice of {\displaystyle n}n" What exactly do you mean here?
 * There's a limiting argument going on in the earlier text about "high probability": we're taking a limit as $$n\to\infty$$. As we change $$n$$ in this way, $$\varepsilon$$ should remain unchanged. It doesn't work, for instance, to set $$\varepsilon=1/n$$ and then try to take the limit. You can define a limit that way, but the thing that you want to be true with high probability will not actually be true in that limit. —David Eppstein (talk) 07:59, 1 March 2022 (UTC)
 * I understand that part, but this feels like a convoluted way of saying "fix $$\varepsilon$$ first, then let $$n\to\infty$$". Is "The analysis depends on a fixed $$\varepsilon>0$$ which can be chosen arbitrary close to 0" too imprecise? Also, is the asymptotic width of the transition period in terms of $$n$$ known? Ovinus (talk) 17:27, 2 March 2022 (UTC)
 * Just this one point here. Ovinus (talk) 18:35, 2 March 2022 (UTC)
 * It's really a question of quantifier order. The correct order is something like "for all epsilon: for all delta: exists N: for all n: (n > N and p < (1-epsilon)/n) => Pr[all components are small] > 1-delta". The wrong order is something like "forall delta: exists N: for all epsilon: forall n: ..." Delta and N aren't even explicit in the text but are implied by the wording "arbitrarily close to one for sufficiently large values". Anyway, I'm not sure "a fixed epsilon>0" really conveys the intended meaning, because maybe that would allow us to set epsilon=1/n? That's fixing it, isn't it? In fact this sort of choice of epsilon to be a function of n rather than a value independent of n is commonly done (with other functions than 1/n), in studying the transition area in finer detail. A lot more is known about what happens there, but I didn't think this article was the place to get into details. —David Eppstein (talk) 22:01, 2 March 2022 (UTC)
 * Indeed it's a question of order of two limit processes, hence why I thought it'd be easiest to just say "choose epsilon first." But I agree it's distracting to mention at the beginning, "choose epsilon > 0," before introducing n. How about "a positive constant which can be chosen to be arbitrarily close to zero but independently of n"? Ovinus (talk) 22:11, 2 March 2022 (UTC)
 * Ok, but I reversed it to put the independence first before the close to zero part; I think it's a little tighter that way. —David Eppstein (talk) 06:51, 5 March 2022 (UTC)
 * "y=y(np)" I'm guessing you mean that y is only dependent on the value np, but the notation is confusing. Maybe just "where y is the solution to ..., only dependent on the value np"
 * I think someone else must have added that phrasing, so I can only speculate on the intended meaning. I agree that it is confusing, and unnecessary. I just removed the "=y(np)" part. I don't think we need to be explicitly told that it only depends on $$n\cdot p$$ and not separately on both variables; what use are we making of this information? —David Eppstein (talk) 17:50, 2 March 2022 (UTC)

Note
I totally overestimated my knowledge in this area; if you felt my review wasn't thorough let me know. Ovinus (talk) 00:32, 1 March 2022 (UTC)
 * I don't have any complaints so far. But if you see something that you think could be understandable for someone with your level of technical background, but isn't, please let me know; some of this material is unavoidably technical but I don't want it to be technical when it doesn't have to be (and neither does GA criterion 1a). An undergraduate CS student level is a good target for much of this material; at least, for the material in the definition and algorithms sections. The number of components and random graphs sections may need to be a little more advanced, maybe an undergraduate math student level. —David Eppstein (talk) 08:02, 1 March 2022 (UTC)
 * Great, thanks. Overall it was quite understandable, despite being a formal mathematical approach and including the random graphs part. Ovinus (talk) 17:27, 2 March 2022 (UTC)

I think I've responded to everything above, so please take another look. —David Eppstein (talk) 17:51, 2 March 2022 (UTC)
 * One point above and one last nitpick, allows them to be subjected to additional processing -> "allows additional processing", since you immediately describe the processing afterward and it's clear that it's on the identified components. I'll promote after these are addressed. Ovinus (talk) 18:35, 2 March 2022 (UTC)
 * Awesome; passing. Thank you for the excellent work! Ovinus (talk) 16:00, 5 March 2022 (UTC)